티스토리 뷰

Why do they make a new camera coordinate system... so confusing.

TL;DR . CO3D uses the Pytorch3D camera coordinate system, and it is different from the OpenGL camera coordinate system and the conventional camera coordinate system.

https://pytorch3d.readthedocs.io/en/latest/modules/renderer/cameras.html

CO3D also uses the "ndc_norm_image_bounds" system for the pixel space. It normalizes the pixel coordinates to [-1,1]. Also, the rotation should be multiplied from the right! 얘네가 쓰는 R은 내 코드대로라면 좌표축 변환행렬임. 좌표 변환 행렬이 아니라.

class ViewpointAnnotation:
    # In right-multiply (PyTorch3D) format. X_cam = X_world @ R + T
    R: Tuple[TF3, TF3, TF3]
    T: TF3

    focal_length: Tuple[float, float]
    principal_point: Tuple[float, float]

    intrinsics_format: str = "ndc_norm_image_bounds"
    # Defines the co-ordinate system where focal_length and principal_point live.
    # Possible values: ndc_isotropic | ndc_norm_image_bounds (default)
    # ndc_norm_image_bounds: legacy PyTorch3D NDC format, where image boundaries
    #     correspond to [-1, 1] x [-1, 1], and the scale along x and y may differ
    # ndc_isotropic: PyTorch3D 0.5+ NDC convention where the shorter side has
    #     the range [-1, 1], and the longer one has the range [-s, s]; s >= 1,
    #     where s is the aspect ratio. The scale is same along x and y.

https://github.com/facebookresearch/co3d/blob/83ded49453f287f0a330e7fc4a1bd354878cb517/co3d/dataset/data_types.py#L64

 

GitHub - facebookresearch/co3d: Tooling for the Common Objects In 3D dataset.

Tooling for the Common Objects In 3D dataset. Contribute to facebookresearch/co3d development by creating an account on GitHub.

github.com

My actual code for projection

            # load camera
            cam_data = annot['viewpoint']
            R = np.array(cam_data['R'], dtype=np.float32)  # (3,3)
            t = np.array(cam_data['T'], dtype=np.float32).reshape(3,1)  # (3,)
            focal = np.array(cam_data['focal_length'], dtype=np.float32)  # (2,)
            princpt = np.array(cam_data['principal_point'], dtype=np.float32)  # (2,)
            K = np.array([[focal[0], 0., princpt[0]],
                          [0., focal[1], princpt[1]],
                          [0., 0.,       1.]], dtype=np.float32)


            # parse Pytorch3D camera paramters to the conventional way
            # K[:2] = -1 * K[:2]
            # denorm_factor = np.array([
            #    [img_shape[1] / 2., 0., img_shape[1] / 2.],
            #    [0., img_shape[0] / 2., img_shape[0] / 2.],
            #    [0., 0., 1.]
            # ], dtype=np.float32)
            # K = denorm_factor @ K
            # R = R.T
            
            # scale K
            denorm_factor = np.array([
                [img_shape[1] / 2., 0., img_shape[1] / 2.],
                [0., img_shape[0] / 2., img_shape[0] / 2.],
                [0., 0., 1.]
            ], dtype=np.float32)
            K = denorm_factor @ K
            
            # convert x,y axes... to match conventional camera
            cam_center = - R @ t
            R[:, :2] = - R[:, :2].copy()            
            R = R.T
            t = - R @ cam_center
            
            # load pointcloud
            pcd_path = osp.join(self.data_dir, scene_name, 'pointcloud.ply')
            pcd = o3d.io.read_point_cloud(pcd_path) 
            points = np.asarray(pcd.points, dtype=np.float32)  # (num_points, 3)
            
            # Project
            Rt = np.concatenate([R, t], axis=1)
            img_points = project(points, K, Rt)
            

            def project(xyz, K, RT):
                """
                xyz: [N, 3]
                K: [3, 3]
                RT: [3, 4]
                """
                xyz = np.dot(xyz, RT[:, :3].T) + RT[:, 3:].T
                xyz = np.dot(xyz, K.T)
                xy = xyz[:, :2] / xyz[:, 2:]
                return xy

 

 

Extra)

공지사항
최근에 올라온 글
최근에 달린 댓글
Total
Today
Yesterday
링크
«   2024/05   »
1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31
글 보관함