티스토리 뷰

 

Input: Two hand keypoints from a SPML-H mesh sequence

Output: Two Shadow robot hand joints' angles + base pose

No dynamics, just geometry. People say it's easy, but it was freaking difficult for me


> Why do I choose this input and output? or what are the technical differences?

Options: Humanoid, Bi-manual arms with five finger hands, Flying robot hands (what I did) 

Why didn't I do humanoid retargeting? No reason. I think humanoid retargeting and flying robot hand retargeting are more similar against bi-manual arm retargeting. Because the bi-manual's base is fixed, there is no global tranformation. I don't know how others do, but in humanoid and flying robot hand case, global transformation and local transformation (joint angls) should be solved separately. Maybe I should do this in jaxmp again to learn...

I think my position-based retargeting failed to give proper finger flexion and extension, cause I used less joints following Dex-retargeting. I don't know how Dex-retargeting did exactly. Maybe they solved the global transformation and do IK using less joints. https://github.com/dexsuite/dex-retargeting/blob/main/dex_retargeting/configs/offline/shadow_hand_right.yml

Humanoid catalog: https://www.freethink.com/robots-ai/humanoids

Unitree G1 (Oct 2024): https://www.unitree.com/g1/

From LEAP hand authors' presentation. Or look at this too: https://github.com/dexsuite/dex-retargeting/tree/main

Mink Github Repo also have all kinds of robot explained with example code:

https://github.com/kevinzakka/mink/blob/main/examples/ufactory_xarm7/README.md

 

> What I did for the demo

1. Extract 21 hand keypoints per hand from SMPL-H mesh (6890 vertices)

If it was SMPL-X mesh (10475 vertices), I could have used this: https://github.com/mks0601/Hand4Whole_RELEASE/blob/2afaa618b301b96e861d42865e92e3175dbed2a5/common/utils/human_models.py#L119

But is was SMPL-H (6890 vetrices)...

https://github.com/hongsukchoi/generic_tools/blob/master/smplhmesh_to_mano_joints.py

This code provides hand joints in this order (Mediapipe): https://ai.google.dev/edge/mediapipe/solutions/vision/hand_landmarker

2. Retarget local joint angles (Solving inverse kinematics of robot hands using the human MANO keypoints as the objective) + global wrist rotation 

Types of retargeting: position vs. vector 

I modified Dex-retargeting's vector retargeting code to load hand detection from my data and get global transformation information.

3. Load into the Viser

I rotated the base orientation (an arm attached to the Shadow hand), but could have rotated the wrist? Though, I don't know how to. Btw, I fixed the translation gap between the base and the actual wrist location too.

The base is the entire robot geometry, and the origin seems to be at the bottom of the forearm.

 

Loading the retarget result into the Viser was actually the hardest and the most confusing part. I didn't know the correct viewpoint that I can interpret the retargeted visualization. Hands seem to be just randomly moving, especially when I was visualizing a single hand. The tip is to visualize the original body mesh too, so that seeing if the retargeted robot hand motion matches them. 

The confusing part in Dex-retargeting's code (this is my modified version):

    def load_detection(self, keypoint_3d_array):
        # keypoint_3d_array: (J, 3) in input

        keypoint_3d_array_root_relative = keypoint_3d_array - keypoint_3d_array[0:1, :]
        mediapipe_wrist_rot = self.estimate_frame_from_hand_points(keypoint_3d_array_root_relative) # mano wrist's local wrist frame following their convention
        joint_pos = keypoint_3d_array @ mediapipe_wrist_rot @ self.operator2mano

        # keypoint_3d_array @ mediapipe_wrist_rot: transforms keypoints into the wrist local frame
        # (...) @ self.operator2mano: converts the rotated keypoints from the wrist to a new frame, ikely the MANO (right or left)-hand model's coordinate frame. 
 
        # dummy values
        num_box = 1
        keypoint_2d = np.zeros((21, 2))

        return num_box, joint_pos, keypoint_3d_array, mediapipe_wrist_rot @ self.operator2mano

Later, you multiply the rotation matrix (mediapipe_wrist_rot @ self.operator2mano) to the base pose from the left, which means, you are transforming the base to the wrist frame and then to the world frame, where `keypoint_3d_array` lives. This alignes the original human hand source and the retargeted robot hand well.

 

? I think this means that the robot hand has the same orientation with the mano hand, when they are in the same coordinate frame lik MANO frame. 

=> It seems not.They seem to be not aligned. I don't know..! Maybe this is just dex-retargetting thing?

공지사항
최근에 올라온 글
최근에 달린 댓글
Total
Today
Yesterday
링크
«   2024/10   »
1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31
글 보관함