Calibration Method between Markerless Motion Capture and Optical Motion Capture
Summary
 There is a growing need for human body measurement using markerless motion capture. In this study, a markerless motion capture system using six calibrated RGB cameras was used to achieve measurements equivalent to those of optical motion capture systems using markers. One of the issues with MV-OpenPose is that the coordinate space used in MV-OpenPose is different from that of common optical motion capture systems. Therefore, the purpose of this study is to convert the coordinates of each joint acquired with MV-OpenPose into the coordinate space of an optical motion capture system to perform positioning.

MV-OpenPose
MV-OpenPose is a motion capture system using six RGB cameras without markers. As shown in the figure below, the cameras are arranged to surround the object to be captured and estimate the coordinates of each joint in three dimensions by using the principle of triangulation. The accuracy of the measurement can be improved by using the average value of the 3D points estimated by triangulation even with one camera.



The figures below show actual motion capture using MV-OpenPose and an example of captured results.



ICP Algorithm
The proposed method utilizes the Iterative Closet Point (ICP) algorithm, which is a method for adjusting the position-posture relationship of two point clouds so that they are aligned, by iteratively adjusting the position-posture relationship step by step based on repeated calculations.



Procedure of matching between two motion capture
 The positioning procedure consists of manual rough positioning to set the initial ICP values and detailed positioning using the ICP algorithm.
  1. Process for manually setting conversion parameters and rough positioning
  2. First, the timing of capturing data acquired from the two motion capture systems is manually synchronized. Next, based on the actor's position and pose while viewing the acquired data, 4 to 8 keyframes of data that are close in time are manually selected, and each is combined into a single point cloud data. Finally, the parameters for translation, scale, and rotation are manually adjusted. Below is an example of a point cloud visualized in MeshLab with the initial ICP values.



  3. Detailed positioning process using the manually transformed results as the initial values of ICP
  4. After setting the manually aligned parameters as the initial ICP values, the ICP algorithm is applied from the point cloud of the MV-OpenPose data to the point cloud of the optical motion capture system data. The problem here is that the skeletal models used by the two motion capture systems are different. However, even if the two skeletal models are different, ICP minimizes the distance to the nearest neighbor point, so that even if the joint positions are defined differently, the distance between the corresponding points is considered to be optimal. Below is an example of the results of calculating ICP from the MV-OpenPose point cloud to the optical motion capture point cloud, using the set initial ICP point cloud.
    (Red: optical motion capture, Blue: MV-OpenPose before ICP, Black: MV-OpenPose after ICP)




  5. Scale and synchronization parameter estimation by iterative processing
  6. Since the key frames may have deviated from the initial values, ICP is performed repeatedly by shifting the frame number, and a greedy search is performed to find the point where the error is minimized. Similarly, scale parameters are also obtained by a Greedy search.

  7. Visualization of all frames
  8. The estimated alignment transformation is applied to all frames, and the point clouds for each joint acquired by the optical motion capture system and MV-OpenPose are plotted and visualized on the same graph to check whether the data alignment is correct.

Experimental results
 The alignment of the two motion capture systems using the proposed method was verified in two situations. The first is a slow, circular walking motion. The second is an acrobatic movement, such as standing on one's head and spinning around. The two motion capture data are in good agreement despite the complexity and intensity of the movements, such as standing on one's head and spinning. However, compared to the first situation, the discrepancy is larger, and this verification confirmed that some frames in the MV-OpenPose did not detect the toes, or that they were clearly estimated in the wrong position.

Motion 1 Motion 2


Resources
  • Code: GitHub (icp calculation and visualization code)
  • Data captured by The Sports Performance Research Center of the Kanoya University of Health and Sport Sciences.


Publications
  • Takahiro Shirakawa, Hitoshi Teshima, Takumi Kitamura, Thomas Diego, Hiroshi Kawasaki
    Calibration Method between Markerless Motion Capture and Optical Motion Capture
    The 25th Meeting on Image Recognition and Understanding, 2022
  • Sayo, Akihiko and Thomas, Diego and Kawasaki, Hiroshi and Nakashima, Yuta and Ikeuchi, Katsushi
    PoseRN: A 2D Pose Refinement Network For Bias-Free Multi-View 3D Human Pose Estimation
    IEEE International Conference on Image Processing (ICIP), pp.3233-3237, 2021
Computer Vision and Graphics Laboratory