Generalization of pixel-wise phase estimation by CNN and improvement of phase-unwrapping by MRF optimization for one-shot 3D scan
3D shape measurement is used in various fields, and highly accurate and stable shape measurement is required. One of the measurement methods is the active stereo. Active stereo is compact, because it consists of only a projector and a camera. And then, it enables measurement from a single acquired image, which allows for fast measurement. One severe drawback of one-shot 3D scan is sparse reconstruction, because positional information of projector coordinate are encoded into spatial pattern. In addition, since spatial pattern becomes complicated for the purpose of efficient embedding, it is easily affected by noise, which results in unstable decoding and low accuracy on 3D shape.

Existing methods have used three deep learning methods to achieve densification. A learning method that acquires phase information for each pixel in a repetitive projected pattern, A learning method that obtains features from images of projected patterns, The other is a learning method that uses the obtained features to infer the correspondence points with the projected pattern on a pixel-by-pixel basis. We added three more methods to these methods to achieve highly accurate shape measurement.

1. Efficient pre-training by computer graphics

It is time-consuming to obtain training data for deep learning from real images. Therefore, the proposed method uses computer graphics to create training data. In order to deal with the arbitrary shapes, a pattern is projected onto a large number of object surface generated by computer graphics. In addition, we add noise to the training data, taking into consideration of the capturing condition, for example, the noise from sensors and speckle noise from laser projectors. By adding these variation into the training data, robust estimation against noise can be achieved.

2. Corresponding nodes refinement by solving MRF

One of the important factor for active stereo method is finding the correspondences between the projection and the projected patterns accurately. The problem can be seen as an optimization problem that follows to Markov random field (MRF). Suppose the node inference is done correctly, the layout of the target node and the neighboring node in the input image should be the same as those on the projection pattern. Among the corresponding node candidates obtained by GCN inference, the node that satisfies this condition the most is the final inferred node.

3. Pixel-wise phase refinement by Gaussian kernel

As the phase estimation by U-Net is spatially imitated by the sparse distribution of the dots on the projection pattern, it is hard to detect phases on a surface with high frequency. Therefore, we propose a method that correct the phase estimation by lines in the pattern. Ideal phases are a saw tooth and range from 0 to 1, starting from a node to next node. We regard the difference of the phase value inferred by U-Net and 0.5 (ideal phase value on the line) as the correction value of the phase. We apply Gaussian filtering to the correction value based on our assumption that the distribution follows to Gaussian. By using the dense correction map, we achieve a phase that close to the ideal phase.

Measurement results of various shapes. It can be confirmed that this method clearly and densely measures the fine irregularities of objects.



Publications
Computer Vision and Graphics Laboratory