The Microsoft Kinect works with two cameras, a RGB camera and a depth-finding camera. Just like any other cameras the lenses of the Kinect aren’t perfect in shape and so you get distortions in your captured scene. Also you have to consider that the Kinect has two cameras and these cameras are off-center like the human eyes and so each camera has her own field of view. To create a point-cloud with the correct RGB and depth information for each point we have to merge the information of both cameras.
For calibration we used the RGB-Demo from Nicolas Burrus and build a checker-board.
We used the RGB-Demo to record pictures of the checker-board at different positions. After we took a large number of pictures we used the build in 3D calibration of RGB-Demo to find the checker-board in every picture and calculate the lens-distortion and the lens-intrinsics over all the picture for the RGB and the depth-finding camera. Also the relative transformation between both cameras is automatically calculated.
To reconstruct the recorded scene we project the data of the depth-finding camera into 3D space using the lens-intrinsics. To colorize the 3D point we use first the relative transformation to get into the RGB-camera coordinate system. The second step is to project the transformed 3D point into the 2D Layer of the RGB image using the lens-intrinsics of RGB camera.