» Framework

The Framework

For our project “3D Scan 2.0″ we developed a small C++ framework which depends on three main libraries:

  1. libfreenect – to use the Kinect hardware with our pc
  2. OpenSceneGraph – to visualize data retrieved from it
  3. ARToolKit – to handle positioning and orientation related things

The very first part…

…, contemporary the heart of our framework, was to get data out of this very nice depth camera provided by Microsoft. After some research about actual developments, we decided to use libfreenect as a starting point because the main alternative OpenNI comes with too much overhead, we do not need for our approach. libfreenect also offered a C++ Wrapper already, which seemed to suit our needs. After studying the code and getting familiar with the library, we found out that we had to add some things. As a result we decided, to write our own wrapper class basing on the original.

When our first prototype was designed, it was planned to use OpenSceneGraph directly, to have an automatic visualization mechanism. But when development time progressed, it showed that this was the wrong approach. Code grew drastically with several improvements and it was hard to get along with it in an easy way. So we decided to separate data related things and visualization. Finally we got two classes, we want to describe now.

The first class is some kind of device-class which provides access to the Kinect data streams and Hardware information. Additionally it is possible to retrieve the serial number of the connected Kinect cameras. This is essential for correct visualization of RGB and depth values after having calibrate a specific Kinect. The previously stored calibration data depending on the device serial are loaded as soon as it is initialized and can be retrieved through get properties.

The second wrapper is a kind of device-factory. Its main function is to initialize libfreenect but also to provide access to a connected Kinect. It it designed to support multiple devices simultaneously although we do not use this function yet. Further work on our framework includes enabling a plug-and-play functionality for Kinect devices during a running instance of the framework.

The main part …

… of our framework is a basic application interface and application starter. Every task is encapsulated within its own application and each application has its own visualization routine on demand. Visualization varies from task to task, so we decided not to create a general module for it. This is leaved to the programmer but we already created some small apps to start scanning in three dimensions, visualize data in realtime and creating models of real objects anywhere. Unfortunately they are on a more testing stage then anything else, but as shown we are getting some nice results with them.


The first app namley “Scan” is used to build a segmented, registered and color-matched point cloud of the object of interest. Interest at this is defined by our self-made scan tablets surrounded by special markers. This is where ARToolKit comes into. We use this library to detect them within the Kinect’s RGB-Image. Along with depth information we are able to calculate the position and orientation of the cameras view on the object. So what you would see is a splitted realtime 3D-view of what the Kinect actually sees and the processed point cloud.

At any time or more precisely at least three markers are detected, a snapshot of the actual view can be added to the resulting point cloud and finally it can be saved to a file. We deliberately divided scanning and meshing into different tasks because automatically creating a model from a point cloud with hundreds of thousands vertices is a resource-intensive process not consistent to realtime visualization.

Mesh reconstruction

So this is what the next mentionable application does. After loading the colored point cloud, some preprocessing which removes duplicated vertices of overlapping snapshots. After the use of a voxel grid filter, the vertex count is reduced again. After these reduction steps, a plane is approximated with the RANSAC algorithm (RAndom SAmple Consensus) for each voxel simultaneously. Each vertex which fits this model sets the plane’s normal as it’s own normal. (This was our first idea but the results were not satisfactory for every input. Again some research and manual testing gave a better idea which also has to be part of further work.) The final mesh is built with a technique called Poisson Surface Reconstruction which is described as an own part on our website.

For downloading our framework please look here.