Deep learning for the automated interpretation of camera images and point clouds
Monitoring the condition of large structures such as traffic routes, buildings or agricultural land creates enormous quantities of 3D and image data, which currently tends to be analyzed manually. Fraunhofer IPM relies on automated data interpretation by using a »deep learning« approach, which is more favorable in terms of duration and costs. This method implies a semantical segmentation of the image data, where each pixel or 3D point is attributed to as specific object class.
As a method of machine learning, »deep learning« is a subfield of artificial intelligence which relies on smart algorithms. Training data sets are used to identify objects, e.g. prototype objects such as traffic signs, in a picture. Deep learning is based on artificial neural networks and has been shown to be superior to traditional methods of object recognition.
ANNs learn the output patterns which correspond to specific input patterns with the help of manually annotated training data. On the basis of this »experience«, new types of input data can then be analyzed in real-time. ANNs have proven to be very robust when confronted with variations on characteristic colors, edges and shapes.
2D Camera and /or 3D scanner data (pointclouds) or merged scanner and camera data form a suitable data basis for automated object recognition. A large number of object classes (e.g. buildings, tree trunks, kerbs, rails) and surface classes (e.g. asphalt, concrete, paving) can be recognized reliably and in real time.
Depending on the database, the generation of annotated training data is a very time-consuming and largely manual process. Various (self-developed) tools and strategies are available to make this process more efficient.