Depth Technology is becoming well established in Human Machine Interaction (HMI) and improves the intuitive interaction of humans and machines via real-time gesture detection. In addition, Depth Technology is proving itself in the more traditional fields of industrial automation, 3D-sensing machines, and inspection tasks. Now, 3D cameras with integrated Depth Technology are available off-the-shelf, like Intel®’s RealSense Technology. This 3D Depth Technology is easy to handle and affordable, and can accelerate various vision applications while taking them to the next level.
3D Positioning
Whether it is general robotic guidance, pick & place applications, or interactive gaming, machines in our 3D world need to know object location, and the location of obstacles to avoid or articles to interact. Precise 3D data is key to controlling, measuring and interacting correctly with these items. A conventional camera transmitting 2D information is restricted to analyzing form, size, and the proportions of these bodies. The machine estimations of its distances from a 2D camera in contrast is approximate. These distances are inferred from predefined 2D models compared to what is currently viewed by the camera. Likely, the distance value will include a considerable error when viewing previously unknown artifacts. For example, to control a robot moving down a corridor, 2D information from a robot mounted camera might be sufficient to stop the robot from crashing into a wall or a chair, and detecting the presence of empty space (via recognition of floor texture). A 3D camera would provide information on the location of new items in the field of view that might include new carpet surfaces. The direct implementation of 3D information for 3D navigation reduces processing resources, hardware, and time. In addition, 3D implementation removes both the approximations and assumptions that are present in 3D location information that has been extracted from a 2D image source.
Any gaps in both the precision and latency acquired in 3D information is relevant to safety, especially regarding human-machine collaboration, and factory automation. Detecting the movement and distance of other robots and humans allows a machine to avoid collisions, or unsafe activities. This provision enables a safe and efficient sharing of the work space. Appropriate 3D data provides exact information on “what, where, what size”, and makes decision making easier for the machine, without any assumptions, and without time-consuming processing. 3D information makes every application in need of positioning, safer and faster.
How it Works – 2D versus 3D
Remember the movie “Minority Report[1]” and the billboard scene where, our hero Chief John Anderton (played by Tom Cruise), walks by and they change to target ads specifically to him? If there is a picture of a person on a billboard, a 2D camera recognizes them as a real person. Only a 3D camera can determine if it is just a flat projection. But why is this statement true? A regular 2D vision camera creates a projection of a 3D object in 2D space. Data must be interpreted by the camera to perform this conversion. Humans, like most camera systems, perceive an object that is further away as being smaller than it appears. A 2D vision program is not capable of this task, it must work with estimations and approximations learned via complicated algorithms to achieve approximate 3D coordinates for any recognized object. This problem indicates to the vision engineer, a need to write software not only for navigating a 3D environment, but to write specific software for both object recognition and distance estimation; a potentially unlimited task. Pure 3D data will enable the application to be no longer affected by scaling variations that are affected by distance. In addition, 3D data reduces the complexity of the application – as it is currently calibrated and combined with 2D data automatic scaling; object recognition can be created with less approximation and a known scale. A good example is the people counting application; the engineer simply has to program the system with the nominal shape and size of a person and the task is complete. Here, the system could distinguish easily between a child’s doll and a real person.
Quality Inspection
2D imaging with one camera merely provides the surface area of an object, as it would be projected onto a flat plane. If the camera is viewing a tire or a PCB, it detects the pattern and characters based on lines and shadows. However, for a precise inspection, 3D data can provide depth of the tire tread, or the elevation of components on the PCB. Additionally, multiple cameras in tandem can increase the accuracy of depth measurement, or capture all the sides of an object while viewing the same object from different perspectives. Merging the 3D system with 2D data from the different cameras allows for a complete inspection to ensure quality standards are met.
A traffic system, for example, simply levies tolls based on the height of cars with 2D cameras. The application instructs the driver to “please park here” to get very accurate data, or must work with the aforementioned estimations combined with very complicated algorithms. Conversely, 3D data allows the car to be recognized in motion, and without any restriction to distance or position.
An engineer knows different ways to solve a problem and nearly every problem can be solved with 2D scenarios and work-arounds. 3D information solutions can reduce the software complexity significantly, while providing a flexible solution together with increased precision. Machine Vision applications, especially those that require exact positioning, can dispense with a large portion of costs that are related to the precise location of the object to the camera. 3D data permits an object to be measured without the errors created by perspective or texture.
The Advantages of 3D Cameras
In addition to the direct benefits, 3D cameras enable the augmentation of 2D systems. For example, in addition to a typical 2D surveillance system, 3D data allows for the accurate height measurement of “Persons of Interest”. Also, it is possible to use the 3D data for ‘abandoned luggage’ detection and validation in areas of high security sensitivity, like an airport. Here, a 2D solution would be unable to distinguish between a brown duffle bag and a spilled cup of coffee. 3D data can support and improve 2D applications by responding to the need for location, together with both volumetric and navigational information. Evidently, 3D data can be directly compared to 2D systems and save much time, and reduce both software complexity, and costs. Increasingly, 3D acquisition hardware is accessible to new users, and established Machine Vision users can look again at legacy projects and see new potential in 3D data solutions.
[1] Copyright of Amblin Entertainment and Cruise/Wagner Productions