What Is Augmented Reality? – The Complete Guide16 min read01/07/2018
Augmented reality is mostly associated with gaming since it has become widely known after the Pokémon GO game release. Millions of gamers first experienced the immersive technology while trying to catch pokemons with their mobile devices. However, augmented reality is not just about gaming.
The augmented reality technology enhances our physical environment by superimposing it with computer-generated information through special hardware. This additional information can include text, animation, image, sound, or video.
How augmented reality works
While using AR apps, users only the final result of adding computer-generated imagery objects to the real-world environment on a display. However, augmented reality software comes through four major phases to superimpose the real-world with additional content:
- Capturing the environment – using input devices, an AR system captures a physical world image.
- Image processing – AR software processes the image to determine where to add computer-generated content
- Requesting the necessary content – once the augmented reality system determines where to superimpose an additional content layer, it requests the content to be added.
- Superimposition – as soon as the system retrieves the necessary content, it forms the final image consisting of the real-world environment image and additional content layer.
Along with overlaying real environments with additional content, augmented reality provides a set of features that can be easily transformed into significant benefits.
- Collecting data using different sensors. AR systems can collect various types of data about the real environment that include video (through depth or wearable cameras), audio (through a microphone), location (through GPS or GSM triangulation), motion (through cameras, or wireless signals (through WiFi or Bluetooth).
- Real-time data processing. Augmented reality systems collect data from sensors and then process it in real-time. In addition, some collected data can be stored for further analysis or sharing.
- Overlaying the typical world perception. Unlike virtual reality, AR doesn’t offer a totally different environment. Instead, it expands the existing surrounding with useful information through a display, haptic feedback (for example, air pulses or vibration), or speakers.
- Providing contextual information. AR provides users with information that relates to the environment they’re in. This information may either enhance or explain what users see. For example, using its augmented reality feature, a mobile app Google Translate explains the meaning of a particular word when users point their smartphone camera at text printed on a T-shirt.
- Recognizing and tracking physical objects. AR systems can be built in the way to be able to recognize specific real-world objects and track them as they remain in the user’s field of view. This is exactly how face recognition systems work.
- Availability through different types of devices. Augmented reality can be implemented on a base of different types of hardware including mobile and wearable.
Augmented reality is mostly used through iOS and Android mobile platforms because of their popularity and wide user coverage. From a solution to solution, features vary but all AR mobile apps are based on the main technology capability to overlay physical environment with additional content. However, AR applications may rely on different approaches to provide contextual information.
With AR, mobile devices can provide users with information about any object within the camera’s field of view using GPS. This expands the capabilities of modern navigation systems and contextualizes information about surroundings. By pointing a smartphone camera at real objects like shop signs, users can retrieve useful information using a location-based AR app. This information can include working hours, a product list, contact details, etc.
Location-based augmented reality mobile apps rely on a set of smartphone built-in features to augment the environment captured by a device camera. These features include:
- A GPS that helps an AR app accurately determine the user’s location.
- A solid-state compass (also known as a digital compass or magnetometer) that helps the app determine the direction where the user’s pointing in when not moving.
- An accelerometer that enables the augmented reality solution to determine changes in the speed and orientation when users are moving.
- A gyroscope that supports the accelerometer to accurately track the device position within the environment.
It was hard to imagine all these features in mobile devices a few years ago. However, they’re now an inherent part of nearly every smartphone or tablet and make the superimposition possible for AR apps. For example, Sky Guide AR, a location-based augmented reality app, can turn any user into an astronomer who can accurately recognize constellations in the sky by pointing a mobile device camera at stars in the clear sky.
Augmented reality apps aimed at recognizing physical objects rely on understanding the environment visually instead of the geolocation of particular objects. To determine where to attach computer-generated content, recognition-based AR relies on either markers or a markerless approach.
Marker-based augmented reality apps use special markers to recognize where to add digital content. These markers can have a form of small printed images, 2D barcodes, or QR codes. Using a mobile device camera, an AR app scans a marker and then overlays it with predefined information. For example, in 2017, Shazam added a new feature that allows brands to connect with their target audience through augmented reality. By scanning special Shazam codes, users can reveal 3D animations, 360-degree video, mini-games, or product visualizations.
SLAM (markerless AR)
A GPS system is worthless for accurate determination of the indoor object. To properly put a 3D object, an AR app has to measure the distance to this physical object and then attach the necessary digital content. Furthermore, an augmented reality mobile app has to take into account the orientation of the mobile device to properly place a virtual object. text, video, or animation.
It’s hard to achieve a high level of accuracy with smartphone built-in sensors. This also requires advanced algorithms that can efficiently process all retrieved data from those sensors. This is where the simultaneous localization and mapping (SLAM) technology comes in useful. SLAM digitalizes the environment captured by a smartphone camera and creates maps to accurately place predefined content. This technology “sees” the environment as a set of points and overlays virtual objects using complex algorithms and data retrieved from sensors.
SLAM is the markerless AR technology based on direct object recognition. The main advantage of SLAM is that it ensures a high level of object placement accuracy even when the user is moving with a mobile device in one’s hand.
Markerless indirect recognition
AR apps may use the markerless indirect recognition to provide users with contextual information. This approach implies comparing a digital footprint with the content stored in the database. This is how the Google Translate’s AR feature works. However, to recognize text on the image, the app uses the deep learning technology, a type of artificial intelligence.
Another Google’s product that uses markerless indirect recognition is Google Goggles. Once the app captures an image using a camera, the system compares it with a database of index information about a set of predefined images of objects, sights, logos, etc. To get the information about a certain sight, users can simply point their mobile device camera at this object. In addition, the app uses GPS and compass data to provide relevant search results.
AR based on the projection approach happens in the physical environment rather than on a display of a mobile device. Projection-based AR systems project artificial light onto physical surfaces to enable users to interact with the computer-generated content by comparing the expected projection with the projection altered by a user.
Projection-based AR falls into the category of the spatial immersive technology. Such augmented reality systems project photographic or computer-generated images using one or more projectors. These images can be either generated in real-time or pre-rendered. AR projectors can align digital content with real-world objects or register it to align with features of these physical objects. If accurately registered, those rendered images can even replicate a color and features of objects to form a unique visual effect which can provide magnificent high dynamic range (also known as HDR) outcomes.
AR apps based on the superimposition approach either partially or fully replace real-world objects on a mobile device display with digital images, a 3D object, or animation. These solutions can also enhance physical objects with some virtual features. For superimposition-based AR, object recognition plays a vital role in accurate computer-generated content overlaying.
To properly put digital information, superimposition-based AR apps have to quickly recognize necessary objects. One of the examples of such apps is IKEA Place. This augmented reality app enables users to virtually put furnishings in their houses or apartments by simply pointing their mobile device camera at necessary places within their accommodations.
Developers can use various programming languages typically utilized for backend software development. However, according to the 2017 State of the Developer Nation report, C# and C/C++ are the most popular programming languages for AR app development.
To build augmented reality apps, developers used to utilize third-party software development kits (SDKs) or write the code from scratch which was an extremely time-consuming task. Android developers had the other option to use Google Tango, a high-end AR app development platform for specific devices like Lenovo Phab 2 Pro and Asus ZenFone AR. Google no longer supports the platform because of its low scalability.
When Google and Apple released their official SDKs, it significantly simplified the augmented reality app development. ARCore and ARKit offer complete tools that dramatically cut the time needed for the development process by enabling developers to skip writing a code typical for most AR apps.
ARCore is Google’s SDK for AR mobile app development. It supports mobile devices based on Android 7.0 Nougat and higher. The augmented reality app development platform has the following features:
- Motion Tracking. The SDK determines the mobile device orientation using the motion tracking feature to properly align computer-generated content to physical objects.
- Detecting surfaces: ARCore can detect horizontal surfaces to properly put digital content.
- Light analysis. The platform can analyze light conditions to properly adjust reflections and shadows. This enables accurate virtual content displaying in real-time.
- Interactivity. The SDK enables users to interact with physical objects using its pick correlation feature that tracks interactions of light rays.
- Determining anchors. The Google’s platform determines anchors to accurately track the disposition of real-world objects and properly put virtual objects within a physical environment.
ARKit is Apple’s SDK for AR mobile app development. It supports mobile devices based on iOS 11 and higher. The augmented reality app development platform has the following features:
- Positional tracking. The Apple’s SDK uses a mobile device built-in visual inertial odometer to analyze data retrieved from a motion sensor and camera in order to accurately track the position of an iOS-based mobile device.
- Environment understanding. The environment understanding feature allows ARkit to quickly determine various types of surfaces like a floor, furniture, and walls as well as calculate their dimensions.
- Rendering. Advanced rendering capabilities allow ARKit to properly place 3D objects and accurately align them to the real-world environment. The integration with different software development frameworks and engines enables the SDK to render 3D animations on mobile device displays.
To make overlaying real environments with computer-generated content possible, hardware has to include the following components:
- sensors to collect necessary data about a physical environment,
- a processor to process data retrieved from sensors,
- a display to show users the augmented surrounding,
- and input devices to enable users to interact with virtual objects.
Any device that supports augmented reality is a specific configuration of all these components.
Smartphones and tablets
Today’s tablets and smartphones contain the above-mentioned components. Most mobile devices have a camera and set of microelectromechanical systems (MEMS) sensors that include GPS, a digital compass (magnetometer), accelerometer, and gyroscope. As of 2018, the industry standard is high-frequency central processing units and high-resolution touchscreens that also play a role of an input device. These components make mobile devices a suitable hardware for AR apps.
For displaying visual content, specific augmented reality systems can use head-up displays (HUDs) that look like transparent displays. They provide users with computer-generated information within the user’s field of view. Head-up displays are projection-based AR systems that typically consist of a video processor, which generates visual information, projector, which projects this visual content onto a transparent display, and combiner, which captures light transmitted by a projector.
HUDs are widely used in modern cars for transmitting important information on windshields that play a role of transparent displays. Thus, with a color HUD, 2018 Toyota Camry drivers can see real-time data that includes speed, navigation, shift position, outside temperature, rotations per minute (RPM), compass, audio and phone settings.
Some projection-based AR systems may use smart glasses to overlay what users see with visual content. This wearable device enables users to see the real-world surrounding like they see it when wearing traditional glasses. One of the most known smart glasses is Google Glass that allows users to control it with voice commands. The device has a built-in camera, touchpad, and LED display. With Google Glass, users have an instant access to different types of useful information like calendar events, weather, video, images, etc.
The same as in the case of problems with their vision, people may wear either dioptric glasses or eye lenses, augmented reality is also available with both smart glasses and smart lenses. The latter device is an improved version of smart glasses. Smart lenses transmit visual content directly into wearer’s eyes. For instance, in 2016, Samsung patented smart lenses which have a camera and set of sensors. Users can control this device with a wink.
Virtual retinal displays
AR developers combined smart glasses and smart lenses which resulted in a virtual retinal display (VRD). This device looks like typical virtual reality glasses, but VRD projects visual content directly into the user’s eye in the similar way smart lenses do. Virtual retinal displays have a significant advantage over smart glasses.
With a VRD, users may relax their eyes and see the augmented real-world surrounding in an essential way without the need for focusing their vision on the computer-generated content that appears close to their eyes. However virtual retinal displays are currently too heavy to become a popular AR device.
Many businesses have been greatly benefiting from the immersive technology. They have been experimenting with different augmented reality use cases. For more than 20 years since its first application, augmented reality has deeply penetrated various industries.
One of the first AR games was 2006 Agent V on Nokia 3230. Using the device camera viewer and built-in motion sensor, players had to move the smartphone to shoot virtual creatures that appeared on the display. The thing is that those virtual objects were hardly adjusted to the real-world environment. The physical surrounding rather looked like a background for computer-generated content. Since that time, augmented reality has been greatly improved. Its algorithms have become more complex, graphics have become more detailed, and hardware has become more powerful.
It’s currently hard to find a smartphone owner who has never heard about Pokemon GO. Millions of users all over the world played this game. In this game, virtual objects look like a part of the physical environment. This is how AR mobile games make the playing process closer to reality.
Projection-based AR systems are widely used in new car models to transmit valuable driving data onto a vehicle’s windshield in order to let drivers focus on the road and monitor necessary information at the same time.
Intelligent parking assist systems use the immersive technology to help drivers safely park. Using a rear camera, these systems transmit the vehicle trajectory onto the real-world image on a display in accordance with the steering wheel position.
Augmented reality also has a great potential for employee training within the automotive industry. For example, Mitsubishi created its AR-based maintenance support technology that enables technicians to conduct mechanism inspection and enter results by voice while wearing smart glasses.
BMW uses the immersive technology for marketing purposes. Their AR app allows customers to see through their smartphone display how a new i3 or i8 would like in their yard or inside their garage. This helps clients better visualize their future purchase compared to ordinary car images.
The aviation industry may use augmented reality as a means of the maintenance assistant. When wearing smart glasses, mechanists can review repair or assembly checklists on the fly without the need for picking up printed materials or even tablets. Thus, their hands remain free which significantly facilitates the maintenance process.
Using a camera, intelligent landing assist systems can help pilots safely land their airplanes by generating the aircraft trajectory on a display. This is similar to existing digital car parking assistants.
Transportation companies can speed up their completeness check processes by using augmented reality. To verify whether the loading is complete, employees currently need to manually scan barcodes on each box. When wearing smart glasses, workers can automatically scan barcodes with this smart device.
The immersive technology can also significantly optimize the load process. With smart glasses, employees can get instant information about a certain box, its place on the shelf, and optimal route to it within a warehouse.
AR has a great potential for medical training and surgery assistance. With the immersive technology, medical students can learn the human anatomy in a more engaging and interactive way compared to just anatomical models. Combining anatomical models with augmented reality, medical students can review the inner body structure in details in a 3D view mode.
Intervention planning mostly relies on medical analysis data and visualization techniques like X-ray. To conduct a surgery, doctors always keep the intervention plan in mind. The computer-guided surgery can significantly facilitate the intervention process by providing surgeons with real-time visual information about the patient health condition. For example, when wearing smart glasses, a surgeon can see how a patient’s injured bone looks like in 3D. This will greatly increase the surgery accuracy and reduce the error rate.
AR stands for augmented reality. This immersive technology has been turned from the entertaining feature into a powerful marketing, employee training, and process optimization tool. It’s currently widely used in various industries that include aviation, automotive, healthcare, transportation, and many more. By overlaying our physical environment with contextual information and valuable visual content on a display, AR can provide significant benefits for both commercial and public organizations.