The maps are constructed in the Immersal Cloud Service. You must first submit a set of images and metadata to the service and then start the map construction job. Small jobs of a few dozen images complete in seconds while large jobs with hundreds of images can take tens of minutes.
The input for map construction is camera images and metadata. The metadata consists of
camera intrinsics (pixel focal length and principal point offset)
camera pose information (relative image position and orientation in metric scale)
optional GPS coordinates
Depth (LiDAR) data from the latest iPhones is not used as of now due to the limited range.
The input must be images, the Cloud Service can not convert a 3D point cloud to machine-readable maps for Visual Positioning.
In very simplified terms, the map construction process:
Finds distinct visual features, such as high contrast areas and shapes, in the input images
Matches visual features from images from different viewpoints
Uses the parallax in matched features to compute a 3D structure
The 3D structure forms the base for the output map. There's a lot more at work to make the Visual Positioning work, but understanding the basics of map construction will help in learning how to map.
You can use the REST API to submit images to the Cloud Service, but the easiest way to map is with our Immersal Mapper App.
Here are some tips to get you started
Remember to move around the target object or in the target location. The map construction needs to have different viewpoints of the target area
Don't be afraid to capture plenty of images, even for small locations. Any single visual feature should be seen at least from 3 different viewpoints
To make sure you capture the visual features from multiple viewpoints, you should try to capture images that overlap with each other
A 30-50% overlap between two images is a good rule of thumb
You can capture both portrait and landscape images. Just try to have as much useful visual information in the images as possible. The sky, for example, is not very helpful in the map construction process
For a great map, you should think about what your user's cameras will see when using the finished AR application. For best results, the map should include similar viewpoints
Not all spaces can be mapped. Some locations are more suitable for Spatial Mapping and Visual Positioning than others.
For example, highly reflective surfaces don't have static visual features for map construction. The reflections move around depending on the viewpoint.
Locations that lack distinct visual detail and features are also difficult. An extreme example would be a blank white wall. It would look practically the same no matter what part of the wall you would see.
Highly dynamic lighting can cause problems as the mapped space can look visually very different in drastically different lighting. In these cases, it's best to use multiple maps of the same location for different lighting conditions.
For a good map, you should see the same area from different viewpoints. Any single visual feature needs to be seen from at least three viewpoints. More varied viewpoints mean better accuracy.
Small to medium-sized, focused locations enhanced with Augmented Reality content. These types of locations are a very good use case and do not require mapping everything around the user, just the focused target area.
Examples of AR Hotspots are statues, murals, and other street art. Storefronts, building facades, pop-up stores, exhibition booths, and art installations fall in this category.
When mapping AR Hotspots, try to cover the area from as many angles as possible. Take a series of images that overlap with each other. If you need to cover a specific part of the hotspot with extra detail, you can capture additional close-ups.
If you need to map a very long area, such as a building facade, that would be difficult to cover in one arc, you can try to cover it with multiple "mini-arcs". You can also take additional images from further away. These will help the localizer when viewing the target area from further away.
Landmarks like statues are often easy to map by just capturing a series of images in a circle around it. Try to fit all important visual features in the images. You can map either in landscape or portrait mode. You can also mix orientations when needed and take close-ups for extra accuracy.
Large indoor spaces should be divided into separate maps, such as different rooms. You can use the separate maps at the same time or combine them later. But mapping them separately makes the mapping process easier, map construction faster, and the map update process more flexible.
To map an indoor location, map using the "outside-in" method. Take a series of images while moving around the perimeter of the room looking across the space. Remember that you can use either landscape or portrait mode. In smaller rooms, landscape often works very well.
This basic approach works for all types of indoor locations with only a little tweaking. Just try to cover the whole area from as many angles as possible.
If you need to map narrow areas or areas connected by narrow doorways, you should take extra care to make sure the different areas can be visually connected by the images.
Outdoor locations are usually just larger variations of the other types. You could map city streets the same way as you would map a mural. Capture images in multiple directions from multiple viewpoints.
For a wide and open area such as a market square, you could capture many "panoramas" from different viewpoints to get perfect coverage in the whole area.