Cultural Heritage Imaging | Photogrammetry (original) (raw)
What is it? How does it work?
Example: Tlingit Helmet
How to Capture
Archiving the Results
Example: Assyrian Genie
What is it?
Fundamentally, photogrammetry is about measurement: the measuring of the imaging subject. To perform high-quality photogrammetric measurement, the photographer capturing the photogrammetry data set must follow a rule-based procedure. This procedure will guide users regarding how to configure, position, and orient the camera towards the imaging subject in a way that provides the most useful information to the processing software and minimizes the uncertainty in the resulting measurements. These measurements will be as good or as poor as the design of the measurement structure, or lack thereof, that underlies the collection of the photographic data.
Recent technological advances in digital cameras, computer processors, and computational techniques, such as sub-pixel image matching, make photogrammetry a portable and powerful technique. It yields extremely dense and precise 3D surface data with an appropriately limited number of photos, captured with standard digital photography equipment, in a relatively short period of time. In the last five years, the variety and power of photogrammetry and related processes have increased dramatically.
Video: “Photogrammetry for Rock Art”
Watch this brief video to see an example of a petroglyph rock art panel as a 3D model created using photogrammetry.
Photogrammetry for Rock Art from Cultural Heritage Imaging on Vimeo.
How does it work?
CHI uses an image capture technique for photogrammetry based on the work of Neffra Matthews and Tommy Noble at the US Bureau of Land Management (BLM). The BLM Tech Note (PDF) and 2010 VAST tutorial, provide additional information regarding the origins of our methods. Neffra and Tommy have been improving their photogrammetry methods at the BLM for over 20 years. Their image capture method acquires photo data sets that are software independent and will get the most information-rich results possible from the various photogrammetry software systems on the market. CHI has been working in collaboration with Tommy and Neffra for over a decade. The four-day photogrammetry training CHI offers was developed by and continues to feature this collaboration.
The method of image capture taught by CHI is software independent. A well-captured photogrammetry data set will repeatedly produce the same 3D model when processed by a knowledgeable user employing sufficiently robust software. Currently CHI uses Agisoft Metashape Pro software.
Structure from Motion (SfM)
The most advanced photogrammetry software uses the Structure from Motion (SfM) method. The SfM approach simultaneously determines how light passes through the camera’s optical system (the camera’s calibration) and the camera’s position and orientation (pose), relative to the imaging subject, for each photo. During processing, each camera’s calibration and pose is made increasingly more precise through an iterative process. This is done by iteratively refining a sparse cloud of points in the virtual scene representing the real-world environment containing the imaging subject. The points in the sparse cloud are created from the matches of similar pixel neighborhoods identified in multiple photos. If matching pixel neighborhoods are found in two, or preferably more, photos, the areas occupied by the pixel neighborhoods in the respective photos are projected into the virtual 3D scene. These projections intersect in the form of a common volume in the 3D scene and are represented as a point in the sparse cloud. The positional uncertainty (precision) of these points is reduced in a process discussed in more detail below. As the precisions of the point positions increase, the precisions of the camera calibration and pose also increase. When the desired camera calibration and pose are at the level of precision acceptable to the user, the SfM process is finished. During the following stage, Agisoft Metashape and other software packages offering SfM then use one variety or another of multi-viewpoint stereo algorithms to build a dense point cloud, which can be transformed into a textured 3D model.
Using SfM algorithms, photographic capture sets can be acquired using uncalibrated camera/lens combinations. To generate the information necessary to characterize how light passes from the imaging subject through the given optical system, SfM algorithms need a set of matched point correspondences. These matched points are found in the overlapping photographs of a planned network of images, captured from different positions and orientations relative to the imaging subject. How the camera is moved relative to the subject has a great impact on the degree of precision (positional uncertainty) present in the measurements of the associated 3D representation.
SfM differs from previous photogrammetry software tools. SfM relies solely on the photographs of a camera moving around the scene containing the imaging subject. No separate camera calibration is needed or desired. This feature separates SfM from other older photogrammetry algorithms, which require either a precalibrated camera or an additional set of photos to calculate a calibration for the camera, before point neighborhood matching commences.
In greater detail, the SfM software must take the information contained in the set of photogrammetry photos and optimally solve for three outcomes:
- Calibrate the camera’s interior geometry describing how bundles of light rays travel from the imaging subject through the camera’s optics to the digital sensor
- Determine the relative position and orientation of the camera pose for each photo relative to the imaging subject
- Generate a sparse point cloud of 3D points from finding and matching locations in two or more photographs that depict the same feature on the imaging subject
The camera calibration, pose, and sparse cloud point uncertainty are improved by using an uncertainty (error) reduction processing workflow. This workflow is known as optimization. In SfM, error reductions in the camera calibration, pose, and 3D point matches are all solved simultaneously. A precision improvement in any one of these three components, calibration, pose, or sparse points locations in the cloud, will improve the precision of the other two. A complex algorithm called a Bundle Adjustment generates this three-part improvement. How the Bundle Adjustment works is beyond the scope of this photogrammetry introduction; however, it is useful to know that Bundle Adjustment algorithms are widely used in experimental science.
In SfM, optimization operation continuously improves the camera calibration and pose as the matched point’s positional uncertainties are systematically reduced within the sparse cloud. This is usually done by iteration, at each stage removing the points in the sparse cloud that have the poorest precision. Each time the points with the poorest precision are removed, a Bundle Adjustment is run, and the calibration, pose, and point precisions improve. Points with initially poor precision, if not first selected for deletion, can have their positional precision continuously improved over the iterations of the Bundle Adjustment. This is one reason why not all the poor precision points are deleted at once.
When these three operations have yielded a very high precision, low uncertainty camera calibration and pose, often expressed in small fractions of pixels, the role of the SfM algorithm is finished. The sparse cloud has no further use. The remaining precisional uncertainty of the SfM Bundle Adjustment solution is quantified in the form of a Root Mean Squares Error (RMSE) residual. RMSE is equivalent to the statistical concept of a standard deviation. This level of precision uncertainty will serve as a foundation for all subsequent measurement operations.
Generating the Dense Point Cloud
The photogrammetry software must then use Multi-Viewpoint Stereo (MVS) algorithms, informed by the knowledge of camera calibration and pose, to build a dense point cloud in virtual space of a size determined by the user. The size of the dense cloud can reach into the hundreds of millions or billions of points. With a high precision camera calibration and camera pose, the camera sensor that captured each photograph can be positioned and oriented (posed) in virtual space to project the photo's pixel information through the virtual model of the lens (the camera calibration) in a direct line out towards the point on the virtual subject’s surface, represented by the pixel. It is important to understand that each of these projections is, in fact, a small, gradually widening “tube” from the boundary of the pixel on the camera sensor to a spot on the imaging subject. This tube encloses a small volume. This volume is the “footprint“ the projected pixel covers on the surface of the subject. When the projections from multiple photos intersect on the subject’s surface, they create a commonly shared volume. When the photos are captured from rule-based positions and orientations (poses), their projections work together to make a smaller and smaller commonly shared volume. The surface point in the dense point cloud made by these intersections falls within this commonly shared volume. The smaller the common volume, the less uncertain the point’s location becomes. This also means that the point’s position in space is known with increasingly higher precision. The rule-based photogrammetric capture method designed by Matthews and Noble is explicitly designed to produce a set of viewpoints of the subject that will produce projection intersections with the smallest common volume in 3D space. As will be shown below, when nine projections from nine properly positioned and oriented photographs intersect, the common volume will be very small and a highly precise, low positional uncertainty point will result. When each point results from the intersection of nine well located pixel projections, the dense cloud of points will represent a precise, measurable virtual 3D version of the original imaging subject’s surface shape.
The photogrammetry software then employs surfacing algorithms, using the dense cloud’s 3D point positions and the look angles from the photos to the matched points, to build the geometrical mesh. A texture map is calculated from the color information in the pixels of the original photos and the knowledge of how those pixels map onto the 3D geometry. The result is a textured 3D model that can be measured with a known precision.
Example: Tlingit Helmet – Views of a 3D Photogrammetric Model
This is a Tlingit helmet made of carved wood by artist Richard Beasley, 1998. Above are three views of a 3D model of it, produced from a photogrammetry image sequence. Top to bottom, left side: detail of input image as object rests on turntable; model in wireframe viewing mode; model in solid viewing mode; model in texture viewing mode. Right side: large background image combines 3 views of the model, illustrating wireframe, solid, and texture.
Adding Measurability
Without the inclusion of objects of known length in the project, photogrammetry generates 3D representations without scale. The scale for the virtual representation is added during the SfM stage of processing. The scale provides the ability to introduce real-world measurement values to the virtual 3D model. At CHI, we accomplish this by adding at least three (and preferably four) calibrated scale bars of known dimension into the scene containing the imaging subject. The scale bars can be on, around, or next to the region of interest. Each scale bar must be included in multiple (at least three, preferably nine) overlapping images. Scale bars are flat, lightweight linear bars in several sizes with printed targets separated by a known, calibrated distance. The software can recognize the targets. The user then enters distances between the targets. Using calibrated scale bars can produce levels of measurement precision well below one tenth of a millimeter.
Measurement structure design is the process of defining a sensor network and the subsequent methods to process the information it collects. In photogrammetry, the sensor network is the camera’s 3D location and orientation for each photo in the capture set in relation to the imaging subject. To get the best results, this network must collect enough data so that the impact of any incorrect data is minimized. The design of the measurement structure is influenced by the imaging subject’s 3D features and the number of images necessary to satisfy the given “accuracy” and quality requirements. Additionally, any restrictions on the placement of the camera can impact the sensor network design. The prerequisite for any successful measurement in any scientific data domain is the design of such a measurement structure. Reduction of measurement uncertainty is accomplished through the systematic reduction and elimination of error in photogrammetric image capture and its subsequent virtual 3D reconstruction.
The resolution of a surface model is governed by the area on the real-world subject represented by the pixels in the images from which it was generated. This resolution is known as ground sample distance (GSD). The GSD resolution is determined by the resolution of the camera sensor, the focal length of the lens, and the distance from the subject.
How to Capture Photos
A crucial element of a successful photogrammetric process is obtaining a “good” photographic sequence. Good photographic sequences are based on a few simple rules. These methods are independent of any particular software package, and the data produced will work in all photogrammetry packages. The CHI photogrammetry training class explores the reasons behind these rules and shows how to make informed choices in the face of challenging subjects.
Camera/Lens Configuration: Strong Suggestions
- Begin your project using a wide-angle lens. CHI’s first lens selection usually has a 24mm focal length.
- Choose your distance from the subject and set the focus. Then set the lens to “manual-focus” and tape the focus ring in place.
- Use prime lenses rather than zoom lenses. If a zoom lens must be used, use the nearest or farthest extent of the zoom.
- The camera’s aperture must remain constant during the capture sequence. On a 35mm camera, it is good practice not to set the aperture smaller than f/11. With apertures smaller (higher) than f/11, diffraction effects occur that blur the image, significantly reducing the camera’s resolution.
- Use the lowest possible ISO setting. The higher the ISO setting, the more electronic noise is generated in the camera sensor. This noise makes the matching of pixels in different photographs more difficult.
- Turn off image stabilization and auto-rotate camera functions.
- In variable light conditions (a partly cloudy day for instance), the camera should be set to manual OR aperture priority mode (use f/5.6–f/11 to get the sharpest images). Aperture priority locks the aperture and evens out exposure by varying the shutter speed. It is necessary to keep the exposure metering point on the imaging subject for consistent results. Set your camera to center point meters mode.
- To obtain the highest precision results, ensure that the camera configuration does not change for a given sequence of photos.
- If a change of camera or lens configuration, including focus, is necessary, group the subsequent photos together in a different set from the previous photos. Calibrate the sets of photos separately, each in its own calibration group.
How to Determine Where to Take the Photographs
To maintain a consistent 66% overlap, the camera must be moved a distance equivalent to 34% of the camera’s field of view between photographs, from left to right.
Be sure to begin the first row of photos positioned such that two-thirds of the field of view is to the left of the imaging subject.
Ensure the entire subject is covered by at least three frames in each row.
Proceed systematically from left to right along the length of the subject and take as many photos as necessary to ensure complete coverage. The last photo of the row must have two-thirds of the field of view to the right of the subject.
For higher quality results and greater imaging redundancy, follow this procedure. The steps below help lower point matching and depth uncertainty and provide essential redundancy.
- Raise the camera vertically and aim the camera downward 15 degrees to re-photograph the previously captured row.
- At the same time, rotate the camera 90 degrees to portrait mode and use the same 66% overlap from left to right.
- When the second row is finished, lower the camera vertically below the first row and aim the camera upward 15 degrees to re-photograph the captured area.
- Rotate the camera 180 degrees (for a total of 270 degrees), and again capture the area in the same way.
It is important to maintain a sufficiently consistent distance from the subject to retain sharp focus. Use a depth of field calculator app on a cell phone to understand how much freedom of movement in depth from the subject is available for your given camera and lens configuration.
For multi-resolution applications or to increase or decrease resolution, the user can change the camera position (closer or farther away from the subject) or change the focal length of the lens (such as 24mm to 50mm) up to a factor of twice or one-half the resolution of the previous set of photos.
- Follow this rule for as many sets of photos as necessary to reach the desired resolution.
- Calibrate each set of photos separately if you change the focus or the lens.
Because of the flexibility of this technique, it is possible to obtain high accuracy 3D data in subjects that are in almost any orientation (horizontal, vertical, above, or below) the camera position.
For round subjects, capture photos every 10 to 15 degrees and overlap the beginning and end photos to complete the circuit. Repeat the previous procedure to capture three rows of properly positioned photographs. For turntable projects, the subject’s background must be masked to allow only those pixels falling on the subject into the subsequent SfM processing pipeline. Failure to mask the background can corrupt your project.
Archiving the Results
Photogrammetry is archive friendly. Strictly speaking, all of the 3D information required to build a scaled, virtual, textured 3D representation is contained in the 2D photos present in a well-designed photogrammetric capture set. Today, the methods of long-term preservation of photographs are well understood. To preserve the textured 3D information of any imaging subject, all that is necessary is to archive the sets of photos and their associated metadata. When a 3D representation is desired, the archived photo sets can be used to generate or re-generate the virtual model. With a well-captured image set, newly generated 3D representation will be same as previous representations made with the image set. At the current rate of software and computing power development, it is likely that 3D models built from archived photogrammetry image sets will at some point be available “on demand.”
Example: Assyrian Genie Bas Relief in Sketchfab
The interactive 3D model embedded below was built in October 2016 by Cultural Heritage Imaging using photogrammetry. The imaging was done as part of a National Endowment for the Humanities (NEH) sponsored training session, hosted by the Los Angeles County Museum of Art (LACMA). This model was published in Sketchfab, a platform for sharing 3D content, and is made from 57 images from 2 different cameras and was decimated from a 3.5 million face model. No hole filling or smoothing was applied.
3D model created using photogrammetry: Eagle-Headed Deity, Neo-Assyrian Period (9th century B.C.), Los Angeles County Museum of Art, Gift of Anna Bing Arnold (66.4.4)
Here is a detail in Sketchfab of the genie's hand from the same bas relief.