Understanding LiDAR point cloud formats

There are several accepted LiDAR point cloud file formats used to store point cloud-type information, for example, the open-source 3D point cloud and mesh processing software CloudCompare supports over 30 different file extensions covering over 20 different file types.

Generally, the various file formats are attempts to optimise the suitability of the data storage for the needs of developers and users of these formats.

Key characteristics of file formats that are evaluated for optimal suitability include:

  1. The compactness of stored data
  2. Accessibility of the data by third-party developers
  3. Length of record for each point, line, or face that is being stored
  4. Speed of loading the point clouds into memory
  5. Ability to load only a known cloud subsection into memory
  6. Storing point values specifically, or the possibility to simplify the data to lines and planes
  7. The ability to store large floating-point values with high precision, without using 64-bit or 128-bit storage for each bit of data associated with each point, line, or plane
  8. Other possible characteristics

 

Stencil Pro scan of San Diego old town

Point cloud types

The type of point cloud relates to the underlying data and not the file type or format.

Structured v unstructured

To balance the goals of compact size with rapid access of sub clouds within the data, these myriad point cloud data formats often use a “structured” format, meaning the data is stored not just by point record but by grouping point records together that have common spatial relationships or a common LiDAR scan location.

The alternative to structured point cloud file types is unstructured file types where the data records are of individual points, reported in an often arbitrary sequence, from the first point to the last.

Registered v unregistered

To register point clouds is to align two or more point clouds into the same reference system and find the spatial transformation that aligns them. Determining the geolocation is similar to registering the point clouds using ground control points or known values to align point clouds to the Earth.

Ordered v unordered

An ordered point cloud has to do with the data being sorted in a particular sequence and is independent of the file type or format. Kaarta point clouds and all point clouds generated through continuous motion are unordered. Unordered point clouds could be ordered in 2D or 3D but it is not obvious at what point they are considered ordered.

Primary point cloud file formats

The simplest and most compact storage would be to represent each point in the file as an X, Y, Z record stored in a flat list of 32-bit binary data records. This format though is not very useful beyond being compact and accessible. Here are the most widely used formats:

LAS format

The most common “open” file format in use today is the LAS file format.

“The LAS (LASer) format is a file format designed for the interchange and archiving of LiDAR point cloud data. It is an open, binary format specified by the American Society for Photogrammetry and Remote Sensing (ASPRS). The format is widely used and regarded as an industry standard for LiDAR data”

However, the LAS format is not particularly compact, nor, because it is not a structured format, does it provide for an easy way to load a meaningful subset of the cloud.

Kaarta supports the LAS format through an export function. LAS clouds are often up to twice as large as the corresponding PLY binary format due to the method LAS uses of storing data beyond the usual XYZ and RGB information. LAS supports large UTM floating-point values more smoothly than PLY formats. A point cloud in PLY format is simply a list of the points and associated scalar fields.

E57 format

A second commonly used format is E57, which according to the libE57 website is:

“The E57 file format is a compact, vendor-neutral format for storing point clouds, images, and metadata produced by 3D imaging systems, such as laser scanners. The file format is specified by the ASTM, an international standards organisation, and it is documented in the ASTM E2807 standard.” 

E57 is a structured format, which allows for rapid loading of sub-clouds and has provision for recording the location of the LiDAR sensor for each cloud-record. This enables the orientation of the estimated normals for each sub-cloud, necessary for many cloud analytical steps.

Kaarta support of the E57 structured file format is complicated because their data is continuously collected in a time sequence and not organised spatially, as with LiDAR data collected from discrete locations. Each frame of roughly 30,000 to 130,000 points has an associated trajectory point, but the volume subsumed by those 30,000 to 130,000 time-correlated points, may have close neighbours that were observed anywhere from 100ms to tens of minutes later and are far separated in the original data file sequence. Kaarta has ideas on how to overcome this feature of their point clouds and associated trajectories, to create a meaningful E57 structure.

PLY format

A third format, and the one that Kaarta primarily uses, is the Stanford Triangle Format or PLY format. It is an unstructured point record format that can be compact and simple but is less commonly used in the industry than LAS. PLY incorporates a file header system that is easy to understand and interpret, the versatility of storing data in either ASCII or binary formats, both big-endian, and little-endian, which makes it easy to develop tools that manipulate or make calculations on the point cloud.

Conclusion

There are many file formats and each one has particular strengths and weaknesses depending on the method of capture, software use, computing hardware effects and more. In short, the portable mapping devices have particular characteristics that lend themselves well to PLY formats but other formats can be converted or used for many applications.