MCAP example for OSI tracefiles
Describe the feature
One or multiple OSI tracefiles should be packaged within an MCAP container MCAP repository. MCAP is now used for ros2 and is MIT licensed. Recently the requirement for signing a CLA with Foxglove Inc. was lifted. Therefore the broad usage is now unproblematic.
Describe the solution you would like
We analyse the meta data used and proposed in the MCAP project and map it to the meta data information used in the OSI project. The osi tracefile writer python script is extended to also wirte these MCAP traces. The documentation of osi regarding traces is extended with the example and usage guidelines for MCAP. A good usecase are e.g. traffic participant models which have multiple osi trace outputs which are time synchronized and could profit from being stored in a single container.
Describe alternatives you have considered
Consider mentioning the possibility of using ASAM MDF as container. Remark: MCAP is used broadly in simulation context already.
Describe the backwards compatibility
No issues. We should name the tracefiles contained within the MCAP container according to the existing naming conventions.
@timmruppert I have invited you to the official team.
Here is a current draft from Pierre for meta data deifnitions for osi tracefiles: https://github.com/GAIA-X4PLC-AAD/ontology-management-base/pull/80
I explored the possibilities of using MCAP, conducted a quick proof of concept (POC), and discussed it with the Persival team.
General overview:
- MCAP supports Protobuf natively.
- There is an official C++ (pseudo) header-only library for reading and writing.
- There is an official Python package for reading and writing.
- A CLI tool
mcapis available, which can be used to display information, among other things. Example:
$ mcap info 20240830T101515Z_sv_370_244_10000_mcap.mcap
library: python mcap-protobuf-support 0.5.1; mcap 1.1.1
profile:
messages: 10000
duration: 16m39.9s
start: 0.000000000
end: 999.900000000
compression:
lz4: [6/6 chunks] [5.43 MiB/520.69 KiB (90.64%)] [533.00 B/sec]
channels:
(1) sensor_view 10000 msgs (10.00 Hz) : osi3.SensorView [protobuf]
attachments: 0
metadata: 2
In the POC, I created an .mcap file consisting of OSI3::SensorView messages using the Python package. I then read this .mcap file using an extended version of the OpenMSL trace-file-player, which fed the data into a co-simulation using openmcx. This combination of Python and C++ worked smoothly.
Now, let's dive into the details:
Timestamps are (more or less) required
- If omitted, the MCAP writer will automatically use the time when the message is added.
- It’s most practical to use OSI timestamps, which are required in each top-level message since version 3.7.
A topic is required
Each message must be stored under a topic.
We propose using the name of the top-level message class, e.g., SensorView.
mcap can handle compression
- MCAP supports data compression, using zstd by default.
- Other options include lz4 and no compression.
- Further testing is needed to determine if zstd is fast enough for large SensorData messages.
- Here's a comparison of file sizes for a short file containing a few SensorView messages:
5,2M 20240830T100943Z_sv_370_244_10000_mcap.osi # corresponding trace file
5,7M 20240830T100943Z_sv_370_244_10000_mcap.mcap # no compression
509K 20240830T101032Z_sv_370_244_10000_mcap.mcap # zstd (default)
737K 20240830T101032Z_sv_370_244_10000_mcap.mcap # lz4
Metadata
- MCAP files don’t require specific metadata, but you can store arbitrary metadata.
- Each metadata entry has a name and key-value pairs that can store multiple pieces of information.
- We propose using the following metadata structure:
- name: time_of_first_message
- key: timestamp value: string format 20210818T150542Z
- name: versions
- key: osi value: string format x.y.z
- key: protobuf value: string format x.y.z
- name: creation_tool
- key: name value: arbitrary string
- There is no longer a need to store metadata in the filename itself.
- The
cration_toolmetadata entry would be a new addition. - I would omit the fields of https://github.com/GAIA-X4PLC-AAD/ontology-management-base/pull/80 for the initial mcap support. They can still be used though. Nobody is stopped from adding more metadata to their files.
Specification and Implementation
While specifying the required metadata and formats would be sufficient, we strongly recommend providing implementations of the MCAP writer and reader, similar to the existing osi3trace/osi_trace.py script.
The benefits of providing such implementations, which are essentially thin wrappers, include:
- Easier adoption of the MCAP format.
- Could be integrated into OSI.
- Provides out-of-the-box support.
- Enhances user experience and usability.
- Automatic selection of topics based on message type (ensuring consistency).
- Automatic setting of timestamps to OSI timestamps (ensuring consistency).
- Automated handling of metadata.
- Avoids reinventing the wheel for tools like the OpenMSL trace file player and writer.
- Prevents errors when trying to store non-top-level messages.
I will be out of the office for the next three weeks, but Clemens will be available to discuss this further. Upon my return, I plan to create a branch to provide such an implementation.
@jdsika let's discuss this next week.
Specification
- It shall be possible to put multiple channels of the same top level message type into one MCAP, e.g. two SensorData messages
- It shall be possible to also add non-osi data to the same MCAP
- File names shall be freely chosen
- The specification shall link to an example file
Library
- Utility library for reading and writing OSI trace file should be provided
- This library should be put into a separate repo to enable more flexible development
- The library should be able to process current .osi binary traces as well as "new" OSI MCAP trace files
- The library should be implemented in both Python and C++
- The library should be extendable with other features for OSI
@thomassedlmayer could you add the link to the example tracefiles here please? They are MPL-2.0 licensed? I need the example actually quite quickly in order to test it with a visualization tool.
@jdsika https://github.com/GAIA-X4PLC-AAD/sensordata/tree/main/ars548/sampledata/119_pmsf_adc_cutout
License information is also given there. (Yes, it's MPL 2.)
Should we just copy the files? The repo is private
Where do you want to copy the files to?
I'm not sure if we can make the repo public because there is some sensor specific code etc. in there. But I'll bring this up for discussion next week because I agree that the sample files should be publicly available.
But since the files are licensed as MPL 2.0 you can of course copy them (I suppose you have access.)
I think we should use the trace as a test case for the reader/writer utility