open-simulation-interface Add orientation indicator to CameraSensorViewConfiguration (flip/rotate)

Describe the feature

The raw image data provided by some sensors needs to be treated as mirrored and/or rotated before further processing or display on a screen. Real-world cameras use the EXIF Orientation tag to record this information (https://sirv.com/help/articles/rotate-photos-to-be-upright/#exif-orientation-values). This hint cannot currently be represented in OSI, so generators of CameraSensorView messages are forced to reorganize the pixel values in CameraSensorView::image_data to match the sensor coordinate system to prevent confusion about how to interpret the image data for receivers. Instead of forcing this byte-reordering step, The CameraSensorViewConfiguration should have a way to preserve this orientation hint in the CameraSensorViewConfiguration, indicating how the data in CameraSensorView::imageData should be interpreted.

Describe the solution you would like

Add an integer field "orientation" (required, default 1, range 1..8) to CameraSensorViewConfiguration to indicate how the image data should be interpreted (or how a sensor/sensor simulator should fill the image_data field). The integer values should correspond to the orientation values of the exif data (i.e. 1 = "image is in correct orientation", 4 = "image has been flipped back-to-front and is upside down", etc.)

Describe alternatives you have considered

use separate boolean field for "mirrored" and a rotation enum (ROTATE_0, ROTATE_90, ROTATE_180, ROTATE_270)
Using negative values in field_of_view_vertical and _horizontal to indicate flipped data (=>rotation not possible)
Adding new channel formats to indicate flipped / rotated data

Describe the backwards compatibility

By choosing the default "orientation = 1", backwards compatibility is maintained.

Additional context

EXIF specification https://web.archive.org/web/20180921145139if_/http://www.cipa.jp:80/std/documents/e/DC-010-2017_E.pdf

Examples for flipped image data:

OpenGL image coordinates (0,0 is bottom left) vs "standard" image coordinates (0, 0 is top left)
Cameras used as mirror replacements
model-specific technical details of the camera/sensor chip

Mar 16 '22 11:03 RIFJo

@kmeids i have successfully added a pull-request that passes DCO signoff. The first one had the wrong author name in it.

Apr 28 '22 15:04 RIFJo

CCB 2022-05-02: The use case for this in regards to OSI SensorView (rather than native camera output formats, which are not OSI-based) is not very clear yet, and should be elaborated.

May 02 '22 07:05 pmai

CCB 2022-05-02: The use case for this in regards to OSI SensorView (rather than native camera output formats, which are not OSI-based) is not very clear yet, and should be elaborated.

I guess it is a question of wether OSI SensorView wants to represent the output as it is coming from the sensor, or enforce some kind of restriction/reformatting for the message.

From my point of view, this orientation hint has nothing to do with native camera formats, and is just another piece of information how image data can be formatted. The ChannelFormat enum has RGGB_32 and GBBR_32 - if you object to the orientation hint, i could - with the same argumentation - question the existence of this redundancy (and other similar entries in this enum). Same goes for *_U32_LIN vs *_F32_LIN. In theory, only one of these is necessary, as the other one can be produced by conversion or byte reordering.

The argument for the orientation hint is the same which could be used to argue for the "redundant" fields in the ChannelFormat enum: Make it easy and fast for generators of SensorView messages to fill the the "CameraSensorView::image_data" field, without forcing them to reorder or recalculate a byte array.

May 02 '22 09:05 RIFJo

OSI SensorView is not expected to represent the output as it is coming from the sensor. It is expected to provide enough data from the environment simulation so that a sensor model can then model whatever would be coming from the sensor (and do other further processing, but that is not relevant here). So the expectation is not that SensorView already is the data from the sensor.

Similarly, the channel format is under control of the sensor model, i.e. it is requesting data in the format it wants, so this is not about making the job of the generator easier (every option is making this job harder), rather it is making the job of the model potentially easier.

As you point out, we currently probably even have too many options in the channel format (the RGGB vs BGGR redundancy), already; but that is probably not in itself a good reason to make the problem worse, if there is no other rationale.

That does not mean that there is not a good use case for this extension, but right now at least to me it is not yet very clear what the intended use case is:

For simulator image generators it is trivially easy to generate the data in the mandated formats (no reformatting is ever needed on that end).
For recorded data replay usually various conversions are needed to prepare the data to be suitable for a sensor model, hence realignment is usually not expensive, and usually is not done at simulation time.
For raw data replay into a sensor model, it is unclear whether OSI is even a preferred option, since you likely want to use the exact format (including encodings) that you would get directly from the chip, so using OSI SensorView for this seems unhelpful.

So really the major reason to have orientation (and then likely mostly reflection and not rotation, practically speaking) configurable by the sensor model would be to enable sensor models to use code that is closer to series code when working with an imager chip that provides data in e.g. reflected format, without having to realing the data internally.

So if that is the use case, this should probably be made more explicit; and then it probably would make more sense to have a more generic solution (i.e. an extensible enum), since other realignments (i.e. interlaced or striped image formats) might be wanted in the future.

The point by the CCB was that this probably needs to come with more of the rationale being made explicit, e.g. through discussion in the sensor modelling working group, before the PR becomes ReadyForCCB.

May 02 '22 13:05 pmai

Thanks, I get what you are saying, but still think I can explain why I have a valid use case here. If you invite me, I will join one of the next working group discussions to make my case.

May 03 '22 07:05 RIFJo

@kmeids : Please pay attention to that topic and maybe invite @RIFJo .

May 16 '22 07:05 ThomasNaderBMW

We have concluded in the sensor modelling discussions that adding mirroring (without rotations) to the pixel layout is useful. I will prepare a new pull request that includes all participants input

Dec 15 '22 14:12 RIFJo

Can be closed as completed

Apr 25 '24 11:04 jdsika