Add a handler that appends a static date field to all outbound messages
Some feeds only provide timestamps as duration past midnight, assuming prior context of what day the processed messages are from. Since the messages themselves don't have this context, it would be useful to be able to append a statically specified date, convert that to Unix milliseconds, and append it as a manufactured int field to outbound messages. This would reduce friction for downstream SQL analytics on messages ingested from these feeds.
e.g --message_handlers AppendDateHandler:date=20291102
would turn that date into it's UNIX seconds equivalent, and append that as column to all outbound messages.
This could be an additional option to the timestamp pull forward handler.
https://github.com/GoogleCloudPlatform/market-data-transcoder/blob/main/transcoder/message/handler/TimestampPullForwardHandler.py
Good point, now with handlers being somewhat configurable, this could possibly just become a TimestampHandler with several modes: e.g.: pull forward from Seconds message, append static date, manufacture single timestamp column from a nanos timestamp + date, etc.
Algorithmically, normalizing dates from low-context streams (messages providing only nanos past midnight, e.g.) might look something like:
day = datetime.fromisoformat('20191230') # YYYYMMDD from MessageHandler params
day_epoch = time.mktime(day.timetuple()) # UNIX seconds equivalent
midnight_in_nanos = day_epoch * 1000000000
epoch_nanos = midnight_in_nanos + msg['nanos_past_midnight']
Then the handler can manufacture any combination of those values as a field, or unify into a single field (like ts_epoch_nanos)