feat: Utilities to assist in retrieving data from tables
Adds utilities to make it easier to pull data from a table.
There is a KeyedRecordAdapter that leverages the ToMapListener to allow pulling data by key (e.g. USym):
final KeyedRecordAdapter<String, MyTradeHolder> keyedRecordAdapter = KeyedRecordAdapter.makeKeyedRecordAdapter(
tradesTable,
myTradeHolderRecordAdapterDescriptor,
"USym",
String.class
);
final MyTradeHolder lastAaplTrade = keyedRecordAdapter.getRecord("AAPL");
final Map<String, MyTradeHolder> lastTradesBySymbol = keyedRecordAdapter.getRecords("AAPL", "CAT", "SPY");
And a TableToRecordListener that lets you listen to updates as objects instead of just row keys:
final Consumer<MyTradeHolder> tradeConsumer = trade -> System.out.println(trade.toString());
TableToRecordListener.create(tradesTable, myTradeHolderRecordAdapterDescriptor, tradeConsumer);
The mapping of columns to fields in whatever object is handled by a RecordAdapterDescriptor:
RecordAdapterDescriptor<MyTradeHolder> myTradeHolderRecordAdapterDescriptor = RecordAdapterDescriptorBuilder
.create(MyTradeHolder::new)
.addColumnAdapter("Sym", RecordUpdater.getStringUpdater(MyTradeHolder::setSym))
.addColumnAdapter("Price", RecordUpdater.getDoubleUpdater(MyTradeHolder::setPrice))
.addColumnAdapter("Size", RecordUpdater.getIntUpdater(MyTradeHolder::setSize))
.addColumnAdapter("Timestamp", RecordUpdater.getReferenceTypeUpdater(DBDateTime.class, MyTradeHolder::setTimestamp))
.build();
The RecordAdapterDescriptor can be used to create a SingleRowRecordAdapter or a MultiRowRecordAdapter. A single-row adapter creates an object (MyTradeHolder or JsonNode or whatever) and reads data from the columns directly into that object. A multi-row adapter has more steps but is intended to be efficient and chunk-oriented. It will:
- Create arrays to hold data for all of the columns.
- Read the data for all columns into those arrays (by chunk).
- Create an array of records (e.g.
MyTradeHolder[],JsonNode[], whatever) and fill it with new empty records. - Populate all of the records with the data from the arrays, one column at a time.
A record adapter instance can be generated based on the descriptor. For example, com.illumon.iris.db.util.dataadapter.rec.json.JsonRecordAdapterUtil#createJsonRecordAdapterDescriptor(t, "Col1", "Col2", "Col3").createMultiRowRecordAdapter(t) would figure out the types of "Col1"/"Col2"/"Col3" and generate a class that populates ObjectNodes with those fields.
There is also a draft Python version that lets you create data structures by passing data to either a function or straight to some type's constructor:
class StockTrade:
sym: str
price: float
size: int
exch: str
def __init__(self, sym: str, price: float, size: int, exch: str):
self.sym = sym
self.price = price
self.size = size
self.exch = exch
keyed_record_adapter: KeyedRecordAdapter[str, StockTrade] = KeyedRecordAdapter(
source,
StockTrade,
'Sym',
["Price", "Size", "Exch"]
)
records = keyed_record_adapter.get_records(["AAPL", None])
aapl_trade = records['AAPL']
self.assertEquals('AAPL', aapl_trade.sym)
self.assertEquals(1.1, aapl_trade.price)
self.assertEquals(10_000_000_000, aapl_trade.size)
self.assertEquals('ARCA', aapl_trade.exch)
Random notes:
- This is a generic version of a utility we previously used for latency monitoring (that's where the "create arrays to hold data, consistently read chunks from columns and copy the chunks to the arrays, then copy the data from the arrays back into row-oriented data structures" part comes from)
- The
MultiRowRecordAdapteris responsible for the real work of pulling data out of tables; bothKeyedRecordAdapterandTableToRecordListeneruse that. (KeyedRecordAdapteralso usesSingleRowRecordAdapterwhen fetching single keys.) - The Python version (
keyed_record_adapter.py/PyKeyedRecordAdapter.java) only supports strings as keys and has no listening component. - (The Java unit tests pass. The Python unit tests fail on primitive keys.)
https://github.com/deephaven/deephaven.io/issues/1715
I have not looked at this PR recently, so I don't remember the contents. DHC has added functionality to help users extract data. I'm noting it here in case it makes this PR no longer needed.
https://github.com/deephaven/deephaven-core/pull/5595 https://deephaven.io/core/docs/how-to-guides/iterate-table-data/