fsttable
fsttable copied to clipboard
An interface to fast on-disk data tables stored with the fst format
Seems like a great package for handling large datasets.
I would like to rbind two fsttable objects or a single fsttable with data.frame. What would be the preferred method? ``` library(fsttable) library(data.table) ft1
For example, we can have a `fst_remote` package that implements the _fst_ format as a _remote table_. That structure could be easily modified to have the `fst_remote` package running on...
After some tweaking, the design is now as follows: * The _data_table_interface_ class (defined [here](../blob/develop/R/data_table_interface.R)) acts as a wrapper around the controller object of class _table_proxy_ (defined [here](../blob/develop/R/table_proxy.R)). * The...
Some operations might require temporary files. For example, a large _slice map_ or a newly created column can be stored in an additional `fst` file. When these potentially large temp...
The table object should determine when the file it points to is deleted. At a minimum, this should happen when a query arrives that requires reading from disk, so that...
But it will not be _evaluated_ like in `data.table`. That's because the _datatableinterface_ object doesn't really contain any data, but just references to on-disk data. So we can't actually evaluate...
A _table proxy_ has a specific state. That state reflects the current table that it represents. But no actual data is read until necessary, so the state is a collection...
Currently the `datatableinterface` inherits from `data.table`, and stores its `table_proxy` object in a `data.table` cell. This allows the `table_proxy` object to be updated in-place. I believe this functionality should be...