Data explorer does not render the timestamp datatype
I've tried the Data explorer on a bunch of test-files, and it looks great. There is a small issue related to the rendering of timestamp (the native date time datatype in Parequet). It is rendered as a numeric value, and not a date time. I'm attaching a test-file (parquet), and screenshots from both the data explorer and duckdb.
Hey @trygu,
Thanks for the feedback!
Seems like an easy enough fix!
@garronej I don't want to dampen your spirits, but it's not that simple. Clients (pyarrow, R arrow, Spark...) manage datetime differently. So, the content of a parquet file will depend on the client used to create it. This is a problem that @pengfei99 had documented here. Don't you remember, Jo? 😉
I think the problem is how you should represent a date time field; My suggestion is to represent it in a fixed iso-8601 fashion just like DuckDb. (and maybe at a later juncture, as a configurable option for the date time formatting).
@garronej The parquet format is the easiest one to work with. For the CSV format, you will have more trouble. For example, if a file is encoded with window-1252, and you try to open it with UTF-8, all the special characters will be wrongly interpreted. You will likely encounter this issue eventually.
@garronej The parquet format is the easiest one to work with. For the CSV format, you will have more trouble. For example, if a file is encoded with
window-1252, and you try to open it withUTF-8, all the special characters will be wrongly interpreted. You will likely encounter this issue eventually.
I agree, and I'm not sure if it's that important to fix this for CSV's, as you say they can be encoded (and not encoded) in a lot of different ways, but seldom as a large integer.
I think the most pressing issue is that the date time-type in Parquet is displayed using EPOCH. There are several strategies to GUESS that it is a date time-value, but the absolute best would be to get this information from the underlying reading of the metadata (of the parquet file), since the grid itself do not have this information. I also noticed that there is a milestone related to the File explorer. I would love this issue to be part of that milestone as well (nudge @fcomte).