river icon indicating copy to clipboard operation
river copied to clipboard

Binary incompatibility. Expected 96 from C header, got 88 from PyObject

Open agricolab opened this issue 3 years ago • 0 comments

Following the examples, i run GLOG_alsologtostderr=1 river-ingester -h 127.0.0.1 -o river_streams to ingest the data. After that, ls -R river_streams, which ran fine, too. After that i tried python -c 'import pandas as pd; print(pd.read_parquet("river_streams/<your stream name>/data.parquet"))' (replacing with an existing stream name as read from the ls output. I had to conda install pandas and conda install fastparquet into the clean environment. After that, running the snippet raised

Traceback (most recent call last):
  File "<privatefolder>/anaconda3/envs/river/lib/python3.10/site-packages/fastparquet/api.py", line 135, in _parse_header
    fmd = read_thrift(f, parquet_thrift.FileMetaData)
  File "<privatefolder>/anaconda3/envs/river/lib/python3.10/site-packages/fastparquet/thrift_structures.py", line 25, in read_thrift
    obj.read(pin)
  File "<privatefolder>/anaconda3/envs/river/lib/python3.10/site-packages/fastparquet/parquet_thrift/parquet/ttypes.py", line 1929, in read
    iprot._fast_decode(self, iprot, [self.__class__, self.thrift_spec])
SystemError: PY_SSIZE_T_CLEAN macro must be defined for '#' formats

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "<privatefolder>/anaconda3/envs/river/lib/python3.10/site-packages/pandas/io/parquet.py", line 493, in read_parquet
    return impl.read(
  File "<privatefolder>/anaconda3/envs/river/lib/python3.10/site-packages/pandas/io/parquet.py", line 345, in read
    parquet_file = self.api.ParquetFile(path, **parquet_kwargs)
  File "<privatefolder>/anaconda3/envs/river/lib/python3.10/site-packages/fastparquet/api.py", line 100, in __init__
    self._parse_header(fn, verify)
  File "<privatefolder>/anaconda3/envs/river/lib/python3.10/site-packages/fastparquet/api.py", line 138, in _parse_header
    self.fn)
AttributeError: 'ParquetFile' object has no attribute 'fn'

Coincidentially, when i then tried rerunning the writer example, it threw

Traceback (most recent call last):
  File "<privatefolder>/JOSS/river/writer.py", line 1, in <module>
    import river
  File "river.pyx", line 1, in init river
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject1

This might be linked?

Anyways, it seems my installation is broken now. Restarting redis-server or reinstalling numpy with conda install numpy --force-reinstall didn't fix that issue.

agricolab avatar Aug 11 '22 08:08 agricolab