traildb-python Use CFFI instead of ctypes

CFFI will allow us to use traildb with PyPy, leading to better performancce of the program in general. Moreover, CFFI is faster even on CPython and it releases the GIL which allows for true parallelism as one might expect from a C library that performs I/O. The only thing that is faster than CFFI on CPython is a C extension. We can provide both or keep the CFFI version only.

May 26 '16 20:05 thedrow

@thedrow yes, CFFI would make sense. The only reason it uses Ctypes now is that Ctypes is in the standard library and CFFI is not.

I wonder how hard it would be to support both: use CFFI if it is available, otherwise fall back to Ctypes.

May 26 '16 20:05 tuulos

It's not hard, just unnecessary. It will double down the implementation time for new features and bug fixes. Having CFFI as a requirement is not that bad. A lot of people do it and CFFI is installed automatically when you specify it as a dependency in setup.py.

May 27 '16 19:05 thedrow

Right. It would be interesting to benchmark CFFI vs. Ctypes. If there's a noticeable difference, it makes sense to use CFFI instead of Ctypes.

The Wikipedia example from the tutorial, https://github.com/traildb/traildb-python/blob/master/examples/tutorial_wikipedia_sessions.py would be an interesting test case. It takes hours to run currently for all Wikipedia data using the current binding. It uses only a small subset of the whole API, so the benchmark could be run without having to port the whole API.

Please submit a PR :)

May 29 '16 03:05 tuulos

@tuulos The reasons why I'd like to see CFFI are:

The python2/python3 independent code
The independency of library version and ABI
pure python library which can run on any os and several underlying library versions

You can see example with https://github.com/zeromq/pyzmq project where you have pure python wheel which works under any OS (*BSD, OS X, Linux, Windows) and with different versions of ZMQ library.

Jun 11 '16 15:06 eirnym