crossrefapi icon indicating copy to clipboard operation
crossrefapi copied to clipboard

Add cache functionality

Open pdvass opened this issue 3 years ago • 2 comments

The proposed changes are based on Crossref's REST API documentation recommendation of caching the responses.

NOTE: The cache that unittest creates is persistent. To tackle that, I have written .bat and .sh files to run develop and test from setup.py, then delete the cache created. If needed, I can provide them too.

pdvass avatar Mar 21 '23 18:03 pdvass

Hello @pdvass

Thanks for the commit.

In my point of view, what we would cache is the http requests made to the Crossref API.

So, I think any cache implementation should be in the HTTPRequest class, in specific the do_http_request method.

Besides that, I'm not sure about the benefits to include a caching implementation in this library, once it is mostly used for data harvesting, and the chances to reuse or benefit from a cache is almost null, but maybe I could be wrong. Even though, it is fine for me to include a cache layer in the terms described above.

fabiobatalha avatar Apr 13 '23 00:04 fabiobatalha

Hello @fabiobatalha

Thank you for your feedback.

The main motive behind this PR is that I had to create a dataset that I didn't know from the beginning the size that it should be. My approach was to wait for the responses, convert them and then save them, but it was an expensive conversion for a bigger dataset. So, by caching I was able to save the responses and then add more, but not the same, if needed. Also, I find it easier to transfer it from device to device, to not rerun everything and save time.

As of the implementation details, this is how I managed to get it working, because I needed a way to choose a backend.

pdvass avatar Apr 13 '23 09:04 pdvass