AI (AutoML) feature not working on recent builds - hash mismatch
Recent changes seem to have 'broken' the AI feature. Regular ML algorithms can be run, but the "AI" button in the upper right corner of each database on the "Databases" dashboard seems to be permanently inactive.
I have tested this on both a MacOS laptop and on a Raspberry Pi 400. Interestingly, although the AI feature doesn't work on either of them, an informative error message is only given on the Raspberry Pi.
Excerpt from the Raspberry PI logs:
...
lab_1 | 1|ai | surprise_recommenders: INFO: setting training data...
lab_1 | 1|ai | base: INFO: updating hash_2_param...
lab_1 | 1|ai | base: INFO: storing parameter hash...
lab_1 | 1|ai | surprise_recommenders: INFO: append and drop dupes
lab_1 | 1|ai | surprise_recommenders: INFO: load_from_df
lab_1 | 1|ai | surprise_recommenders: ERROR: the results_df hash from the pickle is different
lab_1 | 1|ai | Traceback (most recent call last):
lab_1 | 1|ai | File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
lab_1 | 1|ai | "__main__", mod_spec)
lab_1 | 1|ai | File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
lab_1 | 1|ai | exec(code, run_globals)
lab_1 | 1|ai | File "/appsrc/ai/ai.py", line 662, in <module>
lab_1 | 1|ai | main()
lab_1 | 1|ai | File "/appsrc/ai/ai.py", line 631, in main
lab_1 | 1|ai | term_condition=args.TERM_COND, max_time=args.MAX_TIME)
lab_1 | 1|ai | File "/appsrc/ai/ai.py", line 186, in __init__
lab_1 | 1|ai | self.initialize_recommenders(rec_class) # set self.rec_engines
lab_1 | 1|ai | File "/appsrc/ai/ai.py", line 247, in initialize_recommenders
lab_1 | 1|ai | self.rec_engines[pred_type] = rec_class(**recArgs)
lab_1 | 1|ai | File "/appsrc/ai/recommender/surprise_recommenders.py", line 126, in __init__
lab_1 | 1|ai | random_state=random_state)
lab_1 | 1|ai | File "/appsrc/ai/recommender/base.py", line 165, in __init__
lab_1 | 1|ai | serialized_rec_filename)
lab_1 | 1|ai | File "/appsrc/ai/recommender/base.py", line 195, in _train_empty_rec
lab_1 | 1|ai | self.load(self.serialized_rec_path, knowledgebase_results)
lab_1 | 1|ai | File "/appsrc/ai/recommender/surprise_recommenders.py", line 175, in load
lab_1 | 1|ai | source='knowledgebase')
lab_1 | 1|ai | File "/appsrc/ai/recommender/surprise_recommenders.py", line 162, in _reconstruct_training_data
lab_1 | 1|ai | raise ValueError(error_msg)
lab_1 | 1|ai | ValueError: the results_df hash from the pickle is different
lab_1 | PM2 | App [ai:1] exited with code [1] via signal [SIGINT]
lab_1 | PM2 | App [ai:1] starting in -fork mode-
lab_1 | PM2 | App [ai:1] online
lab_1 | 1|ai | ======= Penn AI =======
lab_1 | 0|lab | POST /api/projects 200 - - 4.529 ms
lab_1 | 0|lab | serverSocket.emitEvent('recommenderStatusUpdated', '[object Object]')
lab_1 | 0|lab | {}
lab_1 | 0|lab | =socketServer:recommenderStatusUpdated(initializing)
lab_1 | 0|lab | POST /api/recommender/status 200 54 - 4.591 ms
lab_1 | 1|ai | ai: INFO: loading pmlb knowledgebase
lab_1 | 1|ai | knowledgebase_utils: INFO: load_default_knowledgebases('True', 'data/knowledgebases/user/results', 'data/knowledgebases/user/metafeatures'
lab_1 | 1|ai | knowledgebase_utils: INFO: load_knowledgebase('['data/knowledgebases/sklearn-benchmark-data-knowledgebase-r6.tsv.gz', 'data/knowledgebases/pmlb_regression_results.pkl.gz']', ['data/knowledgebases/pmlb_classification_metafeatures.csv.gz', 'data/knowledgebases/pmlb_regression_metafeatures.csv.gz']', '')
lab_1 | 1|ai | knowledgebase_utils: INFO: _load_results_from_file(data/knowledgebases/sklearn-benchmark-data-knowledgebase-r6.tsv.gz)
lab_1 | 1|ai | knowledgebase_utils: INFO: returning 52249 results from data/knowledgebases/sklearn-benchmark-data-knowledgebase-r6.tsv.gz
lab_1 | 1|ai | knowledgebase_utils: INFO: _load_results_from_file(data/knowledgebases/pmlb_regression_results.pkl.gz)
lab_1 | 1|ai | knowledgebase_utils: INFO: concatenating results....
lab_1 | 1|ai | knowledgebase_utils: INFO: load metafeatures...
lab_1 | 1|ai | knowledgebase_utils: INFO: Loading metadata from file 'data/knowledgebases/pmlb_classification_metafeatures.csv.gz
lab_1 | 1|ai | knowledgebase_utils: INFO: Loading metadata from file 'data/knowledgebases/pmlb_regression_metafeatures.csv.gz
lab_1 | 1|ai | ai: INFO: updating AI with classification knowledgebase (52249 results)
lab_1 | 1|ai | ai: INFO: pmlb classification knowledgebase loaded
...
And similarly, from MacOS:
...
lab_1 | 1|ai | base: WARNING: algo changing from <surprise.prediction... to <surprise.prediction...
lab_1 | 1|ai | base: WARNING: first_fit changing from True... to False...
lab_1 | 1|ai | base: WARNING: reader changing from <surprise.reader.Rea... to <surprise.reader.Rea...
lab_1 | 1|ai | base: WARNING: hash_2_param changing from {'c65edfb84911c2647a... to {'c65edfb84911c2647a...
lab_1 | 1|ai | base: WARNING: adding trainset=<surprise.trainset.T...
lab_1 | 1|ai | base: WARNING: adding results_df_hash=9213096e6869a9a4d9ea...
lab_1 | 1|ai | base: WARNING: adding ml_p_hash=31fa2d17c46be017c19f...
lab_1 | 1|ai | base: INFO: updating internal state
lab_1 | 1|ai | base: INFO: ml_p hashes match
lab_1 | 1|ai | surprise_recommenders: INFO: setting training data...
lab_1 | 1|ai | base: INFO: updating hash_2_param...
lab_1 | PM2 | App [ai:1] exited with code [0] via signal [SIGKILL]
lab_1 | PM2 | App [ai:1] starting in -fork mode-
lab_1 | PM2 | App [ai:1] online
lab_1 | 1|ai | ======= Penn AI =======
lab_1 | 0|lab | POST /api/projects 200 - - 23.081 ms
lab_1 | 0|lab | serverSocket.emitEvent('recommenderStatusUpdated', '[object Object]')
lab_1 | 0|lab | {}
lab_1 | 0|lab | =socketServer:recommenderStatusUpdated(initializing)
lab_1 | 0|lab | POST /api/recommender/status 200 54 - 15.614 ms
lab_1 | 1|ai | ai: INFO: loading pmlb knowledgebase
lab_1 | 1|ai | knowledgebase_utils: INFO: load_default_knowledgebases('True', 'data/knowledgebases/user/results', 'data/knowledgebases/user/metafeatures'
lab_1 | 1|ai | knowledgebase_utils: INFO: load_knowledgebase('['data/knowledgebases/sklearn-benchmark-data-knowledgebase-r6.tsv.gz', 'data/knowledgebases/pmlb_regression_results.pkl.gz']', ['data/knowledgebases/pmlb_classification_metafeatures.csv.gz', 'data/knowledgebases/pmlb_regression_metafeatures.csv.gz']', '')
lab_1 | 1|ai | knowledgebase_utils: INFO: _load_results_from_file(data/knowledgebases/sklearn-benchmark-data-knowledgebase-r6.tsv.gz)
lab_1 | 0|lab | results:
lab_1 | 0|lab | [ { _id: 5fe3e870e2c61a175b7b7928,
lab_1 | 0|lab | type: 'recommender',
lab_1 | 0|lab | status: 'initializing' } ]
lab_1 | 0|lab | GET /api/recommender 201 79 - 4.873 ms
lab_1 | 1|ai | knowledgebase_utils: INFO: returning 52249 results from data/knowledgebases/sklearn-benchmark-data-knowledgebase-r6.tsv.gz
lab_1 | 1|ai | knowledgebase_utils: INFO: _load_results_from_file(data/knowledgebases/pmlb_regression_results.pkl.gz)
lab_1 | 1|ai | knowledgebase_utils: INFO: concatenating results....
lab_1 | 1|ai | knowledgebase_utils: INFO: load metafeatures...
lab_1 | 1|ai | knowledgebase_utils: INFO: Loading metadata from file 'data/knowledgebases/pmlb_classification_metafeatures.csv.gz
lab_1 | 1|ai | knowledgebase_utils: INFO: Loading metadata from file 'data/knowledgebases/pmlb_regression_metafeatures.csv.gz
...
The Raspberry Pi logs suggest there is an issue loading the knowledge base - the hash doesn't match the dataset.
Possibly an issue caused after running BFG Repo-Cleaner, or related to Git LFS?
Steps to reproduce:
$ git clone https://github.com/epistasislab/pennai
$ cd pennai
$ cp config/ai.env-template config/ai.env
$ docker-compose build
$ docker-compose up
what happens when you click the AI button on Mac?
@lacava The AI is grayed out and instead of a button there is a spinning grey progress wheel. This is the case for both the Mac and Raspberry Pi.
@JDRomano2, for the Mac, I think we need a little more information. Could you post a larger excerpt of the log?
Then could you:
- Try starting pennai and waiting a few minutes and seeing if anything changes (if svd recommender couldn't be loaded and is being trained, that could take a few minutes)
- Try starting with a different recommender (in
config/ai.envchange AI_RECOMMENDER to "random" and restart pennai) and see if the ai button becomes active - Check your docker runtime memory settings. What are they currently? We recommend at least 6gb of memory.
From the logs, there might be a different issue with the RaspberryPi. A good first step for that might be for us to get the unit tests running on the pi to check they all pass.
These 3 suggestions seemed to do the trick on Mac. Therefore, this must be isolated to images running on the Pi.
I'll close this issue and continue work on the raspberrypi branch to get this up and running. As recommended, I'll focus on the unit tests. Given the constrained resources on the Pi, this may require some creative tweaking to convince everything to work correctly.
Hi @JDRomano2, Excellent! Do you know which of these fixed it? Just to check, are you now able to run the SVD recommender on your Mac?
I suspect it was increasing the available RAM that did the trick. I just went in and re-enabled the SVD recommender and it still works correctly, so no problem there.
Great, thanks!
Encountered this same issue on arm64. setting the recommender to svd displays an error when starting up Aliro. Steps to recreate:
- On an arm64 machine, set RECOMMENDER=svd and run
docker compose up
The following error is displayed:
aliro-lab-1 | 1|ai | newHash d617b188ab49492d3c37bb083a37bd31cbcf3acc077d7bd3ab697115196c617c aliro-lab-1 | 1|ai | test_newHash: 031edd7d41651593c5fe5c006fa5752b37fddff7bc4e843aa6af0c950f4b9406 aliro-lab-1 | 1|ai | self.results_df_hash 5a246d759bb571dbd867344ef8f282ca7b0cce46347f6db58986ffec8985eb34 aliro-lab-1 | 1|ai | newHash == self.results_df_hash False aliro-lab-1 | 1|ai | surprise_recommenders: ERROR: the results_df hash from the pickle is different aliro-lab-1 | 1|ai | Traceback (most recent call last): aliro-lab-1 | 1|ai | File "/usr/local/lib/python3.7/runpy.py", line 193, in _run_module_as_main aliro-lab-1 | 1|ai | "main", mod_spec) aliro-lab-1 | 1|ai | File "/usr/local/lib/python3.7/runpy.py", line 85, in _run_code aliro-lab-1 | 1|ai | exec(code, run_globals) aliro-lab-1 | 1|ai | File "/appsrc/ai/ai.py", line 658, in
aliro-lab-1 | 1|ai | main() aliro-lab-1 | 1|ai | File "/appsrc/ai/ai.py", line 627, in main aliro-lab-1 | 1|ai | term_condition=args.TERM_COND, max_time=args.MAX_TIME) aliro-lab-1 | 1|ai | File "/appsrc/ai/ai.py", line 182, in init aliro-lab-1 | 1|ai | self.initialize_recommenders(rec_class) # set self.rec_engines aliro-lab-1 | 1|ai | File "/appsrc/ai/ai.py", line 243, in initialize_recommenders aliro-lab-1 | 1|ai | self.rec_engines[pred_type] = rec_class(**recArgs) aliro-lab-1 | 1|ai | File "/appsrc/ai/recommender/surprise_recommenders.py", line 126, in init aliro-lab-1 | 1|ai | random_state=random_state) aliro-lab-1 | 1|ai | File "/appsrc/ai/recommender/base.py", line 165, in init aliro-lab-1 | 1|ai | serialized_rec_filename) aliro-lab-1 | 1|ai | File "/appsrc/ai/recommender/base.py", line 195, in _train_empty_rec aliro-lab-1 | 1|ai | self.load(self.serialized_rec_path, knowledgebase_results) aliro-lab-1 | 1|ai | File "/appsrc/ai/recommender/surprise_recommenders.py", line 212, in load aliro-lab-1 | 1|ai | source='knowledgebase') aliro-lab-1 | 1|ai | File "/appsrc/ai/recommender/surprise_recommenders.py", line 199, in _reconstruct_training_data aliro-lab-1 | 1|ai | raise ValueError(error_msg) aliro-lab-1 | 1|ai | ValueError: the results_df hash from the pickle is different aliro-lab-1 | PM2 | App [ai:1] exited with code [1] via signal [SIGINT] aliro-lab-1 | PM2 | App [ai:1] starting in -fork mode- aliro-lab-1 | PM2 | App [ai:1] online aliro-lab-1 | 1|ai | ======= Aliro ======= aliro-lab-1 | 0|lab | POST /api/projects 200 - - 2.737 ms aliro-lab-1 | 0|lab | serverSocket.emitEvent('recommenderStatusUpdated', '[object Object]') aliro-lab-1 | 0|lab | {} aliro-lab-1 | 0|lab | POST /api/recommender/status 200 54 - 1.747 ms