Aliro AI failed on initial datasets after uploading a new dataset

AI worked fine on initial datasets after starting up PennAI. But after uploading a new dataset via +Add dataset bottom, AI failed somehow on those initial datasets but worked for the newly uploaded dataset. Please check the error message below.

1|ai     | ai: INFO: 2019 07:50:22 PM UTC: checking requests...
0|lab    | POST /api/datasets 200 - - 1.057 ms
0|lab    | POST /api/datasets 200 - - 0.855 ms
0|lab    | serverSocket.emitEvent('aiToggled', '[object Object]')
0|lab    | PUT /api/userdatasets/5cd5d4e8a957c2003197e796/ai 200 53 - 2.144 ms
1|ai     | ai: INFO: 2019 07:50:26 PM UTC: checking results...
0|lab    | POST /api/experiments 200 - - 1.269 ms
1|ai     | ai: INFO: 2019 07:50:26 PM UTC: checking requests...
0|lab    | POST /api/datasets 200 - - 1.205 ms
1|ai     | ai: INFO: 2019 07:50:26 PM UTC: new ai request for:allbp
0|lab    | serverSocket.emitEvent('aiToggled', '[object Object]')
0|lab    | PUT /api/userdatasets/5cd5d4e8a957c2003197e796/ai 200 53 - 3.727 ms
1|ai     | request_manager: INFO: AiRequest initilized (allbp,5cd5d4e8a957c2003197e796)
1|ai     | request_manager: INFO: AiRequest new_request (allbp,5cd5d4e8a957c2003197e796)
1|ai     | request_manager: DEBUG: AiRequest adding recs (allbp,5cd5d4e8a957c2003197e796)
1|ai     | ai: INFO: generate_recommendations(5cd5d4e8a957c2003197e796,10)
0|lab    | POST /api/datasets 200 - - 0.888 ms
0|lab    | GET /api/datasets/5cd5d4e8a957c2003197e796 200 - - 0.953 ms
1|ai     | knn_meta_recommender: ERROR: error running self.best_model_prediction for5cd5d4e8a957c2003197e796
1|ai     | ai: ERROR: Unhanded exception caught: <class 'KeyError'>
1|ai     | ai: INFO: Shutting down AI engine...
1|ai     | ai: INFO: ...Shutting down Request Manager...
1|ai     | request_manager: INFO: AiRequest terminate_request (allbp,5cd5d4e8a957c2003197e796)
1|ai     | request_manager: DEBUG: queue size: 0
1|ai     | request_manager: DEBUG: Removed experiments from queue, isQueueEmpty()=True
1|ai     | request_manager: DEBUG: queue size: 0
0|lab    | serverSocket.emitEvent('aiToggled', '[object Object]')
1|ai     | ai: INFO: Goodbye
1|ai     | Traceback (most recent call last):
1|ai     |   File "/opt/conda/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 2657, in get_loc
0|lab    | PUT /api/userdatasets/5cd5d4e8a957c2003197e796/ai 200 53 - 1.389 ms
1|ai     |     return self._engine.get_loc(key)
1|ai     |   File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
1|ai     |   File "pandas/_libs/index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc
1|ai     |   File "pandas/_libs/hashtable_class_helper.pxi", line 1601, in pandas._libs.hashtable.PyObjectHashTable.get_item
1|ai     |   File "pandas/_libs/hashtable_class_helper.pxi", line 1608, in pandas._libs.hashtable.PyObjectHashTable.get_item
1|ai     | KeyError
1|ai     | : 'iris_full.tsv'
1|ai     | During handling of the above exception, another exception occurred:
1|ai     | Traceback (most recent call last):
1|ai     |   File "/opt/conda/lib/python3.7/runpy.py", line 193, in _run_module_as_main
1|ai     |     "__main__", mod_spec)
1|ai     |   File "/opt/conda/lib/python3.7/runpy.py", line 85, in _run_code
1|ai     |     exec(code, run_globals)
1|ai     |   File "/appsrc/ai/ai.py", line 522, in <module>
1|ai     |     main()
1|ai     |   File "/appsrc/ai/ai.py", line 504, in main
1|ai     |     pennai.process_rec()
1|ai     |   File "/appsrc/ai/ai.py", line 324, in process_rec
1|ai     |     self.requestManager.process_requests()
1|ai     |   File "/appsrc/ai/request_manager.py", line 86, in process_requests
1|ai     |     req.process_request()
1|ai     |   File "/appsrc/ai/request_manager.py", line 184, in process_request
1|ai     |     self.recBatchSize)
1|ai     |   File "/appsrc/ai/ai.py", line 345, in generate_recommendations
1|ai     |     dataset_mf=metafeatures)
1|ai     |   File "/appsrc/ai/recommender/knn_meta_recommender.py", line 147, in recommend
1|ai     |     raise e 
1|ai     |   File "/appsrc/ai/recommender/knn_meta_recommender.py", line 116, in recommend
1|ai     |     dataset_mf)
1|ai     |   File "/appsrc/ai/recommender/knn_meta_recommender.py", line 186, in best_model_prediction
1|ai     |     alg_params = (self.best_mlp.loc[d,'algorithm'] + '|' +
1|ai     |   File "/opt/conda/lib/python3.7/site-packages/pandas/core/indexing.py", line 1494, in __getitem__
1|ai     |     return self._getitem_tuple(key)
1|ai     |   File "/opt/conda/lib/python3.7/site-packages/pandas/core/indexing.py", line 868, in _getitem_tuple
1|ai     |     return self._getitem_lowerdim(tup)
1|ai     |   File "/opt/conda/lib/python3.7/site-packages/pandas/core/indexing.py", line 988, in _getitem_lowerdim
1|ai     |     section = self._getitem_axis(key, axis=i)
1|ai     |   File "/opt/conda/lib/python3.7/site-packages/pandas/core/indexing.py", line 1913, in _getitem_axis
1|ai     |     return self._get_label(key, axis=axis)
1|ai     |   File "/opt/conda/lib/python3.7/site-packages/pandas/core/indexing.py", line 141, in _get_label
1|ai     |     return self.obj._xs(label, axis=axis)
1|ai     |   File "/opt/conda/lib/python3.7/site-packages/pandas/core/generic.py", line 3585, in xs
1|ai     |     loc = self.index.get_loc(key)
1|ai     |   File "/opt/conda/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 2659, in get_loc
1|ai     |     return self._engine.get_loc(self._maybe_cast_indexer(key))
1|ai     |   File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
1|ai     |   File "pandas/_libs/index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc
1|ai     |   File "pandas/_libs/hashtable_class_helper.pxi", line 1601, in pandas._libs.hashtable.PyObjectHashTable.get_item
1|ai     |   File "pandas/_libs/hashtable_class_helper.pxi", line 1608, in pandas._libs.hashtable.PyObjectHashTable.get_item
1|ai     | KeyError: 'iris_full.tsv'
PM2      | App [ai:1] exited with code [1] via signal [SIGINT]
PM2      | App [ai:1] starting in -fork mode-
PM2      | App [ai:1] online

May 10 '19 19:05 weixuanfu

I think the reason of this error is that self.best_mlp was not updated after running AI on the newly added dataset (named "Iris_full.tsv" in our case) but df_mf had this dataset's meta features. Then KNN recommender cannot find the best results for this dataset. I pushed a patch to fix this issue (with adding those codes below). But I think a nicer way to insure the best_mlp can be updated after getting a good result. @hjwilli any idea?

if d not in self.best_mlp.index:
    continue

May 13 '19 15:05 weixuanfu

Hmm... not able to reproduce consistently, but I was able to get this error once this without uploading any new datasets by turning the ai on for several datasets simultaneously with the knn recommender. @lacava it looks like this might be specific to the knn recommender, but I'm not exactly sure what's going on.

May 13 '19 16:05 hjwilli