AllenSDK icon indicating copy to clipboard operation
AllenSDK copied to clipboard

VisualBehaviorOphysProjectCache.from_lims() method does not accurately reflect what is in lims

Open matchings opened this issue 4 years ago • 3 comments

Describe the bug When I run the following code, the table that is returned does not include the complete list of ophys_experiment_ids for each ophys_container_id. It appears to require that a given ophys_experiment_id is only represented once, however it is possible for an ophys_experiment_id to be associated with multiple ophys_container_ids in lims (some that have passed and some that were failed). The result of this is that sometimes an ophys_experiment_id is associated with a passed container, and other times with a failed container, meaning that the passing ophys_container_id does not have the correct list of ophys_experiment_ids associated with it.

To Reproduce

from allensdk.brain_observatory.behavior.behavior_project_cache import VisualBehaviorOphysProjectCache as bpc

cache = bpc.from_lims()
experiments = cache.get_ophys_experiment_table()

print('there are', len(experiments[experiments.ophys_container_id==1115959875]), 
      'experiments associated with container id', 1115959875, 'in the cache table')

Expected behavior There are actually 6 experiments associated with that container ID in lims. This can be observed through direct lims queries, and by looking at the lims directory for that container:

image

Workaround @djkapner identified that this part of the code in the SDK enforces that there should only be one unique ophys_experiment_id in the ophys_experiments table, and suggested the code block below as an alternative workaround. This block gives me the correct list of ophys_experiment_ids in lims for a given ophys_container_id, including cases where an ophys_experiment_id is associated with multiple containers. This is the ground truth that we need access to do QC and other tasks, like identifying candidate experiments for release.

experiments = cache.fetch_api.get_ophys_experiment_table()

print('there are', len(experiments[experiments.ophys_container_id==1115959875]), 
      'experiments associated with container id', 1115959875, 'in the cache table')
print('yay this is correct')

Environment (please complete the following information):

  • AllenSDK version 2.11.0

matchings avatar Jul 31 '21 01:07 matchings

deeper link for that explanation: https://github.com/AllenInstitute/AllenSDK/blob/0178688ccdfb4d3b6c311cc8879126f7d64e90a1/allensdk/brain_observatory/behavior/behavior_project_cache/tables/project_table.py#L32-L35

djkapner avatar Aug 09 '21 16:08 djkapner

@matchings I think this branch

https://github.com/AllenInstitute/AllenSDK/tree/ticket/2187/dev

will fix the problem for you. Note the added passed_only kwarg in get_ophys_experiment_table that must be set to False to get everything. If you have any tests you want to run, please do so.

I used this branch to create an experiments table from_lims. It matched your ground truth table with the exception of 35 experiments that do not appear to be in the S3 bucket now, anyway. We can dig into that if you want

danielsf avatar Aug 13 '21 00:08 danielsf

The 35 ophys_experiment_ids that were not returned by a naive passed_only=True query from LIMS were

could not find  1012165655
could not find  1008408505
could not find  1011771129
could not find  1011771134
could not find  994082657
could not find  993891317
could not find  994278584
could not find  994278590
could not find  1101388665
could not find  1101564157
could not find  1012771669
could not find  1012771667
could not find  1012771670
could not find  1012771672
could not find  1012771673
could not find  1012771678
could not find  1010092818
could not find  1081870750
could not find  1081870759
could not find  1082434510
could not find  965267337
could not find  1078699503
could not find  1078699509
could not find  1058613919
could not find  1058613914
could not find  1058613924
could not find  1058835249
could not find  1059574070
could not find  1059792822
could not find  1059792832
could not find  1059792820
could not find  1060223938
could not find  1105664341
could not find  1106021634
could not find  1106021635

danielsf avatar Aug 13 '21 00:08 danielsf