micro-manager 2.0 gamma reader
Use Case
The open source microscopy control tool micro-manager supports file saving to single-page .tiffs and ome.tiffs. Additionally, with pycro-manager (ZMQ-based java-python communication and data transfer bridge) there is even greater need to have pythonic data loading/saving.
An extension to this repository to enable easy loading and saving of micro-manager data formats would both broaden the reach of this repo and accelerate micro-manager's integration into python.
Solution
We could build off the Reader ABC, or the TiffReader, or whatever approach makes sense. In particular, it would be useful if such a reader is capable of:
- A tiff sequence reader (not just ome.tiff). Micro-manager saves images as one of two types -- ome.tiff or many single page tiffs with metadata XML file. Even if they move to other formats in the future, the backwards compatibility is important.
- Extracting instrument/experiment metadata and associated image data. Having named channels associated with the indexed channels would be useful.
tifffilehas amicromanager_metadataattribute that returns a JSON embedded in the ome.tiff. - Automatic discovery of master-ome.tiff. Micro-manager breaks up individual scenes into multiple files (as is the case in the google drive data above). tifffile can identify the master ome-tiff but does not provide an easy way to intercept this. Internally, we are combing all files in the folder to look for an ome.tiff with the most
scenes, and assume this is the master.
Alternatives
An alternative is to enable this type of reading directly into TiffReader/OMETiffReader, rather than make a new reader. The key difference is the amount of accompanying metadata, and whether one should expand TiffReader to check for that.
A couple of thoughts:
An extension to this repository to enable easy loading and saving of micro-manager data formats would both broaden the reach of this repo and accelerate micro-manager's integration into python.
Totally agree!
- A tiff sequence reader (not just ome.tiff). Micro-manager saves images as one of two types -- ome.tiff or many single page tiffs with metadata XML file. Even if they move to other formats in the future, the backwards compatibility is important.
- Extracting instrument/experiment metadata and associated image data. Having named channels associated with the indexed channels would be useful. tifffile has a micromanager_metadata attribute that returns a JSON embedded in the ome.tiff.
Hmmmm we may want to have two variants of it then:
- One that inherits from my planned GlobReader
- One that inherits from current
OmeTiffReaderand returnsmetadataas a named tuple for interaction like so:
from aicsimageio.readers import MicroManagerReader
r = MicroManagerReader("image.ome.tiff")
r.metadata.ome # returns OME from base OmeTiffReader
r.metadata.mm # returns `Dict` from micromanager metadata
This is a similar approach to how CZI wrote their MicromanagerReader except I would use a NamedTuple instead of a Dict for typing purposes.
Sidenote: there is also an argument to bake "Glob" functionality into every reader instead of having it as it's own thing. I just kind of like the simplicity of handling it as a layer above base readers. The other thing I haven't really thought about is how to handle metadata aggregation for a "glob" function. Do we return a list of all the read metadata(s)? Do we try to aggregate the metadata into a single struct?
- Automatic discovery of master-ome.tiff. Micro-manager breaks up individual scenes into multiple files (as is the case in the google drive data above). tifffile can identify the master ome-tiff but does not provide an easy way to intercept this. Internally, we are combing all files in the folder to look for an ome.tiff with the most scenes, and assume this is the master.
I think this would be the hardest part of supporting this. It would be great if there was a file naming convention for this. I.e. all files are the same name except for the main file which is tagged with ***.mm-main.ome.tiff so you might have scene-1.ome.tiff, scene-2.ome.tiff, and then all-scenes.mm-main.ome.tiff but that's not really on this library that's upstream.
This is a similar approach to how CZI wrote their MicromanagerReader except I would use a NamedTuple instead of a Dict for typing purposes.
That is my teammate (small distinction, we are part of CZBiohub not CZInitiative). He's the one who brought your repo to my attention :-). Our goals are to both develop a broadly useful Micromanager Reader and an extension of it for our group's specific use. The latter part is a discussion point and may not be necessary.
Sidenote: there is also an argument to bake "Glob" functionality into every reader instead of having it as it's own thing. I just kind of like the simplicity of handling it as a layer above base readers. The other thing I haven't really thought about is how to handle metadata aggregation for a "glob" function. Do we return a list of all the read metadata(s)? Do we try to aggregate the metadata into a single struct?
Good question regarding metadata handling. There's already one complication in that, ome-metadata is encoded as xml and micromanager is as JSON. Maybe this is not a big deal, but it suggests that aggregating metadata into a single struct could be a real pain. Are there robust tools to, say, translate xml and json into dicts?
I think this would be the hardest part of supporting this. It would be great if there was a file naming convention for this.
Tifffile logs warnings when it identifies non-masters. I have a hard time catching the logs using the logging api, but it's theoretically possible. Here's how tifffile logs them:
if element.tag.endswith('BinaryOnly'):
# TODO: load OME-XML from master or companion file
log_warning('OME series: not an ome-tiff master file')
break
FWIW, regarding automatic ome-tiff master discovery, the above code snippet works well. Querying tiff.scenes was terribly inefficient in that scenes will comb through the whole file for the page locations, even if it's not the ome-tiff master. This is not useful in certain micro-manager cases where the data is split into multiple files.
Here's the exact snippet I used:
def tag_search(root_, tag_name='BinaryOnly'):
"""
returns True if tag_name is present
"""
for element in root_:
if element.tag.endswith(tag_name):
print(f'OME series: not an ome-tiff master file')
return True
return False
for file in os.listdir(folder_):
if not file.endswith('.ome.tif'):
continue
with TiffFile(os.path.join(folder_, file)) as tiff:
print(f"checking {file} for ome-master records")
omexml = tiff.pages[0].description
# get omexml root from first page
try:
root = etree.fromstring(omexml)
except etree.ParseError as exc:
try:
omexml = omexml.decode(errors='ignore').encode()
root = etree.fromstring(omexml)
except Exception as ex:
print(f"Exception while parsing root from omexml: {ex}")
# search for tag corresponding to non-ome-tiff files
if not tag_search(root, "BinaryOnly"):
ome_master = file
break
else:
continue
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.