MorphIO icon indicating copy to clipboard operation
MorphIO copied to clipboard

Feature Request: Option to treat subsequent points with different SWC tags as fork points

Open Helveg opened this issue 2 years ago • 23 comments

Consider a morphology with an axon consisting of a single branch, but with 3 different SWC tags: 2 for axon, 6 for axon hillock, and 7 for axon initial segment. With the current MorphIO parser this gets reduced into 1 Section with 1 type, depending on the SWC tag of the first point of the branch.

I'd like to request a new option to split this branch into 3 Sections, wherever there is an SWC tag transition (taking into account the no_duplicates option)

Helveg avatar Jul 25 '23 11:07 Helveg

I think I noticed something similar just recently; let me create an example file, and we can check that we're on the same page.

mgeplf avatar Jul 25 '23 11:07 mgeplf

Conceptually a series of SWC samples with the following tags:

1-1-1-6-6-6-7-7-7

gets converted in MorphIO to a single Section with type 1:

|--------1-------|

more equivalent to this SWC:

1-1-1-1-1-1-1-1-1

I'd like an option to parse this as 3 Sections with their respective types:

|--1--|--6--|--7--|

Helveg avatar Jul 25 '23 12:07 Helveg

Yup, we're on the same page:

contents =('''1 1 0 4 0 3.0 -1
              2 6 0 0 2 0.5 1
              3 7 0 0 3 0.5 2
              4 8 0 0 4 0.5 3
              5 9 0 0 5 0.5 4''')
n = Morphology(contents, "swc")

should have len(n.sections) == 4

I will take a look when I have moment. I have a rewrite of the SWC parser in progress, that covers this case, afaik. I need to finish it though.

mgeplf avatar Jul 25 '23 12:07 mgeplf

@Helveg I have implemented a change that should fix this. However, it's in a completely new SWC reader - I'm passing all the tests that I have in NeuroM and nearly all of them in MorphIO (the ones that I'm not, I think we should re-evaluate what is right.

I'd really appreciate it if you could try with your code, because I think: a) it would fix this issue b) you may turn up other issues.

The branch w/ my new reader is: https://github.com/BlueBrain/MorphIO/compare/efficient-swc-build Let me know if you have any luck.

mgeplf avatar Jul 27 '23 11:07 mgeplf

Hmm, my version is MorphIO-3.3.6.dev54+g57e4e98.d20230811 from the latest commit on the efficient-swc-build branch and I get this error:

>       morpho = morphio.Morphology(os.fspath(file_like))
E       morphio._morphio.RawDataError: 
E       /mnt/c/Users/pwd06/git/arborize/tests/data/morphologies/multitagged.swc:1:error
E       Unable to parse this line

for this SWC file, which looks good:

1 1 0.0 0.0 0.0 2.9000000953674316 -1
2 1 0.0 5.622320175170898 0.0 2.9000000953674316 1
3 6 0.0 5.622320175170898 0.0 0.75 2
4 6 0.0 6.622320175170898 0.0 0.75 3
5 7 0.0 6.622320175170898 0.0 0.3499999940395355 4
6 7 0.0 16.6223201751709 0.0 0.3499999940395355 5
7 8 0.0 16.6223201751709 0.0 0.15000000596046448 6
8 8 0.0 29.222320556640625 0.0 0.15000000596046448 7
9 8 0.0 41.82231903076172 0.0 0.15000000596046448 8
10 8 0.0 54.42232131958008 0.0 0.15000000596046448 9
11 8 0.0 67.02232360839844 0.0 0.15000000596046448 10
12 8 0.0 79.62232208251953 0.0 0.15000000596046448 11
13 8 0.0 92.22232055664062 0.0 0.15000000596046448 12
14 8 0.0 104.82231903076172 0.0 0.15000000596046448 13
15 8 0.0 117.42231750488281 0.0 0.15000000596046448 14
16 8 0.0 130.02232360839844 0.0 0.15000000596046448 15
17 8 0.0 142.622314453125 0.0 0.15000000596046448 16

I'm on WSL2, my files might have weird line endings and other fun cross-platform freakiness

Helveg avatar Aug 11 '23 08:08 Helveg

Cool, thanks for the example; I will have a look.

mgeplf avatar Aug 22 '23 12:08 mgeplf

I tried loading the above snippet, and it seemed to work. I will try some variations on it though. I don't have a WSL2 machine at my disposal, though.

mgeplf avatar Aug 24 '23 09:08 mgeplf

It's the line endings, I believe. After running dos2unix on the file the parser works. I think Windows line endings are \r\n, maybe you determine the line ending in a platform specific manner?

>>> import morphio
>>> morphio.Morphology("git/arborize/tests/data/morphologies/multitagged.swc")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
morphio._morphio.RawDataError:
git/arborize/tests/data/morphologies/multitagged.swc:1:error
Unable to parse this line
(b4py9) robin@LAPTOP-JF6T3PSU:~$ dos2unix git/arborize/tests/data/morphologies/multitagged.swc
dos2unix: converting file git/arborize/tests/data/morphologies/multitagged.swc to Unix format...
>>> import morphio
>>> morphio.Morphology("git/arborize/tests/data/morphologies/multitagged.swc")
<morphio._morphio.Morphology object at 0x7f6ff26c2a70>

Helveg avatar Aug 25 '23 09:08 Helveg

It's the line endings, I believe. After running dos2unix on the file the parser works. I think Windows line endings are \r\n, maybe you determine the line ending in a platform specific manner?

Yup, thanks for catching that.

mgeplf avatar Aug 25 '23 11:08 mgeplf

Any status updates? :)

Helveg avatar Sep 13 '23 09:09 Helveg

Sorry, I was away, and haven't had a single stitch of time to look since I got back. I think it's a quick fix, though, so I can maybe find time today.

mgeplf avatar Sep 14 '23 07:09 mgeplf

I have added handling of windows end of line here: https://github.com/BlueBrain/MorphIO/commit/cad7bdd24b33c306078d9a20fb9467566d086332

Could you try it, and see if it fixes this problem?

mgeplf avatar Sep 15 '23 08:09 mgeplf

Parses into 1 soma and 3 sections! 🎉

Helveg avatar Sep 22 '23 12:09 Helveg

Parses into 1 soma and 3 sections!

I hope that's what you expected!

mgeplf avatar Sep 22 '23 12:09 mgeplf

It is, thanks :)

Helveg avatar Sep 22 '23 13:09 Helveg

When would this land in release? It's a blocker for me. If you can sneak it in in the next few days I would not have to fix a whole series of pesky bugs with my own SWC parser ;)

Helveg avatar Sep 28 '23 09:09 Helveg

I'm super busy with other things at the moment.

There are a couple tests that fail, and I think it might be a semantic decision that needs to be made. For that, I will need to discuss with others what the right decision is.

mgeplf avatar Sep 29 '23 08:09 mgeplf

@Helveg you're input would be helpful: do you think the following is a valid (but small) SWC file:

1 1 0 0 0 1 -1
1 3 1 0 0 1  1

mgeplf avatar Sep 29 '23 12:09 mgeplf

Nitpick: the sample identifiers (first column) need to be unique. I'd say since SWC points from the child to the parent, the inferred segment should have the child's properties all the way up to the parent. The resulting output of:

1 1 0 0 0 1 -1
2 3 1 0 0 1  1

should then be a disc shaped soma segment of length 0, and a dendritic segment of length 1, just like:

1 1 0  0 0 1 -1
2 1 1  0 0 1  1
3 3 2  1 0 1  2
4 3 2 -1 0 1  2

would intuitively create a soma and 2 dendrites, not a Y shaped soma.

Helveg avatar Sep 29 '23 12:09 Helveg

I'd say it's a design choice of MorphIO to allow for 0 length segments or not, but I don't see the harm in allowing it. I'd find it needlessly restrictive for valid edge cases, and would leave these kinds of quality checks to dedicated morphology auditing/repairing tools that one can build on top of MorphIO.

Helveg avatar Sep 29 '23 12:09 Helveg

Nitpick: the sample identifiers (first column) need to be unique.

Yeah, that was a typo; sorry.

I'd say it's a design choice of MorphIO to allow for 0 length segments or not, but I don't see the harm in allowing it.

Okay, makes sense.

Currently, we're also transforming:

1 1 0 0 0 1 -1
2 3 1 0 0 1  1
3 3 1 1 0 1  2
4 3 1 0 1 1  2

Into:

1 1  0  0  0  1 -1
2 3  1  0  0  1  1
3 3  1  1  0  1  2
4 3  1  0  0  1  1
5 3  1  0  1  1  4

Which seems wrong: ie now the soma has two children instead of one. Thoughts?

mgeplf avatar Sep 29 '23 12:09 mgeplf

I don't understand? Why add a point?

Helveg avatar Sep 29 '23 12:09 Helveg

I don't understand? Why add a point?

Yeah, that's my confusion too; I don't in the "new" parser, and that's what I have to work out. Thanks for your input.

mgeplf avatar Sep 29 '23 12:09 mgeplf