spikeinterface Channel maps for tetrode arrays

In the past, we have used Kilosort2 to do sorting. Because we are using 64 channel tetrodes, and the recommendations from Kilosort is to give each tetrode a different kcood and somewhat disregard the xcoods and ycoords, we have used this type of channel map.

# in kilosort, Wilson Lab has been using this channel map
n_tetrodes = 16
# x placement is codified as different for each wire of tetrode
xcoords = np.tile(np.arange(4) + 1, n_tetrodes)
# y placement is the same for each wire of each tetrode, different between tetrodes
ycoords = np.repeat(np.arange(n_tetrodes) + 1, 4)
# same kcoords for each tetrode
kcoords = ycoords
# chanMap
chanMap = np.arange(1,n_tetrodes*4 + 1)
chanMap0ind = chanMap - 1
# preview array (first 2 tetrodes)
print(np.array([xcoords, ycoords, kcoords]).T[:8, :])
[[1 1 1]
 [2 1 1]
 [3 1 1]
 [4 1 1]
 [1 2 2]
 [2 2 2]
 [3 2 2]
 [4 2 2]]

Following the Probe generator example in the probeinterface documentation here, I tried:

# Genearate our tetrodes
total_tetrodes = int(len(recording.get_channel_ids())/4)

probegroup = ProbeGroup()
for i in range(total_tetrodes):
    tetrode = generate_tetrode()
    tetrode.set_device_channel_indices(np.arange(4) + i * 4)
    probegroup.add_probe(tetrode)

These produce qualitatively similar channel map outputs. Although xy positions are different, I'm not sure if they are being used at all for the sorters (and we actually can't have certainty given tetrode array).

I am trying to understand what is going to happen with the xy and kcoords in the different sorters.

What I noticed is that when calling:

recording_tetrodes = recording_sub.set_probegroup(probegroup, group_mode='by_probe')
# slice the recordings by tetrodes
recordings = recording_tetrodes.split_by(property='group')

The recording will get split and passed with channel maps that are only 4 channels. This generates one folder per tetrode. With data subsamples, I have had Kilosort fail nastly (likely because of finding no spikes in particular tetrodes in short subsets), which ruins all other sorted tetrodes and also crushes the Jupyter notebook.

I was wondering if there are ways to pass the 64 x n matrix instead of 16 4 x n matrices to prevent these failures. I am also wondering if the different sorters (e.g., tridesclous, mountainsort4) would use the channel maps in similar ways.

Should I generate tetrodes using the ProbeGroup() and the suggested for loop or is there a way to use our previous channel maps, supplying them via np.array?.

By the way, Kilosort2 failures mentioned here were also found when using simulated data, for example, when trying to reproduce this example. If splitting by tetrode, I only managed to run it with Kilosort when increasing the number of units (to 50 units) and the duration (to at least 300 secs). Not sure this example applies to channel maps, because toy_example() provides a channel map with xy positions separated in a line (which would be not the reality with tetrodes).

Running the toy example in mountainsort4 with 50 units on the ground truth. The tetrode split finds 77 units, while the "all-together" version finds 70. I guess one would be able to manually curate this, but I am curious as to what is the way to go about this. I can provide a jupyter notebook that is basically an expansion of the one in the docs named plot_3_sorting_by_group.

Feb 07 '22 20:02 matiasandina

I am not totally sure what kcoords is in KS. I guess it is simply the grouping.

Almost all sorter use coordinates. Theses coordinate are mostly used for neightboors. There are supposed to be micrometer. If you have true wire tetrode, you can use generate_tetrode() which in fact make a small circle. You could also use manual linear coordinate like you do.

The grouping in spikeinterface by default is done on probe (and so tetrode here). Every group will be sorted independantly.

If you have planar silicon multi tetrode. Another possibility (which I would recommend) is to provide the full probe description with 64 channels. Depending the spacing, you will see that you can have some units that impact severalgroup between tetrode, so it make sens to sort globaly. Furthermore if you use one only unique probe, you will be able to use spikeinterface-gui and see in one shot the global output sorted at once. Then it will be quite easy so see if some units have impact one differents tetrode. If you do this, please use the true micrometers coordinate. If you have a cambridge neurotec probe, then your probe should be available with get_probe(). If it is a neuronexus one, you will need to make you own mapping because we don't have yet the probe library.

Feel free to send notebook for more discussion.

Feb 08 '22 07:02 samuelgarcia

I'm sorry if I wasn't clear. Check the notebook here. That's an extension of one of the tutorials in the documentation. We are trying to get reproducible results from different clusters, my main issue is understanding how to give information about our 64 channel tetrode array to the different sorters. If the positions are used differently, then the sorting will be different and will lead to avoidable problems.

Feb 08 '22 18:02 matiasandina

Hi @matiasandina

Sorry it took sometime to reply to this. One way to prevent issues of KS to run with few channels (e.g. tetrodes) would be to just run_kilosort without splitting in groups first. This way Kilosort completely ignores the group info. However, you can generate electrode locations so that channels within a tetrode are close to each other (say 1um distance), while tetrode-to-tetrode distance is much larger (e.g. 100um). For example for the first two tetrodes it would be (x,y):

[0, 0,
 0, 1,
 0, 2,
 0, 3,
 100, 0,
 100, 1,
 100, 2,
 100, 3, ...]

Once the sorting is done, you can compute the group of the unit by looking at the group of the maximum channel and set it as a property.

Let us know if this gives better results!

Feb 21 '22 07:02 alejoe91

Indeed, I wrote this function to troubleshoot things. Briefly, it would generate different channel maps. I was also experimenting with separating by shank (we have 2 brain locations with tetrode arrays). I also wanted to check if the regular positions that generate_tetrode() gives are in any way different from using made up coords (similar to your example).

from probeinterface import Probe, ProbeGroup
from probeinterface.plotting import plot_probe, plot_probe_group
from probeinterface import generate_tetrode

# this function can help generate probes the way we want it
def generate_custom_tetrodes(recording, shank_split=32, from_df = False, tetrode_split=True):
    
    total_tetrodes = int(len(recording.get_channel_ids())/4)
    
    if from_df:
        # We can generate this using probeinterface,
        import pandas as pd
        # in kilosort, Wilson Lab has been using this channel map,
        # x placement is codified as different for each wire of tetrode,
        xcoords = np.tile(np.arange(4) + 1, total_tetrodes)
        #y placement is the same for each wire of each tetrode, different between tetrodes,
        ycoords = np.repeat(np.arange(total_tetrodes) + 1, 4)
        # same kcoords for each tetrode
        kcoords = ycoords
        # chanMap,
        chanMap = np.arange(1,total_tetrodes*4 + 1)
        chanMap0ind = chanMap - 1,
        
        if shank_split is not None:
            # hard coded regions to 0 and 1 but whatever...
            shank_ids = [0] * total_tetrodes * 4
            # elements from shank_split onwards
            shank_ids[shank_split:] = [1] * (total_tetrodes * 4 - shank_split)
        else:
            shank_ids = ""
        
       
        # debug lengths
        #print(list(map(len, [probe_index, xcoords, ycoords, shank_ids])))
    
        # create the probe
        
        probe_df = pd.DataFrame({'x': xcoords, 
                                 'y': ycoords, 
                                 'contact_shapes': 'circle', 
                                 'radius': 6.0,
                                 # we can give IDs here!
                                 'shank_ids': shank_ids,
                                 'contact_ids': ''})
        if tetrode_split:
            probegroup = ProbeGroup()
            # if we need to split the tetrodes we need to add them one at a time
            for chann4 in range(0, total_tetrodes):
                tetrode = generate_tetrode()
                channels = tetrode.from_dataframe(probe_df.iloc[chann4:chann4+4, ])
                channels.set_device_channel_indices(np.arange(4) + chann4 * 4)
                probegroup.add_probe(channels)
        else:
            # generate from dataframe into one tetrode
            tetrode = generate_tetrode()
            channels = tetrode.from_dataframe(probe_df)
            channels.set_device_channel_indices(np.arange(total_tetrodes * 4))
            probegroup = ProbeGroup()
            probegroup.add_probe(channels)

    else:
        # Genearate default tetrodes
        probegroup = ProbeGroup()
        for i in range(total_tetrodes):
            tetrode = generate_tetrode()
            tetrode.set_device_channel_indices(np.arange(4) + i * 4)
            if shank_split is not None:
                if i * 4 < shank_split:
                    tetrode.set_shank_ids([0, 0, 0, 0])
                    #tetrode.set_channel_property(ch, property_name='brain_area', value='Hipp')
                else:
                    tetrode.set_shank_ids([1, 1, 1, 1])
                    #tetrode.set_channel_property(ch, property_name='brain_area', value='Hipp')
            probegroup.add_probe(tetrode)
    return probegroup

Using this, when I split by tetrode Kilosort fails just as before. I think the error is on the kilosort side, because it cannot handle tetrodes without enough spikes (error related to W matrix). Mountainsort4 handles this error by throwing a warning about matmult involving a divide by zero and moves on to the next tetrode. Shank splitting seems to not make a huge difference.

Beyond splitting by tetrode "as the correct thing to do", I wonder if it's possible sorters perform better (as in faster) due to not having to handle so much data, especially on the CPU only sorters.

Feb 21 '22 14:02 matiasandina

@matiasandina

Closing this for now, feel free to reopen or open another issue if you have other questions!

Oct 13 '22 11:10 alejoe91