spikeinterface noise caused by whitening

Dear Spikeinterface community

My recording is about neurons with giant axons + its has a high spontaneous firing rate (40 Hz, can go up to 150 Hz in response to visual stimuli) so detecting its spike is easy. However, when doing spike sorting, I bumped into issues such as a waveform residual from one unit being clustered as a separate unit at other channels or noise of all kinds being clustered as units. Here is an overview of spike sorting done by kilosort standalone (blue and red are the waveform residual from the same unit)

I read about preprocessing steps https://github.com/SpikeInterface/spikeinterface/issues/2017, #3483, and from my understanding whitening can decorrelate data but can also cause noise (see this notebook for further info). Therefore, I was wondering if optimising the whitening step can improve the performance of spike sorting or whitening is suitable for this kind neural signals.

Then I compared the results of whitening from 4 different recording sessions at 4 different radius in the local mode Here are the code and the graphs

r_range=[25,50,100,150]
for this_r in r_range:
    rec_w = spre.whiten(recording=recording_corrected,mode="local",radius_um=this_r,int_scale=200,dtype=float)

In general, while some waveform residuals were removed, it looks like whitening introduced noise across channels. Could anyone give me suggestion what would be the best way to optimise parameters of whitening or whether whitening is suitable for my dataset?

Note1: setting int_scale=200 is meant to replicate the kilosort behaviour. I have removed that parameter but the issue still exists. Note2: the channels were sorted by their ID not depth or location Note3: these acute recording were done using CambridgeNeurotech H10 probe (32-channel per shank within 330 um in depth, 18.5um in hozitonal distance and 30 um in vertical distance) in head-fixed insects (typically around 1 hour). Note4: my preprocessing steps start from bandpass filter (600-6000Hz), detecting bad channels, common median reference, correction motion by DREDge to whitening step

Oct 04 '25 11:10 chiyu1203

Not quite sure who I should punt this to. @samuelgarcia or @alejoe91 any ideas here? I would say most spike sorters these days have converged on the necessity of whitening for the reasons you listed. Picking up residuals as additional events is a common problem and I don't think whitening alone should change that problem although maybe Sam or Alessio think differently?

Oct 04 '25 18:10 zm711

@zm711 thank you so much for the feedback! The fundamental problem underlying my issue is that I do not know what the best way to evaluate the performance of spike sorting is, and whether there is still ways to improve the performance. "a waveform residual from one unit being clustered as a separate unit at other channels" itself is not a big problem, but it would add up the number of putative units for manual curation. At the moment, I have a go through 60-100 putative units for a region where I think the maximum number of units should be around 6 (normally 1 or 2 per session). In addition to the plots in my original post, I have plotted the covariance matrix across channels for the trace shown in the first dataset (first figure).

I guess applying whitening is not a bad idea but I just do not know if I should try tuning for some parameters in whitening in order to get the best performance in spike sorting

Oct 08 '25 12:10 chiyu1203

Frankly, I do not like so much the whittenning preprocessing. I can improve but also has many trap.

And of course, as you said, whetning do not denoise but decorrelate noise.

Something important to kow is that snippets for computing the covariance are randomly selected. So when you have a high spike rate you will capture many spikes and so the covariance will not capture the "noise covariance" but the "spikes covariance". In the late case later case the main effect is to concentrate the main signal on a spikes on one channel. Intuitivelly we could loose spatial information and that should be bad for spike sorting. (this as my intuition)

But with @yger we have done many benchmarks of several sorting step (peak detection, clsutering, template matching) and having whittening generally improve the method itself. For instance template matching given better results with whitenning (except on the tdc-peeler...).

kilosort4 works better with whitening (it is actiavte by default) for sure.

Also something iumportant : apparently you have less SNR on the main channel after whitening but keep in mind that all the informationn is now on one channel and templates are not multi channel anymore after whitening.

Oct 10 '25 14:10 samuelgarcia

@samuelgarcia thanks for the feedback! That's exactly what I suspect. I was trying to sort out a neuron with high firing rate (40 Hz, can go up to 150 Hz in response to visual stimuli). And the covariance matrix I showed in previous post come from that 0.1-s snippet, so not randomly selected. In this case, how should I find a good radius range for the whitening step? and should I applying whitening technique through kilosort4 or spykingcircus2 themselves on spikeinterface, not with the preprocessing module of spikeinterface?

And sorry I don't understand the last part: is all the information is now on one channel and templates are not multi channel anymore after whitening a good thing for spike sorting?

Oct 10 '25 15:10 chiyu1203