-
Notifications
You must be signed in to change notification settings - Fork 0
Output files MATLAB
These are uploaded to the lab's datastore and saved in the folder of the specific recording.
Firings0 - This is for the dataset where all channels are whitened together and then tetrodes are sorted separately
Firings1 - This is when tetrodes are whitened separately and the sorted separately
http://mountainsort.readthedocs.io/en/latest/first_sort.html?highlight=output%20file
These are matlab files, created by the post-processing matlab script.
datasave_all.mat
datasave_separate.mat
This script runs on a folder that may contain multiple recording folders, and saves its output to the highest level. It uses Firings0 and Firings1 files saved by the post-clustering script
data_all.csv - this is based on tetrode data where all channels were whitened together
data_separate.csv - this is based on tetrode data that was whitened tetrode by tetrode
In these spreadsheets, each row represents a cluster. In the next section, I will define what the columns of this spreadsheet are.
animal - name of animal taken from the first part of the folder name, [M6]
day - date of the recording based on folder name, [12/03/2018]
tetrode - ID of tetrode where the cluster was recorded, [3]
cluster - ID of cluster on given tetrode and day [10]
nspikes - number of spikes that belong to the cluster from the whole session [278960]
coverage - percentage of the surface area of the open field arena that the animal covered during the exploration
The arena is binned to 2.5 cm * 2.5 cm squares, and then based on the position information, the percentage of bins visited by the animal is calculated.
avgFR - average firing rate
The total number of spikes (within the cluster) is divided by the total recording time.
maxamplitude - This is the highest out of the 4 average amplitudes of the tetrode
maxchannel - the channel of the tetrode (1-4) that has the highest average amplitude
spikewidth - peak-trough of mean waveform, the mean is taken from the channel with the highest amplitude spike (given in number of samples)
function [frh_hd,meandir_hd,r_hd]=plothd(hd,spkhd,sampling_rate,subplots)
Parameters hd is a vector calculated in readBonsai for each unit of time as the angle (degrees) of the line between two beads attached to the headstage. It is an output from get_position_data, which extracts it from the raw data with GetPostSyncedCorr, which uses readBonsai to extract data. Bonsai records positions of two beads on the headstage of the animal. spkhd = is a vector of values of hd for times when the cell fired. sampling_rate = the sampling rate. subplots = indicates coordinates on the output figure where subplots should go.
frh_hd is calculated from frh, which is the polar histogram of the heading direction corresponding to firing of each spike normalized to the overall distribution of heading directions. It's the heading direction (degrees) at which the firing rate is maximal.
meandir_hd is the same as frh_hd.
r_hd is the radius of the preferred head direction.
These are calculated in plothd.m and polat_hist.m and definitely need double checking These are the three values (Peak, PFD, |r|) that are written above the HD plot on the output figure.
HD_maxFR - maximum firing rate of the cell among firing rates calculated for each angle the animal's head can face. This is calculated by dividing the polar histogram of spikes of the cluster by the polar histogram of how much time the animal spent looking each way (spike_hist/head_dir_hist). Firing rate is given in Hz (1/s) This is called frh_hd in plothd.m
meanHD - the mean direction the animal was facing when the cell fired. This is the 'preferred direction'. It is given in degrees. This is called meandir_hd in plothd.m
r_HD - radius of head-direction histogram when the cell fired at preferred direction. This represents how much the mouse was facing that way while the cell fired relative to the whole recording session. x and y components of the firing vectors in the polar plot are calculated (sin and cos alpha are used in a unit circle). All x and y components of the polar plot are added together normalized to firing rate, and then 'r' is calculated by applying the Pythagorean theorem. This number is between 0 and 1, the higher the more direction specific the firing.
skaggs ?? According to Bri's thesis, this is called 'spatial information score' and was published by Skaggs, McNaughton and Markus (1993). This represents the amount of information about the location of the animal which is encoded in each spike.
sum(PiLi/Llog2()Li/L),
where Li is the avg firing rate of a unit in the i-th bin, L is the overall avg firing rate, and pi is the probability of the animal being in the ith bin (dwell time in ith bin/total recording time).
sparsity - script says : 'Warning this may all be incorrect' - quantifies how spatially sparse the firing of the cell is Jung et. al. 1994 J Neurosci.
spatialcoherence - spatial coherence is calculated based on what was described in Kubie et. al. 1990 J Neurosci., use Fisher r-Z' transform to normalise the correlation +-1.96 are 95% confidence interval http://www.jneurosci.org/content/jneuro/9/12/4101.full.pdf Spatial coherence is an estimate of firing pattern quality. The third estimate of orderliness of the spatial firing distribution is a first-order autocorrelation that will be called 'coherence'. Coherence is the z-transform of the correlation between a list of firing rates in each pixel and a corresponding list of firing rates averaged over the 8 nearest neighbors of each pixel. Coherence measures the extent to which the firing rate in a pixel is predicted by the rates of its neighbors, and therefore estimated the local orderliness of the spatial firing pattern.
maxFRspatial - firing rate is calculated for each bin of the open field, and the highest number is taken out of these
gridscore - we could also output grid spacing, field size, and grid orientation,an ellipticity but these are not saved now
A script which analyses grid cell autocorrelograms and outputs several commonly used grid field measures.
Input:
amap = the autocorrelation matrix for processing
binsize = the bin size (cm) used to create the original firing rate map
Output:
Grid score:
Defined by Krupic, Bauza, Burton, Barry, O'Keefe (2015) as the difference between the minimum correlation coefficient for autocorrelogram
rotations of 60 and 120 degrees and the maximum correlation coefficient for autocorrelogram rotations of 30, 90 and 150 degrees.
This score can vary between -2 and 2, although generally values above below -1.5 or above 1.5 are uncommon
Grid spacing/wavelength:
Defined by Hafting, Fyhn, Molden, Moser, Moser (2005) as the distance from the central autocorrelogram peak to the vertices of the inner
hexagon in the autocorrelogram (the median of the six distances).
This should be in cm.
Field Size:
Defined by Wills, Barry, Cacucci (2012) as the square root of the area of the central peak of the autocorrelogram divided by pi.
This should be in cm2
Grid orientation:
Defined by Hafting, Fyhn, Molden, Moser, Moser (2005) as the angle between a camera-defined reference line (0 degrees or x axis)
and a vector to the nearest vertex of the inner hexagon in the counterclockwise direction
This is in degrees and can vary between 0 and 59 (after 59 a new field should emerge at 0 if its a grid cell);
Ellipticity/eccentricity:
As measured by Krupic, Bauza, Burton, Barry, O'Keefe (2015) by fitting an ellipse to the six central peaks of the local spatial
autocorrelogram using a least squares method. Eccentricity e was used as a measure of ellipticity (with 0 indicating a perfect
circle): e = sqrt(1 - (b^2/a^2)) where a and b are the major and minor axis lengths respectively
This varies between 0 and 1; 0 is the ellipticity of a perfect circle, 1 is the ellipticity of a parabola
These values are calculated from data when the mouse was not stationary (running) -the same as above
skaggsrun
sparsrun
spatialcoherencerun
maxFRspatialrun
gridscorerun
This analysis is based on Kvitsiani et al (2013) and uses SALT analysis
SALT Stimulus-associated spike latency test. [P I] = SALT(SPT_BASELINE,SPT_TEST,DT,WN) calculates a modified version of Jensen-Shannon divergence (see Endres and Schindelin, 2003) for spike latency histograms.
Input arguments:
SPT_BASELINE - Discretized spike raster for stimulus-free baseline
period. N x M binary matrix with N rows for trials and M
columns for spikes. Spike times have to be converted to a
binary matrix with a temporal resolution provided in DT. The
baseline segment has to excede the window size (WN) multiple
times, as the length of the baseline segment divided by the
window size determines the sample size of the null
distribution (see below).
SPT_TEST - Discretized spike raster for test period, i.e. after
stimulus. N x M binary matrix with N rows for trials and M
columns for spikes. Spike times have to be converted to a
binary matrix with a temporal resolution provided in DT. The
test segment has to excede the window size (WN). Spikes out of
the window are disregarded.
DT - Time resolution of the discretized spike rasters in seconds.
WN - Window size for baseline and test windows in seconds
(optional; default, 0.01 s).
Output arguments:
P - Resulting P value for the Stimulus-Associated spike Latency
Test.
I - Test statistic, difference between within baseline and
test-to-baseline information distance values.
Briefly, the baseline binned spike raster (SPT_BASELINE) is cut to non-overlapping epochs (window size determined by WN) and spike latency histograms for first spikes are computed within each epoch. A similar histogram is constructed for the test epoch (SPT_TEST). Pairwise information distance measures are calculated for the baseline histograms to form a null-hypothesis distribution of distances. The distances of the test histogram and all baseline histograms are calculated and the median of these values is tested against the null-hypothesis distribution, resulting in a p value (P).
Reference: Endres DM, Schindelin JE (2003) A new metric for probability distributions. IEEE Transactions on Information Theory 49:1858-1860.
lightscoreP - Resulting P value for the Stimulus-Associated spike Latency Test.
lightscoreI - Test statistic, difference between within baseline and test-to-baseline information distance values.
lightlatency - latency of responses
percentresponse - % of stumulation trials that had a response
lightscore_p2 lightscore_I2 lightlatency2 percentresponse2
lightscore_p3
lightscore_I3
lightlatency3
percentresponse3
lightscore_p4
lightscore_I4
lightlatency4
percentresponse4
cluster - cluster ID is repeated for readability
goodcluster - 1 if the cluster passed curation criteria, 0 if not
firing_rate - average firing rate repeated for readability
FRpass - 1 if the firing rate it high enough for analysis, 0 if not
isolation - The isolation metric quantifies how well separated (in feature space) the cluster is from other nearby clusters. Clusters that are not well separated from others would be expected to have high false-positive and false-negative rates due to mixing with overlapping clusters. This quantity is calculated in a nonparametric way based on nearest-neighbor classification.
isolationpass - 1 if cluster passed isolation criteria, 0 otherwise
noiseoverlap - Noise overlap estimates the fraction of “noise events” in a cluster, i.e., above-threshold events not associated with true firings of this or any of the other clustered units. A large noise overlap implies a high false-positive rate. The procedure first empirically computes the expected waveform shape for noise events that have by chance crossed the detection threshold. It assesses the extent of feature space overlap between the cluster and a set of randomly selected noise clips after correcting for this expected noise waveform shape.
The noise overlap and isolation metrics vary between 0 and 1, and in a sense, represent the fraction of points that overlap either with another cluster (isolation metric) or with the noise cluster (noise overlap metric). However, they should not be interpreted as a direct estimate of the misclassification rate but should rather be considered to be predictive of this quantity. Indeed, due to the way they are computed, these values will depend on factors such as the dimensionality of the feature space and the noise properties of the underlying data. Therefore, the annotation thresholds should be chosen to suit the application. With that said, in this study we used the same sorting parameters and annotation thresholds for all analyses.
noiseoverlappass - 1 if cluster passed noise overlap criteria, 0 otherwise
peakSNR - Depending on the nature of signal contamination in the dataset, some clusters may consist primarily of high-amplitude artifactual signals such as those that arise from movement, muscle, or other non-neural sources. In this case, the variation among event voltage clips will be large compared with clusters that correspond to neural units. To automatically exclude such clusters we compute cluster SNR, defined as the peak absolute amplitude of the average waveform divided by the peak SD. The latter is defined as the SD of the aligned clips in the cluster, taken at the channel and time sample where this quantity is largest.
peakSNRpass - - 1 if cluster passed peak signal to noise criteria, 0 otherwise
burstingparent - this this may be the ID of another cluster that this cluster might be part of but couldn't be sorted because of bursting (??)
We should consider renaming several of these output files and moving them to a subfolder within the recording folder.