-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mismatch in event timestamp length #700
Comments
I just checked another session and it's also off by two:
|
@matchings are those timestamps inherited directly from the array that @ledochowitsch is saving to disk? |
@dougollerenshaw Yes. For now, don't use the timestamps in the events df. Use dataset.ophys_timestamps. Those are the ground truth from SDK. I am not sure what could be causing the timestamps in the event detection output to be off. |
More evidence that the extra two timestamps at the end are extraneous:
Plotting with last two timestamps trimmed off:
gives us two aligned traces: But trimming off the first two gives us misaligned traces:
|
i believe the timestamps Doug is talking about are the npz['ts'] ones, not the ones in the event_dict. |
Thanks @ledochowitsch. Sorry for being unclear about the underlying issue. I was initially struggling to understand it myself so this issue got a little muddied. But the fundamental issue is this: the timestamps associated with the events are two values longer than the events arrays themselves. For the same oeid I initially referenced in this issue, here's what happens when I go back to the cached events file:
Get the length of the timestamps array:
Get the length of an events trace:
Above you said:
When I go back to the directly to the SDK, I get this:
But you went on to say:
Checking that myself, I see:
So it'd seem that the |
Ah, I see... That’s frustrating:/.
The good news is that the contents of npz[‘events’] will be unaffected by this issue. However, the events time stamps will be off by the same two samples…
…-Peter
From: Doug Ollerenshaw <[email protected]>
Date: Tuesday, December 22, 2020 at 3:48 PM
To: AllenInstitute/visual_behavior_analysis <[email protected]>
Cc: Peter Ledochowitsch <[email protected]>, Mention <[email protected]>
Subject: Re: [AllenInstitute/visual_behavior_analysis] mismatch in event timestamp length (#700)
Thanks @ledochowitsch<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fledochowitsch&data=04%7C01%7C%7Cb7b765066ca84b5e4ff408d8a6d422fb%7C32669cd6737f4b398bddd6951120d3fc%7C0%7C0%7C637442777354408371%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=ZgLQwOfTjPxjyki3EGk1qXA591cQbt8SeZr%2FR2sEJbs%3D&reserved=0>. Sorry for being unclear about the underlying issue. I was initially struggling to understand it myself so this issue got a little muddied.
But the fundamental issue is this: the timestamps associated with the events are two values longer than the events arrays themselves.
For the same oeid I initially referenced in this issue, here's what happens when I go back to the cached events file:
import numpy as np
events_file = '//allen/programs/braintv/workgroups/nc-ophys/visual_behavior/event_detection/953443028.npz'
f = np.load(events_file, allow_pickle=True)
Get the length of the timestamps array:
len(f['ts'])
140014
Get the length of an events trace:
event_dict = f['event_dict'].item()
cell_roi_ids = list(event_dict.keys())
len(event_dict[cell_roi_ids[0]]['event_trace'])
140012
Above you said:
there is npz[‘ts’], which should be identical to what you get from the SDK
When I go back to the directly to the SDK, I get this:
from allensdk.brain_observatory.behavior.behavior_ophys_session import BehaviorOphysSession
oeid = 953443028
session = BehaviorOphysSession.from_lims(oeid)
len(session.ophys_timestamps)
140012
But you went on to say:
...because it’s just the result of
dataset = loading.get_ophys_dataset(eval(exp_id), include_invalid_rois=False)
ts = dataset.timestamps.ophys_frames.values[0]
Checking that myself, I see:
from visual_behavior.data_access import loading
dataset = loading.get_ophys_dataset(oeid, include_invalid_rois=False)
len(dataset.timestamps.ophys_frames['timestamps'])
140014
So it'd seem that the dataset.timestamps.ophys_frames['timestamps'] attribute is the source of the confusion here. @matchings<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmatchings&data=04%7C01%7C%7Cb7b765066ca84b5e4ff408d8a6d422fb%7C32669cd6737f4b398bddd6951120d3fc%7C0%7C0%7C637442777354418365%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=ZSbaUfWGQq0FMMcvQ5sOoFVjLPKDxZ91ZwudNSCKJ3A%3D&reserved=0>, do you know where that attribute is coming from and why it would be two elements longer than ophys_timestamps?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FAllenInstitute%2Fvisual_behavior_analysis%2Fissues%2F700%23issuecomment-749839540&data=04%7C01%7C%7Cb7b765066ca84b5e4ff408d8a6d422fb%7C32669cd6737f4b398bddd6951120d3fc%7C0%7C0%7C637442777354418365%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=M7FILrl55SbMxpX3c%2Br4cmXGHEuGCGsAb4WS4ivFiZA%3D&reserved=0>, or unsubscribe<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FABVLHVWS6QH6SPUKRLN7AHDSWEV6HANCNFSM4VGEZ56A&data=04%7C01%7C%7Cb7b765066ca84b5e4ff408d8a6d422fb%7C32669cd6737f4b398bddd6951120d3fc%7C0%7C0%7C637442777354418365%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=ZGTdRXsyBEGGz0NnbTIaH%2FbL3wR%2FmQEQDE04O%2Fd32W0%3D&reserved=0>.
|
dataset.timestamps.ophys_frames['timestamps'] are computed directly from the sync file, and these are what is used to create dataset.ophys_timestamps for mesoscope experiments because the SDK does not yet do the proper time resampling for mesoscope (or at least it didnt in the version we are using). For scientifica, dataset.ophys_timestamps is pulled directly from the SDK. If the SDK is doing some truncation of frames, it could lead to a discrepancy between dataset.ophys_timestamps and dataset.timestamps.ophys_frames['timestamps']. But that should be specific to Scientifica, because mesoscope uses the same thing for both. I hope that makes sense... |
Thanks @matchings. It looks like you're correct that this is specific to scientifica sessions. Here's an example from mesoscope showing that both the SDK ophys_timestamps and the VBA dataset ophys_frames['timestamps'] vectors are the same length: And here's a different 2P3 (scientifica) session with the same off-by-two error as above: So does this mean that the problem is with the SDK? If so, we should submit an SDK issue to solve it. These discrepancies will undoubtedly confuse other users in the future. |
im guessing that the SDK truncates the timestamps to match the ophys traces, which is probably a desired behavior, otherwise we would have mismatches all over the place. i believe the scientificas are known to give out a few extra TTL pulses at the end of the session (or at least MPE says its at the end, its nice that you just validated that here), which we want to remove so that everything is aligned. it surprises me that you are always seeing an off by exactly 2 issue though, because i thought the number of those extra pulses at the end was variable. |
I'm always paranoid about convolutions. Should the mode be "valid" instead of "full" (default)? visual_behavior_analysis/visual_behavior/ophys/response_analysis/response_processing.py Line 422 in 0b07d46
https://numpy.org/doc/stable/reference/generated/numpy.convolve.html |
The length of the timestamp array in the dataset events dataframe does not match the length of the filtered_events array.
For example:
len(dataset.events.iloc[0]['filtered_events'])
Note that the length of the ophyst_timestamps attribute matches the length of the filtered_events attribute, so it would seem that the 'timestamps' attribute of the events dataframe is the outlier.
len(dataset.ophys_timestamps)
The text was updated successfully, but these errors were encountered: