Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

It is impossible to see what device collected a measurement in the data stream #438

Open
bardram opened this issue Oct 31, 2024 · 4 comments
Labels
help wanted Extra attention is needed question Further information is requested

Comments

@bardram
Copy link
Contributor

bardram commented Oct 31, 2024

Looking at the data stream, like:

  {
    "id": 5253,
    "data_stream_id": 639,
    "snapshot": {
      "syncPoint": {
        "synchronizedOn": "1970-01-01T00:00:00Z",
        "relativeClockSpeed": 1.0,
        "sensorTimestampAtSyncPoint": 0
      },
      "triggerIds": [
        10,
        14
      ],
      "measurements": [
        {
          "data": {
            "id": "f2307b9b-e401-4014-b2c7-68e05e2ca17c",
            "__type": "dk.cachet.carp.audio",
            "upload": true,
            "filename": "f2307b9b-e401-4014-b2c7-68e05e2ca17c.mp4",
            "metadata": {},
            "mediaType": "audio",
            "endRecordingTime": "2024-10-29T19:30:46.677579Z",
            "startRecordingTime": "2024-10-29T19:30:34.778037Z"
          },
          "sensorEndTime": 1730230246677579,
          "sensorStartTime": 1730230234778037
        },

there is no way to know what device collected the measurement or data item(s).

It is not part of the Trigger since in the protocol, this is stored in the TaskControl.

How should we store this?

Maybe @Whathecode has thought about this?

@bardram bardram added help wanted Extra attention is needed question Further information is requested labels Oct 31, 2024
@Whathecode
Copy link
Member

Whathecode commented Oct 31, 2024

That's by design. To interpret the data, you need to look up the ids referenced (trigger ids) in the study protocol.

@Whathecode
Copy link
Member

Whathecode commented Nov 1, 2024

Actually, deviceRoleName should be there. The above comment is only true to get the full context (as in, know what type of device that is).

But, the data shown here is much different from e.g. a carp core data point: https://github.com/cph-cachet/carp.core-kotlin/blob/develop/carp.data.core%2Fsrc%2FcommonMain%2Fkotlin%2Fdk%2Fcachet%2Fcarp%2Fdata%2Fapplication%2FDataStreamPoint.kt

What am I looking at here?

The "snapshot" looks a bit like a DataStreamSequence. JSON schema: https://github.com/cph-cachet/carp.core-kotlin/blob/develop/rpc%2Fschemas%2Fdata%2FDataStreamSequence.json

That should have a dataStreamId: https://github.com/cph-cachet/carp.core-kotlin/blob/develop/rpc%2Fschemas%2Fdata%2FDataStreamId.json

Data stream id being an integer is definitely not carp core defined infrastructure, and neither is the "snapshot" property.

There's also minor things, like the start and end recording time of audio, which is a clear candidate to use the default timestamps for.

And, as far as I know __type needs to be the first defined field for polymorphic deserialization to work.

@bardram
Copy link
Contributor Author

bardram commented Nov 15, 2024

Actually, deviceRoleName should be there. The above comment is only true to get the full context (as in, know what type of device that is).

But, the data shown here is much different from e.g. a carp core data point: https://github.com/cph-cachet/carp.core-kotlin/blob/develop/carp.data.core%2Fsrc%2FcommonMain%2Fkotlin%2Fdk%2Fcachet%2Fcarp%2Fdata%2Fapplication%2FDataStreamPoint.kt

What is this DataStreamPoint used for? As far as I can see, this class is not used for anything in the domain model when I read the Kotlin code @Whathecode ? We upload a list of DataStreamBatch which consists of a list of DataStreamSequence which holds a list of Measurement. Nowhere is the DataStreamPoint used.

What am I looking at here?

You are looking at a DataStreamSequence which is part of a list in the DataStreamBatch.

The "snapshot" looks a bit like a DataStreamSequence. JSON schema: https://github.com/cph-cachet/carp.core-kotlin/blob/develop/rpc%2Fschemas%2Fdata%2FDataStreamSequence.json

That should have a dataStreamId: https://github.com/cph-cachet/carp.core-kotlin/blob/develop/rpc%2Fschemas%2Fdata%2FDataStreamId.json

Yes - this DataStreamId seems to be missing in the JSON.

Data stream id being an integer is definitely not carp core defined infrastructure, and neither is the "snapshot" property.

Indeed - I will create and issue for CAWS for @yuanchen233 to look at - there is something very wrong here in the data export.

There are also minor things, like the start and end recording time of audio, which is a clear candidate to use the default timestamps for.

A lot of the data we collect on the phone gets its own timestamp in a UTC format which we save.

And, as far as I know __type needs to be the first defined field for polymorphic deserialization to work.

I hope not - the order of fields in JSON should hopefully not matter - I can't control this in Dart.

@bardram
Copy link
Contributor Author

bardram commented Nov 15, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants