Skip to content

Add metadata property to itk.Image and populate it from file metadata #394

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
thewtex opened this issue Feb 8, 2021 · 8 comments
Closed

Comments

@thewtex
Copy link
Member

thewtex commented Feb 8, 2021

E.g. DICOM tags

@agirault
Copy link
Collaborator

agirault commented Feb 8, 2021

Based on work in paraview-medical (cc: @floryst), and example in #252

@floryst
Copy link
Member

floryst commented Feb 8, 2021

Some thoughts:

  • DICOM tags are associated with DICOM objects. When the objects are slices, then it could be the case that certain tags (e.g. RescaleSlope and RescaleIntercept) vary between each slice/object. We would want to allow requesting the DICOM tags for a particular object.
  • There can be a lot of tags, and I don't know if we want to incur the performance penalty (if significant; needs testing) of reading all tags up-front. We could make such a metadata property into a getDicomTag method, but we still need to choose what DICOM object to read from.
  • If we read a single slice, then metadata makes more sense than if we constructed a volume.

One idea is to have a function like so: readDICOMTags(dicomFile, [...tagNames]). This is likely the smallest useful primitive for reading DICOM tags. The problem with this approach is that there will be double-parsing of the dicom header when the user also wants to render said DICOM as part of a volume. I suppose that shouldn't be too bad considering that ITK's DicomIO.ReadInformation (I think that's what it's called) only reads the DICOM header.

If we want a property off of itk.Image, maybe it can be something like image.getDICOMTag(slice#, [...tagNames]). If the itk.Image is a volume, then this assumes we have a mapping from volume (Z) slice to dicom object, unless we extract all the tags per slice during volume construction. If the itk.Image is a slice, then all we need is image.getDICOMTags(...tagNames).

The classic use-case: User wants to scroll through a DICOM stack and have the W/L automatically set from the RescaleSlope and Intercept of the current slice.

  • If we are slicing a 3D volume, how will we map from a slice # to the correct set of DICOM tags?
  • Is it more worthwhile to get the correct ordering of slices, then read each slice into its own image w/ tags? AFAIK DICOM viewers take this approach, I think.

@agirault
Copy link
Collaborator

agirault commented Feb 8, 2021

I agree that important issues to take into account are:

  1. that some tags vary across single images/scans (ImagePosition, TriggerTime...)
  2. that we might not want to keep the whole metadata in memory
  3. another one to take into account: how do we handle nested datasets/metadata (sequences)

image.getDICOMTag(slice#, [...tagNames])

The problem as you said is that we need to store all the DICOM metadata for all slices. I'd rather have the contrary: some data bank out of which we can extract a 3D Image / Volume given a set of files (or a series object), so the user can discard that data bank when they want. So "Parse -> Read what you want -> Create Image" instead of "Parse -> Create Image and store all metadata -> Read what you want later".

i.e: the metadata should only be retrievable within the reader's scope, the application can then choose to cache what they need. Does that make sense?

Is it more worthwhile to get the correct ordering of slices

I'd hope that itk.js' GDCM reader does that already by properly sorting the scans based on their proper order.?

@floryst
Copy link
Member

floryst commented Feb 9, 2021

...some data bank out of which we can extract a 3D Image / Volume given a set of files (or a series object), so the user can discard that data bank when they want. So "Parse -> Read what you want -> Create Image" instead of "Parse -> Create Image and store all metadata -> Read what you want later".

So, here are two ideas from this: we either support reading DICOM tags from the files at a later point (database model), or we support reading tags in-line with the image.

  • database model: we will need to create a small DICOM database context within itk.js, similar to what I have in paraview-medical
  • reading tags in-line: we extend readImageDICOMFileSeries to also take in a list of DICOM tags to read/return for every slice. This means that itk.Image.metadata will have a list of tag dictionaries, with each dictionary corresponding to a slice(?)
    • e.g. readImageDICOMFileSeries(fileList, { tags: ['0008|0020', ...] }).

another one to take into account: how do we handle nested datasets/metadata (sequences)

I'm not fully understanding nested datasets. Does that mean DICOM can store an arbitrary hierarchy of distinct datasets, each with their own set of dicom metadata?

I'd hope that itk.js' GDCM reader does that already by properly sorting the scans based on their proper order.?

Yes. I was thinking high-level operations, and GDCM should be doing this one, not us.

@agirault
Copy link
Collaborator

@floryst Maybe we should ignore nested metadata for now until we have use cases that come up, and we can iterate on it unless some expert shows interest in helping (we could reach out on the ITK discourse). I'm surprised if this isn't already handled in ITK C++ between GDCM and DCMTK.

@thewtex I just saw there was a metadata field on itk.Image in 5.2.rc2: https://blog.kitware.com/itk-5-2-release-candidate-2-available-for-testing/

Could you provide more details/fill us in on how that works? Is that with MONAI only? What about the issue mentioned above of metadata that varies for each 2D scan in a single series? I suppose that the solution we want to design as part of this thread's discussion should align with that latest addition in 5.2.rc1? Thanks!

@thewtex
Copy link
Member Author

thewtex commented Feb 15, 2021

@agirault yes, as starting point we can add the same metadata property in JavaScript as we have in Python. This is derived from the itk::MetaDataDictionary associated with an itk::Image.

As a next step, we could improve this for a DICOM series based on @floryst's and your work.

What about the issue mentioned above of metadata that varies for each 2D scan in a single series?

In this case, the tags will correspond to a single slice.

@floryst
Copy link
Member

floryst commented Mar 4, 2021

Here is one possible API for DICOM tags, depending on how itk.js webworker functionality works.

We can add a new API, readDICOMTags(webWorker, file, { selectTags: []|null }) -> { tagMap, webWorker }. This allows us to read DICOM tags from a given file, which should correspond to a DICOM object. If selectTags is falsey, then return all tags. Otherwise, read and return certain tags. I think this is a simple enough API, and ideally we will be able to reuse the webworker if, say, we have a list of files from which to extract DICOM tags.

Now this might not be the most performant, since workflows will likely want to call readImageDICOMFileSeries and readDICOMTags, and unless File objects are transferable, each call will incur a copy. Assuming that there is a (noticeable) performance penalty from doing so, it might be we can consider providing a small DICOM database of sorts. An initial idea could be the following:

  • importDICOMFiles(fileList, dicomDB=null) -> DicomDB: imports files into a dicom db.
  • DicomDB.listInstanceUIDs() -> string[]: returns a list of processed instance UIDs (might also be able to handle nested datasets)
  • DicomDB.readTagsFor(instanceUID): reads the tags for a given instance UID
  • listSeries() -> seriesInstanceUIDs[]: list series
  • buildSeriesVolume(): Builds a series volume. Of course, a big problem is when series have multiple volumes....

The first API is a good one to add regardless, but if performance is necessary, then maybe the second is a good fit. However, I would like to keep it as simple as possible so that logic can be built on top, rather than having feature creep. In any case, the first approach is definitely the easiest, while the second approach is more complicated and prone to edge cases.

@thewtex
Copy link
Member Author

thewtex commented Mar 9, 2021

We can add a new API, readDICOMTags(webWorker, file, { selectTags: []|null }) -> { tagMap, webWorker }.

👍 @floryst yes, this seems like a good next step, providing a usable piece of functionality but also a building block for future work. The webWorker API and per file API is nice -- it enabled good performance via the itk/WorkerPool. Since recent itk.js supports SharedArrayBuffer's when available, we will want file to be an ArrayBuffer / SharedArrayBuffer. It could also support File objects (but convert to SharedArrayBuffer internally. When SharedArrayBuffer's are available, they do not need to be copied or transferred.

@floryst floryst mentioned this issue Apr 1, 2021
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants