Slicer DICOM Scalar volume plugin relies on (old) GDCM: why do we not use DCMTK?

I’m not arguing with you Andrey, and given that we now have significant
experience on the topic I’m sure that as Andras suggested and I agreed we
should try it out and fix any issues we run into.

You are right that the DICOMScalarVolumePlugin does a lot of consistency
checking in an attempt to be sure the Archetype reader has the best chance
of success. The reason I still used Archetype reader is so that it can
deal with the encodings, pixel representations, lookup tables, etc.

As I also mentioned, the GDCM-based code does not clearly document which
things, so I can’t give you a direct pointers to everything. But it
doesn’t take long looking at the ITK GDCM code to find all kinds of
heuristics and workaround for various non-standard data files (I pasted a
few quick examples below).

// There isn’t a definitive way to check for DICOM files;

// In general this should be relatively safe to assume

        // Stupid file: CT-MONO2-8-abdo.dcm
        // The spacing is something like that: [0.2\0\0.200000]
        // I would need to throw an expection that VM is not compatible

The archetype reader was useful because it allowed a saved scene to refer to existing DICOM files. However, this does not work anymore (since the change that forces the file name to be the same as node name on save).

I think we should also consider David Gobbi’s vtk-DICOM package. It is the most comprehensive open-soruce DICOM image reading/writing solution - it can read multiframe, cine, multidimensional, compressed, gantry-tilt images and it can also write several image IODs (see more info here and here). It’s small, clean, simple, supported and continuously improved by David.

vtkDICOM looks good to me as well.

The only downside I can see is that for some applications, like in dcmqi, we might end up duplicating some of the slice-to-volume conversion logic in something closer to native DCMTK without the VTK dependency.

I think to expedite things and help the user, I should add an alternative scalar volume reader to the multivolume importer plugin, label its output loadable as “DCMTK scalar volume”, and use it in the multivolume either as the primary, or as a fallback when the first-order scalar volume plugin fails.

1 Like

Sounds very reasonable.

I wonder if you could add a testing mode where both loading mechanisms are used and the user (developer) is notified if they give different results.

For what it’s worth:

  • this branch replaces GDCM with DCMTK image IO: https://github.com/fedorov/Slicer/tree/dcmtk-for-scalar-volume - feel free to test any “corner cases”
  • the build of the branch above works fine, and I can load the DCE series from ERROR with DCE MRI loading in DICOM Browser without problems
  • almost all of the tests are passing when I tested on linux (from the list below, most of the failing tests failed because I killed Slicer app manually, as the test was running for very long time, often without any indication of what it was doing, some of them keep doing something or showed non-responsive app window after 10 minutes and up)

The following tests FAILED:
440 - py_StandaloneEditorWidgetTest (Failed)
601 - py_AtlasTests (Failed)
615 - py_RSNAVisTutorial (Failed)
616 - py_RSNAQuantTutorial (Failed)
622 - py_RSNA2012ProstateDemo (Failed)
623 - py_JRC2013Vis (Failed)
640 - py_LandmarkRegistration (Failed)
Errors while running CTest

We thought about this a bit more and discussed the implementation plan further with @pieper.

The following issues related to the initially considered approach of introducing a new C++ CLI that would read a DICOM series using DCMTKImageIO have been identified:

  • there will be a need to re-implement the code that determines the scalar type from DICOM - this has to be done before the ITK reader is instantiated!
  • since in the general case we cannot assume that the directory containing DICOM files does not include files from other series, and there is a limit on the maximum length of a command line in Windows, we would need to work around this by either copying files into a temp location, or communicating the list of files via a file

Based on this, and further analysis of the archetype reder code, an alternative approach has been developed:

By using this approach, we will not change any of the behavior in the application or its reliance on GDCM, unless GDCM fails. As such, we will be able to introduce the fix addressing the user need relatively easily, and without waiting for the next release, and without confusing the user by introducing a new type of plugin for scalar volumes.

If you have any concerns or suggestions, please respond. We plan to proceed with the implementation of the proposed approach very soon.

done before the ITK reader is instantiated

May this could be added to ITK … in that case a list of readers could be returned. @thewtex Do you think that make sense or should Slicer team proceed as described ?

or communicating the list of files via a file

:thumbsup: This is a common practice. Within build system, such files are called “response” files

@jcfr thanks for the response. Do you have any specific concerns with the proposed plan?

I think the issue of improving ITK is an important, but a separate one. I would prefer not to wait for the ITK improvements to be developed, tested and integrated to resolve this specific issue, unless there are really good reasons.

I like the proposed solution of implementing this by making vtkITKArchetypeImageSeriesScalarReader smarter.

I have just a small comment: it would be a bit nicer to make the DICOM reader selector a string attribute (PreferredDICOMReader, PreferredDICOMIOMethod, PreferredIOClass, …), instead of a Boolean flag (UseGDCMImageIO), as in the future we will probably want to have more options (for example, different modes for using DCMTK or GDCM, auto-select between different readers, use other toolkits) without changing the interface.

It would be useful to have a way to load using DCMTK, even when GDCM returns something. For this, the DICOMScalarVolumePlugin could always offer two loadables for a series: a GDCM-based and a DCMTK-based. By default, if GDCM can interpret the image, the GDCM-based loadable would have higher confidence (so it would be selected by default).

Yes, I think it is a idea to have the DICOM image type code available in ITK without duplication, but it should not be a blocker to move forward. Could an issue please be created here:

GitHub - KitwareMedical/ITKDICOM: Better support for DICOM in ITK.

that points to the appropriate code as a reference?

The overall plan sounds good :thumbsup:

Worth noting, @pieper did some digging the other week and found these DICOM datasets from GDCM:

Grassroots DICOM / Gdcmdata / [a0bba5]

These are a good resource for regression testing.

Thanks for the review and feedback everyone.

I agree with @lasson that we should rename the methods for selecting which toolkit to use. I also like the idea of offering the two methods as alternative loadables so the user can pick which to use in Advanced mode of the DICOM browser.

I’m also thinking it could make sense to write a test that checks out Matheiu’s repository (beefed up with any additional examples) and then runs both readers and compares the results. I wouldn’t want to make this a Nightly test, but it would make sense to be able to run it when the code changes.

Since Andrey is traveling I went ahead and filed the issue on ITKDICOM:

https://github.com/KitwareMedical/ITKDICOM/issues/7

3 Likes

I just merged the topic to address this.

I believe it captures most (all?) of the agreed upon approaches discussed here.

TL;DR People should not notice anything different for now, since GDCM will still be the default reader, but if it fails to load correctly DCMTK will be used as a backup. There’s now a settings panel for DICOM where people can choose to use DCMTK if they want to experiment.

There is still this known issue outstanding,

but report if you see other issues.

3 Likes

Any deprecation schedule, to avoid the conversation in two years about “can’t remove GDCM OR DCMTK because it might break someone’s volume”? :smile:

By the way, when I looked at the original issue that prompted this, I noticed that some slices of the sample data did in fact have thickness set to 0 – so GDCM wasn’t lying. I believe it worked in DCMTK because they don’t look at that tag at all (and then Slicer recalculates the spacing from IPP anyway).

I think this is a long-term issue for the reasons I mentioned above - we don’t have a good way of knowing the impact of any particular change because we cannot test it with users’ actual data. For the self test I implemented code to load with both methods and compare the results. I was thinking to make that a selectable option so that we could start experimenting about when one fails but the other works.

Realistically what we need is a very large regression testing suite of data that somehow defines the data that we do support. This would allow changes to be made on at least some solid footing.

Unfortunately I don’t think there are any easy answers…

I think we should deliver a stable release as soon as possible (that’s why I’m slowly chewing through all the open issues in Mantis) and after that we can change the default to DCMTK.

We have to be careful about how to save the user preference for DCMTK/GDCM: we should not save the user preference now to be GDCM, as user preferences are not overwritten when releases are uninstalled/installed. A solution could be to offer Default, DCMTK, and GDCM options to the user. We can then change in the code at any time what toolkit we use if “Default” is selected.

1 Like

Sure, but the flipside is that we’ll never get regression reports and test data if we never break anything. A hypothetical deprecation schedule might look something like:

  • 2 months, GDCM stays default, DCMTK fallback with pop-up request to report
  • +4 months, switch to DCMTK, GDCM fallback with strongly-worded pop-up request to report
  • +6 months, DCMTK default, GDCM annoying pop-up over-ride available with desperate plea to report
  • +1 year, GDCM importer available as an extension (or perhaps via emscripten)

Every dependency adds build complication and potential failures, maintenance burden, etc. and we are now carrying two JSON libraries, a bunch of Python packages, etc. too.

Given the quality of DCMTK ImageIO code, severely limited availability of DICOM test data and DICOM testing functionality in general, and other issues mentioned by @pieper, I vote against deprecating GDCM IO. It is good to have both options for the situations when one approach does not work.

Yes, but we introduced 0 new dependencies by the new feature introduced. Both GDCM and DCMTK are already build by Slicer/ITK.

I like the idea from @lassoan of adding a ‘Default’ reader option that we can change as needed. I hadn’t thought of that.

Of course we can also add a version number to future settings keys if we want to override old behavior and ignore the old setting value.

This is where the selection settings are handled now in case we want to change before too many people start using it.