SQLite3 query in Python returns less data than what appears in the Slicer DICOM module

smsmt · June 15, 2023, 6:29pm

Hello everyone,

I am currently working on constructing a database of DICOM tags using 3D Slicer. I have imported DICOM image data of approximately 700 patients into the Slicer, which seems to have successfully processed all the files as the DICOM tags within the Slicer UI show data for all the patients.

To extract this data, I used the “ctkDICOM.sql” file that Slicer generated. This file is around 2GB in size, with an additional 20GB cache SQL file. When I attempt to parse the “ctkDICOM.sql” file using the sqlite3 module in Python, however, I find that data for around 200 patients seems to be missing.

Despite this, there doesn’t appear to be an issue with the original DICOM data, as all patient data is correctly displayed in the Slicer UI. I have double-checked the DICOM files, and they don’t seem to be the problem.

I was wondering if anyone could provide some guidance on this. Specifically, my goal is to generate a single SQL file using Slicer that contains all the DICOM tag information for these patients.

Any help or insights would be greatly appreciated!

Thank you in advance.

pieper · June 15, 2023, 8:26pm

The dicom database is managed by the CTK code, including the mapping from what’s in the database to what’s displayed on the screen, as generally described here:

github.com

commontk/CTK/blob/master/Libs/DICOM/Core/ctkDICOMDisplayedFieldGenerator.h#L34-L49


      
          /// \ingroup DICOM_Core
          ///
          /// \brief Generates displayable data fields from DICOM tags
          /// 
          /// The \sa updateDisplayedFieldsForInstance function is called from the DICOM database when update of the
          /// displayed fields is needed.
          /// 
          /// Displayed fields are determined by the rules, subclasses of ctkDICOMDisplayedFieldGeneratorAbstractRule.
          /// The rules need to be registered to take part of the generation. When updating the displayed fields,
          /// every rule defines the fields it is responsible for using the cached DICOM tags in the database.
          /// Tags can be requested to be cached in the rules from the getRequiredDICOMTags function. After the fields
          /// are defined in each rule, the results are merged together. The merging rules are also defined in the
          /// rule classes. Each field can requested to be merged with "expect same value", which uses the only
          /// non-empty value and throws a warning if conflicting values are encountered, or with "concatenate",
          /// which simply concatenates the displayed field values together.
          ///

That should explain how the fields are generated. But there’s no reason 200 patients should be missing from the database if they are displayed on the GUI (they should be the same). Maybe try recreating the issue, perhaps with public data, and if you can let us know the steps and hopefully someone can help you troubleshoot.

lassoan · June 16, 2023, 1:32pm

SQlite3 sets default limits for query results. It seems that with your select query you reached the default limit of 500 records. You can use the LIMIT clause in your query to get all the records.

smsmt · June 16, 2023, 6:16pm

Thanks Andras and Pieper for your thoughts. I’ve tried a few things already:

I used a “LIMIT” command to see if there’s an issue with sqlite3, but that didn’t help.
I also opened the ctkDICOM.sql file using a tool called DB browser, but I still saw the same problem.

I noticed that 200 patients are missing, but they’re not just at the end of the list. They’re missing randomly when I try to look at the data using the DB browser or Python. But, I can see them when I use the Slicer UI.
Also, some data shows as ‘None’ when I look at it in the DB browser or Python, but it’s there when I use the Slicer UI.

So, my guess is that Slicer might be using cache file called ctkDICOMTagCache.sql to fill in the missing data and patients. That might be how Slicer manages to show all the data.

Now, I’m going to try moving the data to a new computer. I want to see if I still have the same problems with the ctkDICOM.sql generated file by Slicer and the missing data.

lassoan · June 16, 2023, 6:58pm

You can ignore the tag cache file, that just stores some fields for faster access. Only ctkDICOM.sql content matters. You can find the list of patients in Patients table.

Note that none of these files are part of the public Slicer API. If you can find a way to extract useful information from the sqlite files then that is fine for us, but we do not support this (because that would impose many limitations on how we can evolve the internal design in the future). The public API for the Slicer DICOM database is the ctkDICOMDatabase object that is accessible in Slicer Python environment as slicer.dicomDatabase.

Topic		Replies	Views
DICOM query error in latest build of 5.7 Support dicom	3	167	May 20, 2024
tempDICOMDatabase crashes slicer Development	2	351	October 4, 2022
Load DICOM data via "networking" with python script? Support	25	2029	February 8, 2021
Show patient comments field in DICOM browser Support dicom	11	786	October 27, 2020
DICOMDatabase problem Support	2	418	January 27, 2019

SQLite3 query in Python returns less data than what appears in the Slicer DICOM module

Related topics