Indeed, see
Operating system: Windows 10
Slicer version: 4.13
Expected behavior: Equal Image load times
Actual behavior: 1 series takes 30 seconds, the other takes 10 minutes.
@pieper @lassoan @jamesobutler @jcfr,
I’m trying to create DCMs from memory for my Slicer module to load and display. It was too slow for me to get the original files from the server for my module to use, so I decided to create the files from images already in memory. The saved files are identical to the original files (according to the DICOM Validator Compare utility) and all DCMs are the same size. Yet, 1 series takes 30 seconds, the other takes 10 minutes to load into Slicer.
The first series here was created as a result of saving DICOM files from images in memory:
https://www.dropbox.com/s/op4jaeqynhu8u42/FromHeap.zip?dl=0
It takes Slicer 10 to 20 times longer to load into Slicer than than the original files…
https://www.dropbox.com/s/pcdxdbpr2jiv4jy/FromStackFile.zip?dl=0
How does this happen? This is interesting to say the least.
Thanks
I could not reproduce the slow loading using the latest Slicer Preview Release. Using DCMTK, GDCM, and the archetype reader, import+loading time was between 48 and 75 seconds (sometimes one, sometimes the other was slightly faster).
>>> def loadDICOM(dicomDataDir):
... loadedNodeIDs = [] # this list will contain the list of all loaded node IDs
... import time
... from DICOMLib import DICOMUtils
... with DICOMUtils.TemporaryDICOMDatabase() as db:
... DICOMUtils.importDicom(dicomDataDir, db)
... patientUIDs = db.patients()
... for patientUID in patientUIDs:
... loadedNodeIDs.extend(DICOMUtils.loadPatientByUID(patientUID))
... return loadedNodeIDs
...
>>>
>>>
>>>
>>> startTime = time.time()
>>> loadDICOM(r"c:\tmp\20210727-dicom-slow-load\UNZ\heap")
Switching to temporary DICOM database: C:/Users/andra/AppData/Local/Temp/Slicer/20210728_040253_TempDICOMDatabase
Loading with imageIOName: DCMTK
['vtkMRMLScalarVolumeNode1']
>>> print(f"Elapsed: {time.time()-startTime}")
Elapsed: 75.61358332633972
>>>
>>> startTime = time.time()
>>> loadDICOM(r"c:\tmp\20210727-dicom-slow-load\UNZ\stack")
Switching to temporary DICOM database: C:/Users/andra/AppData/Local/Temp/Slicer/20210728_040408_TempDICOMDatabase
Loading with imageIOName: DCMTK
['vtkMRMLScalarVolumeNode2']
>>> print(f"Elapsed: {time.time()-startTime}")
Elapsed: 66.1774833202362
>>>
>>>
>>>
>>> startTime = time.time()
>>> loadDICOM(r"c:\tmp\20210727-dicom-slow-load\UNZ\heap")
Switching to temporary DICOM database: C:/Users/andra/AppData/Local/Temp/Slicer/20210728_040550_TempDICOMDatabase
Loading with imageIOName: GDCM
['vtkMRMLScalarVolumeNode3']
>>> print(f"Elapsed: {time.time()-startTime}")
Elapsed: 52.05962634086609
>>>
>>> startTime = time.time()
>>> loadDICOM(r"c:\tmp\20210727-dicom-slow-load\UNZ\stack")
Switching to temporary DICOM database: C:/Users/andra/AppData/Local/Temp/Slicer/20210728_040642_TempDICOMDatabase
Loading with imageIOName: GDCM
['vtkMRMLScalarVolumeNode4']
>>> print(f"Elapsed: {time.time()-startTime}")
Elapsed: 55.64105939865112
>>>
>>>
>>>
>>> startTime = time.time()
>>> loadDICOM(r"c:\tmp\20210727-dicom-slow-load\UNZ\heap")
Switching to temporary DICOM database: C:/Users/andra/AppData/Local/Temp/Slicer/20210728_041449_TempDICOMDatabase
['vtkMRMLScalarVolumeNode5']
>>> print(f"Elapsed: {time.time()-startTime}")
Elapsed: 48.70469260215759
>>>
>>> startTime = time.time()
>>> loadDICOM(r"c:\tmp\20210727-dicom-slow-load\UNZ\stack")
Switching to temporary DICOM database: C:/Users/andra/AppData/Local/Temp/Slicer/20210728_041538_TempDICOMDatabase
['vtkMRMLScalarVolumeNode6']
>>> print(f"Elapsed: {time.time()-startTime}")
Elapsed: 55.29209494590759
>>>
>>>
I’ll check this out. Could it be in the way I’m launching my module? Here’s the command I’m using …
dicomDataDir = "C:/Stack-vs-HeapMPRLoadingTimes/Stack" # input folder with DICOM files
loadedNodeIDs = [] # this list will contain the list of all loaded node IDs
from DICOMLib import DICOMUtils
with DICOMUtils.TemporaryDICOMDatabase() as db:
DICOMUtils.importDicom(dicomDataDir, db)
patientUIDs = db.patients()
for patientUID in patientUIDs:
loadedNodeIDs.extend(DICOMUtils.loadPatientByUID(patientUID))
Thanks
If you use the latest Slicer preview and run that code in the python interactor, do the results change?
Could be a change between your Slicer May 15th version and the latest Slicer preview. Download the slicer preview from https://download.slicer.org/
ok, thanks I will do that.
But isn’t it strange? There has to be something different about the Heap-generated files. I don’t know if it would be to the Slicer community’s advantage to know what causes this phenomenon, but I am perplexed by it.
Thanks
Same results
Attach VerySleepy (a free profiler) and see what the most time Slicer spends with in both cases.
Also, try DCMTK, GDCM, and the archetype DICOM reader methods.
I used Slicer 4.13.0-2021-07-27 and using the same profiling code that @lassoan posted about earlier and I had the following results similar to him:
Each time the load code was run I started Slicer fresh.
FromStackFile: Elapsed: 39.98049068450928
FromHeap: Elapsed: 34.227794885635376
I did observe about 27000 log messages during loading of each I would suspect if sometime took longer to load it would be due to logging more messages. Review the log in Help->Report A Bug
Good idea about the log messages @jamesobutler. I’ve seen slowdowns when the log gets long, so @spycolyf try comparing a fresh restart of slicer when generating the timings. If the log is a bottleneck we can definitely address that.
So, yes I think you hit the nail right on the head, 929 pairs of these in the Stack log file…
[CRITICAL][Stream] 27.07.2021 18:23:44 [] (unknown:0) - W: DcmMetaInfo: No Group Length available in Meta Information Header
[CRITICAL][Stream] 27.07.2021 18:23:44 [] (unknown:0) - W: DcmItem: Length of element (7fe0,0010) is not a multiple of 2 (VR=OW)
… and 929 of these in the Heap log file…
[CRITICAL][Stream] 27.07.2021 18:10:32 [] (unknown:0) - W: DcmMetaInfo: No Group Length available in Meta Information Header
[CRITICAL][Stream] 27.07.2021 18:10:32 [] (unknown:0) - W: DcmItem: Dataset not in ascending tag order, at element (0018,0015)
[CRITICAL][Stream] 27.07.2021 18:10:32 [] (unknown:0) - W: DcmItem: Dataset not in ascending tag order, at element (0018,0050)
[CRITICAL][Stream] 27.07.2021 18:10:32 [] (unknown:0) - W: DcmItem: Dataset not in ascending tag order, at element (0018,0060)
[CRITICAL][Stream] 27.07.2021 18:10:32 [] (unknown:0) - W: DcmItem: Dataset not in ascending tag order, at element (0018,0090)
[CRITICAL][Stream] 27.07.2021 18:10:32 [] (unknown:0) - W: DcmItem: Dataset not in ascending tag order, at element (0018,1000)
[CRITICAL][Stream] 27.07.2021 18:10:32 [] (unknown:0) - W: DcmItem: Dataset not in ascending tag order, at element (0018,1020)
[CRITICAL][Stream] 27.07.2021 18:10:32 [] (unknown:0) - W: DcmItem: Dataset not in ascending tag order, at element (0018,1030)
[CRITICAL][Stream] 27.07.2021 18:10:32 [] (unknown:0) - W: DcmItem: Dataset not in ascending tag order, at element (0018,1100)
[CRITICAL][Stream] 27.07.2021 18:10:32 [] (unknown:0) - W: DcmItem: Dataset not in ascending tag order, at element (0018,1110)
[CRITICAL][Stream] 27.07.2021 18:10:32 [] (unknown:0) - W: DcmItem: Dataset not in ascending tag order, at element (0018,1111)
[CRITICAL][Stream] 27.07.2021 18:10:32 [] (unknown:0) - W: DcmItem: Dataset not in ascending tag order, at element (0018,1120)
[CRITICAL][Stream] 27.07.2021 18:10:32 [] (unknown:0) - W: DcmItem: Dataset not in ascending tag order, at element (0018,1130)
[CRITICAL][Stream] 27.07.2021 18:10:32 [] (unknown:0) - W: DcmItem: Dataset not in ascending tag order, at element (0018,1140)
[CRITICAL][Stream] 27.07.2021 18:10:32 [] (unknown:0) - W: DcmItem: Dataset not in ascending tag order, at element (0018,1150)
[CRITICAL][Stream] 27.07.2021 18:10:32 [] (unknown:0) - W: DcmItem: Dataset not in ascending tag order, at element (0018,1151)
[CRITICAL][Stream] 27.07.2021 18:10:32 [] (unknown:0) - W: DcmItem: Dataset not in ascending tag order, at element (0018,1152)
[CRITICAL][Stream] 27.07.2021 18:10:32 [] (unknown:0) - W: DcmItem: Dataset not in ascending tag order, at element (0018,1160)
[CRITICAL][Stream] 27.07.2021 18:10:32 [] (unknown:0) - W: DcmItem: Dataset not in ascending tag order, at element (0018,1170)
[CRITICAL][Stream] 27.07.2021 18:10:32 [] (unknown:0) - W: DcmItem: Dataset not in ascending tag order, at element (0018,1190)
[CRITICAL][Stream] 27.07.2021 18:10:32 [] (unknown:0) - W: DcmItem: Dataset not in ascending tag order, at element (0018,1200)
[CRITICAL][Stream] 27.07.2021 18:10:32 [] (unknown:0) - W: DcmItem: Dataset not in ascending tag order, at element (0018,1201)
[CRITICAL][Stream] 27.07.2021 18:10:32 [] (unknown:0) - W: DcmItem: Dataset not in ascending tag order, at element (0018,1210)
[CRITICAL][Stream] 27.07.2021 18:10:32 [] (unknown:0) - W: DcmItem: Dataset not in ascending tag order, at element (0018,5100)
[CRITICAL][Stream] 27.07.2021 18:10:32 [] (unknown:0) - W: DcmItem: Length of element (7fe0,0010) is not a multiple of 2 (VR=OW)
So we need to correct things on our end.
Question: Is the bulk of the time spent writing to the log file? Or is the time spent in detecting and analyzing the issues? In other words, if there are no problems detected, would this reduce the load times for both the stack and the heap versions significantly?
Thanks
I’m pretty sure it’s the log file and the log display dialog. There is code to make it thread safe and I’ve noticed that sometimes output text is significantly delayed for some reason.
I think you are correct though that fixing the dicoms to generate fewer messages is the best path in any case.
Here are the links
I’m in a meeting now about this issue and they want to know …
-
if there is a way to have a rolling log? Or should we manage that ourselves so that we don’t run out of disk space on any of our 50K+ systems?
-
Is there a way to turn logging off so that it’s bypassed completely?
-
When producing the DCMs, would it be OK to exclude the Sequences?
Thanks.
A rolling log sounds like a good solution. I don’t think you want to turn off messages completely. Probably you should manage that on your side or propose a fix to the slicer core. It sounds like an isolated feature that wouldn’t require a lot of changes to the internals of the program.
Best would be to fix the dicom so that not many messages are printed.
Yes, but is there a Python way to turn off logging? We can always turn it back on when we are troubleshooting. Also, I would like to see what the timing differences would be.
Thanks
That’s an idea, but I don’t know if it’s possible.
OK, We will try to eliminate the messages. We are currently looking into the Out Of Order tags. The question is which message can we eliminate in the stack file logs (common to both heap and stack actually)? So, does Slicer normally generate the following log info for all Encapsulated format Pixel Data?
[CRITICAL][Stream] 27.07.2021 18:23:44 [] (unknown:0) - W: DcmMetaInfo: No Group Length available in Meta Information Header
[CRITICAL][Stream] 27.07.2021 18:23:44 [] (unknown:0) - W: DcmItem: Length of element (7fe0,0010) is not a multiple of 2 (VR=OW)
Here is our [7FE0, 0010] pixel data tags for Implicit VR Little Endian encapsulated pixel data…
00008a66 ( 35430) (7FE0,0010) - Pixel Data [OW][UNDEFN][ 1]:
00008a72 ( 35442) (FFFE,E000) - Item [][ 4][ 1]:
00008a7e ( 35454) (FFFE,E000) - Item [][183848][ 0]:
000358ae ( 219310) |--->(FFFE,E0DD) - Sequence Delimitation Item [][ 0][ 0]:
Is the “UNDEFN” the problem?
It’s hard for me to say exactly how to fix these, but those messages give you some ideas where to look. You can compare the structure of your data with samples that don’t generate these messages.
Also this tool can help flag non-conformities: