Hi, I first want to say how impressed I am with the persistent and comprehensive support provided by the slicer team on these forums!
Now for the boring part…
I’m looking to programmatically convert an RTstruct file to a nrrd binary mask via python using the slicer python interpreter for a particular contour. I have attempted running the batch processing module in RTslicer via command line but it a) did not operate as intended and b) purports to convert all of the potential segmentations. For these reasons I would prefer to code the iterative conversion myself. Unfortunately, I’m a little lost with respect to which functions to use in order to achieve this goal. I was having similar troubles with converting a dicom series to nrrd until I found @lassoan had made an excellent post describing a procedure that helped me a lot. Something similar to the linked post would be ideal, but any guidance would be much appreciated!
It is always better both for you and the community to fix something existing instead of implementing it again from scratch, especially if the purpose of the tool you decide not to use is exactly what you need. Please describe why the BatchProcessing module in SlicerRT does not work for you. We are happy to help you adapt it.
I suppose you’re correct. Let’s get into it then, I’m working with (as a start) the MAASTRO Lung1 dataset available on TCIA that you may be familiar with. I downloaded it via the NBIA retriever. Within the dataset each patient has a corresponding folder containing 3 folders: one for the dicom series, one for the RTstruct and one for the segmentation.
My understanding was that running the script included in the “readme” in the command line was meant to convert all of the RT structs (each segmentation within) present in the entire dataset (containing all of the patient folders) into binary mask nrrds that would be stored in the output folder. In short I used the following as guidance(from the readme):
[path/]Slicer.exe --no-main-window --python-script [path/]BatchStructureSetConversion.py --input-folder input/folder/path --output-folder output/folder/path
(Optionally use -i and -o instead of the long argument names)
–ref-dicom-folder (-r): Folder containing reference anatomy DICOM image series, if stored outside the input study
–use-ref-image (-u): Use anatomy image as reference when converting structure set to labelmap
–exist-db (-x): Process an existing database instead of importing data in a new one (in this case --input-folder is a database and not a folder containing DICOM data)
–export-images (-m): Export all image data alongside the labelmaps to NRRD
Because the folder housing all the patient folders contains both the structs and dicom series I did not use a dicom reference folder or reference image. I did use -m because I figured it may be more convenient for future implementation if it worked.
Upon running the script I found that only the segmentations from (what appears to be) the first patient appeared “LUNG-001” in the output folder, rather than nrrds for the RT struct of every patient. It’s possible that there was an error but I found it difficult to troubleshoot without error messages. I will say that upon examining the generated output for this singular patient, I was very happy with the quality of conversion!
Beyond the error, the file names produced by the script do not lend themselves to facile interpretation and every contour from the RT struct is saved as a separate file whereas I’m seeking to only extract the gross tumour volume (GTV). Granted, this is a very minor issue as I could filter out the unnecessary files out afterwards and computation efforts are not a significant for me in this case.
Let me know what further information I can provide or if I’ve misinterpreted the use-case of the script that I employed as I am somewhat of a neophyte when it comes to this stuff.
System: Windows 10 pro
Slicer version: 4.11.20200930
Are you sure the image and the RT struct are in the same DICOM study? You can check it if you import the folder to Slicer’s DICOM database and see if selecting one study (middle table) you’ll see both series on the bottom.
I’m not sure I understand this. What do you mean by segmentation here? The RT struct is a segmentation. You wrote above that there is a third object in the folders, which is a segmentation. Are you talking about that? If so, what format is it?
This is a feature that could be added potentially.
The RT struct and dicom series for each patient are attributed to the same study, and I believe that they are mapping onto each other as the results for the first patient are great. I’ve attached an image from slicer for your reference in case I’m missing something. Note that there appears to be no study ID/description (this is the case for all patients), I’m not sure if that may be causing the issue in loading the subsequent patients? Although, it’s strange that this didn’t present to be an issue for the first patient if that’s the case.
Apologies for inconsistent terminology. To clarify, my issue is that upon running the script I only found nrrd masks corresponding to the RT struct segmentations (and corresponding volume nrrd with command ‘-m’) for the first patient, rather than all patients.
The third object in the folders I was referring as a segmentation was a .dcm segmentation object, you’re welcome to disregard this detail as I have since removed those files from my database and ran the script again to the same result (conversion of RT struct and dicom series to nrrds files for the first patient only).
OK then can you please share at least two of the patients (you said only the first patient is converted so I guess two will do it)? If not then please try to reproduce it with two freely accessible patients (for example from TCIA). And also your command line command. Thanks!
Just wondering if you’ve had a chance to look at this and whether you think a fix would be straightforward to implement. Otherwise, I would appreciate a reference to some functions in the slicer API that would suit my application in the meantime.
I took a look at it and ran the script on a folder containing four of the patients in the large 33GB dataset you referenced.
Based on the console output and the code, it seems that the current version of the script only supports processing the first patient if you do not use an existing database (--exist-db (-x) argument). I don’t see any reason for this limitation and actually it seems easy to fix.
Your command line should work, and instead of only processing the first patient it will process all patients, and will put the output of each patient in its own folder named the patient database ID (i.e. incremental numbers). You can try it in tomorrow’s preview version.