mpReviewPreprocessor code change for automated DICOM to NRRD conversion

Dear Dr. Fedorov and 3DSlicer support,

for an academic project, we are building a database out of anonymized DICOM files (only the directory of the DICOM contains information about age and sex, the metadata is not usable), and we would first have liked to convert them into easier manageable NRRD files.
The Idea is to loop over all the DICOM files with the mpReviewPreprocessor in batch mode, for further batch processing with registration and segmentation.

I would have liked to change the code of the mpReviewPreprocessor script to use the name of the input directory to name the NRRD file by using something like:

(filepath, filename) = os.path.split(inputDir)
nrrdName = os.path.join(dirName, filename + ".nrrd")

But not being a software developper, I have still difficulties with classes methods and attributes, and even just trying to insert a string called tempName in the convertData() function to change the dir, xml and nrrd names, which could be changed by an other script later, makes the code crash.
I also tried to add one more variable called inputDir in the convertData function, and all parent functions, but it made also the code unusable.
I added a picture of the part of the code that I think needs to be changed for my problem, and one of my non working solution.

image

image
Adding the temporary file names seemed to be the easiest way to change the data tree, and after conversion to change the file names, but even after disabling everything involving the creation of the xml file the code is unable to create the nrrd.
Here is the error message :
image
image

I know I made a lot of mistakes trying to modify the script nor had I respect to the art of writing good code, but I would be really greatfull if you could help me out on this issue.

I thank you in advance,

Kristof S.

Kristof, does it work for your data if you do not change anything at all in the code?

There are multiple issues with using Slicer for generating volume-reconstructed files in batch mode: indexing the database and running DICOM plugins is slow, with the current code, if one series fails, it is not easy to track down what happened. I ran into those issues again for the latest project where I was using this code, and am now thinking to explore switching to dcm2niix for volume reconstruction (at least for scalar volumes, if not multivolumes). The idea is to leave mpReviewPreprocessor as is, but add an option to use dcm2niix in place of the DICOM scalar volume plugin (and perhaps avoid indexing the data into the database, hopefully eventually cutting the dependency on the Slicer application for this task).

I have multiple deadlines this week, and I am traveling next week, but I will look into this hopefully sometime soon.

If you want to explore the use of dcm2niix in this script for generating the reconstructed volumes, your contributions are most welcomed!

Dr. Fedorov, thank you very much for your answer.

Yes, the code is working when I leave it as it was, and I tried it again after cleaning up the code involved in the creation of the xml file and fixing some of my syntax errors it seems to generate working NRRD files even with the error message.
But as you warned me about the potential issues involved in the use of mpReviewPreprocessor and it’s relative slowness I will follow your advice and explore dcm2niix.

This project will last a few more month, so there will be time to explore, understand and maybe contribute some code if the project is viable.

1 Like

@Kramer84 I added mpReviewPreprocessor2 script, which assumes the data was first sorted using dicomsort into the hierarchy used by mpReview, and then uses dcm2niix for generating volume reconstructions in NIfTI format. I also updated mpReview to recognize NIfTI files as acceptable input. This new preprocessor should be faster, a lot easier to use, and a lot easier to understand and modify.

One feature that is missing from the new processor at this point is that it does not process multivolumes. Slicer default volume reader is not able to read NIfTI 4d images, and so they will require special handling.

I committed this new code to the repo - it would be great if you could take a look, test and give feedback if it works for your data.

1 Like

@fedorov I have taken a look at the different parts of your script and the manual, and have just yet understood the way mpReview works. We are actually working in the field of maxilla-cranial surgery, with head MRIs where only the skull has to be extracted, and we didn’t really bothered nor knew about multi-parametric reviewing. So at was at the beginning surprised of seeing that much variation between the number of NIfTI files in Reconstructions folders. But after verification we indeed have variations in the X-Ray exposure for some MRIs.

Does it make a difference in the process of extracting the bones to have MRIs with different exposure? Or would it make segmentation or metallic artifact reduction easier?

For the testing on our files on mpReviewpreprocessor2 :

When running dicomsort , as all the DICOM files are anonymized, the file path tends to be relatively unreliable, as it takes empty or encrypted metadata. But mpReviewpreprocessor 2 does convert the files in the chosen output directory in the Reconstructions folder, and constructs readable NifTI files.

Here the path organized by dicomsort and the Reconstructions folder made by mpReviewpreprocessor2:
image

While walking through the files I saw that indeed some of the head MRIs that were multiparametric with a separate NIfTI file for each parameter
image
The way your script was working is interesting because it allows us to treat DICOMs in bulks, and so to be a first step to automate the extraction process of the skulls and metallic artifacts, so we can eliminate human input in our data and keep consistency.

But I also tried to use dcm2niix to make something more problem specific, as it is possible to batch convert whole DICOM directories with the dcm2niixbatch batch_config.yaml command and to chose the file name in the YAML script.

I tried to write a function that allows me to create the batch_config.yaml file and that takes as inputs the directory containing the DICOM files and the empty directory where we want the NIfTl files to be created, also giving the name of the directory containing the DICOM images to the NIfTI files.
The function also creates the new directories containing the NIfTI files, and organize the batch_config.yaml file.

After some tests, the function returned a batch_config.yaml file, and creates the empty directories for the NifTI files, but the YAML file itself has still some issues, with the inversion of the Files and Options dictionary, and bad indentation.
Empty directories, and non-conform yaml file:
image
image
Here is the code involved in the creation of the YAML file:

!/usr/bin/python3
import argparse, sys, shutil, os, logging	
import ruamel.yaml as ry
from ast import literal_eval
from pathlib import Path

def batch_creator(inputDir , outputDir):
	
	#options have to be (for now) manually changed
	optionStr = "{'Options': {'isGz': False, 'isFlipY': False, 'isVerbose': False, 'isCreateBIDS': False, 'isOnlySingleFile': False},"

	#dirIn contains all the dicom files, dirOut is empty, the yaml file is created in the dirOut directory (for testing purposes)

	dirIn   = inputDir
	dirOut  = outputDir
	yamlDir = outputDir		#for testing issues the yaml file is also in the output directory

	listDicomDir = os.listdir(dirIn)
	n = len(listDicomDir)

	#FileArray represents the dictionary (in string format) that has to be further converted into yaml

	FileArray = []

	#while iterating through the whole directory containing the Dicom files, we create for each Dicom a corresponding folder @ dirOut
	for i in range (0,n): 
		tempName = listDicomDir[i]
		outPathName = dirOut+"/"+tempName+"_dcm2niix"	
		os.mkdir(Path(outPathName))  
		FileArray.extend([{'in_dir': dirIn+"/"+tempName ,'out_dir': outPathName , 'filename': tempName+"_dcm2niix" }]) #dictionary in string format
    
    
    
	
	AlmostYaml = optionStr+" 'Files':"+str(FileArray)+"}"		 #here we join the Options and Files part of the yaml file

	dict_batchFile = literal_eval(AlmostYaml)			 #this should convert the string into a python dictionary
	
	save_path = Path(yamlDir)					 #then we create the empty yaml file in the Out directory
	batch_config = open(os.path.join(save_path,"batch_config.yaml"),"w+")
	
	ry.dump(dict_batchFile,batch_config, default_flow_style=False) #and dump our finished yaml

I will try to find the mistakes that I made in the function and yaml syntax, and update it here later. The conversion should then be done relatively easy.

In the end, the database should be (it’s utopic) consistent enough to train the convolutional neural network based metal artifact reduction algorithm presented here , or to make statistical analysis on the maxillofacial framework.

@Kramer84 how is your progress? Do you have any specific questions about this?