Getting started with MONAIlabel

Dear Slicer Community,

I have been working with Slicer for a while now and have recently started using the MONAI module in Slicer. However, I’m having some issues getting the output that I’m looking for.

I’ve managed to run the segmentation model to train an initial dataset that I have previously segmented using TotalSegmentator. However, when testing the model there is a large offset in the output as seen below.

I tried to recreate this by training the segmentation model on a subset of the publicly available TotalSegmentator dataset and encountered different errors. It seems that MONAI is struggling with applying a transform (see below):

[2024-09-24 12:20:25,830] [17576] [MainThread] [INFO] (ignite.engine.engine.SupervisedEvaluator:876) - Engine run resuming from iteration 0, epoch 0 until 1 epochs
[2024-09-24 12:32:03,933] [17576] [MainThread] [ERROR] (ignite.engine.engine.SupervisedEvaluator:1086) - Current run is terminating due to exception: applying transform <monai.transforms.compose.Compose object at 0x000001D6A6078CD0>
2024-09-24 12:32:03,933 - ERROR - Exception: applying transform <monai.transforms.compose.Compose object at 0x000001D6A6078CD0>

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 3.16 GiB. GPU

RuntimeError: applying transform <monai.transforms.post.dictionary.AsDiscreted object at 0x000001D6A60781C0>

What kind of transform is MONAI trying to perform?
Could it be a driver issue that CUDA is running out of memory at 3.16GiB? The used GPU is an Nvidia RTX 4090 (24GiB).
Do I need to manually clear the GPU cache?

When I tried to train a different subset of my own data afterwards it would take about a minute after every epoch to resume the engine run (see below).

[2024-09-24 13:45:26,169] [16008] [MainThread] [INFO] (ignite.engine.engine.SupervisedEvaluator:876) - Engine run resuming from iteration 0, epoch 0 until 1 epochs
[2024-09-24 13:46:12,469] [16008] [MainThread] [INFO] (ignite.engine.engine.SupervisedEvaluator:259) - Got new best metric of val_mean_dice: 0.0
2024-09-24 13:46:12,469 - INFO - Epoch[1] Metrics – val_mean_dice: 0.0000 val_skeletal_muscle_mean_dice: 0.0000 val_subcutaneous_fat_mean_dice: 0.0000 val_torso_fat_mean_dice: 0.0000

During the first training that generated the offset, training was a lot quicker. Eventually, the offset was recreated regardless of training and testing both only left shoulder scans and also using mixed unilateral scans.

Finally, is there a way to compare the outputs of different models? Auto Segmentation/Models only shows “segmentation” as an option, even though the different models (all producing offset) have been saved with unique names. Could I even be testing on a wrong model?

I would highly appreciate any inputs. Thank you!

Best regards,
Dennis

Hello again, I’d like to provide an update on the situation, give some more background info, reorganize the issues that I’ve encountered and pose more concise questions.

I have recently started using MONAI for a research project that involves automating the segmentation of the scapula to extract bone density values that then could potentially be used for surgical planning. My background is in biomechanics with some but limited programming skills. However, I have been using 3D Slicer for almost a year now.

I have a dataset of shoulder CT scans that are cropped in a way that the crop axis is not aligned with the scanning axis and when loading them into 3D Slicer, some scans appear rotated. I used TotalSegmentator to segment the scapula and other structures of interest, and the different orientations of the scans did not seem to be a problem.

Since MONAI seems to be preprocessing the data with transforms etc., I assume that the differently oriented scans should not cause any issues when training a segmentation model in MONAI. Is that correct?

To get to know how MONAI works, I tested segmenting the spleen dataset on the pretrained spleen model and that worked fine.

As a first test using my own data, I trained a subset of 7 scans (average dimensions: 1000x500x150, average spacing: 0.25x0.25x1mm) with 3 segments each (adjusted segmentation.py in configs accordingly) using the segmentation model for 500 epochs on an Nvidia RTX 4090 with 24GiB. The training was successfully completed in about an hour.

Does this training duration fall into the expected time range?

I then wanted to test the performance of the trained model using the Auto Segmentation tab. The dropdown menu in that tab had “segmentation” as the only option. I loaded the next sample that didn’t have any ground truth segmentations and hit run. The resulting segmentations had roughly the desired shape, but they were offset so much, that they lay outside the body (see image below).

Am I testing on my own model?

How can I specify which model I want to test on?

Are these results to be expected or is that sufficient training to expect better results?

Do I need to increase the number of scans and epochs to get better results?

After restarting the server in the same app with a different dataset loaded, I trained another segmentation model giving it a different name in the options tab. After successful training, there still is only the same single model (“segmentation”) available in the Auto Segmentation tab.

Which model am I testing on if I load the next unlabeled scan and hit run?

How can I choose which model I want to test on?

Is it even possible to have multiple segmentation models in the same app or do I have to create a new app for every model?

Is there a way to visualize the segmentation outputs from the validation scans when val_split is >0 / when there are scans used for validation?

To check if there is something wrong with my dataset I tried to reproduce the offset results. I used a subset (16 chest CT scans) of the TotalSegmentator dataset and ran TotalSegmentator on those scans to get the needed segmentations. Then, I loaded the data into the same app and started training, again with the segmentation model. After the first epoch, it timed out giving the following error message:

[2024-09-24 12:20:11,585] [17576] [MainThread] [INFO] (monailabel.tasks.train.basic_train:264) - 0 - Records for Training: 16
[2024-09-24 12:20:11,588] [17576] [MainThread] [INFO] (ignite.engine.engine.SupervisedTrainer:876) - Engine run resuming from iteration 0, epoch 0 until 100 epochs
2024-09-24 12:20:13,821 - INFO - Epoch: 1/100, Iter: 1/16 – train_loss: 2.7914
2024-09-24 12:20:14,936 - INFO - Epoch: 1/100, Iter: 2/16 – train_loss: 2.2761
2024-09-24 12:20:15,729 - INFO - Epoch: 1/100, Iter: 3/16 – train_loss: 1.9255
2024-09-24 12:20:16,620 - INFO - Epoch: 1/100, Iter: 4/16 – train_loss: 1.9023
2024-09-24 12:20:17,308 - INFO - Epoch: 1/100, Iter: 5/16 – train_loss: 1.9119
2024-09-24 12:20:18,056 - INFO - Epoch: 1/100, Iter: 6/16 – train_loss: 2.3988
2024-09-24 12:20:18,838 - INFO - Epoch: 1/100, Iter: 7/16 – train_loss: 1.9958
2024-09-24 12:20:19,639 - INFO - Epoch: 1/100, Iter: 8/16 – train_loss: 1.9923
2024-09-24 12:20:20,427 - INFO - Epoch: 1/100, Iter: 9/16 – train_loss: 2.3822
2024-09-24 12:20:21,259 - INFO - Epoch: 1/100, Iter: 10/16 – train_loss: 2.1640
2024-09-24 12:20:22,024 - INFO - Epoch: 1/100, Iter: 11/16 – train_loss: 1.8334
2024-09-24 12:20:22,963 - INFO - Epoch: 1/100, Iter: 12/16 – train_loss: 1.8153
2024-09-24 12:20:23,869 - INFO - Epoch: 1/100, Iter: 13/16 – train_loss: 1.9521
2024-09-24 12:20:24,715 - INFO - Epoch: 1/100, Iter: 14/16 – train_loss: 1.7093
2024-09-24 12:20:25,162 - INFO - Epoch: 1/100, Iter: 15/16 – train_loss: 2.0465
2024-09-24 12:20:25,813 - INFO - Epoch: 1/100, Iter: 16/16 – train_loss: 2.4063
[2024-09-24 12:20:25,819] [17576] [MainThread] [INFO] (ignite.engine.engine.SupervisedTrainer:259) - Got new best metric of train_mean_dice: 0.04801511764526367
2024-09-24 12:20:25,819 - INFO - Epoch[1] Metrics – train_mean_dice: 0.0480 train_skeletal_muscle_mean_dice: 0.1105 train_subcutaneous_fat_mean_dice: 0.0133 train_torso_fat_mean_dice: 0.0000
2024-09-24 12:20:25,819 - INFO - Key metric: train_mean_dice best value: 0.04801511764526367 at epoch: 1
[2024-09-24 12:20:25,830] [17576] [MainThread] [INFO] (ignite.engine.engine.SupervisedEvaluator:876) - Engine run resuming from iteration 0, epoch 0 until 1 epochs
[2024-09-24 12:32:03,933] [17576] [MainThread] [ERROR] (ignite.engine.engine.SupervisedEvaluator:1086) - Current run is terminating due to exception: applying transform <monai.transforms.compose.Compose object at 0x000001D6A6078CD0>
2024-09-24 12:32:03,933 - ERROR - Exception: applying transform <monai.transforms.compose.Compose object at 0x000001D6A6078CD0>

I don’t understand this error.

Is MONAI struggling to transform the scans?

Or is the GPU struggling to process that amount of data?

How much data can a single Nvidia RTX 4090 (24GiB) handle?

Is there a buildup of cache when training multiple models in the same app?

With this comparison failed, I set out to train another model with a different subset of my own data. This time I loaded 13 scans with roughly the same dimensions and spacing (as the original dataset) and trained again. Training took a lot longer this time. After every epoch training seemed to pause for about one minute at this point (same as where the error happened previously):

[2024-09-24 13:45:24,528] [16008] [MainThread] [INFO] (monailabel.tasks.train.basic_train:264) - 0 - Records for Training: 13
[2024-09-24 13:45:24,530] [16008] [MainThread] [INFO] (ignite.engine.engine.SupervisedTrainer:876) - Engine run resuming from iteration 0, epoch 0 until 100 epochs
2024-09-24 13:45:24,786 - INFO - Epoch: 1/100, Iter: 1/13 – train_loss: 2.4122
2024-09-24 13:45:24,878 - INFO - Epoch: 1/100, Iter: 2/13 – train_loss: 2.3242
2024-09-24 13:45:24,992 - INFO - Epoch: 1/100, Iter: 3/13 – train_loss: 2.0153
2024-09-24 13:45:25,109 - INFO - Epoch: 1/100, Iter: 4/13 – train_loss: 1.6996
2024-09-24 13:45:25,225 - INFO - Epoch: 1/100, Iter: 5/13 – train_loss: 2.2290
2024-09-24 13:45:25,340 - INFO - Epoch: 1/100, Iter: 6/13 – train_loss: 1.4649
2024-09-24 13:45:25,456 - INFO - Epoch: 1/100, Iter: 7/13 – train_loss: 1.4484
2024-09-24 13:45:25,570 - INFO - Epoch: 1/100, Iter: 8/13 – train_loss: 1.4148
2024-09-24 13:45:25,686 - INFO - Epoch: 1/100, Iter: 9/13 – train_loss: 1.3817
2024-09-24 13:45:25,802 - INFO - Epoch: 1/100, Iter: 10/13 – train_loss: 2.7751
2024-09-24 13:45:25,925 - INFO - Epoch: 1/100, Iter: 11/13 – train_loss: 2.1342
2024-09-24 13:45:26,040 - INFO - Epoch: 1/100, Iter: 12/13 – train_loss: 1.7436
2024-09-24 13:45:26,153 - INFO - Epoch: 1/100, Iter: 13/13 – train_loss: 1.3803
[2024-09-24 13:45:26,157] [16008] [MainThread] [INFO] (ignite.engine.engine.SupervisedTrainer:259) - Got new best metric of train_mean_dice: 0.0
2024-09-24 13:45:26,157 - INFO - Epoch[1] Metrics – train_mean_dice: 0.0000 train_skeletal_muscle_mean_dice: 0.0000 train_subcutaneous_fat_mean_dice: 0.0000 train_torso_fat_mean_dice: 0.0000
2024-09-24 13:45:26,157 - INFO - Key metric: train_mean_dice best value: 0.0 at epoch: 1
[2024-09-24 13:45:26,169] [16008] [MainThread] [INFO] (ignite.engine.engine.SupervisedEvaluator:876) - Engine run resuming from iteration 0, epoch 0 until 1 epochs
[2024-09-24 13:46:12,469] [16008] [MainThread] [INFO] (ignite.engine.engine.SupervisedEvaluator:259) - Got new best metric of val_mean_dice: 0.0
2024-09-24 13:46:12,469 - INFO - Epoch[1] Metrics – val_mean_dice: 0.0000 val_skeletal_muscle_mean_dice: 0.0000 val_subcutaneous_fat_mean_dice: 0.0000 val_torso_fat_mean_dice: 0.0000
2024-09-24 13:46:12,469 - INFO - Key metric: val_mean_dice best value: 0.0 at epoch: 1
[2024-09-24 13:46:12,553] [16008] [MainThread] [INFO] (monailabel.tasks.train.handler:86) - New Model published: C:\Users\Eva\radiologyBoneDensity\model\segmentation\test_rem_pat\model.pt => C:\Users\Eva\radiologyBoneDensity\model\segmentation.pt
[2024-09-24 13:46:12,554] [16008] [MainThread] [INFO] (ignite.engine.engine.SupervisedEvaluator:972) - Epoch[1] Complete. Time taken: 00:00:46.368
[2024-09-24 13:46:12,555] [16008] [MainThread] [INFO] (ignite.engine.engine.SupervisedEvaluator:988) - Engine run complete. Time taken: 00:00:46.386
[2024-09-24 13:46:12,607] [16008] [MainThread] [INFO] (ignite.engine.engine.SupervisedTrainer:972) - Epoch[1] Complete. Time taken: 00:00:48.035
2024-09-24 13:46:12,835 - INFO - Epoch: 2/100, Iter: 1/13 – train_loss: 1.7215
2024-09-24 13:46:13,029 - INFO - Epoch: 2/100, Iter: 2/13 – train_loss: 1.3821
2024-09-24 13:46:13,217 - INFO - Epoch: 2/100, Iter: 3/13 – train_loss: 1.6698
2024-09-24 13:46:13,416 - INFO - Epoch: 2/100, Iter: 4/13 – train_loss: 1.7367
2024-09-24 13:46:13,609 - INFO - Epoch: 2/100, Iter: 5/13 – train_loss: 1.7043
2024-09-24 13:46:13,824 - INFO - Epoch: 2/100, Iter: 6/13 – train_loss: 1.6097
2024-09-24 13:46:14,019 - INFO - Epoch: 2/100, Iter: 7/13 – train_loss: 2.4585
2024-09-24 13:46:14,211 - INFO - Epoch: 2/100, Iter: 8/13 – train_loss: 1.6047
2024-09-24 13:46:14,419 - INFO - Epoch: 2/100, Iter: 9/13 – train_loss: 1.8360
2024-09-24 13:46:14,642 - INFO - Epoch: 2/100, Iter: 10/13 – train_loss: 1.3524
2024-09-24 13:46:14,860 - INFO - Epoch: 2/100, Iter: 11/13 – train_loss: 1.7037
2024-09-24 13:46:15,045 - INFO - Epoch: 2/100, Iter: 12/13 – train_loss: 1.4732
2024-09-24 13:46:15,241 - INFO - Epoch: 2/100, Iter: 13/13 – train_loss: 1.3968
[2024-09-24 13:46:15,256] [16008] [MainThread] [INFO] (ignite.engine.engine.SupervisedTrainer:259) - Got new best metric of train_mean_dice: 0.04763128608465195
2024-09-24 13:46:15,257 - INFO - Epoch[2] Metrics – train_mean_dice: 0.0476 train_skeletal_muscle_mean_dice: 0.0953 train_subcutaneous_fat_mean_dice: 0.0000 train_torso_fat_mean_dice: 0.0000
2024-09-24 13:46:15,257 - INFO - Key metric: train_mean_dice best value: 0.04763128608465195 at epoch: 2
[2024-09-24 13:46:15,259] [16008] [MainThread] [INFO] (ignite.engine.engine.SupervisedEvaluator:876) - Engine run resuming from iteration 0, epoch 1 until 2 epochs
[2024-09-24 13:47:01,627] [16008] [MainThread] [INFO] (ignite.engine.engine.SupervisedEvaluator:259) - Got new best metric of val_mean_dice: 0.024709107354283333
2024-09-24 13:47:01,627 - INFO - Epoch[2] Metrics – val_mean_dice: 0.0247 val_skeletal_muscle_mean_dice: 0.0741 val_subcutaneous_fat_mean_dice: 0.0000 val_torso_fat_mean_dice: 0.0000

I expected training to take longer in a somewhat exponential way, but I didn’t expect such a big difference (around 5secs per epoch in the first training with 7 samples vs 1min per epoch with 13 samples).

What is happening in that step?

Is this increase in training time expected?

Any input would be greatly appreciated.

I would guess not and that you need to resample the volumes and labelmaps to the same pixel space for MONAI.

As for your other questions, they may get resolved after you get past this first issue. But if not, maybe try asking one small question per post.