MONAI Auto3DSeg – inconsistent performance for vertebral body segmentation

Hi everyone,

I’m currently working with Prof. Ron Alkalay on a spine CT segmentation project using MONAI Auto3DSeg and would appreciate any advice from the community.

My task is to segment only vertebral bodies. I reformulated it as a binary segmentation task (VB vs. non-VB), which initially improved the validation Dice to >70%.

However, when I ran a second round of training, the performance dropped noticeably and did not recover, even though the training process itself ran normally.

Between the two runs, I changed the following parameters:

  • Resample resolution
    From: (0.3125, 0.3125, 0.5)
    To: (0.3125, 0.3125, 1.0)

  • ROI size
    From: (128, 128, 64)
    To: (128, 128, 96)

Other than these changes, the setup and data split were kept the same.

In both experiments, I used Auto3DSeg’s Quick training mode.

Dataset details

  • imagesTr: ~60 GB, 257 CT volumes (.nii.gz)

  • labelsTr_bin: ~385 MB, 257 binary segmentation masks (.nii.gz)

Some additional details:

  • Tool: MONAI Auto3DSeg - segresnet_0

  • Task: Vertebral body vs. background (binary)

  • Modality: CT

  • Tried adjusting: ROI size, AMP, batch/auto-scaling

I’m trying to understand whether this performance drop is likely related to the coarser through-plane resolution, the larger ROI depth, Quick mode limitations, or some interaction with Auto3DSeg’s preprocessing and model selection.

If anyone has experience using Auto3DSeg for similar anatomical segmentation tasks, I would really appreciate any guidance on what to sanity-check first, or best practices for tuning these parameters.

I’m happy to share configs or logs if helpful.

Thank you very much in advance!

I have tried Auto3DSeg for training but not extensively. The results were good, but not noticeably better than nnU-Net, so I’ve standardized on nnU-Net for a the last few years and have generally found that it behaves predictably as more data is added for training. I don’t use any resampling or cropping, only segmentations and native resolution CT data.

The details change for each dataset, but the basic format for me is still as demonstrated in this nnmouse repo.

The format of the input directories is a little different between Auto3DSeg and nnU-Net, but they are close enough that you can accomplish the renaming just with symbolic links, so you don’t need to duplicate the data.

Was there a particular reason you wanted to use Auto3DSeg instead of nnU-Net?