we performed a CT segmentation for which we placed multiple labels into one dataset.
Now when we start extracting the features using either the command line or a Jupyter notebook, the image and mask data need to be loaded for each label, even if they are from the same image data. Is there a way to speed up the extraction in the way that the extractor loads the image and mask once and then goes through all the labels? Loading image and mask into a variable and the parsing it to the extractor speeded up things a littleā¦
If you do this from a jupyter notebook you can load the image and mask as SimpleITK image objects, which you can pass instead of the path to the corresponding files. Then the image and mask are loaded only once. All other processing steps are repeated for each computation, as they are ROI dependent (e.g. the resampling only resamples the ROI and a limited region around it for computational and memory efficiency).
On the commandline there is no option to prevent reloading the image, but you can enable parallel processing to speed things up a bit (--jobs). This parallelizes the process on the case level (e.g. each thread processes a single case, multiple threads can run in parallel). Especially handy in case of large datasets.