SAM 2 with video support

Meta’s Segment Anything Model 2 was announced last week. Are there any thoughts about its applicability to 3D images?

1 Like

Maybe this is of interest:

1 Like

Also this: [2408.03322] Segment Anything in Medical Images and Videos: Benchmark and Deployment

With a Slicer module! GitHub - bowang-lab/MedSAMSlicer at SAM2

2 Likes

I was aware of MedSAMSlicer, I didn’t notice they have a SAM2 branch :smile: Thank you for pointing it out.

1 Like

They uploaded their preprint five days after the SAM 2 preprint

1 Like

Many people celebrate performance of SAM when they see some demo videos, but they don’t realize that they are looking at trivial segmentation tasks (e.g., kidney segmentation on a single CT slice where no similar structures are nearby) and they don’t notice when the segmentation spectacularly fails on moderately difficult tasks (such as missing half of the liver). See for example @alireza’s recent experiments with SAM2 for 3D medical image segmentation - with quite poor results: Alireza Sedghi on LinkedIn: SAM 2 Released: 3D Medical Image Segmentation Solved? I got a screen… | 18 comments

SAM performance is remarkable when used interactively on 2D images of everyday objects, the well-engineered web demos make SAM easily available for the crowds, and open-sourcing the model is examplary. It is also nice that it is a general-purpose tool that could be made to work on any imaging modality to interactively segment any structure. However, 99% users will not want any interactivity in the segmentation, and don’t want to work slice-by-slice; they just want fast, fully automatic 3D segmentation for free - and they can already get it via TotalSegmentator, MONAIAuto3DSeg, DentalSegmentator, etc. So, my overall impression is that considering their clinical relevance and impact, SAM-based models may not deserve as much attention as they are getting.

2 Likes

TotalSegmentator, MONAIAuto3DSeg, DentalSegmentator, etc.

Ground truth for training them needs to come from somewhere. SAM is a great interactive segmentation tool.

Yes, and this is its main limitation: it requires interaction. If the user needs to interact with the image for several (potentially tens of) seconds then the value is questionable, because the clinician can make the standard, well-proven measurements in about the same time, without 3D segmentation. Having a 3D segmentation may have some extra value, but the cost of required extra time of the clinician usually makes this a tough sell.

In contrast, fully automatic segmentation is clinically useful and impactful, because it both reduces the clinician’s effort and can provider richer results. Interactive segmentation tools cannot ever come close to automatic methods in routine clinical use.

Interactive methods may still play a role for research, in generating ground truth training data, or help with very difficult segmentations. But this is a very small arena, very crowded with various tools, and SAM-based tools do not seem to stand out in any way - they are not particularly well positioned for solving very hard problems, for doing 3D segmentations, for providing robust, bias-free, consistent, or anatomically correct segmentations.

SAM/SAM2/MedSAM/SAM-Med3D/etc. might find their niche where they can be useful in medical imaging, maybe in the future they can even carve out a larger area where they are successful. I guess I just don’t understand the excitement about them, when neither the current performance nor future prospects look so great. Anyway, if anybody can set up a nice 3D segmentation tool in Slicer based on SAM then let me know, I’ll try it and I’m ready to change my mind and will be excited and will happily advertise it if it works very well.

2 Likes

SAM is foreign to me, I won’t comment about SAM.

This means that clinicians want already processed data; i.e, results only. For decades, radiologists have provided that and are still doing that.

‘AI’ tools are trained with normal data and certainly provide ‘good enough’ results with new near normal data, though on a well defined ‘major organs’ target. ‘Good enough’ because they are far from being an immediate and doubtless reference for discrete measurements.

Clinicians deal with pathological situations that do not correspond to normal inputs. The spectrum of anomalous anatomies is very wide, near infinite. Feeding those ‘AI’ algorithms with such diverse anatomies would take more than infinite time… and funding that clinicians are not ready to provide.

To me, the current situation is that clinicians are having excessive expectations from ‘auto-tools’. Digital tools are not yet ready for effortless consuming in a click. Manual segmentation will prevail in the coming decade or more. The best bet for clinicians remains the radiologist if they find it too hard to invest in understanding and segmenting.

As for technologists, they should not over-sell an easy life to clinicians neither.

1 Like

I agree with this comment. My experience with SAM has been less than ideal for real segmentation tasks. However, I can possibly see two specific cases that might be useful for SlicerMorph community:

  1. We often work with organisms and scans where there is no standard anatomy, orientation and calibration, and there will be no automated tools (specifically ML tools) that will guide us. Often in those situations, users tend to do slice-by-slice segmentation with a lot of interaction anyways. So, if SAM can be integrated into Slicer as a segmentation editor effect, can be used like that, I am willing to give it a try.

  2. We do have a very specific use case in which we need to remove background (which may or may not be very uniform) from 2D photographs of specimens taken at various angles to prepare them for 3D photogrammetry. Most of the time algorithms like SIFT does that OK programmatically, but sometimes “eats” into the specimen if the contrast is not high. If a SAM like tool can do this better with user guidance, then I am happy to try.

These are not sufficiently wide use cases and there are alternative solutions for them, hence I am not motivated to spend time and resources to work on the integration. But if someone wants to do it, and do it any robust way (I couldn’t even get start with the previous extensions, installation steps were not trivial), I am of course willing to try, use and promote it, if it works.

1 Like

For awareness, here’s a new preprint by NVIDIA discussing this topic: A Short Review and Evaluation of SAM2’s Performance in 3D CT Image Segmentation | Abstract (arxiv.org)