Stable diffusion 2.1 trained on Medical images?

It would be very nice to see a NCI-IDC (medical images) trained model such as the open-source Stable Difussion 2.1 for image-from-text-generation

Here is a picture from it:

1 Like

Depending on the equipment you have available (especially GPU) and willingness to invest some time on this, this is something you could likely accomplish yourself by fine-tuning Stable Diffusion (i.e. you start from a Stable Diffusion model and then extend the training on a set of images you curate and label). If you don’t have a sufficiently powerful GPU on a local machine, it is very possible to rent enough time on cloud services to accomplish this (probably doable with much less than $50 USD). I’d be happy to point you to some resources and suggest the approach I think it would make sense to take, but first I wonder what you would be hoping to accomplish if you were successful. The likely result would be a model which can generate images which, at first glance, look like the types of medical images you trained in, but which generally look wrong on closer inspection (e.g. bones connecting to the wrong other bones, features which don’t make sense anatomically, the wrong number of ribs, mixtures of kinds of images, etc.; Stable Diffusion famously still struggles mightily to produce images where hands are not obviously deformed, despite being trained on literally millions of images which have hands in them). I’m not sure what use these sorts of images would be, but perhaps you have some good ideas.


I just had a reference to this pop up in a news feed: Some Stanford researchers did a pretty comprehensive-looking version of what I was about to suggest to you in a small way. I haven’t had a chance to dig into this paper at all, but you might find it very relevant.


Nice. I’ll take a look

An update from the same group: