Running a Module in Parallel

Hello,

I am currently using a python scripted module to run my vertebra segmentation program. The program uses a for loop to segment multiple fiducial points. This process takes a long time to run and I thought that running in parallel would take a shorter time than iterating through a loop. How can I run my program in parallel using python?

@cpinter @Sunderlandkyl @lassoan

Your most likely route is to call into a c++ library from python.

Python scripted module -> c++ (loadable) module with no GUI -> set flag when finished processing

Another option is to start an independent PythonSlicer process and pass over the data it needs. The SlicerProcess module does this using pickle and stdio, making it pretty efficient. The nice thing is that you get a complete slicer python environment with all the same libraries but independent of mrml and the GUI.

(post withdrawn by author, will be automatically deleted in 24 hours unless flagged)

Does anybody know how to do this?

So I have opened up a scriptedcli module and I have my function that I would like to copy on to there (the onApplybutton function from the slicer scripted module), how do I effectively transfer this code to my scriptedcli module and then run this with my scripted module?

(post withdrawn by author, will be automatically deleted in 24 hours unless flagged)

(post withdrawn by author, will be automatically deleted in 24 hours unless flagged)

Hello,

I have a local threshold segmentation code in a scripted python module which iterates through fiducial points and runs the local threshold function. Iterating through points takes a while to do and to speed it up, I would like it to run in parallel with a scripted CLI python module. I have set a list of parameters, and I have this line of code:

slicer.cli.runSync(slicer.modules.climodulecode, None, param, True, True)

I am not sure what to do in terms of adding code to the scripted cli module and how to use the parameters from there. I have looked at some examples but I am still a little unclear.

Thanks

param = {“inputVolume”: masterVolumeNode.GetID(), “MinimumThreshold”: 265, “MaximumThreshold”: 1009, “MinimumDiameterMm”: 9, “Seed”:fidList.getID()}

These are my parameters. Any ideas? @lassoan @pieper

slicer.cli.runSync blocks execution until processing is complete. slicer.cli.runAsync blocks execution until processing is complete. slicer.cli.run(..., wait_for_completion = False) is not much better either, as Slicer always runs only one CLI at a time (the only advantage is that you can still use Slicer while computation is running in the background). For parallel execution, I would recommend to use @pieper’s SlicerProcesses extension.

Another approach is to keep a single process but use multiple seeds. LocalThreshold effect uses only a single input point, but you could modify it to take all your input points at once.

However, before you would start trying these, the most important thing is to profile your existing implementation. You need to know what line(s) of code take most of the time and focus only on those. There are Python profilers that you can configure or you can measure approximate execution time by adding log messages.

I would be interested on creating a CLI module (written on C++) for saving selected nodes or all scene (just what the save dialog achieves).
I think this would be useful since it will allow the user to autosave by executing the callback of a timer periodically and (if I understood correctly) Slicer GUI will not freeze, further more all features (processing and visualization) of Slicer would be available as it is normally.

Would this idea work? Would this idea have a positive impact if it’s implemented?

Thank you

It it’s a CLI running as a separate process (the default) then Slicer would communicate with it via files and there would be no real time saved. If the saving is in a separate thread there could be a problem if, for example, the data is deleted in the main thread during the save. You could implement a threaded version that copies all the data to private memory in the thread and then does the disk IO while the main thread goes on to other tasks. In fact, you can use multiple threads, say one for each data file and that could speed up, for example, compression. I tried this once for reading and got about 6x performance improvement for a scene with lots of files.

1 Like

So memory cannot be shared between processes even in a read-only mode?
Maybe you could flag/lock the nodes that are being read so the cannot be modified during the save.
Could two Slicer instances share RAM and through it share node references so one saves the nodes on the background while the other does visualization of them on the foreground? Maybe on a virtualized enviroment that’s possible?

There’s nothing that locks memory in the scene so sharing it between processes would be unsafe in general (modules can modify the scene contents). Copying in memory is usually a very efficient operation compared to IO so it’s probably the best way to go. It should be easy to try some timing experiments.

1 Like

Documenting here comments reported during the IGT session that took place during the 37th NA-MIC project week related to SlicerParallelProcessing

from @jcfr

Why not look into doing a scripted CLI module running in the background ?

As well as improving the way such module can communicate feedback back to the application

from @cpinter

Not being able to run algorithms on an actual parallel process in Python was a big limitation. Steve’s module solves this issue, but that doesn’t mean the other options are not available anymore
CLI is super flexible in that you only need to specify the command-line and under the hood it van be anything, even python

From @lassoan

The only current limitation of CLIs is that Slicer runs them on a single background thread, so if you start multiple CLIs they are all executed one after the other on that background thread. On most computers you have 8 or more cores, so allow running 5-10 CLIs in parallel could makes things faster (as demonstrated by ParallelProcessing extension).

cc: @ungi

current limitation of CLIs is that Slicer runs them on a single background thread, so if you start multiple CLIs they are all executed one after the other on that background thread

To address this, I started a topic Commits ¡ jcfr/Slicer ¡ GitHub

Apparently, the MRML scene cannot be directly delegated to another thread from the main thread. Therefore, I believe the main thread may still be used for this copy operation to make the scene/data available to other threads.

Yes, while copying the inputs and final outputs from/to the scene the main thread must be blocked (or the main thread must copy the data). Copying can be done by just replacing a few pointers, so the main thread is blocked for just microseconds.

I’d like to know if the usage of these ‘few pointers’ is dependent on input size.

What I am considering is deep copying the nodes from the main thread, such as segmentation nodes, transformation nodes, color table nodes, etc., except for volume nodes which might be larger, into another thread and then performing the autosave operation. The downside to this approach is that memory overhead will increase as the size of nodes grows.