Slicer crash when call multiprocessing

I want to use python multiprocessing, but it makes slicer crash.

Traceback (most recent call last):
  File "C:\Program Files\Slicer 4.7.0-2017-08-16\lib\Python\Lib\multiprocessing\process.py", line 249, in _bootstrap
    sys.stdin.close()
AttributeError: 'PythonQtStdInRedirect' object has no attribute 'close'

Does slicer have plans to fully support Python3 and PyQt5?

You can already build Slicer with Qt5, probably we’ll update to Qt5 in a couple of weeks.

We plan to move to Python3 when all Slicer’s dependencies support Python3.

PyQt uses GPL license, which is incompatible with Slicer’s more permissive BSD-type license (we use PythonQt, which is LGPL).

I found PythonQt too many bugs, why not replace it with PySide? It’s also LGPL.

PySide seems to be only for Qt4 (https://wiki.qt.io/PySide). Is PySide2 complete and stable, supporting Python3 and latest Qt5?

We don’t experience any issues with Qt wrapping using PythonQt, so unless we run into some major problems in the future, it is not likely Slicer would switch.

Do you know what would be some advantages of using PySide2 compared to PythonQt? What “bugs” you experienced with PythonQt? Are you sure you using it correctly? Have you reported the errors to PythonQt?

  1. multiprocsssing can not work
  2. QAbstractItemModel createIndex overide error
  3. setItemDelegateForColumn crash
  4. Delegate can not work normally

The code runs fine on PyQt

The issues that you describe do not seem to be difficult to address; except the last one, which I don’t understand. I would suggest to submit your questions (with more specific description of the problem) to https://sourceforge.net/p/pythonqt/discussion/.

There are some differences compared to PySide (see http://pythonqt.sourceforge.net/Features.html), but there doesn’t seem to be any major limitation in PythonQt. The beauty of PythonQt is that it is so small and simple that it is easy to develop and maintain it, even for a small team. PySide has some advantages, mainly the larger user community, but I don’t think that would justify the huge work that would be required to migrate Slicer and all extensions. It is probably a better investment to spend that time with fixing issues or adding more features to Slicer.

The code runs fine on PyQt

PyQt is irrelevant for us due to its restrictive license.

I have already submitted a question on sourceforge.
Model/View/Delegate some problems!

2 Likes

To increase the chance that your questions will be answered on sourceforge, post only one question in a topic and provide much more details including a complete example that reproduces the problem.

2 Likes

You can work around the Slicer PythonQtStdInRedirect multiprocessing crash by temporarily changing sys.stdin. For example, the below multiprocessing works in slicer:

from multiprocessing import Pool
import sys, os
import math

original_stdin = sys.stdin
sys.stdin = open(os.devnull)
try:
  p = Pool(5)
  print(p.map(math.sqrt, [4, 8, 16]))
finally:
  sys.stdin.close()
  sys.stdin = original_stdin
3 Likes

Thanks @John_DiMatteo for sending the Pool recipe. Did you also get it working for multiprocessing.Process? I’m looking for something that will work robustly on all platforms and I’m thinking I’ll use subprocess with PythonSlicer.

This trick does not work on Windows, and will never work. Although it works perfectly on Mac and linux.

Thanks for the info @Alex_Vergara. Yes, I did some testing and now I’m thinking I’ll use QProcess since it has a very clean API and is already integrated with the signals/slots event loop. Turns out it’s pretty straightforward to start a PythonSlicer process and send pickled data back and forth through stdin/stdout. Will share some sample code if I get something working well on all platforms.

2 Likes

You will end up by facing the same problem in Windows:

There is simply not solution available to Windows. The multiprocessing shall be done in a separate script with the if self.__name__ == "main" trick

In this last case the script itself must be called in a way that it is the main module, otherwise the multiprocessing spawn will just create several non functioning processes. This is a real fault in windows conception. The best explanation is this

Agreed, Windows and Unix took very different approaches years ago and it’s caused a lot of complexity.

For my particular use case the QProcess approach is working well for me on both Windows and Mac. In particular we have some computationally intense single threaded python code we want to apply to a set of data already loaded in Slicer. So launching a PythonSlicer instances and sending them the data will work. I don’t exactly want to fork the whole Slicer process anyway and may at some point want to run these processes on another machine, e.g. via ssh. I’m still working on complete examples and then I’ll push some code.

Here’s the example using QProcess:

1 Like

I have successfully translated your code into mine with some caveats:

  • Basically you need to decode the stdout to the correct data type, it is not enough to ask pickle to do it
  • You need to deepcopy your input data
  • You can add a progress bar :wink:
import qt
import json
import pickle
import slicer
import copy
import numpy as np

from Logic import logging, utils
from slicer.ScriptedLoadableModule import *

#
# ProcessesLogic
#

class ProcessesLogic(ScriptedLoadableModuleLogic):

    def __init__(self, parent = None, maximumRunningProcesses=None, completedCallback=lambda : None):
        super().__init__(parent)
        self.results = {}
        if not maximumRunningProcesses:
            cpu_cores, cpu_processors, lsystem = utils.getSystemCoresInfo()
            self.maximumRunningProcesses = cpu_cores
        self.completedCallback = completedCallback

        self.QProcessStates = {0: 'NotRunning', 1: 'Starting', 2: 'Running',}

        self.processStates = ["Pending", "Running", "Completed"]
        self.processLists = {}
        for processState in self.processStates:
            self.processLists[processState] = []

        self.canceled = False
        self.ProgressDialog = slicer.util.createProgressDialog(
                parent=None, value=0, maximum=100)
        labelText = "Processing ..."
        self.ProgressDialog.labelText = labelText
        self.ProgressDialog.show()
        slicer.app.processEvents()

    def setMaximumRunningProcesses(self, value):
        self.maximumRunningProcesses = value

    def saveState(self):
        state = {}
        for processState in self.processStates:
            state[processState] = [process.name for process in self.processLists[processState]]
        self.getParameterNode().SetAttribute("state", json.dumps(state))

    def state(self):
        return json.loads(self.getParameterNode().GetAttribute("state"))

    def addProcess(self, process):
        self.processLists["Pending"].append(process)
        self.ProgressDialog.maximum = len(self.processLists["Pending"])

    def run(self):
        while len(self.processLists["Pending"]) > 0:
            if len(self.processLists["Running"]) >= self.maximumRunningProcesses:
                break
            process = self.processLists["Pending"].pop()
            process.run(self)
            self.processLists["Running"].append(process)
            self.saveState()

    def onProcessFinished(self,process):
        self.processLists["Running"].remove(process)
        self.processLists["Completed"].append(process)
        self.ProgressDialog.value += 1
        slicer.app.processEvents()
        self.saveState()
        if len(self.processLists["Running"]) == 0 and len(self.processLists["Pending"]) == 0:
            for process in self.processLists["Completed"]:
                k, v = process.result[0], process.result[1]
                self.results[k] = v
            self.ProgressDialog.close()
            self.completedCallback()
        if self.ProgressDialog.wasCanceled:
            self.canceled = True
        else:
            self.run()

class Process(qt.QProcess):
    """TODO: maybe this should be a subclass of QProcess"""

    def __init__(self, scriptPath):
        super().__init__()
        self.name = "Process"
        self.processState = "Pending"
        self.scriptPath = scriptPath
        self.debug = False
        self.logger = logging.getLogger("Dosimetry4D.qprocess")

    def run(self, logic):
        self.connect('stateChanged(QProcess::ProcessState)', self.onStateChanged)
        self.connect('started()', self.onStarted)
        finishedSlot = lambda exitCode, exitStatus : self.onFinished(logic, exitCode, exitStatus)
        self.connect('finished(int,QProcess::ExitStatus)', finishedSlot)
        self.start("PythonSlicer", [self.scriptPath,])

    def onStateChanged(self, newState):
        self.logger.info('-'*40)
        self.logger.info(f'qprocess state code is: {self.state()}')
        self.logger.info(f'qprocess error code is: {self.error()}')

    def onStarted(self):
        self.logger.info("writing")
        if self.debug:
            with open("/tmp/pickledInput", "w") as fp:
                fp.buffer.write(self.pickledInput())
        self.write(self.pickledInput())
        self.closeWriteChannel()

    def onFinished(self, logic, exitCode, exitStatus):
        self.logger.info(f'finished, code {exitCode}, status {exitStatus}')
        stdout = self.readAllStandardOutput()
        self.usePickledOutput(stdout.data())
        logic.onProcessFinished(self)

class ConvolutionProcess(Process):
    """This is an example of running a process to operate on model data"""

    def __init__(self, scriptPath, initDict, iteration):
        super().__init__(scriptPath)
        self.initDict = copy.deepcopy(initDict)
        self.iteration = iteration
        self.name = f"Iteration {iteration}"

    def pickledInput(self):
        return pickle.dumps(self.initDict)

    def usePickledOutput(self, pickledOutput):
        output = np.frombuffer(pickledOutput, dtype=float)
        #output = pickle.loads(pickledOutput)
        self.result = [self.iteration, output]

Remaining question, Have you tested this in Windows??

It should be possible to define custom pickling methods for VTK and MRML classes. It would be nice if you could look into this. It may be more elegant than implementing some custom serialization solution for each project.

In my case I just output a numpy array in the cpu intensive calculation. But you are right, standard pickling methods would be great, specially for generic intensive computations.

This algorithm can be made generic for any process that basically a class that accepts dictionaries as input (any kind of data wrapped into a dictionary), watch my convolution class:

import numpy as np
import pickle
import sys
import Logic.logging as logging

class Convolution:
    '''convolution class to be runned either sequentially or parallel
    for parallel execution a wrapper class must be implemented as:
        def multi_run_wrapper(args):
            convolution = Convolution(*args) 
            return convolution.convolute()
    where args must be a tuple with
        args = (indexes, p0, DVK, distance_kernel, dens_array, act_array, num, boundary)
    '''
    def __init__(self, indexes, p0, DVK, distance_kernel, dens_array, act_array, num, boundary):
        self.logger = logging.getLogger('Dosimetry4D.convolution')
        self.indexes=indexes
        self.p0=p0
        self.DVK=DVK
        self.distance_kernel=distance_kernel
        self.dens_array=dens_array
        self.act_array=act_array
        self.num=num

    def convolute(self):
        ''' Non homogeneous convolution in a list of voxel (vectorized)
            Variable density correction by distance
        '''

        res = np.zeros_like(self.indexes, dtype=float)
        .....    Computation     .....

        return res

try:
    pickledInput = sys.stdin.buffer.read()
    input = pickle.loads(pickledInput)

    convolution = Convolution(**input) # Making the convolution pickable
    output = convolution.convolute() # Invoking the function is now thread safe

    sys.stdout.buffer.write(pickle.dumps(output))
except Exception as e:
    print(e)

And the calling method is

                    logic = ProcessesLogic()
                    scriptPath = self.script_path / "Logic" / "Convolution.py"

                    chunks = 100 # to avoid overheading 
                    iteration = 0
                    args1 = {  # wrap arguments for multithreading
                            'p0':p0, 
                            'DVK':self.DVK, 
                            'distance_kernel':self.distanceKernel, 
                            'dens_array':self.densityArray, 
                            'act_array':self.activityArray,
                            'num': num, 
                            'boundary':'repeat'
                    } 
                    while acc < length1:
                        stepsize = max(1, min(int(length1/chunks), length1-acc))
                        lindex = copy.deepcopy(args1)
                        lindex['indexes'] = np.array(indexes[acc:acc + stepsize])
                        convolutionProcess = ConvolutionProcess(scriptPath, lindex, iteration)
                        logic.addProcess(convolutionProcess)
                        acc += stepsize
                        iteration += 1
                    
                    results=[]
                    self.logger.info(f"Launching convolution in {self.CpuCores} CPU cores")
                    slicer.app.processEvents()

                    try:
                        logic.run()
                        canceled = logic.canceled
                    except Exception as e:
                        print(e)
                        return False

For reference, when you use the pickle method to decode the output you will usually get the following error:

  File "/Users/alexvergaragil/Documents/GIT/dosimetry4d/Dosimetry4D/Logic/Process.py", line 131, in usePickledOutput
    output = pickle.loads(pickledOutput)
_pickle.UnpicklingError: could not find MARK

This only means pickle does not know how to read the output