Volume rendering slow in latest nightlies for MacOS

Hey guys,

I’m pretty sure I saw something about this come across on the Development list in the last week or so but a quick search didn’t come up with anything…

Volume rendering is significantly slower with the latest 4.9 nightlies. I have one from January 3 that I’ve been using and it’s fine. I’m assuming this probably has to do with the transition to VTK9?

(My main reason for asking is I’d really like to use vtkMRMLPlotSeriesNode and it doesn’t appear to be available in the Jan 3 nightly, but the more recent nightlies are unusable.)

This is on MacOS 10.12.6.

Thanks!

-Hollister

Thanks for the report.

@hherhold Could you provide more details about the settings select in the volume rendering module, and the details of your GPU.

We really need to pinpoint what is happening here. There are already two issues related to slower rendering:

@lassoan @Sankhesh_Jhaveri It would be great to work together to that can clearly understand the bottle neck:

  • Video driver issue
  • Problem integrating the vtk GPU volume rendering mapper in Slicer following the transition to VTK9
  • Problem within the vtk GPU volume rendering mapper

This is on a Mid 2015 MacBook Pro with AMD Radeon R9 M370X graphics (2G).

I’m rebuilding the latest master from scratch right now to see if I can help debug.

Basically, I just turn on volume rendering and rotate around. It’s significantly slower, and changing the transfer function is really slow.

I’ll update when my build finishes and give you more specifics.

Thanks!!!

What is the GPU memory size setting? Is quality setting at maximum or adaptive? Any warnings or errors in the log? What is the volume size and scalar type? Is it slower with Slicer sample data sets? How the speed compares to CPU-based rendering?

Oops, all excellent questions that I should have added in previous post.

Memory size = 2G
Quality = Adaptive

My data set:
Volume dimensions = 385 x 413 x 996
Volume type = 16 bit scalar
No errors on console, but a couple of what look like unrelated warnings:

Switch to module:  "VolumeRendering"
Warning: In /Volumes/Dashboards/Nightly/Slicer-0/Libs/MRML/Core/vtkObserverManager.cxx, line 131
vtkObserverManager (0x7f97c1bae510): The same object is already observed with the same priority. The observation is kept as is.

Warning: In /Volumes/Dashboards/Nightly/Slicer-0/Libs/MRML/Core/vtkObserverManager.cxx, line 131
vtkObserverManager (0x7f97c1bae510): The same object is already observed with the same priority. The observation is kept as is.

ctkDoubleSlider::setSingleStep( 200 ) is outside of valid bounds.

Slicer sample data set is better (smaller) but it's still a bit slow.

My dataset actually crashed the OS at one point. Sounds like a driver/video card problem. Any idea where I should look in system logs?

CPU rendering worked with sample set (MRBrainTumor1). Will try with my dataset shortly - I expect it will be slow.

Build completed on my machine with no problems. I do get these, however, on a number of libraries, but I expect they’re unrelated to performance issues:

dlopen(/Users/hherhold/Development/slicer/build-qt5/Slicer-build/lib/Slicer-4.9/qt-loadable-modules/vtkSlicerVolumeRenderingModuleMRMLDisplayableManagerPython.so, 2): no suitable image found.  Did find:
/Users/hherhold/Development/slicer/build-qt5/Slicer-build/lib/Slicer-4.9/qt-loadable-modules/./vtkSlicerVolumeRenderingModuleMRMLDisplayableManagerPython.so: malformed mach-o: load commands size (32888) > 32768
/Users/hherhold/Development/slicer/build-qt5/Slicer-build/lib/Slicer-4.9/qt-loadable-modules/vtkSlicerVolumeRenderingModuleMRMLDisplayableManagerPython.so: malformed mach-o: load commands size (32888) > 32768
/Users/hherhold/Development/slicer/build-qt5/Slicer-build/lib/Slicer-4.9/qt-loadable-modules/vtkSlicerVolumeRenderingModuleMRMLDisplayableManagerPython.so: malformed mach-o: load commands size (32888) > 32768
/Users/hherhold/Development/slicer/build-qt5/Slicer-build/lib/Slicer-4.9/qt-loadable-modules/vtkSlicerVolumeRenderingModuleMRMLDisplayableManagerPython.so: malformed mach-o: load commands size (32888) > 32768

CPU rendering is functional but very slow with this dataset.

I’m trying to run Slicer using Instruments in MacOS while changing the opacity mapping transfer function to see what’s eating up time. No conclusions yet - I haven’t used Instruments in a long time so there’s a little learning curve.

I’m not sure if this is a clue or not, but click-dragging in a slice view to change the window level is noticeably slower than a January 3 nightly build - performance issues do not appear to be just volume rendering.

Probably unrelated, but we have noticed this, too. See VTK OpenGL2 Backend: Reslicing speed can be very slow on Windows · Issue #4496 · Slicer/Slicer · GitHub.

I’m not sure if this helps or not, but some observations:

  • OpenGL performance with viewing models is fast - to my eye, faster than my “benchmark” January 3 nightly. In fact, nearly everything seems faster except volume rendering.
  • When turning on volume rendering, the volume appears in a reasonable amount of time, and when it is relatively small in the view, rotating around is tolerably quick.
  • Zooming in slows everything down, but it’s still useable.
  • Changing the opacity mapping is extremely slow to respond.
  • Once the opacity map is changed, rotating the volume is extremely slow, almost to the point of hanging the program.
  • Changing the interactive frame rate slider in volume rendering has no effect.

I get the same results on 16 or 8 bit data.

I’m happy to help debug.

-Hollister

Andras - I tried volume rendering in the latest nightly of ParaView, but I think it does not support GPU volume rendering - does this sound correct?

I noticed that even on Windows, volume rendering is much slower with OpenGL2. The frame rate is 2-5 times faster (apparently depending on the image) in 4.8.1 using the same image and default settings than in the nightly.

Is it possible that the defaults of the new vtkGPUVolumeRayCastMapper are set to an unnecessarily high
setting? For example the fixed sampling distance of of minSpacing/10 may be too small, and enabling LockSampleDistanceToInputSpacing would help?

I tried it and there is a huge improvement in performance using LockSampleDistanceToInputSpacing, while I cannot notice any degradation in quality. I’ll send a PR shortly with other minor changes (such as the option to use jittering to reduce the wood-grain effect).

This is great news - thank you very much!

I tried a few options and although the results are promising they are not conclusive. I issued a pull request, the fate of which will be decided by the core developers. Please stand by

OK, I took a look at the comments in the pull request. Is there a git command for me to incorporate those changes into my fork to test before they’re merged into master? I’m forked off master, and then I have a local repository of that fork, and I do the usual “git fetch upstream”, “git merge upstream/master”, and “git push” to keep it up to date.

Thanks!

-Hollister

You need to add my Slicer fork as a remote, then you can cherry-pick that commit from the branch to your local repository.

OK, thanks.

I’m not very savvy in git - I did an add remote of your fork, then a fetch, then a merge of your volume-rendering-performance-options branch. I couldn’t find the right incantation for cherry-pick - hopefully this will do the right thing.

I don’t have any changes in my fork, so if I’ve completely screwed it up and have to blow it away and start over, that’s fine.

It’s building now - I will let you know.

Thanks!

I suggest you install http://hub.github.com/ , you will then have an alias from git to hub executable and will streamline your process

You would then simply do:

git remote add cpinter
git fetch cpinter
git checkout -b name-of-topic cpinter/name-of-topic

It also simplify creating pull request and forking repo. Reading the associated doc is worth it.

Will do. Thanks!!

-Hollister