How to capture a screenshot and turn it into a numpy array with python code?

Slicer version 4.11
I can capture a screenshot here:
image
or I can capture a screenshot in Script Repository here

However,both of the two methods are saving the image as .png file.If I want to see the image pixel value I have to load this image file.
So there is some way to capture a screenshot in python and make it a numpy array without saving it?

Thank you advance for your help and advice!!

To get a slice of a volume as a numpy array you can use some of the examples such as

https://slicer.readthedocs.io/en/latest/developer_guide/script_repository.html#get-axial-slice-as-numpy-array

Thank you james but this is not what I need,
I have to use the screenshots image cause it is after ray casting.

# Capture RGBA image
renderWindow = view.renderWindow()
renderWindow.SetAlphaBitPlanes(1)
wti = vtk.vtkWindowToImageFilter()
wti.SetInputBufferTypeToRGBA()
wti.SetInput(renderWindow)
writer = vtk.vtkPNGWriter()
writer.SetFileName("c:/tmp/screenshot.png")
writer.SetInputConnection(wti.GetOutputPort())
writer.Write()

I think there must be an interface to get the image in buffer,however I am not sure how to get the data.

This example vtkImageDataToPNG may help. It’ll give you an array of the png compressed image by you can get something similar if you just use the vtkWindowToImageFilter output directly.

1 Like

Thanks Steve! You really gave me some insight! I think I am close to the solution but maybe need a little more help!
I saw the example and did it in three steps:

  1. Get the render window
view = slicer.app.layoutManager().threeDWidget(0).threeDView()
renderWindow = view.renderWindow()
renderWindow.SetAlphaBitPlanes(1)
  1. Use the filter to get vtk data
wti = vtk.vtkWindowToImageFilter()
wti.SetInputBufferTypeToRGBA()
wti.SetInput(renderWindow)
# wti.ReadFrontBufferOff()
wti.Update()
vtk_data = wti.GetOutput()
  1. Use vtkImageDataToPNG function to get pngArray
def vtkImageDataToPNG(imageData):
    """
    Return pngArray using the data from the vtkImageData.
    """
    writer = vtk.vtkPNGWriter()
    writer.SetWriteToMemory(True)
    writer.SetInputData(imageData)
    # use compression 0 since data transfer is faster than compressing
    writer.SetCompressionLevel(0)
    writer.Write()
    result = writer.GetResult()
    pngArray = vtk.util.numpy_support.vtk_to_numpy(result)

    return pngArray


pngArray = vtkImageDataToPNG(vtk_data)

After the above three steps,I checked the type and shape of the pngArray as follows:

type(pngArray)
--> <class 'numpy.ndarray'>

pngArray.shape
--> (3721789,)

I have two question for the above results:

  1. It seems the array has been flattened,so my next quesion is how to reshape the array to the render window size, in other words,how to get the render window size (height,width).

  2. I compared the dimension of the pngArray with the screenshot produced by GUI:
    image
    Theoretically,the png image is the same dimension with the pngArray,so I read the image and check both dimensions:

import matplotlib.image as mpimg
img = mpimg.imread('test.png')
img.shape
--> (925, 1004, 4)

925 * 1004 * 4
--> 3714800

It is strange that the two images are not the same size,in other words,the flattened numbers are different. (3714800 v.s. 3721789).Did something go wrong during the procedure of producing the pngArray?

Thank you for your continued attention and help again!

1 Like

Hi @user4 -

Yes, you are very close - the example I pointed to was for getting the compressed png data as an array (e.g. the data in a .png file). Instead you need to get the array from the vtkImageData that comes from the vtkWindowToImageFilter.

Here’s the snippet:

view = slicer.app.layoutManager().threeDWidget(0).threeDView()
renderWindow = view.renderWindow()
renderWindow.SetAlphaBitPlanes(1)
wti = vtk.vtkWindowToImageFilter()
wti.SetInputBufferTypeToRGBA()
wti.SetInput(renderWindow)
wti.Update()
image_array = vtk.util.numpy_support.vtk_to_numpy(vtk_data.GetPointData().GetScalars())
image_array = image_array.reshape((view.height, view.width, 4))

You may need to swap width and height - I didn’t check that for sure.

2 Likes

Thanks Steve! You are quite right,with your code above I have get the image array:

image_array.shape
--> (925, 1136, 4)

While I took a screenshot with GUI, and saved the png image file marked in red:
image

from skimage import io
img = io.imread('test.png')
img.shape

--> (925, 1136, 4)

yeah, It seems okay because the dimensions of both images are the same.
However,when I check the value of array it is not the same,am I missing something or maybe forgot some parameters setting?

(img == image_array).all()
--> False

P.S
This image is produced by image array with your help and code:
image

This image is produced by screenshot capture GUI:
image

Apparently,the two images are different from each other,do you think I am missing some parameters?

Ah yes, I forgot that detail. It would be easier to tell if you load some data but you can see it in the gradient too: the images are mirrored vertically from each other. This is because vtk image rows go from bottom to top (like xy coordinates of a graph), while most image formats stack the rows from top to bottom (like scan lines of an old tv). So you need to rearrange the rows in memory to make the match.

2 Likes

Thanks a lot Steve! :+1:I have solved the problem just reversing the image array.
However it can not be done without your help,I am really grateful to your help again!