Converting view coordinates to image coordinates

I am writing a custom scripted loadable module where one of the features lets the user click on a button, then click & drag to select a region of interest for further analysis. I have written MouseEvents like so:

class MyModuleWidget(ScriptedLoadableModuleWidget, VTKObservationMixin):
	"""
 	Uses ScriptedLoadableModuleWidget base class, available at:
	https://github.com/Slicer/Slicer/blob/master/Base/Python/slicer/ScriptedLoadableModule.py
	"""

	def __init__(self, parent=None):
        ...

    def setup(self):
        ...
        self.ui.myButton.connect("clicked(bool)", self.onButtonClick)
        ...

    def onButtonClick(self):
		sliceWidget = slicer.app.layoutManager().sliceWidget("Red")
		renderWindowInteractor = sliceWidget.sliceView().renderWindow().GetInteractor()
		self.leftButtonPressObserverID = renderWindowInteractor.AddObserver(vtk.vtkCommand.LeftButtonPressEvent, self.onLeftButtonPress)
		self.leftButtonReleaseObserverID = renderWindowInteractor.AddObserver(vtk.vtkCommand.LeftButtonReleaseEvent, self.onLeftButtonRelease)

	def onLeftButtonPress(self, obj, event):
		interactor = obj
		eventPosition = interactor.GetEventPosition()
		if event == "LeftButtonPressEvent":
			self.startPos = eventPosition

	def onLeftButtonRelease(self, obj, event):
		interactor = obj
		eventPosition = interactor.GetEventPosition()

		if event == "LeftButtonReleaseEvent":
			self.endPos = eventPosition

			# Perform analysis on the selected region
			if self.startPos and self.endPos:
				# TODO: convert to image coordinates and perform analysis

			# Reset the start and end positions
			self.startPos = None
			self.endPos = None

			# Remove the observers
			sliceWidget = slicer.app.layoutManager().sliceWidget("Red")
			renderWindowInteractor = sliceWidget.sliceView().renderWindow().GetInteractor()
			renderWindowInteractor.RemoveObserver(self.leftButtonPressObserverID)
			renderWindowInteractor.RemoveObserver(self.leftButtonReleaseObserverID)

The start and end positions are being correctly captured. However, the coordinates are not image coordinates, but view coordinates, i.e. selecting the top corner of the image should be, for example:

startPos = (0, 0)
endPos = (30, 30)

But if I select this region, the coordinates might instead be set to:

startPos = (30, 30)
endPos = (60, 60)

How can I convert these view coordinates to image coordinates?

You can look into implementation of segment editor effects that take mouse clicks as inputs. However, I would recommend to use a markups plane or ROI for selecting a region.

I see, is there a way to implement it so that the user selects the ROI after clicking the button, and that ROI is used for further analysis? Something like this:

Slicer question