Rpy2 pip installation fails

Following the thread, Pip in nightly build not working, I got pip working as demonstrated.

But this fails:
>>> pipmain([‘install’, ‘Rpy2’])

Collecting Rpy2

Using cached https://files.pythonhosted.org/packages/f1/98/c7652cc9d7fc0afce74d2c30a52b9c9ac391713a63d037e4ab8feb56c530/rpy2-2.9.4.tar.gz

Could not install packages due to an EnvironmentError: [Errno 2] No such file or directory: ‘c:\users\murat\appdata\local\temp\pip-req-tracker-vuah0s\a3317999c50976d8efecf0ed8dfd36602d75111dfb34525631802c24’

Any suggestions on how to fix this?

I think this is the same issue as this:

Yes, from briefly looking at the Rpy2 documentation there are some precompiled binaries. If you’re using Windows, you’re not going to be able to use this package in Slicer’s environment at this time. I recommend using pure python packages when using Slicer on Windows.

From the other thread that you linked above, the important information is in JC’s comment.

Second, on linux (and most likely macOS), it is indeed possible to pip install official packages (even the including binaries like scipy , tensorflow , …).

On windows, it will currently work only for pure python wheels because the official package for python 2.7 are built with a compiler than the one used for the official wheels. Note that this will change as soon as we standardize on Visual Studio 2015 and switch to python >= 3.5.

The work towards python3 I believe is scheduled following a 4.10.1 release. So likely early next year.

Note that you only need to use Slicer’s built-in Python interpreter to directly access GUI and scene data. You can run Python scripts in any external environment with any Python packages installed as described here.

@lassoan @jamesobutler

I am trying to get Rpy2 working within Slicer so that we can do some computations in R and visualize the results in Slicer using the extension we are building. So, if this is the use case, should I try to get it working with the internal python or use the external environment?

Thanks.

If you’re needing Rpy2, use the external environment method as suggested by @lassoan.

Can you describe the computation that you are doing in R? There could be another way of doing this without using R.

Thanks @jamesobutler.

There are many different types of analyses that one can use landmark based shape analyses to investigate different biological questions (phylogenetics, ecology, evolutionary trajectories etc). Most of these are already implemented in R as different packages (and surprisingly almost none in python, AFAIK). We just don’t want to reimplement them in Slicer (and also take the responsibility of maintaining them). Instead our goal is to have a standard way of visualizing results by pulling and pushing data between these R packages (or at least that’s the idea).

I also noticed that is another package called pyRserve, which makes a remote connection to its sister package Rserve. That model may work us better than Rpy2.

We do landmark-based shape analysis using VTK and also experimenting with SlicerSALT and they work quite well.

What analysis do you plan to do? What R packages would you like to use? I’ve done a quick search and found morpho and Rvcg packages, but their features seem to be very limited compared to what VTK can do.

Try geomorph as well. It provides the basic procrustes superimposition and its typical outputs (consensus shape, procrustes residuals, riemann distances, centroid sizes, etc). From there you can do eigen decomposition on your resultant coordinates and take your data to do phylogenetic analysis (phylocurve), or developmental trajectory studies, or do a partial least squares to assess overlaps in VCV matrices between different components of your data (aka morphological integration).

These types of analyses are somewhat different (and harder to generalize) than the typical two group analyses (control vs patient) that are so common in biomedical contexts.

But more perhaps more importantly (for the users), the statistical foundation of most of those methods are well-understood and validated, and published in statistical (or field specific) journals, which is important for the target audience of our project.

1 Like

If pyRserve works and is fast enough for your needs, then that sounds like a good option; network-based, de-coupled solutions are generally easier to maintain. If it turns out not to be fast enough, you may be able to build Rpy2 from source as part of your extension. This would assume it only links at runtime (e.g. ctypes) against a separate, configurable R library. However, if it requires R headers and libraries to build, then that is probably prohibitively complicated.

Thanks, yes, indeed geomorph seems to have a few features that are not available in VTK. Also, if your community uses geomorph as reference implementation then you may have little choice.

I’m surprised that geomorph uses GPL license. It is too bad, because I cannot afford to spend time on learning (and testing, reporting errors, contributing to) libraries that I may not be allowed to use in the future for any projects. Is this a trend in R community?

I can’t generalize, but I think there a lot of packages that are out there with GPL. You need to understand that people behind these packages are not team, but academicians that are spending a lot of time on very specific topic.

Though, I am not sure I understand your reluctance. GPL has only implications only on if you are going to redistribute the software. If your only using/testing reporting errors than why does it matter? I am assuming you are using Linux in your lab in one way or another, so your use case for geomorph is not much different than that.

I’ve had the impression that GPL is preferred in the R community, and apparently it dominates by a wide margin:

http://adolfoalvarez.cl/the-free-open-and-proprietary-flavors-of-r/

image

If I invest my into learning and contributing to GPL-licensed software then my time may be wasted because for some projects I or my collaborators cannot use software with restrictive license.

@muratmaga to echo what Andras is saying, the problem with GPL code is that it we do redistribute software in the form of Slicer and Slicer Extensions. The terms of the GPL are such that if we include even one line of GPL licensed code in our distribution, then the copyright holder of that one line could insist that all other code in the same distribution needs to be GPL. This would prevent people from using Slicer as the basis for non-public extensions (e.g. if making a medical device based on Slicer or even release binaries of pre-publication work in progress).

That’s why Andras and I and others (including the ITK community) are careful not to rely on GPL code. In actual practice it would becomes very restrictive and counter-productive. There’s a large literature on this so I won’t repeat it here, but the Linux kernel has specific clarifications about how it can be used in and redistributed and that’s one reason it’s so popular in products.

Anyone writing software that incorporates GPL code you should carefully read up on the terms and understand what obligations come with it. The code may be so good that accepting those obligations is the best solution, but other times its best to find workarounds that are compatible with the licensing.

For us, the workaround of using R and other GPL’d code as an executable or network server sounds the cleanest way to address this for now. It will allow us to make progress without raising license issues that may be hard to resolve.

Once we have the python 3 issues sorted out then users will be able to install R and R packages independently, and that will sidestep the distribution issues since it won’t be us distributing the software but the user installing it. If Slicer only provides interfaces that are compatible with R then we don’t have the license issues.

I wish this weren’t so complicated but it’s the situation we have. Fortunately the processes we have in place have worked well. As long as we are careful there are no technical or legal hurdles that will get in our way.

2 Likes

thanks @pieper. Yes, I think pyRserve/Rserve combination is the best solution, because indeed user will use packages that they specifically installed in their own R environment (wherever that might be). Glad to hear that this will not cause a license issue.

1 Like