Using R remotely: some options and tips

2017-10-26 1114 words 6 minutes

/posts/2017-11-26-using-r-remotely/images/ssh-tutorial-how-does-ssh-work-imagecredit-wwwhostingercom.png

Contents

Why would you need to do this? Say, for instance, you are dealing with sensitive data that should not leave a specific system, or quite simply that you are away on a work retreat - but your laptop is far less powerful than your work desktop computer which you left behind - so you want to keep using it from a distance. For such reasons, I’ve been looking into what options are available to log in remotely to a machine and run R there for some analysis. Below are some of the alternatives I’ve come across:

Remote desktop applications
“Plain” ssh
ssh with X11 forwarding
R used with Vim and tmux
Executing R code remotely from a local interface

Remote desktop applications (RDAs)

Remote desktop applications (like VNC or Remmina etc.) allow you to visualise another machine’s desktop environment from afar. For example, if one day you are working from home, you can use an RDA from your home machine to log into, and use, your work machine. This would let you interact with its environment as though you were in front of the work machine’s screen. I usually save this option for when I need access to a range of things from a remote system and could use a graphical interface, but since 99% of the time I only need access to R on a remote machine (and nothing else), I’ve drifted away from RDAs, personally.

ssh

This one refers to using a secure shell to log into the remote machine. However, all you get here by default is just that… the shell. You don’t see anything other than a command line - so this can be a great option for those who are comfortable with that. Otherwise it might be easier to resort to an RDA.

I think ssh-ing is a lot more streamlined than dealing with a whole windowed desktop environment (as you would with the previous method). That said, there are downsides too - particularly if, like me, you’re used to working in an IDE like RStudio. The “basic” things that I appreciate about RStudio will be absent in a shell - such as having a pane for R scripts (to save and reuse) which is separate from the R bash, or having a built-in graphics pane.

Regardless, in its simplest form, ssh-ing into a system would require this code in a Linux terminal:

$ ssh yourUserName@remoteAddress

ssh with X11 forwarding

Not to worry though. There is one trick to get around ssh being too stripped down. I’ll stick with the RStudio example (since that’s easiest for me): assuming you have RStudio installed on the remote system, you can start it up on the remote and have its window forwarded onto your own local machine. This would simply show up as any other program you’ve got open locally. To achieve this, you’d need to supply one extra option in your ssh call (read here for more details):

$ ssh -X yourUserName@remoteAddress
$ rstudio

Pretty neat, right? This way you can avoid running something like VNC when all you actually need is an instance of a single IDE. But there is a catch: depending on various parameters (e.g., your network, the type of cipher used to establish the secure connection), the forwarded window may be extremely slow to respond. In my case, it took forever to even scroll through a script in RStudio, so this could not be a long-term solution.Thankfully though, there are additional options we can add to the call to try and fix this:

$ ssh -XC yourUserName@remoteAddress

As before, the code above enables X11 forwarding, but at the same time ensures that the visual feedback is compressed, and therefore travels faster between the remote and the local machine. It was surprising to me how efficiently this reduced the lag for my session: in terms of user experience, RStudio was now behaving just as though it had been launched locally.

If this is not enough, you can also try to switch to a different cipher - which is still secure (though opinions vary somewhat here), but faster than the default AES (according to this source). To do this, you’d need to type:

$ ssh -XC -c blowfish-cbc yourUserName@remoteAddress

Teaming up R, Vim and tmux

But what if you don’t want X11 forwarding enabled, for instance if the remote machine is not one you trust unreservedly (see here for risks)? Or perhaps if the X11 option above is still too laggy despite compressing the image stream and changing the cipher? Well, in that case… I might have another solution for you: runing R via Vim (with the Nvim-R plugin) within your ssh shell - and all this with a helping hand from tmux too.

What does this mean? After ssh-ing in, by running tmux you will be able to split your Terminal window into separate panes - one of which I like to keep as an actual Terminal (just in case I end up needing to perform some file operations etc.). The other one can be used to start up Vim (powerful text / script editor) and display my R script in it. On typing \rf in Vim, that further splits the view to allow for a new pane - one in which an R session / bash has started. So you’d end up with three panes this way, all in the same window: a Terminal, a script editor, and an R bash. This may not be the same as running RStudio, but it comes pretty close. I’ve only just started exploring this option, but am really enjoying it so far!

Viewing RStudio locally, but sending code remotely for execution

This is an option I am aware of, but which I have not yet used myself. I’ll just be mentioning it here in case it is useful for you. The poster child for this would probably be RStudio Server, which you can download and use in your browser, but your code actually gets executed by the remote machine / server. It comes in an open source and commercial version.

A possible alternative is the remoter package,which allows you to control a remote R session from a local one. The local R session can run in a terminal, GUI, or IDE such as RStudio.

Disclaimer: This material is not meant as a user guide, it is rather a summary of my own attempts at learning how to run R sessions remotely. Readers are advised to do their own research and decide for themselves what option(s) provide(s) the best compromise in terms of both security and performance.

This content was first published on The Data Team @ The Data Lab blog.