Rscript not working with packaged R for AWS Lambda

I'm trying to run an R script on the command line of an AWS EC2 instance using packaged R binaries and libraries (without installation) -- the point is to test the script for deployment to AWS Lambda. I followed these instructions. The instructions are for packaging up all the R binaries and libraries in a zip file and moving everything to a Amazon EC2 instance for testing. I unzipped everything on the new machine, ran 'sudo yum update' on the machine, and set R's environment variables to point to the proper location:

export R_HOME=$HOME
export LD_LIBRARY_PATH=$HOME/lib

NOTE: $HOME is equal to /home/ec2-user.

I created this hello_world.R file to test:

#!/home/ec2-user/bin/Rscript
print ("Hello World!")

But when I ran this:

ec2-user$ Rscript hello_world.R

I got the following error:

Rscript execution error: No such file or directory

So I checked the path, but everything checks out:

ec2-user$ whereis Rscript
  Rscript: /home/ec2-user/bin/Rscript

ec2-user$ whereis R
  R: /home/ec2-user/bin/R /home/ec2-user/R

But when I tried to evaluate an expression using Rscript at the command line, I got this:

ec2-user$ Rscript -e "" --verbose
  running
    '/usr/lib64/R/bin/R --slave --no-restore -e '

  Rscript execution error: No such file or directory

It seems Rscript is still looking for R in the default location '/usr/lib64/R/bin/R' even though my R_HOME variable is set to '/home/ec2-user':

ec2-user$ echo $R_HOME
  /home/ec2-user

I've found sprinkles of support, but I can't find anything that addresses my specific issue. Some people have suggested reinstalling R, but my understanding is, for the purposes of Lambda, everything needs to be self-contained so I installed R on a separate EC2 instance, then packaged it up. I should mention that everything runs fine on the machine where R was installed with the package manager.

SOLUTION: Posted my solution in the answers.

Answers


It thinkt it is staring at you right there:

ec2-user$ whereis R
  R: /home/ec2-user/bin/R /home/ec2-user/R

is where you put R -- however it was built for / expects this:

ec2-user$ Rscript -e "" --verbose
  running
    '/usr/lib64/R/bin/R --slave --no-restore -e '

These paths are not the same. The real error may be your assumption that you could just relocate the built and configured R installation to a different directory. You can't.

You could build R for the new (known) path and install that. On a system where the configured-for and installed-at path are the same, all is good:

$ Rscript -e "q()" --verbose
running
  '/usr/lib/R/bin/R --slave --no-restore -e q()'

$ 

This blog post walks through a similar problem and offers a potential solution. I also had to implement part of the solution from this post.

I changed the very first line of R's source code from this:

#!/bin/sh
# Shell wrapper for R executable.

R_HOME_DIR=${R_ROOT_DIR}/lib64${R_ROOT_DIR}

To this:

R_HOME_DIR=${RHOME}/lib64${R_ROOT_DIR}

I'll explain why below.

NOTE -- The rest of the code is:

if test "${R_HOME_DIR}" = "${R_ROOT_DIR}/lib64${R_ROOT_DIR}"; then
   case "linux-gnu" in
   linux*)
     run_arch=`uname -m`
     case "$run_arch" in
        x86_64|mips64|ppc64|powerpc64|sparc64|s390x)
          libnn=lib64
          libnn_fallback=lib
        ;;
        *)
          libnn=lib
          libnn_fallback=lib64
        ;;
     esac
     if [ -x "${R_ROOT_DIR}/${libnn}${R_ROOT_DIR}/bin/exec${R_ROOT_DIR}" ]; then
        R_HOME_DIR="${R_ROOT_DIR}/${libnn}${R_ROOT_DIR}"
     elif [ -x "${R_ROOT_DIR}/${libnn_fallback}${R_ROOT_DIR}/bin/exec${R_ROOT_DIR}" ]; then
        R_HOME_DIR="${R_ROOT_DIR}/${libnn_fallback}${R_ROOT_DIR}"
     ## else -- leave alone (might be a sub-arch)
     fi
     ;;
  esac
fi

if test -n "${R_HOME}" && \
   test "${R_HOME}" != "${R_HOME_DIR}"; then
  echo "WARNING: ignoring environment value of R_HOME"
fi
R_HOME="${R_HOME_DIR}"
export R_HOME

You can see at the bottom, the code sets R_HOME equal to R_HOME_DIR, which it originally assigned based on R_ROOT_DIR.

No matter what you set the R_HOME_DIR or R_HOME variable to, R resets everything using the R_ROOT_DIR variable.

With the change, I can set all my environment variables:

export RHOME=$PWD/R  #/home/ec2-user/R
export R_HOME=$PWD/R #/home/ec2-user/R
export R_ROOT_DIR=/R #/R

I set RHOME to my working directory where the R package sits. RHOME basically acts as a prefix, in my case, it's /home/ec2-user/.

Also, Rscript appends /R/bin to whatever RHOME is, so now I can properly run...

Rscript hello_world.R

...on the command line. Rscript knows where to find R, which knows where to find all it's stuff.

I feel like packaging up R to run in a portable self-contained folder, without using Docker or something, should be easier than this, so if anyone has a better way of doing this, I'd really appreciate it.


Need Your Help

MongoDB race conditions or concurency issues

node.js mongodb ecmascript-6

I have the following code in my chat application based on NodeJS and MongoDB to change admin for room: