Installing numpy on Amazon EC2

I am having trouble installing numpy on an Amazon EC2 server. I have tried using easy_install, pip, pip inside a virtual env, pip inside another virtual env using python 2.7...

Every time I try, it fails with the error: gcc: internal compiler error: Killed (program cc1), and then further down the line I get a bunch of python errors, with easy_install I get: ImportError: No module named numpy.distutils, and with pip I get: UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 72: ordinal not in range(128).

The EC2 instance is running kernel 3.4.43-43.43.amzn1.x86_64. Has anybody solved this problem? Numpy has always been hard for me to install, but I can usually figure it out... at this point I don't care whether it is in it's own virtualenv, I just want to get it installed.

Answers


Requirements for installing Numpy

  • c compiler (gcc)
  • fortran compiler (gfortran)
  • python header files (2.4.x - 3.2.x)
  • Strongly recommended BLAS or LAPACK

I wrote a script to install virtualenv and scikit-learn along with all the dependencies. You can follow up to the numpy install, which is pretty straight forward. I copied the relevant code below.

sudo yum -y install gcc-c++ python27-devel atlas-sse3-devel lapack-devel
wget https://pypi.python.org/packages/source/v/virtualenv/virtualenv-1.11.2.tar.gz
tar xzf virtualenv-1.11.2.tar.gz 
python27 virtualenv-1.11.2/virtualenv.py sk-learn
. sk-learn/bin/activate
pip install numpy

Just copy/paste, hit enter, (get a cup of coffee) and you're ready to go with virtualenv and numpy on EC2.

If you want to verify that numpy found the optimized linear algebra libraries, run:

(sk-learn)[ec2-user@ip-10-99-17-223 ~]$ python -c "import numpy; numpy.show_config()"

if you see something similar to the following you're all set.

atlas_threads_info:
    libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas']
    library_dirs = ['/usr/lib64/atlas-sse3']
    define_macros = [('ATLAS_INFO', '"\\"3.8.4\\""')]
    language = f77
    include_dirs = ['/usr/include']
blas_opt_info:
    libraries = ['ptf77blas', 'ptcblas', 'atlas']
    library_dirs = ['/usr/lib64/atlas-sse3']
    define_macros = [('ATLAS_INFO', '"\\"3.8.4\\""')]
    language = c
    include_dirs = ['/usr/include']
atlas_blas_threads_info:
    libraries = ['ptf77blas', 'ptcblas', 'atlas']
    library_dirs = ['/usr/lib64/atlas-sse3']
    define_macros = [('ATLAS_INFO', '"\\"3.8.4\\""')]
    language = c
    include_dirs = ['/usr/include']
lapack_opt_info:
    libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas']
    library_dirs = ['/usr/lib64/atlas-sse3']
    define_macros = [('ATLAS_INFO', '"\\"3.8.4\\""')]
    language = f77
    include_dirs = ['/usr/include']
lapack_mkl_info:
  NOT AVAILABLE
blas_mkl_info:
  NOT AVAILABLE
mkl_info:
  NOT AVAILABLE

For a more detailed explanation, you can read installing-scikit-learn-on-amazon-ec2. I wrote the blog post specifically to remember the installation steps and have a short how-to guide. I try to keep the post and the install script up to date.


I ended up just installing numpy through yum, sudo yum install numpy. I guess this is the best I can do for now. When working with virtualenv and I need numpy, I will tell it to use site packages.

Thanks for the suggestion @Robert.


Just for the archive. If you are running an Ubuntu EC2 and you have already installed pip, then you can do something like:

for Python2:

pip install numpy --user

for Python 3:

pip3 install numpy --user

the key is the word user.


You might try using the Anaconda Python distribution from https://www.continuum.io, which uses the conda Python version and package manager. I have found this distro to be well-configured and convenient for scientific computing work.

I was able to download and install into an EC2 instance using wget and the linux download link from their Downloads webpage. For example, for Python 2:

$ wget https://3230d63b5fc54e62148e-c95ac804525aac4b6dba79b00b39d1d3.ssl.cf1.rackcdn.com/Anaconda2-2.5.0-Linux-x86_64.sh

...

$ chmod a+x Anaconda2-2.5.0-Linux-x86_64.sh

$ ./Anaconda2-2.5.0-Linux-x86_64.sh

...

$ source .bashrc

$ conda create --name myEnvName biopython

$ source activate myEnvName

$ python -c 'import numpy; print(numpy.version.version)'

1.10.4


Need Your Help

Client not found in kerberos database while initializing kadmin Interface

security hadoop authentication kerberos bigdata

When I tried to create Principal ("prabhat/admin") in Kerberos (Kadmind Server) using the addprinc command.

MFC: How dock multiple windows in same position?

c++ mfc docking

I have mfc application with docking windows, I need to place multiple docking windows in same place ( one on each other ).