hdf5 in notebook cluster

Is there a way to get HDF5 installed on the notebook clusters without using sudo? It would be really great to not need a startup script.

I need it for reading loom/h5ad files for single cell analysis with scanpy and seurat.

Comments

9 comments

  • Comment author
    Sushma Chaluvadi

    #!/usr/bin/env bash

    apt-get update
    apt-get install -yq libhdf5-dev

     

    Looks like the script above was able to generate a cluster. Can you test this and let us know! Thanks!

    1
  • Comment author
    Simon Baumgart

    Dear all,

    I had the same issue here.

    Trying to work with the startup script now - but I am going with the same line as Benjamin Doran - it would be nice if it would have been pre-installed. The Seurat package in R which requires the rhdf5 - which requires the HDF5 - is becoming more important  I think and there could be more people facing this issue.

    Could you please install the this package? Thank you in advance.

    Best wishes,

    Simon

    1
  • Comment author
    Sushma Chaluvadi

    Hey Benjamin,

     

    Will check on this and get back to you.

    0
  • Comment author
    Benjamin Doran

    This seems to work for installing hdf5 from a notebook

    %%bash
    mkdir -p ~/.hdf5
    cd ~/.hdf5
    gsutil cp gs://fc-fc0d5126-8840-4d53-801e-71f6c3ca07d6/hdf5-1.10.5.tar.gz .
    tar -xzvf hdf5-1.10.5.tar.gz
    cd hdf5-1.10.5
    ./configure
    make
    make check
    make install
    ln -sf ~/.hdf5/hdf5-1.10.5/hdf5/bin/* /home/jupyter-user/.local/bin

    but this fails with an error, of not able to load package or namespace in R. (the compilation works)

    !R -e 'install.packages("hdf5r", configure.args=c("--with-hdf5=/home/jupyter-user/.local/bin/h5cc"))'
    **testing if installed package can be loaded
    Error: package or namespace load failed for ‘hdf5r’ in dyn.load(file, DLLpath = DLLpath, ...):
     unable to load shared object '/home/jupyter-user/.rpackages/hdf5r/libs/hdf5r.so':
      libhdf5_hl.so.100: cannot open shared object file: No such file or directory
    Error: loading failed
    Execution halted
    ERROR: loading failed
    * removing ‘/home/jupyter-user/.rpackages/hdf5r’
    
    The downloaded source packages are in
    	‘/tmp/Rtmp7sYg8a/downloaded_packages’
    Warning message:
    In install.packages("hdf5r", configure.args = c("--with-hdf5=/home/jupyter-user/.local/bin/h5cc")) :
      installation of package ‘hdf5r’ had non-zero exit status


    Any ideas on why it fails?
    0
  • Comment author
    Sushma Chaluvadi

    Hi Benjamin,

    The team was able to run `! pip install h5py` and then run a couple examples from here: https://portal.hdfgroup.org/display/HDF5/Examples+from+Learning+the+Basics.

    I will send the error messages above to the team as well.

    0
  • Comment author
    Benjamin Doran
    • Edited

    Seems like h5py might work because it is already precompiled. Main issue is that I am trying to get this data into Seurat which needs the hdf5r package to work... 

    I have tried using a startup script with the below content to try to get it working, but just get an "internal error" message with no error file.

    #!/bin/bash
    apt-get install -y libc++1-dev libhdf5-dev

     

    0
  • Comment author
    Benjamin Doran

    Tested and confirmed! hdf5r is able to be installed in resulting cluster.

    It would still be useful to not need the startup script, but this should work for now.

    0
  • Comment author
    Sushma Chaluvadi

    Thanks Benjamin,

    I've passed on the request!

     

    0
  • Comment author
    Sushma Chaluvadi

    Hi Simon,

    Thank you for the feedback, I will be sure to pass it along to the Notebook team!

    0

Please sign in to leave a comment.