Error from copying a large file

Post author
Seung Hoan Choi

I try to copy about 900MB file to Notebook. However, I got this error message below.

Even after installing the crcmod using this command "pip install --no-cache-dir -U crcmod", I got the same error message. 

Do you know how I can fix this?

Thanks,

Seung Hoan Choi

Copying gs://fc-306d0fc4-2f1d-4ea1-ae29-ea5b8fe0cb22/genotype/freeze8_gds/freeze.8.chr21.pass_only.phased.gds...
==> NOTE: You are downloading one or more large file(s), which would            
run significantly faster if you enabled sliced object downloads. This
feature is enabled by default but requires that compiled crcmod be
installed (see "gsutil help crcmod").

CommandException: 
Downloading this composite object requires integrity checking with CRC32c,
but your crcmod installation isn't using the module's C extension, so the
hash computation will likely throttle download performance. For help
installing the extension, please see "gsutil help crcmod".

To download regardless of crcmod performance or to skip slow integrity
checks, see the "check_hashes" option in your boto config file.

NOTE: It is strongly recommended that you not disable integrity checks. Doing so
could allow data corruption to go undetected during uploading/downloading.

ThereforeI

Comments

9 comments

  • Comment author
    Adelaide Rhodes
    • Edited

    Hello Seung Hoan -

    What is the configuration for your notebook cluster? I believe that by default it creates a cluster with 500 MB of storage. You might want to recreate the cluster with more storage.

    My apologies, I was confounding disk size and storage size.

    0
  • Comment author
    Seung Hoan Choi

    Hi Adelaide,

    I found that I used 500GB of the storage. I tried with 1TB, but this does not work

    Thanks, 

    0
  • Comment author
    Adelaide Rhodes

    Hi Seung Hoan Choi - What task are you trying to complete in the notebook?  I a wondering if this larger file could be processed to make it smaller?


    Otherwise, it may be possible to resize the notebook cluster using the Swagger API, but I will have to check with the workbench team to see what is the upper limit for our notebooks.

    0
  • Comment author
    Adelaide Rhodes

    Hi Seung Hoan Choi - I was wondering what values you see in your cluster set up?

    0
  • Comment author
    Seung Hoan Choi

    Hi Adelaide,

    I used 4 CPUs, 1000GB disk size, and 15GB memory. It is not clear for me whether this is the memory issue 

    Thanks,

     

    0
  • Comment author
    Adelaide Rhodes

    Hello Seung Hoan -

    Your download should work, could you please post the command that you were using to do the download?

    Adelaide

    0
  • Comment author
    Seung Hoan Choi
    • Edited

    Hi Adelaide

    Below is the script and response from the termial. 

    Thanks,

    Seung Hoan

    jupyter-user@saturn-d498b7e3-cefa-43e6-ba42-e9cb768e44e4-m:~$ gsutil cp gs://fc-306d0fc4-2f1d-4ea1-ae29-ea5b8fe0cb22/genotype/freeze8_gds/freeze.8.chr21.pass_only.phased.gds .
    Copying gs://fc-306d0fc4-2f1d-4ea1-ae29-ea5b8fe0cb22/genotype/freeze8_gds/freeze.8.chr21.pass_only.phased.gds...
    ==> NOTE: You are downloading one or more large file(s), which would
    run significantly faster if you enabled sliced object downloads. This
    feature is enabled by default but requires that compiled crcmod be
    installed (see "gsutil help crcmod").

    CommandException:
    Downloading this composite object requires integrity checking with CRC32c,
    but your crcmod installation isn't using the module's C extension, so the
    hash computation will likely throttle download performance. For help
    installing the extension, please see "gsutil help crcmod".

    To download regardless of crcmod performance or to skip slow integrity
    checks, see the "check_hashes" option in your boto config file.

    NOTE: It is strongly recommended that you not disable integrity checks. Doing so
    could allow data corruption to go undetected during uploading/downloading.

    0
  • Comment author
    Seung Hoan Choi

    Hi Adelaide,

    I tried a different file with a larger file size. I worked !!! I think there is something wrong with this file. 

    I will create a new version of this file and try it again.

    Thanks for help

    Seung Hoan

     

    0
  • Comment author
    Adelaide Rhodes

    Great! I will close the ticket for now.

    0

Please sign in to leave a comment.