Terra Cloud Environment Error with Hail

Post author
Stephanie Hao

I tried to create an environment with Hail and these specifications, but I get an error when trying to spin up.

Failed to initialize node saturn-877eec3d-fc4d-4f1e-9b62-d502832f0f44-m: Component hive-server2 failed to activate See output in: gs://leostaging-saturn-877eec3d-8841c9e6-fa30-4bbf-9d75-812398fc8f49/google-cloud-dataproc-metainfo/5a9cdd1d-c2eb-4df1-82df-0a98fc8d5f5f/saturn-877eec3d-fc4d-4f1e-9b62-d502832f0f44-m/dataproc-startup-script_output

Comments

4 comments

  • Comment author
    Samantha (she/her)

    Hi Stephanie Hao,

    Thanks for writing in. We'll take a look at your issue and get back to you as soon as we can.

    Best,

    Samantha

    0
  • Comment author
    Samantha (she/her)

    Hi Stephanie Hao,

    Sorry for the delayed response. Can you try deleting your current cloud environment and creating a new Hail environment? I was able to create a Hail environment without issue using the same configuration.

    If you still receive an error, please provide the following details so we can take a closer look:

    • Terra account email address

    • Google Project ID for workspace, found on the right side of the workspace dashboard
      ​​

    • Cluster ID (visit https://app.terra.bio/#clusters, click Details, and see Name field)

    Best,

    Samantha

    0
  • Comment author
    Stephanie Hao

    Hi Samantha-she-her,

    I retried it with a different google project ID and was able to successfully create an environment. Not sure what happened above, but thank you for the help!

    Best,

    Stephanie

    0
  • Comment author
    Stephanie Hao
    • Edited

    Hi Samantha-she-her,

    I get an error when trying to start a hail dataproc, I'm not sure who to ask to give me dataproc.clusters.use access since I thought I am the owner of the project but cannot give myself this permission. 

    jupyter@saturn-a7b713a4-212d-4487-8de3-72c23eb50b46-m:~$ hailctl dataproc start mssng-test
    gcloud dataproc clusters create mssng-test \
        --image-version=1.4-debian9 \
        --properties=^|||^spark:spark.task.maxFailures=20|||spark:spark.driver.extraJavaOptions=-Xss4M|||spark:spark.executor.extraJavaOptions=-Xss4M|||spark:spark.speculation=true|||hdfs:dfs.replication=1|||dataproc:dataproc.logging.stackdriver.enable=false|||dataproc:dataproc.monitoring.stackdriver.enable=false|||spark:spark.driver.memory=41g \
        --initialization-actions=gs://hail-common/hailctl/dataproc/0.2.62/init_notebook.py \
        --metadata=^|||^WHEEL=gs://hail-common/hailctl/dataproc/0.2.62/hail-0.2.62-py3-none-any.whl%7C%7C%7CPKGS=aiohttp>=3.6,<3.7|aiohttp_session>=2.7,<2.8|asyncinit>=0.2.4,<0.3|bokeh>1.3,<2.0|decorator<5|Deprecated>=1.2.10,<1.3|dill>=0.3.1.1,<0.4|gcsfs==0.2.2|humanize==1.0.0|hurry.filesize==0.9|nest_asyncio|numpy<2|pandas>=1.1.0,<1.1.5|parsimonious<0.9|PyJWT|python-json-logger==0.1.11|requests==2.22.0|scipy>1.2,<1.7|tabulate==0.8.3|tqdm==4.42.1|google-cloud-storage==1.25.* \
        --master-machine-type=n1-highmem-8 \
        --master-boot-disk-size=100GB \
        --num-master-local-ssds=0 \
        --num-secondary-workers=0 \
        --num-worker-local-ssds=0 \
        --num-workers=2 \
        --secondary-worker-boot-disk-size=40GB \
        --worker-boot-disk-size=40GB \
        --worker-machine-type=n1-standard-8 \
        --initialization-action-timeout=20m \
        --labels=creator=pet-2596641334362028ce99d_terra-2d3a4d00_iam_gserviceaccount_co
    Starting cluster 'mssng-test'...
    ERROR: (gcloud.dataproc.clusters.create) PERMISSION_DENIED: Not authorized to requested resource.
    Traceback (most recent call last):
      File "/opt/conda/bin/hailctl", line 8, in <module>
        sys.exit(main())
      File "/opt/conda/lib/python3.7/site-packages/hailtop/hailctl/__main__.py", line 100, in main
        cli.main(args)
      File "/opt/conda/lib/python3.7/site-packages/hailtop/hailctl/dataproc/cli.py", line 122, in main
        jmp[args.module].main(args, pass_through_args)
      File "/opt/conda/lib/python3.7/site-packages/hailtop/hailctl/dataproc/start.py", line 369, in main
        gcloud.run(cmd[1:])
      File "/opt/conda/lib/python3.7/site-packages/hailtop/hailctl/dataproc/gcloud.py", line 9, in run
        return subprocess.check_call(["gcloud"] + command)
      File "/opt/conda/lib/python3.7/subprocess.py", line 363, in check_call
        raise CalledProcessError(retcode, cmd)

     

    The email associated: shao@broadinstitute.org and google project ID: terra-2d3a4d00

    0

Please sign in to leave a comment.