Is there a way to find which workspace owns a bucket?

Post author
pmontgom

I'm trying to access data from a different group. They've shared with me a workspace however the samples reference BAM files that are physically stored in a different workspace. (I believe the bucket is one attached to a Terra workspace because the path has the form gs://fc-xxxxxx..)

I can see the bucket name in the BAM file path, and the other person has access to see the bucket, however, we seem to be having trouble figuring out which Terra workspace he needs to share with me to grant me access to that bucket.

I know you can go through one workspace at a time and view the bucket, but after you've accumulated a lot of workspaces, this is tedious.

Is there a way to look up workspace given bucket name?

 

Comments

2 comments

  • Comment author
    pmontgom

    I realized I could extract this information from the API so I wrote a script to dump everything that I can see. I have some logistics to sort out as I'll need to get my collaborator to run this with their account, however, I'm hopeful this will unblock us.

    It would be very convenient if Terra used google's bucket label to store a label on the bucket of the owning workspace -- but absent something like that, I think this is the best solution I can come up with.

    I'm posting my code here in case it's helpful for someone else faced with the same challenge in the future:

     

    import subprocess
    import requests
    import pandas as pd

    token = subprocess.check_output(["gcloud","auth","print-access-token"]).decode("utf8").strip()
    response = requests.get("https://api.firecloud.org/api/workspaces", headers={"accept": "application/json", "Authorization": "Bearer "+token})

    workspaces = response.json()
    private_workspaces = [w for w in workspaces if not w['public']]
    bucketName = [w['workspace']['bucketName'] for w in private_workspaces]
    workspace = [w['workspace']['namespace'] +"/"+ w['workspace']['name'] for w in private_workspaces]

    terra_workspaces = pd.DataFrame(dict(workspace=workspace, bucketName=bucketName))
    terra_workspaces.to_csv("terra_workspaces.csv")
    1
  • Comment author
    Samantha (she/her)

    Hi pmontgom,

     

    I'm glad you were able to extract the information you were looking for with your script. I have created a ticket with your suggestion for our engineering team to look into.

     

    Best,

    Samantha

    0

Please sign in to leave a comment.