Service Incident - September 9, 2019 - TCGA datasets (hg19 version) unavailable

Geraldine Van der Auwera

Summary

Starting Monday 09 September 2019 at 5 PM EDT, the data listed in the older TCGA workspaces (aligned to hg19) became unavailable in Terra/FireCloud. See the Timeline section for the latest troubleshooting and resolution updates and the Impact section to understand how this could affect your use of the system. 

Timeline

September 09, 5:00 PM EDT - Issue started

Impact

Users who have workspaces that include pointers to the affected data will still be able to view results for any analyses they ran previously, but they will no longer be able to run new analyses using the original file paths.

For more information

Please follow this article to get the most up to date information on this incident. If you would like to be notified of all service incidents or upcoming scheduled maintenance, click Follow on the section page.

Was this article helpful?

0 out of 0 found this helpful

Comments

10 comments

  • Comment author
    dannykwells

    Hey Geraldine Van der Auwera - is there any update here? We are blocked on a number of projects until this is resolved. Thank you!

    0
  • Comment author
    Tiffany Miller

    Hi dannykwells , We are actively working on this and should have a better answer next week as to when it will be available. Thanks for following up. I will comment again on this thread so that you are in the loop. 

    0
  • Comment author
    Brendan Reardon

    Hi Tiffany Miller, are there any updates yet?

    1
  • Comment author
    Jake Conway

    Tiffany Miller any update?

    0
  • Comment author
    Cara Mason

    Jake Conway and Brendan Reardon 

    Thanks so much for reaching out on this! We have been working with our collaborators to complete final testing, as restoration of access to the hg19 data was contingent on it, and are on track to have the access restored within the next two weeks. We have been sending communication updates via email to the users that had been accessing the hg19 data, and will add your name to the list, to ensure that you're updated. Please let me know if there are any further questions that i can answer for you!

    0
  • Comment author
    dannykwells

    Hi Cara Mason - any updates? It has been 10 days since the above. We are still blocked on using hg19. Will it be up this week? This is very important to us to get resolved. 

    Also, could you add me to the list of folks to update so we can stay on top of this? 

    Thank you!!

    Danny

    1
  • Comment author
    Brendan Reardon

    Cara Mason following up as well. Are there any updates? 

    1
  • Comment author
    Cara Mason

    Hello, Brendan Reardon and dannykwells, thank you both for following up on this! Our last phase of internal testing, before releasing to the public, is currently underway. The internal testing is being done by those that use the hg19 data, so the results are a litmus test for us to open it up to all users.There have been a few hiccups in this last phase of testing, which, we believe, we have resolved. WIth that said, we are still looking towards release in the next few days. I will ensure that you both are on the email list to get the notice when it is available, and please feel free to keep reaching out!

    -Cara

    0
  • Comment author
    Brendan Reardon

    Thank you, Cara. We're looking forward to having this back up and running. 

    0
  • Comment author
    Tiffany Miller

    You all should have received an email with this information, so notifying everyone else who may see this thread.

    The data is available again!

    As you begin utilizing the workspaces again, you’ll notice that most of the URLs are now DRS URLs (they begin with drs://), which is a system that allows the NCI’s Genomic Data Commons (GDC) to relocate physical data without changing the URLs to those data. For the remainder of the URLs, they are Google bucket URLs, which are external buckets that we have direct access to.  As a result of the changes with DRS, there are a few important things to note:

    • In order to access the TCGA data, you will need to link via eRA Commons in your profile (the same way you previously did) as well as with the “DCF Framework Services by University of Chicago” option. By linking in both places, you gain access to the files that are maintained by GDC via DRS URLs, as well as the files that are located in the Google buckets.

    • For the HG19 aligned data, which is considered to be legacy, not every file path is present in the GDC. Although we have coverage for the vast majority of the data in the GDC, you may notice a very small portion of files missing. If you do, please submit a ticket using our Contact Us button in the app, and we will make a request with NCI for the data to be made available within the GDC. However, where possible, we encourage users to switch to HG38 and we cannot guarantee the GDC will make this missing data available.  

    • Streaming of files using NIO, as the GATK supports, is not currently supported with DRS URLs. We are in the midst of planning how tools like GATK will work with DRS URLs. 

    0

Please sign in to leave a comment.