Resources for RECOVER

Anton Kovalsky
  • Updated

The National Institutes of Health (NIH) created the RECOVER Initiative to learn about the long-term effects of COVID. Its goal is to rapidly improve our understanding of and ability to predict, treat, and prevent PASC (post-acute sequelae of SARS-CoV-2), including Long COVID. The Consortium collaborates with patients, caregivers, and community representatives across all levels of the initiative, including in national leadership roles and within local communities in study locations

Terra is a portal to a vast ecosystem of tools and data. This article is meant to guide users associated with RECOVER to relevant resources. These resources fall in two categories: Data access and platform tutorials.

Data Access

Listed in this section are resources to help you understand how to take advantage of the data ecosystem to which Terra gives you access.

  • Terra supports multiple scientific partnerships by staging tools and data, creating secure links to protected data sets, and in some cases hosting portals for projects such as AnVIL and BioData Catalyst. To better understand how scientific partnerships fit into the Terra ecosystem, check out our article on Terra's support for scientific projects.
  • Researchers can store and search data hosted by consortia like NHLBI BioData Catalyst or AnVIL using Gen3 - an open-source platform for building cloud-based data commons and ecosystems. To learn about using Gen3 data, start with this article on how to access Gen3 data.
  • Participants of certain projects may have access to credits in a pre-loaded Google Billing account through a program called STRIDES. If you are eligible for such credits, see this article on how to access STRIDES credits.

Terra Tutorials

Check out the workspaces highlighted in this section to get you going with Terra!

Working with RECOVER Fitbit data in Terra

The Adult RECOVER release phs003463.v5.p4 "NIH RECOVER: A Multi-Site Observational Study of Post-Acute Sequelae of SARS-CoV-2 Infection in Adults” on 11/19/2025 includes participant-level Apache Parquet data. As described in the release notes, these PARQUET datasets are the raw Fitbit sensor streams in a columnar, compressed format for each participant. There are 6,569 unique participants with PARQUET files. This dataset follows the Gen3 data model.

We identified that several PARQUET files from the Adult RECOVER release on November 19, 2025 share identical filenames despite representing distinct datasets. This could cause issues if multiple files are read or downloaded simultaneously, as files with the same name may overwrite one another.

To prevent this, we have developed a generalizable workaround. Even if the file_name variable is repeated across datasets, the bucket_path and ga4gh_drs_uri variables are unique to each file and are always available according to the Gen3 data model.

When using TNU commands, the tnu drs copy command allows users to download a file from its DRS URI to a specified destination. We recommend specifying a download directory that mirrors the file’s bucket path to ensure unique, non-colliding file locations.

Example:

ga4gh_drs_uri: drs://my-drs-id

bucket_path: s3://path/to/activity/participant.parquet

Assuming the directory ~/activity/participant/ already exists:

tnu drs copy drs://my-drs-id ~/activity/participant/

This command copies the file into ~/activity/participant/ using its original filename (participant.parquet), preserving both the directory structure and filename uniqueness.

 

 

Was this article helpful?

0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.