Creating a list file of reads for input to a workflow

Allie Hajian

Raw genomics data is in the form of many reads from the sequencer. Since it would be messy and time-consuming to type in the location of every one of these data files as input for a WDL, the input is often a 'list' file.

This file is just a list of all the data, where each row is a is a link to an unmapped BAM file in the cloud. It is the expected input to 1_Processing-For-Variant-Discovery, for example. 

If you open a list file in a text editor, it looks like this:

How to make a list file of data in your Google bucket using gsutil 

1. Open a terminal configured to run gsutil. 

Click here for detailed instructions on how to run gsutil in your terminal

2. Output a list of the bam files (in a Google bucket) to a local file. 

To copy to a file named `ubams.list` use the following command:

gsutil ls gs://your_data_Google_bucket_id/ > ubams.list

Note that you will need to replace `your_data_Google_bucket_id` with the path to your workspace Google bucket (or wherever your data are). You can copy your workspace bucket path to your clipboard by clicking the clipboard icon at the far right of your dashboard tab under `Google bucket`. 

To save to a different list file name, replace "ubams.list" in the command line above with the filename of your choice. Just remember to use that filename in the commands below!!

3. Copy ubams.list to your workspace Google bucket from your local machine. 

gsutil cp ubams.list gs://your_data_Google_bucket_id/ 

You can verify that the list file is in your workspace bucket by opening your Google bucket in a browser from the dashboard page (right column). 

Was this article helpful?

0 out of 0 found this helpful

Have more questions? Submit a request



  • Comment author

    I believe the first command in section 2 should read:

    gsutil ls gs://your_data_Google_bucket_id/ > ubams.list

    as opposed to "gs:/your_data_Google_bucket_id" written above.

  • Comment author
    Allie Hajian

    Thanks for the catch, STEVEN GILHOOL! We on the User Ed team really appreciate when users help us identify errors big and small in our documentation. I updated the article with the correct command. 


Please sign in to leave a comment.