How to convert common genomics file formats

Allie Hajian
  • Updated

When you need to convert various sequencing file formats to GATK analysis-ready input formats, we've got you covered. You'll find the Sequence Format Conversion workspace in the Featured Workspace section of Terra's Showcase library.

What's in the workspace

This curated workspace has tools and instructions for converting the following formats so you can use your data in GATK analysis workflows on Terra. 

File conversion WDLs in the workspace

  1. Interleaved FASTQ to paired FASTQ   
  2. Paired FASTQ to unmapped BAM (uBAM) 
  3. BAM to unmapped BAM  
  4. CRAM to BAM 

The Validate BAM workflow confirms proper formatting of SAM or BAM files.

How to find the workspace

1. From the main navigation menu (top left), expand the Library section and select Featured Workspaces.
Screenshot of main menu with 'Library' section expanded and 'Featured Datasets' highlighted

2. Click on the Format Conversion filter (left column - you may need to scroll down quite a bit) and select Sequence-Format-Conversion from the list.

Screenshot of Featured workspace page highlighting the 'Format conversion' filter - under Utilities in the far left column.


Was this article helpful?

1 out of 1 found this helpful



Please sign in to leave a comment.