[2023 Feb] Uncovering structural variation with the GATK-SV pipeline

Derek Caetano-Anolles
  • Updated

On February 13, 2023, the Broad Institute's Data Sciences Platform presented a 4-hour workshop on Uncovering structural variation with the GATK-SV pipeline, in which we walked through the GATK-SV pipeline on Terra and how to use it on genome sequencing data from a variety of large-scale sequencing projects. Attendees had an opportunity to learn the pipeline straight from the developers, as well as use the tools for themselves in a safe demo environment.


The slides and Terra registration materials are at http://broad.io/GATKSVworkshop2023. Additionally, you can try the accompanyingGATK-SV workspace


1:00 PM ET


1:05 PM

Introduction to the workshop (and Terra)


Introduction to the workshop, the plan for the day, and a brief intro to the Terra platform that we will be using to Demo and use the pipeline. We will briefly go over what Terra looks like, as well as an explanation of how to prepare your own workspace of the upcoming demo. 

1:30 PM

Structural variants and how to find them


The GATK-SV pipeline is used for discovering, genotyping, and annotating structural variants in Illumina short-read whole-genome sequencing (WGS) data. But how does it work? What is it looking for? And what can we get out of it? This overview will discuss each stage of the pipeline so we can have a better understanding of what will happen to our input data.

2:30 PM

Getting started with the GATK-SV pipeline


Now that we have our own cloned copies of the GATK-SV workspace waiting for us in Terra, we will go through how to set up our workspaces and run the pipeline. Participants will get an opportunity to use the tool to produce their own demo data, which they can access after the workshop is over.

3:30 PM

How to interpret SV-VFCs (because they're different)


Since the resulting SV-VCFs produced by the SV-GATK pipeline are a bit different from the VCFs we may be accustomed to looking at, this presentation will provide us with a better understanding of what we should be looking for when we make our own output data.

4:00 PM

How to filter our variants


While our output demo data is being processed, let’s take a look at what the final file should look like. We will discuss how to filter our variants, as well as how to perform QC on them.

4:30 PM

Take-home messages, and additional resources


The workshop is over, but it’s not the end! In this wrap-up presentation, we will provide you with a collection of resources that you can use to troubleshoot the GATK-SV pipeline when you get back to work.




Was this article helpful?

0 out of 0 found this helpful



Please sign in to leave a comment.