The Workflow Description Language (WDL, pronounced 'widdle') is a community-driven, human-readable language for data processing workflows. Whether you want to use WDL in Terra or develop your own WDL workflows, we have the resources to get you started!
WDL: A community-driven data processing language
WDL is a human-readable and writable language originally developed for the Broad Institute's genomic analysis pipelines. Since its inception, it has been widely adopted and developed by a global community of researchers, programmers, engineers and analysts.
Read more about the WDL Community and how you can participate at the OpenWDL website. WDL is open-source, allowing any person to contribute ideas and improvements to the language, in addition to tutorial resources.
WDL and Terra: Partners in pipelining
WDL workflows allow you to run whole genomic pipelines in Terra. The Terra platform is specifically designed to integrate a WDL workflow's input and output information directly into the platform. This integration is what allows you to readily configure a workflow from your Terra workspace.
To get started importing and running WDL workflows on Terra, read about Pipelining.
Developing WDL workflows: "Hello, learn-wdl!"
If you're an experienced developer looking to get started with WDL, we highly recommend the learn-wdl tutorial on GitHub. This community resource is open-source and uses a series of video guides developed by Lynn Langit to walk you through a variety of WDL script examples.
When you arrive at the learn-wdl GitHub page, start with the README and the Introductory Video.
Navigating the learn-wdl GitHub
The learn-WDL GitHub repository has a series of folders containing the workflow examples used in the learn WDL videos. Read on for details about the contents of each folder.
Find steps for setting up your Google Compute Engine Virtual Machine (GCE VM) and running your first "hello-world" scripts. This folder also includes examples of the different language patterns you may use when writing a WDL script.
Read a brief overview of pipeline patterns and try the accompanying pipeline scripts as you watch the video tutorials.
Try commonly used workflows for genomic analyses. This folder includes the WDLs, input configuration files (JSONs), and sample data needed for each analysis.
Check out example data divided by data type (BAM, VCF, FASTA, etc.). You will use this data as you work through the video tutorials.
Learn more about WDL with reference material covering:
- Language concepts, including workflow keywords, style guides, and templates
- Setting up your Dev environment
- External WDL resources, including the WDL 1.0 spec and a Nextflow vs. WDL Quickstart Guide
- Courses for additional Cloud Resources (i.e. Google Cloud Platform, Galaxy, etc.)
Contribute to learn-wdl
WDL resources are always expanding and improving. Because learn-wdl is open-source, any WDL user can contribute ideas for improvements, file issues, or add additional resource material.
New to computing? No problem! Try these high-level overviews
For anyone interested in learning more about WDL workflows, the WDL community has curated several guides that provide a high-level overview of the basic WDL script components. These guides are available on the wdl-docs website.
Writing a WDL
First, we'll introduce the building blocks of WDL and how they fit together to form the base structure of a WDL script. Then we'll show how to add variables so that input files and parameters can be specified outside of the script itself, which will allow you to use the same script for different runs without modification. Next, we'll cover how to add plumbing, i.e. how you can chain together the components that perform units of work in different ways to form sophisticated pipelines. Finally we'll look at how to validate syntax, which sounds boring but is really helpful since it will tell you quickly whether your WDL script is runnable or not. Because nobody likes starting a run only to see it fail because it's missing a semi-colon somewhere.
Running a WDL
This is going to be short and sweet. We'll show you how to generate a JSON template for specifying inputs (spoiler: it's super easy) and fill it out. Then we'll present the main options for executing your WDL script, focusing on the execution engine we use for Terra, which is called Cromwell. If you had asked us two years ago if we'd ever have an execution engine that we could use both in development and in production, locally and on the cloud, we would have said when pigs fly...
Check out the following resources that can be useful for WDL scripting:
- The WDL 1.0 Spec. Terra's Cromwell is compatible with this version of WDL, so it's a useful for reference when developing workflows for Terra.
- If you're using Terra and Google Cloud, check out the WDL Best Practices from the Broad Pipeline Development team.
- If you want to see some WDL code snippets that might match your use-case, check out the WDL cookbook section of the wdl-docs site.
- Theiagen Genomics workflow management workshop.
Please sign in to leave a comment.