Best way to use data from public FTP in workflow?
There are BAM/FASTQ files from a public FTP that I would like to use in a workflow (e.g. alignment, variant calling). Do I need to upload them to my workspace files to use them? Or could I treat the URL for each file as an input and `curl` them within the workflow then treat them as an intermediate file?
Comments
1 comment
Hi Juniper,
Thanks for writing in! I believe that both of the methods you've suggested are possible for working with external files with Terra. To move the files to the workspace, please take a look at this document as it will walk you through moving these files: How to move data to/from a Google bucket (workspace or external)
As for having the workflow access the external files directly, this depends on the workflow it's self as it will need to have code in the WDL to import the files into the Docker container. Is there an existing workflow you'd like to use, or are you creating one yourself? Please let me know and I can take a look if you'd like.
Let me know if you have any questions.
Best,
Josh.
Please sign in to leave a comment.