Is it possible to have a task directly modify an input file?

July 29, 2021 16:48
19 comments

While working on converting UWGAC workflows from CWL into WDL as part of BioData Catalyst, I came across one that caused a permissions issue. For weeks I thought this was due to the fact I was using symlinks, but I now believe the issue is that the workflow calls an Rscript that attempts to modify the input file directly.

According to my tests, Cromwell, even in "local mode," localizes input files with rw-r--r-- permissions. This is the case whether they come from a gs:// URI or from another task. So, when Cromwell is run as root, the files can be edited directly. But on Terra, you (quite understandably!) do not have root permissions, meaning that you cannot edit input files directly. At least, that's my theory.

I created a simple Python script in a WDL that demonstrates this a little more simply than the UWGAC workflow I'm converting. The Python WDL passes when run locally, presumably because I am running as root, but it gives a permissions error and fails on Terra. https://github.com/aofarrel/upon-thine-inputs

Two questions:

1. Is my hypothesis correct? I have searched the Cromwell and openWDL spec repos, but thus far have not been able to find documentation on this.

2. For the UWGAC workflow I am converting, I cannot modify the Rscripts; we want to use the exact same Docker image that the CWLs use and the Rscript is in that image. It seems that the only possible workaround is to cp the input file into the execution directory, and point the Rscript to that new copy (which presumably has wider permissions). Terra (again, quite understandably) does not grant me mv or chmod permissions, so it seems this is the only possible workaround. But is there another way that doesn't involve duplication?

Comments

19 comments

Jason Cerrato
- August 02, 2021 14:57
Hi Aisling,

Thanks for writing in and detailing your issue. We'll take a closer look at this and get back to you as soon as we can!

Kind regards,

Jason

0
Jason Cerrato
- August 05, 2021 13:23
Hi Aisling,

Apologies for the wait here, we've been experiencing a higher-than-usual number of support requests. I will work to get you answers to your questions before end of day.

Kind regards,

Jason

0
Jason Cerrato
- August 05, 2021 18:27
Hi Aisling,

For question 1, when you run a workflow, Cromwell makes local copies of your input files in the Google VM it spins up and you should be able to make edits and move these files as needed. I see in your Github repo that you're getting the permission denied error. One of our engineers performed a test using the same commands in their WDL and were able to successfully make edits to that file.

They were able to perform this successfully when they were using the python:latest Docker for the task. When they tried to use your docker, they ran into the same error you did. They ran a whoami command and saw that the command was executed through topmed rather than root.

You should be able to get around this by either changing the Docker you use or by editing the Docker so it runs commands as root instead of as topmed.

For question 2, if your script requires the file to be located at a particular location you can definitely mv the file. Here is an example of running a mv command on a file within a task.

Moving the files to the execution directory will keep the same permissions -rw-r--r--, so you should be able to chmod the file as well if needed for your Rscript (assuming your Docker is set up to run as root).

I hope this helps! If you have any questions please let us know.

Kind regards,

Jason

0
Ash O'Farrell
- August 05, 2021 18:58
Thank you for your response and testing, it's helpful to know this has something to do with the container permissions. However, it seems to raise even more questions. When I run the WDL locally via Cromwell in the topmed container, I can modify the file directly. Although Cromwell in local mode ignores most runtime attributes, it has to use a Docker container otherwise none of my local tests of the pipeline I'm converting would work (as they call a script in the image) Even weirder is that if I add a whoami to my WDL and run it locally, it says that I'm running as topmed, which matches what I'm running as on Terra. And yet, it seems that when run locally, the topmed user suddenly has root permissions? I'm not sure if this is even a quirk of Cromwell, as the original pipeline where I can into this error is using the exact same topmed Docker in its CWL, and the CWL version of the pipeline is able to modify files directly.

It doesn't seem to make sense for a CWL, a local WDL, and a Terra WDL that all use the same Docker image as the same topmed user, and to end up with different file permissions.

The only idea that I have is that the topmed user might have write permissions, but perhaps Terra sees that it isn't root and limits those write permissions, but would allow a root user write permissions as demonstrated by the Python Docker being able to edit the file on Terra...?

0
Jason Cerrato
- August 11, 2021 15:04
Hi Aisling,

We are investigating this further and will get back to you ASAP!

Kind regards,

Jason

0
Jason Cerrato
- August 11, 2021 15:42
Hi Aisling,

Can you add an ls -l to your command to list the input permissions and run the workflow on your local Cromwell instance so we know what permissions are required to edit the input when run locally?

Can you also confirm whether topmed has root or escalated privileges on your local system? It's possible that you are able to edit the input files locally only because topmed has escalated permissions on your system, but it doesn't on the Terra-created VM instance.

Kind regards,

Jason

0
Ash O'Farrell
- August 11, 2021 21:05
I made some edits to my workflow to print this information:

https://dockstore.org/workflows/github.com/DataBiosphere/analysis_pipeline_WDL/vcf-to-gds-wdl:debug-permissions?tab=files

Two quick notes:

* The host system I run locally is monouser with root; the topmed user only exists in the container.

* My folder names locally start with "bark-bark" because I modified my Cromwell config to do that in order to ensure my more important Cromwell config edits were taking place, and because I like dogs.

When I run that locally, here's what I get:

Who am I?
topmed
What's in the group database?
root:x:0:
daemon:x:1:
bin:x:2:
sys:x:3:
adm:x:4:
tty:x:5:
disk:x:6:
lp:x:7:
mail:x:8:
news:x:9:
uucp:x:10:
man:x:12:
proxy:x:13:
kmem:x:15:
dialout:x:20:
fax:x:21:
voice:x:22:
cdrom:x:24:
floppy:x:25:
tape:x:26:
sudo:x:27:topmed,ubuntu,analyst
audio:x:29:pulse
dip:x:30:
www-data:x:33:
backup:x:34:
operator:x:37:
list:x:38:
irc:x:39:
src:x:40:
gnats:x:41:
shadow:x:42:
utmp:x:43:
video:x:44:
sasl:x:45:
plugdev:x:46:
staff:x:50:
games:x:60:
users:x:100:
nogroup:x:65534:
systemd-journal:x:101:
systemd-network:x:102:
systemd-resolve:x:103:
input:x:104:
messagebus:x:105:
netdev:x:106:
rtkit:x:107:
ssh:x:108:
pulse:x:109:
pulse-access:x:110:
avahi:x:111:
geoclue:x:112:
rdma:x:113:
topmed:x:2049:
What are the permissions of the input files?
-rw-r--r-- 2 topmed topmed 71K Aug 11 13:55 /bark-bark/vcftogds/b05343c6-90e6-4284-8198-9ed8f0dbf160/call-unique_variant_id/inputs/1221252165/1KG_phase3_subset_chr1.gds
-rw-r--r-- 2 topmed topmed 72K Aug 11 13:55 /bark-bark/vcftogds/b05343c6-90e6-4284-8198-9ed8f0dbf160/call-unique_variant_id/inputs/-575699194/1KG_phase3_subset_chr2.gds
-rw-r--r-- 2 topmed topmed 72K Aug 11 13:55 /bark-bark/vcftogds/b05343c6-90e6-4284-8198-9ed8f0dbf160/call-unique_variant_id/inputs/1922316743/1KG_phase3_subset_chr3.gds
-rw-r--r-- 2 topmed topmed 73K Aug 11 13:55 /bark-bark/vcftogds/b05343c6-90e6-4284-8198-9ed8f0dbf160/call-unique_variant_id/inputs/125365384/1KG_phase3_subset_chr20.gds
-rw-r--r-- 2 topmed topmed 58K Aug 11 13:55 /bark-bark/vcftogds/b05343c6-90e6-4284-8198-9ed8f0dbf160/call-unique_variant_id/inputs/-1671585975/1KG_phase3_subset_chrX.gds
What are the permissions in the execution directory?
total 36K
drwxr-xrwx 15 topmed topmed 480 Aug 11 13:55 .
drwxr-xrwx 5 topmed topmed 160 Aug 11 13:55 ..
lrwxr-xr-x 1 topmed topmed 124 Aug 11 13:55 1KG_phase3_subset_chr1.gds -> /bark-bark/vcftogds/b05343c6-90e6-4284-8198-9ed8f0dbf160/call-unique_variant_id/inputs/1221252165/1KG_phase3_subset_chr1.gds
lrwxr-xr-x 1 topmed topmed 124 Aug 11 13:55 1KG_phase3_subset_chr2.gds -> /bark-bark/vcftogds/b05343c6-90e6-4284-8198-9ed8f0dbf160/call-unique_variant_id/inputs/-575699194/1KG_phase3_subset_chr2.gds
lrwxr-xr-x 1 topmed topmed 124 Aug 11 13:55 1KG_phase3_subset_chr20.gds -> /bark-bark/vcftogds/b05343c6-90e6-4284-8198-9ed8f0dbf160/call-unique_variant_id/inputs/125365384/1KG_phase3_subset_chr20.gds
lrwxr-xr-x 1 topmed topmed 124 Aug 11 13:55 1KG_phase3_subset_chr3.gds -> /bark-bark/vcftogds/b05343c6-90e6-4284-8198-9ed8f0dbf160/call-unique_variant_id/inputs/1922316743/1KG_phase3_subset_chr3.gds
lrwxr-xr-x 1 topmed topmed 125 Aug 11 13:55 1KG_phase3_subset_chrX.gds -> /bark-bark/vcftogds/b05343c6-90e6-4284-8198-9ed8f0dbf160/call-unique_variant_id/inputs/-1671585975/1KG_phase3_subset_chrX.gds
-rw-r--r-- 1 topmed topmed 1.7K Aug 11 13:55 debug.txt
-rw-r--r-- 1 topmed topmed 6.6K Aug 11 13:55 script
-rw-r--r-- 1 topmed topmed 461 Aug 11 13:55 script.background
-rw-r--r-- 1 topmed topmed 499 Aug 11 13:55 script.submit
-rw-r--r-- 1 topmed topmed 2.9K Aug 11 13:55 stderr
-rw-r--r-- 1 topmed topmed 2.9K Aug 11 13:55 stderr.background
-rw-r--r-- 1 topmed topmed 1.7K Aug 11 13:55 stdout
-rw-r--r-- 1 topmed topmed 1.7K Aug 11 13:55 stdout.background

0

Ash O'Farrell

August 11, 2021 21:10

Interestingly, when I run this modified workflow on Terra, whoami still resolves to the topmed user, and the topmed user still is in the sudo group.

Who am I?
topmed
What's in the group database?
root:x:0:
daemon:x:1:
bin:x:2:
sys:x:3:
adm:x:4:
tty:x:5:
disk:x:6:
lp:x:7:
mail:x:8:
news:x:9:
uucp:x:10:
man:x:12:
proxy:x:13:
kmem:x:15:
dialout:x:20:
fax:x:21:
voice:x:22:
cdrom:x:24:
floppy:x:25:
tape:x:26:
sudo:x:27:topmed,ubuntu,analyst
audio:x:29:pulse
dip:x:30:
www-data:x:33:
backup:x:34:
operator:x:37:
list:x:38:
irc:x:39:
src:x:40:
gnats:x:41:
shadow:x:42:
utmp:x:43:
video:x:44:
sasl:x:45:
plugdev:x:46:
staff:x:50:
games:x:60:
users:x:100:
nogroup:x:65534:
systemd-journal:x:101:
systemd-network:x:102:
systemd-resolve:x:103:
input:x:104:
messagebus:x:105:
netdev:x:106:
rtkit:x:107:
ssh:x:108:
pulse:x:109:
pulse-access:x:110:
avahi:x:111:
geoclue:x:112:
rdma:x:113:
topmed:x:2049:
What are the permissions of the input files?
-rw-r--r-- 1 root root 71K Aug 11 14:08 /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-0/cacheCopy/1KG_phase3_subset_chr1.gds
-rw-r--r-- 1 root root 72K Aug 11 14:08 /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-1/cacheCopy/1KG_phase3_subset_chr2.gds
-rw-r--r-- 1 root root 72K Aug 11 14:08 /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-2/cacheCopy/1KG_phase3_subset_chr3.gds
-rw-r--r-- 1 root root 75K Aug 11 14:08 /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-3/cacheCopy/1KG_phase3_subset_chr4.gds
-rw-r--r-- 1 root root 69K Aug 11 14:08 /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-4/cacheCopy/1KG_phase3_subset_chr5.gds
-rw-r--r-- 1 root root 77K Aug 11 14:08 /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-5/cacheCopy/1KG_phase3_subset_chr6.gds
-rw-r--r-- 1 root root 67K Aug 11 14:08 /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-6/cacheCopy/1KG_phase3_subset_chr7.gds
-rw-r--r-- 1 root root 70K Aug 11 14:08 /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-7/cacheCopy/1KG_phase3_subset_chr8.gds
-rw-r--r-- 1 root root 77K Aug 11 14:08 /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-8/cacheCopy/1KG_phase3_subset_chr9.gds
-rw-r--r-- 1 root root 75K Aug 11 14:08 /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-9/cacheCopy/1KG_phase3_subset_chr10.gds
-rw-r--r-- 1 root root 74K Aug 11 14:08 /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-10/cacheCopy/1KG_phase3_subset_chr11.gds
-rw-r--r-- 1 root root 73K Aug 11 14:08 /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-11/cacheCopy/1KG_phase3_subset_chr12.gds
-rw-r--r-- 1 root root 75K Aug 11 14:08 /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-12/cacheCopy/1KG_phase3_subset_chr13.gds
-rw-r--r-- 1 root root 69K Aug 11 14:08 /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-13/cacheCopy/1KG_phase3_subset_chr14.gds
-rw-r--r-- 1 root root 71K Aug 11 14:08 /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-14/cacheCopy/1KG_phase3_subset_chr15.gds
-rw-r--r-- 1 root root 71K Aug 11 14:08 /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-15/cacheCopy/1KG_phase3_subset_chr16.gds
-rw-r--r-- 1 root root 68K Aug 11 14:08 /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-16/cacheCopy/1KG_phase3_subset_chr17.gds
-rw-r--r-- 1 root root 72K Aug 11 14:08 /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-17/cacheCopy/1KG_phase3_subset_chr18.gds
-rw-r--r-- 1 root root 72K Aug 11 14:08 /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-18/cacheCopy/1KG_phase3_subset_chr19.gds
-rw-r--r-- 1 root root 73K Aug 11 14:08 /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-19/cacheCopy/1KG_phase3_subset_chr20.gds
-rw-r--r-- 1 root root 74K Aug 11 14:08 /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-20/cacheCopy/1KG_phase3_subset_chr21.gds
-rw-r--r-- 1 root root 69K Aug 11 14:08 /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-21/cacheCopy/1KG_phase3_subset_chr22.gds
-rw-r--r-- 1 root root 58K Aug 11 14:08 /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-22/cacheCopy/1KG_phase3_subset_chrX.gds
What are the permissions in the execution directory?
total 220K
drwxrwxrwx 5 root   root   4.0K Aug 11 14:08 .
drwxr-xr-x 1 root   root   4.0K Aug 11 14:08 ..
lrwxrwxrwx 1 topmed topmed  195 Aug 11 14:08 1KG_phase3_subset_chr1.gds -> /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-0/cacheCopy/1KG_phase3_subset_chr1.gds
lrwxrwxrwx 1 topmed topmed  196 Aug 11 14:08 1KG_phase3_subset_chr10.gds -> /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-9/cacheCopy/1KG_phase3_subset_chr10.gds
lrwxrwxrwx 1 topmed topmed  197 Aug 11 14:08 1KG_phase3_subset_chr11.gds -> /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-10/cacheCopy/1KG_phase3_subset_chr11.gds
lrwxrwxrwx 1 topmed topmed  197 Aug 11 14:08 1KG_phase3_subset_chr12.gds -> /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-11/cacheCopy/1KG_phase3_subset_chr12.gds
lrwxrwxrwx 1 topmed topmed  197 Aug 11 14:08 1KG_phase3_subset_chr13.gds -> /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-12/cacheCopy/1KG_phase3_subset_chr13.gds
lrwxrwxrwx 1 topmed topmed  197 Aug 11 14:08 1KG_phase3_subset_chr14.gds -> /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-13/cacheCopy/1KG_phase3_subset_chr14.gds
lrwxrwxrwx 1 topmed topmed  197 Aug 11 14:08 1KG_phase3_subset_chr15.gds -> /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-14/cacheCopy/1KG_phase3_subset_chr15.gds
lrwxrwxrwx 1 topmed topmed  197 Aug 11 14:08 1KG_phase3_subset_chr16.gds -> /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-15/cacheCopy/1KG_phase3_subset_chr16.gds
lrwxrwxrwx 1 topmed topmed  197 Aug 11 14:08 1KG_phase3_subset_chr17.gds -> /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-16/cacheCopy/1KG_phase3_subset_chr17.gds
lrwxrwxrwx 1 topmed topmed  197 Aug 11 14:08 1KG_phase3_subset_chr18.gds -> /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-17/cacheCopy/1KG_phase3_subset_chr18.gds
lrwxrwxrwx 1 topmed topmed  197 Aug 11 14:08 1KG_phase3_subset_chr19.gds -> /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-18/cacheCopy/1KG_phase3_subset_chr19.gds
lrwxrwxrwx 1 topmed topmed  195 Aug 11 14:08 1KG_phase3_subset_chr2.gds -> /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-1/cacheCopy/1KG_phase3_subset_chr2.gds
lrwxrwxrwx 1 topmed topmed  197 Aug 11 14:08 1KG_phase3_subset_chr20.gds -> /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-19/cacheCopy/1KG_phase3_subset_chr20.gds
lrwxrwxrwx 1 topmed topmed  197 Aug 11 14:08 1KG_phase3_subset_chr21.gds -> /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-20/cacheCopy/1KG_phase3_subset_chr21.gds
lrwxrwxrwx 1 topmed topmed  197 Aug 11 14:08 1KG_phase3_subset_chr22.gds -> /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-21/cacheCopy/1KG_phase3_subset_chr22.gds
lrwxrwxrwx 1 topmed topmed  195 Aug 11 14:08 1KG_phase3_subset_chr3.gds -> /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-2/cacheCopy/1KG_phase3_subset_chr3.gds
lrwxrwxrwx 1 topmed topmed  195 Aug 11 14:08 1KG_phase3_subset_chr4.gds -> /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-3/cacheCopy/1KG_phase3_subset_chr4.gds
lrwxrwxrwx 1 topmed topmed  195 Aug 11 14:08 1KG_phase3_subset_chr5.gds -> /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-4/cacheCopy/1KG_phase3_subset_chr5.gds
lrwxrwxrwx 1 topmed topmed  195 Aug 11 14:08 1KG_phase3_subset_chr6.gds -> /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-5/cacheCopy/1KG_phase3_subset_chr6.gds
lrwxrwxrwx 1 topmed topmed  195 Aug 11 14:08 1KG_phase3_subset_chr7.gds -> /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-6/cacheCopy/1KG_phase3_subset_chr7.gds
lrwxrwxrwx 1 topmed topmed  195 Aug 11 14:08 1KG_phase3_subset_chr8.gds -> /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-7/cacheCopy/1KG_phase3_subset_chr8.gds
lrwxrwxrwx 1 topmed topmed  195 Aug 11 14:08 1KG_phase3_subset_chr9.gds -> /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-8/cacheCopy/1KG_phase3_subset_chr9.gds
lrwxrwxrwx 1 topmed topmed  196 Aug 11 14:08 1KG_phase3_subset_chrX.gds -> /cromwell_root/fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e/24f12529-2aca-44ad-b4dd-3aa657889011/vcftogds/6dfc4817-d593-40a8-990c-c19ab88c48f8/call-vcf2gds/shard-22/cacheCopy/1KG_phase3_subset_chrX.gds
-rw-r--r-- 1 topmed topmed 6.2K Aug 11 14:08 debug.txt
drwxr-xr-x 3 root   root   4.0K Aug 11 14:08 fc-b2a9460a-74a9-4fae-8aaf-c79fc933d21e
-rw-r--r-- 1 root   root   2.2K Aug 11 14:08 gcs_delocalization.sh
-rw-r--r-- 1 root   root    22K Aug 11 14:08 gcs_localization.sh
-rw-r--r-- 1 root   root    14K Aug 11 14:08 gcs_transfer.sh
drwxrwxrwx 2 root   root    16K Aug 11 14:04 lost+found
-rw-r--r-- 1 root   root    13K Aug 11 14:08 script
-rw-r--r-- 1 topmed topmed  17K Aug 11 14:08 stderr
-rw-r--r-- 1 topmed topmed 6.2K Aug 11 14:08 stdout
drwxrwxrwx 2 topmed topmed 4.0K Aug 11 14:08 tmp.789f3b6c

Ash O'Farrell
- August 11, 2021 21:30
I might be barking up the wrong tree here, but it seems that when Terra localizes files, those files are considered owned by root instead of the topmed user (which is considered the owner on local and, presumably, Seven Bridges when running the CWL counterpart of this). That would explain why earlier you were able to modify those files in the Python based image; that image runs as root and root owns those files. But in any Docker image that isn't root, it looks like modifying input files isn't possible on Terra?

Assuming that's correct, it's not clear to me why local-Cromwell would localize files as a different owner than Terra-Cromwell. I ran this pipeline twice locally, once pointing to local inputs, and once pointing to the same gs URIs used by Terra (the Dockstore CLI has a plugin that allows for this on local runs). In both instances the results were the same.

0
Jason Cerrato
- August 12, 2021 18:58
Hi Aisling,

Thanks for following up with those details. We're running a couple more tests and we'll get back to you as soon as we can with more information.

Kind regards,

Jason

0
Jason Cerrato
- August 16, 2021 14:00
Hi Aisling,

We believe your assessment is correct. You are able to edit the input files when the workflow is run locally because when they are localized locally they are localized under the topmed user, rather than root. When they are localized in a Terra workflow, they are localized using the root user.

Since you mentioned that the topmed user has sudo access, we were wondering if you would be able to run a sudo command to change file ownership to topmed in your workflow. Once the permissions are changed, the python script should be able to run normally on the files, and effectively run the same way it does locally. If you decide to give this a test, let us know.

I'll look to get you an answer about the difference in behavior between local Cromwell and Terra Cromwell.

Kind regards,

Jason

0
Jason Cerrato
- August 18, 2021 18:25
Hi Aisling,

The team informed me of a workaround you can use for your situation. In your WDL, you can run the command sudo su - root prior to the commands that run that modify your input files. This should allow you to make your necessary changes without more dramatically changing your code.

Kind regards,

Jason

0
Ash O'Farrell
- September 01, 2021 01:41
Am I understanding the suggestion correctly? Is it this?
```
sudo su - root
Rscript /usr/local/analysis_pipeline/R/unique_variant_ids.R unique_variant_ids.config
```
This results in this appearing in the logs:
```
mesg: ttyname failed: Inappropriate ioctl for device
```
...and the task failing with the same error.
0
Jason Cerrato
- September 01, 2021 15:28
Hi Aisling,

Hmm do you know at what point in the script it's failing with this message? Do you get the same result if you run a command to change the localized files to be owned by topmed prior to running your script?

Kind regards,

Jason

0
Ash O'Farrell
- September 01, 2021 19:40
The R script I am running is using openfn to open a gds file. Normally this function opens in read-only mode, but the R script specifically disables this, thereby attempting to open the file in a way that grants write permissions. Terra, which localizes the files in the way it does, blocks this as the files do not have write access with regard to the topmed user. I cannot edit the Rscript nor the docker image, as the whole point of this WDL is to be 1:1 to the CWL version, which uses a particular docker image with the Rscript inside of it. However, as I mentioned before, this *does* work on local Cromwell, so clearly something is being handled differently between platforms even though the user is topmed in both cases and the files are read-only with regard to topmed in both cases.

As far as I'm aware there is no way for me to run a command to change the ownership of the localized files on Terra. Everything I've tried either gives the ttyname error or operation not permitted.

0
Jason Cerrato
- September 08, 2021 14:02
Hi Aisling,

A member of our team was able to replicate the error. It seems sudo su - works if you run the command interactively (thread), but runs into trouble when being run as a command in a Docker in a non-interactive mode.

Instead of trying to switch to root, you can change the permissions of your input files to be accessible by anyone using chmod. So it would go into your command like this:
```
set -eux -o pipefail
        
sudo chmod 777 ~{vcf}

echo "Generating config file"  
```
This worked on their end in being able to change the input vcf permissions. set -eux -o pipefail isn't required but most people use it to make sure if a command within the list commands in the command block fails then the VM stops and error out instead of just moving on to the next line.

Would this type of solution work for your needs?

Kind regards,

Jason
0
Ash O'Farrell
- September 17, 2021 20:20
I have already re-written the task in question to avoid this by simply duplicating the input and running the Rscript on the duplicate, but my new WDL (exact same Docker image so permissions still apply) is not so easily fixed due to how I glob the output, so I'm revisiting this. I've tried chmod 777 before and it didn't work, but this time I tried it exactly as written here to include the pipefail. Nevertheless it still isn't working.

I am wondering if it has to do with how the files are getting modified. In the example I gave earlier, it was an Rscript modifying the files. In my new code, I am using os.rename() via inline Python. As with before, this works perfectly fine locally but sends a permission denied error on Terra, even though both cases run as the topmed user. Is it possible that execution of scripts has different permissions?

Here is my current task. The error is OSError: [Errno 13] Permission denied. The relevant stdout is screenshot below the task screenshot -- note that the permissions do seem to have been changed after the chmod, but Terra seems to be ignoring that change, somehow. I do know the problem is not that Terra cannot edit *any* files as it can modify anything that isn't an input just fine. Duplicates of inputs, no problem, but since I need to glob the output as File renamed_variants = glob("*.gds")[0], and I cannot set a non-input variable as an output in Terra, and these permission errors would likely apply to trying to delete the original input too, it seems this bug(?) is a hard blocker.

0
Jason Cerrato
- September 20, 2021 20:38
Hi Aisling,

Thanks for the update. We'll take a look and get back to you as soon as we can.

Kind regards,

Jason

0
Jason Cerrato
- September 23, 2021 16:46
Hi Aisling,

One of our engineers did a test with a modified version of your workflow. They added a ls -lha . step, which resulted in:
```
check cwd permissions
total 72K
drwxrwxrwx 5 root root 4.0K Sep 22 13:07 .
drwxr-xr-x 1 root root 4.0K Sep 22 13:07 ..
-rw-r--r-- 1 topmed topmed 217 Sep 22 13:07 debug-terra.txt
drwxr-xr-x 3 root root 4.0K Sep 22 13:07 fc-8bf3be10-9439-4686-b9c4-53b7ef59c956
```
This shows /cromwell_root/ is owned by root and the permissions are automatically set for it such that anyone can read/write/execute. However, for the input files that are nested in the localized directory (seen above as fc-8bf3be10-9439-4686-b9c4-53b7ef59c956), these are owned by root and only root can write to them. When they tried creating a file in that dir using python they got the same os permissions error you did.
```
IOError: [Errno 13] Permission denied: './fc-8bf3be10-9439-4686-b9c4-53b7ef59c956/test.txt'
```
So they added find . -type d -exec sudo chmod -R 777 {} + near the beginning of their command block. This command looks for any directories in the current directory (cromwell_root) and changes their permissions so that anyone can read and write. (This also leaves files in the cromwell_root alone, so stderr and stdout files still keep their original permissions).
```
check cwd permissions again
total 72K
drwxrwxrwx 5 root root 4.0K Sep 22 13:16 .
drwxr-xr-x 1 root root 4.0K Sep 22 13:16 ..
-rw-r--r-- 1 topmed topmed 217 Sep 22 13:16 debug-terra.txt
drwxrwxrwx 3 root root 4.0K Sep 22 13:16 fc-8bf3be10-9439-4686-b9c4-53b7ef59c956
```
After that, they didn’t have any issues using python to create files within that directory. This might be a good workaround for your case. Can you let us know if this works for you?

Kind regards,

Jason
0

Please sign in to leave a comment.