Task run request has exceeded the maximum PAPI request size

August 13, 2019 21:43
5 comments

I am trying to run a workflow on terra and a task is failing with a message that says "The task run request has exceeded the maximum PAPI request size (146800064 bytes)". How can I fix this? I searched on the Terra Forum but didn't find a similar report.

There does not seem to be a stdout, stderr or other log file that I can look at.

Thanks,

Walt

Comments

5 comments

Sushma Chaluvadi
- August 14, 2019 14:32
Hello Walt,

Is this task perhaps requiring a localization step wherein it has to read in a long list of files?

0
Walt Shands
- August 14, 2019 15:04
Yes, the error shows up between two tasks; I received a suggestion to localize the files with gsutil cp, but gsutil cp needs a google bucket path like 'gs://', which I wouldn't have. Also this wouldn't work if running the workflow locally with Cromwell. So maybe the best to do is tar gz up the output of the previous task and use that as an input to the following task and untar gz the input file. Would that get around the PAPI error? The input would be only one file instead of 1000's and would be smaller, but I don't really understand if the PAPI error is the number of files or the total size; although it seems like large CRAM files are localized with no problem.

0
Sushma Chaluvadi
- Edited August 14, 2019 16:58
Hi Walt,

I think that the error is stating that the actual command is too long in length since it looks like the input is 1000's of files/filenames. This is assuming that all of these inputs are read in at a single time perhaps as an array rather than one at a time.

You mentioned that there are large CRAM files that are localized with no problem but are there as many CRAM files (on the order of 1000's) as there are in the task that is failing? My guess is that if you were to modify the workflow such that only a handful of inputs are passed instead of the 1000, that it would work or as you suggested, creating a single tar.gz.

Sushma

0
Walt Shands
- August 14, 2019 17:06
Thanks there is an Array[File] that is input that contains around 1800 files. I can create a tar gz file for input instead so there is only on file input. I hope that might solve the problem.

0
Sushma Chaluvadi
- August 14, 2019 17:17
Hi Walt,

Another option is to use a file of file names (FoFn). All this means is writing the paths of all the 1000s of inputs to a text file which is output from the first task and input to the second task which then reads the file paths from the FoFn as input. I'm not sure which would be easier to do in your case but just another option!

0

Please sign in to leave a comment.