ContinueWhilePossible not default?
What is "Will not start job CollectCoverageGCNV.CollectCounts:1:1 when workflow state is 'WorkflowExecutionFailingState' and when 'restarting'=false" ? I've never seen this one before. It sounds like one of the samples failed and the rest won't launch, but it's impossible to tell which one was a real failure amid thousands of "won't starts".
-
Hey Laura,
Our engineers have identified this to be the root cause of the submission failure:
cromwell.engine.workflow.lifecycle.execution.job.preparation.JobPreparationActor$$anonfun$1$$anon$1: Call input and runtime attributes evaluation failed for CollectCounts: : [Attempted 1 time(s)] - FileNotFoundException: gs://fc-eb97544d-4f02-4356-8ed0-45493163bbf6/CMG_Broad_VCGS_OrphanDisease_WES_Closed_Nov2019/RP-1307/Exome/VCGS_FAM149_464_D1/v2/VCGS_FAM149_464_D1.cram File not found gs://fc-eb97544d-4f02-4356-8ed0-45493163bbf6/CMG_Broad_VCGS_OrphanDisease_WES_Closed_Nov2019/RP-1307/Exome/VCGS_FAM149_464_D1/v2/VCGS_FAM149_464_D1.cram at cromwell.engine.workflow.lifecycle.execution.job.preparation.JobPreparationActor$$anonfun$1.applyOrElse(JobPreparationActor.scala:81) at cromwell.engine.workflow.lifecycle.execution.job.preparation.JobPreparationActor$$anonfun$1.applyOrElse(JobPreparationActor.scala:74) at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:38) at akka.actor.FSM.processEvent(FSM.scala:707) at akka.actor.FSM.processEvent$(FSM.scala:704) at cromwell.engine.workflow.lifecycle.execution.job.preparation.JobPreparationActor.processEvent(JobPreparationActor.scala:46) at akka.actor.FSM.akka$actor$FSM$$processMsg(FSM.scala:701) at akka.actor.FSM$$anonfun$receive$1.applyOrElse(FSM.scala:695) at akka.actor.Actor.aroundReceive(Actor.scala:539) at akka.actor.Actor.aroundReceive$(Actor.scala:537) at cromwell.engine.workflow.lifecycle.execution.job.preparation.JobPreparationActor.aroundReceive(JobPreparationActor.scala:46) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:614) at akka.actor.ActorCell.invoke(ActorCell.scala:583) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:268) at akka.dispatch.Mailbox.run(Mailbox.scala:229) at akka.dispatch.Mailbox.exec(Mailbox.scala:241) at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
The main relevant portion seems to be this
FileNotFoundException: gs://fc-eb97544d-4f02-4356-8ed0-45493163bbf6/CMG_Broad_VCGS_OrphanDisease_WES_Closed_Nov2019/RP-1307/Exome/VCGS_FAM149_464_D1/v2/VCGS_FAM149_464_D1.cram File not found gs://fc-eb97544d-4f02-4356-8ed0-45493163bbf6/CMG_Broad_VCGS_OrphanDisease_WES_Closed_Nov2019/RP-1307/Exome/VCGS_FAM149_464_D1/v2/VCGS_FAM149_464_D1.cram
The meaning of the error messages you see in Job Manager is that one of the shards of the scatter failed while Cromwell was going to submit other shards. But because one the shards failed and there no reason anymore to submit the remaining ones, Cromwell showed this error instead of submitting them.
We recommend checking on that missing file and re-running if things look good.
Kind regards,
Jason
Please sign in to leave a comment.
Comments
2 comments