Pysam and SSL certificate problem

Hi All, 
I've been trying to get this to work on Terra and I'm a little frustrated because, it hasn't been easy.
I have a python program that uses pysam which is a Samtools wrapper. Samtools helps read, write and manipulate bams and sam files. I need to use pysam to read RNA seq reads aligned to given positions in various bam files and process them. Therefore, I wanted to avoid locating the bam files to run the task. Instead, I created a docker based on google/cloud-sdk, installed samtools and then pysam and in the command section of the task in the WDL I used:

export GCS_OAUTH_TOKEN=$(gcloud auth application-default print-access-token)
samtools view -h $bam

This happily works. I can see the header of the bam file and the bam file hasn't been localize, which saves time and disk space.
However, attempting to use Pysam to read the same BAM file results in an error related to SSL certificate verification (Problem with the SSL CA cert).:  

[E::easy_errno] Libcurl reported error 77 (Problem with the SSL CA cert (path? access rights?))
[E::hts_open_format] Failed to open file "gs://fc-secure-29fcb143-0827-4430-b92f-3fc3cdc76cb7/Samples/SRR7300571/Aligned.sortedByCoord.out.bam" : Input/output error
gs://fc-secure-29fcb143-0827-4430-b92f-3fc3cdc76cb7/Samples/SRR7300571/Aligned.sortedByCoord.out.bam
Traceback (most recent call last):
  File "/Scripts/Read_bam_file_pysam.py", line 37, in <module>
    main(sys.argv[1:])
  File "/Scripts/Read_bam_file_pysam.py", line 30, in main
    bam_file_ref = pysam.AlignmentFile(bam_file, "rb")
  File "pysam/libcalignmentfile.pyx", line 748, in pysam.libcalignmentfile.AlignmentFile.__cinit__
  File "pysam/libcalignmentfile.pyx", line 947, in pysam.libcalignmentfile.AlignmentFile._open
OSError: [Errno 5] could not open alignment file `gs://fc-secure-29fcb143-0827-4430-b92f-3fc3cdc76cb7/Samples/SRR7300571/Aligned.sortedByCoord.out.bam`: Input/output error

So, the samtools view -H over the bam file: gs://fc-secure-29fcb143-0827-4430-b92f-3fc3cdc76cb7/Samples/SRR7300571/Aligned.sortedByCoord.out.bam works, 
but the same using the pysam throws the error : Problem with the SSL CA cert (path? access rights?)

Does anyone knows how to solve this problem, in order to be able to use pysam seamlessly as samtools?
Thank you in advance, 
MaVi

 

Comments

16 comments

  • Comment author
    Samantha (she/her)

    Hi MaVi,

    Thank you for writing in about this issue. Can you share the workspace where you are seeing this issue with Terra Support by clicking the Share button in your workspace? The Share option is in the three-dots menu at the top-right.

    1. Toggle the "Share with support" button to "Yes"
    2. Click Save

    Please provide us with

    1. A link to your workspace
    2. The relevant submission ID
    3. The relevant workflow ID

    If possible, can you also share the contents of your Dockerfile?

    We’ll be happy to take a closer look as soon as we can!

    Kind regards,

    Samantha

    0
  • Comment author
    Maria Virginia Ruiz Cuevas

    Hi Samantha, 


    Thank you so much for your reply!

    I have shared the workspace: WDLs_Tests

    Here the other information:
    1. https://app.terra.bio/#workspaces/broad-firecloud-cptac/WDLs_Tests
    2. Submission ID: d0cd7a2f-6187-4a39-8e92-af40e450b83e
    3. Workflow ID: 29fcb143-0827-4430-b92f-3fc3cdc76cb7 (Read_Bam_File)

    The Dockerfile is:

    FROM google/cloud-sdk

    RUN apt-get update -y && \
    DEBIAN_FRONTEND=noninteractive apt-get install -y\
    tree\
    build-essential\
    python3.9\
    python3.9-distutils\
    python3.9-dev\
    python3-pip\
    bedtools\
    wget\
    r-base\
    zlib1g-dev

    RUN set -ex \
    && apt-get update && apt-get install -y --no-install -recommends ca-certificates\
    autoconf\
    automake\
    gcc\
    perl\
    bzip2\
    zlib1g-dev\
    libbz2-dev\
    liblzma-dev\
    libcurl4-gnutls-dev\
    && rm -rf /var/lib/apt/lists/*\
    && curl -L -O https://github.com/samtools/samtools/releases/download/1.10/samtools-1.10.tar.bz2\
    && tar -xjvf samtools-1.10.tar.bz2\
    && rm samtools-1.10.tar.bz2\
    && cd samtools-1.10\
    && ./configure --enable-libcurl\
    && make all all-htslib\
    && make install install-htslib\
    && cd..\
    && rm -r samtools-1.10

    RUN ln -s /usr/bin/python3.9 /usr/local/bin/python
    ENV HTSLIB_LIBRARY_DIR=/usr/local/lib
    ENV HTSLIB_INCLUDE_DIR=/usr/local/include
    ENV HTSLIB_CONFIGURE_OPTIONS="--enable-gcs"
    RUN python3.9 -m pip install pysam

    WORKDIR .

    And I have a second one that calls the first one and copies the python script:

    FROM proteogenomics/test_docker_main:latest

    COPY . .

    Once again, thank you so much !!!

    MaVi

    0
  • Comment author
    Samantha (she/her)

    Hi MaVi,

    Are you able to run the that pysam python script in your local terminal? In other words, is pysam able to access that file when you run it outside of Terra?

    Best,

    Samantha

    0
  • Comment author
    Maria Virginia Ruiz Cuevas

    Hi Samantha, 
    Thank you for taking the time to investigate this issue. 

    Outside Terra, yes! Pysam is able to access the file. 

    (base) -bash:login02:/prot/proteomics/LabMembers/MaVi/Projects/CLL/BamQuery/CLL_samples_BAM_directories_mTECs_and_DCS_25 1011 $ python /prot/proteomics/LabMembers/MaVi/Coding/MaVi_Broad/scripts/WDLs_Tests_workspaces/Read_bam_files/Scripts/Read_bam_file_pysam.py /prot/proteomics/LabMembers/MaVi/Tests/test_BamQuery_modular/Unit_test_modules/testing_reading_files/BAM_directories_3.tsv
    Reading header of bam file :  gs://fc-secure-29fcb143-0827-4430-b92f-3fc3cdc76cb7/Samples/star/SRR7300571/Aligned.sortedByCoord.out.bam
    ('chr1', 'chr2', 'chr3', 'chr4', 'chr5', 'chr6', 'chr7', 'chr8', 'chr9', 'chr10', 'chr11', 'chr12', 'chr13', 'chr14', 'chr15', 'chr16', 'chr17', 'chr18', 'chr19', 'chr20', 'chr21', 'chr22', 'chrX', 'chrY', 'chrM', 'GL000008.2', 'GL000009.2', 'GL000194.1', 'GL000195.1', 'GL000205.2', 'GL000208.1', 'GL000213.1', 'GL000214.1', 'GL000216.2', 'GL000218.1', 'GL000219.1', 'GL000220.1', 'GL000221.1', 'GL000224.1', 'GL000225.1', 'GL000226.1', 'KI270302.1', 'KI270303.1', 'KI270304.1', 'KI270305.1', 'KI270310.1', 'KI270311.1', 'KI270312.1', 'KI270315.1', 'KI270316.1', 'KI270317.1', 'KI270320.1', 'KI270322.1', 'KI270329.1', 'KI270330.1', 'KI270333.1', 'KI270334.1', 'KI270335.1', 'KI270336.1', 'KI270337.1', 'KI270338.1', 'KI270340.1', 'KI270362.1', 'KI270363.1', 'KI270364.1', 'KI270366.1', 'KI270371.1', 'KI270372.1', 'KI270373.1', 'KI270374.1', 'KI270375.1', 'KI270376.1', 'KI270378.1', 'KI270379.1', 'KI270381.1', 'KI270382.1', 'KI270383.1', 'KI270384.1', 'KI270385.1', 'KI270386.1', 'KI270387.1', 'KI270388.1', 'KI270389.1', 'KI270390.1', 'KI270391.1', 'KI270392.1', 'KI270393.1', 'KI270394.1', 'KI270395.1', 'KI270396.1', 'KI270411.1', 'KI270412.1', 'KI270414.1', 'KI270417.1', 'KI270418.1', 'KI270419.1', 'KI270420.1', 'KI270422.1', 'KI270423.1', 'KI270424.1', 'KI270425.1', 'KI270429.1', 'KI270435.1', 'KI270438.1', 'KI270442.1', 'KI270448.1', 'KI270465.1', 'KI270466.1', 'KI270467.1', 'KI270468.1', 'KI270507.1', 'KI270508.1', 'KI270509.1', 'KI270510.1', 'KI270511.1', 'KI270512.1', 'KI270515.1', 'KI270516.1', 'KI270517.1', 'KI270518.1', 'KI270519.1', 'KI270521.1', 'KI270522.1', 'KI270528.1', 'KI270529.1', 'KI270530.1', 'KI270538.1', 'KI270539.1', 'KI270544.1', 'KI270548.1', 'KI270579.1', 'KI270580.1', 'KI270581.1', 'KI270582.1', 'KI270583.1', 'KI270584.1', 'KI270587.1', 'KI270588.1', 'KI270589.1', 'KI270590.1', 'KI270591.1', 'KI270593.1', 'KI270706.1', 'KI270707.1', 'KI270708.1', 'KI270709.1', 'KI270710.1', 'KI270711.1', 'KI270712.1', 'KI270713.1', 'KI270714.1', 'KI270715.1', 'KI270716.1', 'KI270717.1', 'KI270718.1', 'KI270719.1', 'KI270720.1', 'KI270721.1', 'KI270722.1', 'KI270723.1', 'KI270724.1', 'KI270725.1', 'KI270726.1', 'KI270727.1', 'KI270728.1', 'KI270729.1', 'KI270730.1', 'KI270731.1', 'KI270732.1', 'KI270733.1', 'KI270734.1', 'KI270735.1', 'KI270736.1', 'KI270737.1', 'KI270738.1', 'KI270739.1', 'KI270740.1', 'KI270741.1', 'KI270742.1', 'KI270743.1', 'KI270744.1', 'KI270745.1', 'KI270746.1', 'KI270747.1', 'KI270748.1', 'KI270749.1', 'KI270750.1', 'KI270751.1', 'KI270752.1', 'KI270753.1', 'KI270754.1', 'KI270755.1', 'KI270756.1', 'KI270757.1')
    Reading header of bam file :  gs://fc-secure-29fcb143-0827-4430-b92f-3fc3cdc76cb7/Samples/star/SRR7300573/Aligned.sortedByCoord.out.bam
    ('chr1', 'chr2', 'chr3', 'chr4', 'chr5', 'chr6', 'chr7', 'chr8', 'chr9', 'chr10', 'chr11', 'chr12', 'chr13', 'chr14', 'chr15', 'chr16', 'chr17', 'chr18', 'chr19', 'chr20', 'chr21', 'chr22', 'chrX', 'chrY', 'chrM', 'GL000008.2', 'GL000009.2', 'GL000194.1', 'GL000195.1', 'GL000205.2', 'GL000208.1', 'GL000213.1', 'GL000214.1', 'GL000216.2', 'GL000218.1', 'GL000219.1', 'GL000220.1', 'GL000221.1', 'GL000224.1', 'GL000225.1', 'GL000226.1', 'KI270302.1', 'KI270303.1', 'KI270304.1', 'KI270305.1', 'KI270310.1', 'KI270311.1', 'KI270312.1', 'KI270315.1', 'KI270316.1', 'KI270317.1', 'KI270320.1', 'KI270322.1', 'KI270329.1', 'KI270330.1', 'KI270333.1', 'KI270334.1', 'KI270335.1', 'KI270336.1', 'KI270337.1', 'KI270338.1', 'KI270340.1', 'KI270362.1', 'KI270363.1', 'KI270364.1', 'KI270366.1', 'KI270371.1', 'KI270372.1', 'KI270373.1', 'KI270374.1', 'KI270375.1', 'KI270376.1', 'KI270378.1', 'KI270379.1', 'KI270381.1', 'KI270382.1', 'KI270383.1', 'KI270384.1', 'KI270385.1', 'KI270386.1', 'KI270387.1', 'KI270388.1', 'KI270389.1', 'KI270390.1', 'KI270391.1', 'KI270392.1', 'KI270393.1', 'KI270394.1', 'KI270395.1', 'KI270396.1', 'KI270411.1', 'KI270412.1', 'KI270414.1', 'KI270417.1', 'KI270418.1', 'KI270419.1', 'KI270420.1', 'KI270422.1', 'KI270423.1', 'KI270424.1', 'KI270425.1', 'KI270429.1', 'KI270435.1', 'KI270438.1', 'KI270442.1', 'KI270448.1', 'KI270465.1', 'KI270466.1', 'KI270467.1', 'KI270468.1', 'KI270507.1', 'KI270508.1', 'KI270509.1', 'KI270510.1', 'KI270511.1', 'KI270512.1', 'KI270515.1', 'KI270516.1', 'KI270517.1', 'KI270518.1', 'KI270519.1', 'KI270521.1', 'KI270522.1', 'KI270528.1', 'KI270529.1', 'KI270530.1', 'KI270538.1', 'KI270539.1', 'KI270544.1', 'KI270548.1', 'KI270579.1', 'KI270580.1', 'KI270581.1', 'KI270582.1', 'KI270583.1', 'KI270584.1', 'KI270587.1', 'KI270588.1', 'KI270589.1', 'KI270590.1', 'KI270591.1', 'KI270593.1', 'KI270706.1', 'KI270707.1', 'KI270708.1', 'KI270709.1', 'KI270710.1', 'KI270711.1', 'KI270712.1', 'KI270713.1', 'KI270714.1', 'KI270715.1', 'KI270716.1', 'KI270717.1', 'KI270718.1', 'KI270719.1', 'KI270720.1', 'KI270721.1', 'KI270722.1', 'KI270723.1', 'KI270724.1', 'KI270725.1', 'KI270726.1', 'KI270727.1', 'KI270728.1', 'KI270729.1', 'KI270730.1', 'KI270731.1', 'KI270732.1', 'KI270733.1', 'KI270734.1', 'KI270735.1', 'KI270736.1', 'KI270737.1', 'KI270738.1', 'KI270739.1', 'KI270740.1', 'KI270741.1', 'KI270742.1', 'KI270743.1', 'KI270744.1', 'KI270745.1', 'KI270746.1', 'KI270747.1', 'KI270748.1', 'KI270749.1', 'KI270750.1', 'KI270751.1', 'KI270752.1', 'KI270753.1', 'KI270754.1', 'KI270755.1', 'KI270756.1', 'KI270757.1')
    Reading header of bam file :  gs://fc-secure-29fcb143-0827-4430-b92f-3fc3cdc76cb7/Samples/star/SRR7300593/Aligned.sortedByCoord.out.bam
    ('chr1', 'chr2', 'chr3', 'chr4', 'chr5', 'chr6', 'chr7', 'chr8', 'chr9', 'chr10', 'chr11', 'chr12', 'chr13', 'chr14', 'chr15', 'chr16', 'chr17', 'chr18', 'chr19', 'chr20', 'chr21', 'chr22', 'chrX', 'chrY', 'chrM', 'GL000008.2', 'GL000009.2', 'GL000194.1', 'GL000195.1', 'GL000205.2', 'GL000208.1', 'GL000213.1', 'GL000214.1', 'GL000216.2', 'GL000218.1', 'GL000219.1', 'GL000220.1', 'GL000221.1', 'GL000224.1', 'GL000225.1', 'GL000226.1', 'KI270302.1', 'KI270303.1', 'KI270304.1', 'KI270305.1', 'KI270310.1', 'KI270311.1', 'KI270312.1', 'KI270315.1', 'KI270316.1', 'KI270317.1', 'KI270320.1', 'KI270322.1', 'KI270329.1', 'KI270330.1', 'KI270333.1', 'KI270334.1', 'KI270335.1', 'KI270336.1', 'KI270337.1', 'KI270338.1', 'KI270340.1', 'KI270362.1', 'KI270363.1', 'KI270364.1', 'KI270366.1', 'KI270371.1', 'KI270372.1', 'KI270373.1', 'KI270374.1', 'KI270375.1', 'KI270376.1', 'KI270378.1', 'KI270379.1', 'KI270381.1', 'KI270382.1', 'KI270383.1', 'KI270384.1', 'KI270385.1', 'KI270386.1', 'KI270387.1', 'KI270388.1', 'KI270389.1', 'KI270390.1', 'KI270391.1', 'KI270392.1', 'KI270393.1', 'KI270394.1', 'KI270395.1', 'KI270396.1', 'KI270411.1', 'KI270412.1', 'KI270414.1', 'KI270417.1', 'KI270418.1', 'KI270419.1', 'KI270420.1', 'KI270422.1', 'KI270423.1', 'KI270424.1', 'KI270425.1', 'KI270429.1', 'KI270435.1', 'KI270438.1', 'KI270442.1', 'KI270448.1', 'KI270465.1', 'KI270466.1', 'KI270467.1', 'KI270468.1', 'KI270507.1', 'KI270508.1', 'KI270509.1', 'KI270510.1', 'KI270511.1', 'KI270512.1', 'KI270515.1', 'KI270516.1', 'KI270517.1', 'KI270518.1', 'KI270519.1', 'KI270521.1', 'KI270522.1', 'KI270528.1', 'KI270529.1', 'KI270530.1', 'KI270538.1', 'KI270539.1', 'KI270544.1', 'KI270548.1', 'KI270579.1', 'KI270580.1', 'KI270581.1', 'KI270582.1', 'KI270583.1', 'KI270584.1', 'KI270587.1', 'KI270588.1', 'KI270589.1', 'KI270590.1', 'KI270591.1', 'KI270593.1', 'KI270706.1', 'KI270707.1', 'KI270708.1', 'KI270709.1', 'KI270710.1', 'KI270711.1', 'KI270712.1', 'KI270713.1', 'KI270714.1', 'KI270715.1', 'KI270716.1', 'KI270717.1', 'KI270718.1', 'KI270719.1', 'KI270720.1', 'KI270721.1', 'KI270722.1', 'KI270723.1', 'KI270724.1', 'KI270725.1', 'KI270726.1', 'KI270727.1', 'KI270728.1', 'KI270729.1', 'KI270730.1', 'KI270731.1', 'KI270732.1', 'KI270733.1', 'KI270734.1', 'KI270735.1', 'KI270736.1', 'KI270737.1', 'KI270738.1', 'KI270739.1', 'KI270740.1', 'KI270741.1', 'KI270742.1', 'KI270743.1', 'KI270744.1', 'KI270745.1', 'KI270746.1', 'KI270747.1', 'KI270748.1', 'KI270749.1', 'KI270750.1', 'KI270751.1', 'KI270752.1', 'KI270753.1', 'KI270754.1', 'KI270755.1', 'KI270756.1', 'KI270757.1')
    Reading header of bam file :  gs://fc-secure-29fcb143-0827-4430-b92f-3fc3cdc76cb7/Samples/star/SRR7300595/Aligned.sortedByCoord.out.bam
    ('chr1', 'chr2', 'chr3', 'chr4', 'chr5', 'chr6', 'chr7', 'chr8', 'chr9', 'chr10', 'chr11', 'chr12', 'chr13', 'chr14', 'chr15', 'chr16', 'chr17', 'chr18', 'chr19', 'chr20', 'chr21', 'chr22', 'chrX', 'chrY', 'chrM', 'GL000008.2', 'GL000009.2', 'GL000194.1', 'GL000195.1', 'GL000205.2', 'GL000208.1', 'GL000213.1', 'GL000214.1', 'GL000216.2', 'GL000218.1', 'GL000219.1', 'GL000220.1', 'GL000221.1', 'GL000224.1', 'GL000225.1', 'GL000226.1', 'KI270302.1', 'KI270303.1', 'KI270304.1', 'KI270305.1', 'KI270310.1', 'KI270311.1', 'KI270312.1', 'KI270315.1', 'KI270316.1', 'KI270317.1', 'KI270320.1', 'KI270322.1', 'KI270329.1', 'KI270330.1', 'KI270333.1', 'KI270334.1', 'KI270335.1', 'KI270336.1', 'KI270337.1', 'KI270338.1', 'KI270340.1', 'KI270362.1', 'KI270363.1', 'KI270364.1', 'KI270366.1', 'KI270371.1', 'KI270372.1', 'KI270373.1', 'KI270374.1', 'KI270375.1', 'KI270376.1', 'KI270378.1', 'KI270379.1', 'KI270381.1', 'KI270382.1', 'KI270383.1', 'KI270384.1', 'KI270385.1', 'KI270386.1', 'KI270387.1', 'KI270388.1', 'KI270389.1', 'KI270390.1', 'KI270391.1', 'KI270392.1', 'KI270393.1', 'KI270394.1', 'KI270395.1', 'KI270396.1', 'KI270411.1', 'KI270412.1', 'KI270414.1', 'KI270417.1', 'KI270418.1', 'KI270419.1', 'KI270420.1', 'KI270422.1', 'KI270423.1', 'KI270424.1', 'KI270425.1', 'KI270429.1', 'KI270435.1', 'KI270438.1', 'KI270442.1', 'KI270448.1', 'KI270465.1', 'KI270466.1', 'KI270467.1', 'KI270468.1', 'KI270507.1', 'KI270508.1', 'KI270509.1', 'KI270510.1', 'KI270511.1', 'KI270512.1', 'KI270515.1', 'KI270516.1', 'KI270517.1', 'KI270518.1', 'KI270519.1', 'KI270521.1', 'KI270522.1', 'KI270528.1', 'KI270529.1', 'KI270530.1', 'KI270538.1', 'KI270539.1', 'KI270544.1', 'KI270548.1', 'KI270579.1', 'KI270580.1', 'KI270581.1', 'KI270582.1', 'KI270583.1', 'KI270584.1', 'KI270587.1', 'KI270588.1', 'KI270589.1', 'KI270590.1', 'KI270591.1', 'KI270593.1', 'KI270706.1', 'KI270707.1', 'KI270708.1', 'KI270709.1', 'KI270710.1', 'KI270711.1', 'KI270712.1', 'KI270713.1', 'KI270714.1', 'KI270715.1', 'KI270716.1', 'KI270717.1', 'KI270718.1', 'KI270719.1', 'KI270720.1', 'KI270721.1', 'KI270722.1', 'KI270723.1', 'KI270724.1', 'KI270725.1', 'KI270726.1', 'KI270727.1', 'KI270728.1', 'KI270729.1', 'KI270730.1', 'KI270731.1', 'KI270732.1', 'KI270733.1', 'KI270734.1', 'KI270735.1', 'KI270736.1', 'KI270737.1', 'KI270738.1', 'KI270739.1', 'KI270740.1', 'KI270741.1', 'KI270742.1', 'KI270743.1', 'KI270744.1', 'KI270745.1', 'KI270746.1', 'KI270747.1', 'KI270748.1', 'KI270749.1', 'KI270750.1', 'KI270751.1', 'KI270752.1', 'KI270753.1', 'KI270754.1', 'KI270755.1', 'KI270756.1', 'KI270757.1')
    Reading header of bam file :  gs://fc-secure-29fcb143-0827-4430-b92f-3fc3cdc76cb7/Samples/star/SRR7300613/Aligned.sortedByCoord.out.bam
    ('chr1', 'chr2', 'chr3', 'chr4', 'chr5', 'chr6', 'chr7', 'chr8', 'chr9', 'chr10', 'chr11', 'chr12', 'chr13', 'chr14', 'chr15', 'chr16', 'chr17', 'chr18', 'chr19', 'chr20', 'chr21', 'chr22', 'chrX', 'chrY', 'chrM', 'GL000008.2', 'GL000009.2', 'GL000194.1', 'GL000195.1', 'GL000205.2', 'GL000208.1', 'GL000213.1', 'GL000214.1', 'GL000216.2', 'GL000218.1', 'GL000219.1', 'GL000220.1', 'GL000221.1', 'GL000224.1', 'GL000225.1', 'GL000226.1', 'KI270302.1', 'KI270303.1', 'KI270304.1', 'KI270305.1', 'KI270310.1', 'KI270311.1', 'KI270312.1', 'KI270315.1', 'KI270316.1', 'KI270317.1', 'KI270320.1', 'KI270322.1', 'KI270329.1', 'KI270330.1', 'KI270333.1', 'KI270334.1', 'KI270335.1', 'KI270336.1', 'KI270337.1', 'KI270338.1', 'KI270340.1', 'KI270362.1', 'KI270363.1', 'KI270364.1', 'KI270366.1', 'KI270371.1', 'KI270372.1', 'KI270373.1', 'KI270374.1', 'KI270375.1', 'KI270376.1', 'KI270378.1', 'KI270379.1', 'KI270381.1', 'KI270382.1', 'KI270383.1', 'KI270384.1', 'KI270385.1', 'KI270386.1', 'KI270387.1', 'KI270388.1', 'KI270389.1', 'KI270390.1', 'KI270391.1', 'KI270392.1', 'KI270393.1', 'KI270394.1', 'KI270395.1', 'KI270396.1', 'KI270411.1', 'KI270412.1', 'KI270414.1', 'KI270417.1', 'KI270418.1', 'KI270419.1', 'KI270420.1', 'KI270422.1', 'KI270423.1', 'KI270424.1', 'KI270425.1', 'KI270429.1', 'KI270435.1', 'KI270438.1', 'KI270442.1', 'KI270448.1', 'KI270465.1', 'KI270466.1', 'KI270467.1', 'KI270468.1', 'KI270507.1', 'KI270508.1', 'KI270509.1', 'KI270510.1', 'KI270511.1', 'KI270512.1', 'KI270515.1', 'KI270516.1', 'KI270517.1', 'KI270518.1', 'KI270519.1', 'KI270521.1', 'KI270522.1', 'KI270528.1', 'KI270529.1', 'KI270530.1', 'KI270538.1', 'KI270539.1', 'KI270544.1', 'KI270548.1', 'KI270579.1', 'KI270580.1', 'KI270581.1', 'KI270582.1', 'KI270583.1', 'KI270584.1', 'KI270587.1', 'KI270588.1', 'KI270589.1', 'KI270590.1', 'KI270591.1', 'KI270593.1', 'KI270706.1', 'KI270707.1', 'KI270708.1', 'KI270709.1', 'KI270710.1', 'KI270711.1', 'KI270712.1', 'KI270713.1', 'KI270714.1', 'KI270715.1', 'KI270716.1', 'KI270717.1', 'KI270718.1', 'KI270719.1', 'KI270720.1', 'KI270721.1', 'KI270722.1', 'KI270723.1', 'KI270724.1', 'KI270725.1', 'KI270726.1', 'KI270727.1', 'KI270728.1', 'KI270729.1', 'KI270730.1', 'KI270731.1', 'KI270732.1', 'KI270733.1', 'KI270734.1', 'KI270735.1', 'KI270736.1', 'KI270737.1', 'KI270738.1', 'KI270739.1', 'KI270740.1', 'KI270741.1', 'KI270742.1', 'KI270743.1', 'KI270744.1', 'KI270745.1', 'KI270746.1', 'KI270747.1', 'KI270748.1', 'KI270749.1', 'KI270750.1', 'KI270751.1', 'KI270752.1', 'KI270753.1', 'KI270754.1', 'KI270755.1', 'KI270756.1', 'KI270757.1')
    Reading header of bam file :  gs://fc-secure-29fcb143-0827-4430-b92f-3fc3cdc76cb7/Samples/star/SRR7300651/Aligned.sortedByCoord.out.bam
    ('chr1', 'chr2', 'chr3', 'chr4', 'chr5', 'chr6', 'chr7', 'chr8', 'chr9', 'chr10', 'chr11', 'chr12', 'chr13', 'chr14', 'chr15', 'chr16', 'chr17', 'chr18', 'chr19', 'chr20', 'chr21', 'chr22', 'chrX', 'chrY', 'chrM', 'GL000008.2', 'GL000009.2', 'GL000194.1', 'GL000195.1', 'GL000205.2', 'GL000208.1', 'GL000213.1', 'GL000214.1', 'GL000216.2', 'GL000218.1', 'GL000219.1', 'GL000220.1', 'GL000221.1', 'GL000224.1', 'GL000225.1', 'GL000226.1', 'KI270302.1', 'KI270303.1', 'KI270304.1', 'KI270305.1', 'KI270310.1', 'KI270311.1', 'KI270312.1', 'KI270315.1', 'KI270316.1', 'KI270317.1', 'KI270320.1', 'KI270322.1', 'KI270329.1', 'KI270330.1', 'KI270333.1', 'KI270334.1', 'KI270335.1', 'KI270336.1', 'KI270337.1', 'KI270338.1', 'KI270340.1', 'KI270362.1', 'KI270363.1', 'KI270364.1', 'KI270366.1', 'KI270371.1', 'KI270372.1', 'KI270373.1', 'KI270374.1', 'KI270375.1', 'KI270376.1', 'KI270378.1', 'KI270379.1', 'KI270381.1', 'KI270382.1', 'KI270383.1', 'KI270384.1', 'KI270385.1', 'KI270386.1', 'KI270387.1', 'KI270388.1', 'KI270389.1', 'KI270390.1', 'KI270391.1', 'KI270392.1', 'KI270393.1', 'KI270394.1', 'KI270395.1', 'KI270396.1', 'KI270411.1', 'KI270412.1', 'KI270414.1', 'KI270417.1', 'KI270418.1', 'KI270419.1', 'KI270420.1', 'KI270422.1', 'KI270423.1', 'KI270424.1', 'KI270425.1', 'KI270429.1', 'KI270435.1', 'KI270438.1', 'KI270442.1', 'KI270448.1', 'KI270465.1', 'KI270466.1', 'KI270467.1', 'KI270468.1', 'KI270507.1', 'KI270508.1', 'KI270509.1', 'KI270510.1', 'KI270511.1', 'KI270512.1', 'KI270515.1', 'KI270516.1', 'KI270517.1', 'KI270518.1', 'KI270519.1', 'KI270521.1', 'KI270522.1', 'KI270528.1', 'KI270529.1', 'KI270530.1', 'KI270538.1', 'KI270539.1', 'KI270544.1', 'KI270548.1', 'KI270579.1', 'KI270580.1', 'KI270581.1', 'KI270582.1', 'KI270583.1', 'KI270584.1', 'KI270587.1', 'KI270588.1', 'KI270589.1', 'KI270590.1', 'KI270591.1', 'KI270593.1', 'KI270706.1', 'KI270707.1', 'KI270708.1', 'KI270709.1', 'KI270710.1', 'KI270711.1', 'KI270712.1', 'KI270713.1', 'KI270714.1', 'KI270715.1', 'KI270716.1', 'KI270717.1', 'KI270718.1', 'KI270719.1', 'KI270720.1', 'KI270721.1', 'KI270722.1', 'KI270723.1', 'KI270724.1', 'KI270725.1', 'KI270726.1', 'KI270727.1', 'KI270728.1', 'KI270729.1', 'KI270730.1', 'KI270731.1', 'KI270732.1', 'KI270733.1', 'KI270734.1', 'KI270735.1', 'KI270736.1', 'KI270737.1', 'KI270738.1', 'KI270739.1', 'KI270740.1', 'KI270741.1', 'KI270742.1', 'KI270743.1', 'KI270744.1', 'KI270745.1', 'KI270746.1', 'KI270747.1', 'KI270748.1', 'KI270749.1', 'KI270750.1', 'KI270751.1', 'KI270752.1', 'KI270753.1', 'KI270754.1', 'KI270755.1', 'KI270756.1', 'KI270757.1')
    Reading header of bam file :  gs://fc-secure-29fcb143-0827-4430-b92f-3fc3cdc76cb7/Samples/star/SRR7300653/Aligned.sortedByCoord.out.bam
    ('chr1', 'chr2', 'chr3', 'chr4', 'chr5', 'chr6', 'chr7', 'chr8', 'chr9', 'chr10', 'chr11', 'chr12', 'chr13', 'chr14', 'chr15', 'chr16', 'chr17', 'chr18', 'chr19', 'chr20', 'chr21', 'chr22', 'chrX', 'chrY', 'chrM', 'GL000008.2', 'GL000009.2', 'GL000194.1', 'GL000195.1', 'GL000205.2', 'GL000208.1', 'GL000213.1', 'GL000214.1', 'GL000216.2', 'GL000218.1', 'GL000219.1', 'GL000220.1', 'GL000221.1', 'GL000224.1', 'GL000225.1', 'GL000226.1', 'KI270302.1', 'KI270303.1', 'KI270304.1', 'KI270305.1', 'KI270310.1', 'KI270311.1', 'KI270312.1', 'KI270315.1', 'KI270316.1', 'KI270317.1', 'KI270320.1', 'KI270322.1', 'KI270329.1', 'KI270330.1', 'KI270333.1', 'KI270334.1', 'KI270335.1', 'KI270336.1', 'KI270337.1', 'KI270338.1', 'KI270340.1', 'KI270362.1', 'KI270363.1', 'KI270364.1', 'KI270366.1', 'KI270371.1', 'KI270372.1', 'KI270373.1', 'KI270374.1', 'KI270375.1', 'KI270376.1', 'KI270378.1', 'KI270379.1', 'KI270381.1', 'KI270382.1', 'KI270383.1', 'KI270384.1', 'KI270385.1', 'KI270386.1', 'KI270387.1', 'KI270388.1', 'KI270389.1', 'KI270390.1', 'KI270391.1', 'KI270392.1', 'KI270393.1', 'KI270394.1', 'KI270395.1', 'KI270396.1', 'KI270411.1', 'KI270412.1', 'KI270414.1', 'KI270417.1', 'KI270418.1', 'KI270419.1', 'KI270420.1', 'KI270422.1', 'KI270423.1', 'KI270424.1', 'KI270425.1', 'KI270429.1', 'KI270435.1', 'KI270438.1', 'KI270442.1', 'KI270448.1', 'KI270465.1', 'KI270466.1', 'KI270467.1', 'KI270468.1', 'KI270507.1', 'KI270508.1', 'KI270509.1', 'KI270510.1', 'KI270511.1', 'KI270512.1', 'KI270515.1', 'KI270516.1', 'KI270517.1', 'KI270518.1', 'KI270519.1', 'KI270521.1', 'KI270522.1', 'KI270528.1', 'KI270529.1', 'KI270530.1', 'KI270538.1', 'KI270539.1', 'KI270544.1', 'KI270548.1', 'KI270579.1', 'KI270580.1', 'KI270581.1', 'KI270582.1', 'KI270583.1', 'KI270584.1', 'KI270587.1', 'KI270588.1', 'KI270589.1', 'KI270590.1', 'KI270591.1', 'KI270593.1', 'KI270706.1', 'KI270707.1', 'KI270708.1', 'KI270709.1', 'KI270710.1', 'KI270711.1', 'KI270712.1', 'KI270713.1', 'KI270714.1', 'KI270715.1', 'KI270716.1', 'KI270717.1', 'KI270718.1', 'KI270719.1', 'KI270720.1', 'KI270721.1', 'KI270722.1', 'KI270723.1', 'KI270724.1', 'KI270725.1', 'KI270726.1', 'KI270727.1', 'KI270728.1', 'KI270729.1', 'KI270730.1', 'KI270731.1', 'KI270732.1', 'KI270733.1', 'KI270734.1', 'KI270735.1', 'KI270736.1', 'KI270737.1', 'KI270738.1', 'KI270739.1', 'KI270740.1', 'KI270741.1', 'KI270742.1', 'KI270743.1', 'KI270744.1', 'KI270745.1', 'KI270746.1', 'KI270747.1', 'KI270748.1', 'KI270749.1', 'KI270750.1', 'KI270751.1', 'KI270752.1', 'KI270753.1', 'KI270754.1', 'KI270755.1', 'KI270756.1', 'KI270757.1')
    (base) -bash:login02:/prot/proteomics/LabMembers/MaVi/Projects/CLL/BamQuery/CLL_samples_BAM_directories_mTECs_and_DCS_25 1011 $ 

    Best, 
    MaVi

    0
  • Comment author
    Maria Virginia Ruiz Cuevas

    Hi Samantha, 

    I wanted to tell you that I managed to create a new docker that integrates samtools and pysam. Now I can do queries with both to bam files in the google bucket without locating them. I can now investigate a bam file in few locations in the genome and it works correctly.
    However, I have run into a new problem and I think this time Terra is the culprit. It turns out that when I have thousands of locations to investigate (while using multiprocessing to speed up the search), I get the following error:

    [E::hts_open_format] Failed to open file "gs://fc-secure-f6a121b3-85b6-47a2-90b2-feb8246d2441/Samples/mTEC/bc6f6c4edf3efa798051c90229682e97/star/S15_mTECs/Aligned.sortedByCoord.out.bam" Software caused connection abort.

    My hypothesis is that there is a firewall or something in Terra, which closes the connection when there are many request to a google bucket. Please let me know if you have an idea how to prevent this new problem from happening, so that I can finally get my program working on Terra and therefore I can have a happier holidays.

    Thank you so much in advance!!
    Best, 
    MaVi

    0
  • Comment author
    Samantha (she/her)

    Hi Maria,

    Sorry for the delayed response. Regarding your new issue, it does seem like it could be due to the amount of requests being made. Can you try setting your max_retries value to 1 or higher so the task can get retried if it fails? I see that the value is currently set to 0 by default in your WDL.

    Best,

    Samantha

    0
  • Comment author
    Maria Virginia Ruiz Cuevas

    Hi Samantha, 

    I hope you had great holidays!

    I came back to this post as I still facing the same problem. I tried the max_retrites, however the same error appears.
    So I was wondering if my problem is not related to a quota https://cloud.google.com/apis/docs/capping-api-usage  that is reached as several connections are requesting information from the google bucket.
    Please let me know if this could be the problem, and how to upgrade the number of connections that can be request!

    Thank you so much un advance, 
    MaVi

    0
  • Comment author
    Samantha (she/her)

    Hi MaVi,

    Can you please provide the submission ID for the recently failed jobs so we can take a closer look?

    Best,

    Samantha

    0
  • Comment author
    Maria Virginia Ruiz Cuevas

    Hi Samantha, 

    Sure, here is the submission ID: 09feba12-4fd3-43a4-a3da-92ba1210c22b

    You can see that in the last task that was executed call-module_2_counting_and_normalization/

    Shard 2 got the message:

    [E::hts_open_format] Failed to open file "gs://fc-secure-f6a121b3-85b6-47a2-90b2-feb8246d2441/Samples/mTEC/bc6f6c4edf3efa798051c90229682e97/star/S16_mTECs/Aligned.sortedByCoord.out.bam" : Software caused connection abort

    And after that, two of the other shards got the message:

    [E::hts_open_format] Failed to open file "gs://fc-secure-f6a121b3-85b6-47a2-90b2-feb8246d2441/Samples/Blood_DC/GSE76511/star/SRR3084027/Aligned.sortedByCoord.out.bam" : Operation not permitted

    Probably, because the connection was lost at some point also.

    Thank you so much for looking into this.

    Best, 

    MaVi

    0
  • Comment author
    Samantha (she/her)

    Hi MaVi,

    Are you running the workflow in a different workspace? I'm not able to find that submission ID in the workspace you previously shared, https://app.terra.bio/#workspaces/broad-firecloud-cptac/WDLs_Tests. 

    If you are using a different workspace, can you please share it with Terra Support and provide the workspace name?

    Thanks,

    Samantha

    0
  • Comment author
    Maria Virginia Ruiz Cuevas

    Hi Samantha, 

    Yes, indeed. Below I share the details of this workspace, I am sorry for the confusion.

    Here the other information:
    1. https://app.terra.bio/#workspaces/broad-firecloud-cptac/BamQuery
    2. Submission ID: 09feba12-4fd3-43a4-a3da-92ba1210c22b
    3. Workflow ID: 5ee6127a-9b01-4547-b385-74b62d455f79

    Thanks so much again!

    MaVi

    0
  • Comment author
    Julia Kodysh

    Just chiming in here - I ran into the same issue with pysam (though outside of terra), and it looks like this started happening with pysam 0.22.0. I'm not seeing the same SSL cert issue with pysam 0.21.0.

    0
  • Comment author
    Maria Virginia Ruiz Cuevas

    Hi Julia Kodysh

    Thanks for your reply!
    I just want to be sure that I understood correctly your message. 

    Did you encounter the same problem (SSL cert) with pysam 0.22.0? Or any of the other problems that I have posted here? :'( 

     

     

    0
  • Comment author
    Maria Virginia Ruiz Cuevas

    Hi Samantha-she-her
    I wanted to know if you have any news regarding the last post about the connection closure.

    Thanks!

    MaVi

    0
  • Comment author
    Li Tai Fang

    Yeah we encountered the exact same error with pysam 0.22.0 that went away when we downgraded to 0.21.0. 

    0
  • Comment author
    Samantha (she/her)

    Hi Maria Virginia Ruiz Cuevas,

    I believe the connection error might be coming from the python script you're running in your task, rather than Terra itself. Would it be possible to share the contents of that script to see exactly what commands are being run?

    Best,
    Samantha

    0

Please sign in to leave a comment.