Primary & Secondary analyses pipeline
Version 1
Genomic raw data are processing using Dragen hardware.
Genomic raw data are processing using Dragen hardware.
This pipeline tests each WGS sample, independently. If a sample fails this test, then it is excluded from the aggregation and not released or reported in this document. These tests detect sample swaps, cross-individual contamination, and sample preparation/sequencing errors. The list of specific QC processes is listed in the table below:
Samples that PASS the pre-aggregation QC pipeline (per sample QC) are aggregated into one multi-sample VCF file. Sample are aggregated using Dragen Iterative gVCF genotyper. To accelerate the aggregation and the downstream analyses (parallel processing), the genome is split into 100 shards.
The output of this pipeline is a multi-sample VCF. One multi-sample VCF is generated per shard in addition to the global multi-sample VCF files (all shards concatenated).
Belqes Alsadi, Saleh Musleh, Hamada R. H. Al-Absi, Mahmoud Refaee, Rizwan Qureshi, Nady El Hajj & Tanvir Alam
Alexandra E. Butler a,*, Steven C. Hunt b,c,1, Eric S. Kilpatrick
Mahboubeh R. Rostami, Philip L. Leopold, Jenifer M. Vasquez, Miguel de Mulder Rougvie, Alya Al Shakaki, Ali Ait Hssain, Amal Robay, Neil R. Hackett, Jason G. Mezey, Ronald G. Crystal
Zeyaul IslamID , Abdoulaye Diane, Namat Khattab, Mohammed Dehbi, Paul Thornalley, Prasanna R. Kolatkar
Mohammad Tariqul Islam; Hesham Zaky; Tanvir Alam
Hamada R. H. Al-Absi, Anant Pai, Usman Naeem, Fatma Kassem Mohamed, Saket Arya, Rami Abu Sbeit, Mohammed Bashir, Maha Mohammed El Shafei, Nady El Hajj & Tanvir Alam
Fatima Qafoud. Khalid Kunji,Mohamed Elshrif, Asma Althani, Amar Salam, Jassim Al Suwaidi, Dawood Darbar, Nidal Asaad , Mohamad Saad