The primary objective of the first quarter was to run a massively-parallel sequencing of the transcriptome (Whole-Transcriptome Sequencing) of several cases of myeloid leukemia Acute-infant (LAM-infant) with normal karyotype in order to identify possible common mutations (recurring mutations) that may have an important role in the process of leukemogenesis and, consequently, the definition of prognostic and therapeutic profile. In total transcriptome has been sequenced marrow blasts of 13 cases of AML-Infant total of which 8 had a normal karyotype and did not have any of the molecular markers known to be associated with acute myeloid leukemias. The remaining patients examined, while having also a normal karyotype, presented some of the most frequent and common genetic alterations of their LAM and were analyzed in order to have a positive control for the entire process of massive sequencing.
The libraries of messenger RNA (mRNA) have been generated mediated the TruSeq RNA Sample Prep Kit v2 (Illumina) in accordance with the manufacturer's instructions and sequencing massively-parallel has been performed on new generation HiScanSQ sequencer (Illumina) in races paired 2x75-end basis (with TruSeq SBS Kit v3-HS, Illumina). Massive sequencing data were then analyzed and filtered preparing specific pipeline to identify all the possible genetic alterations (single nucleotide variations, gene fusions, frameshift mutations, etc ...) in the cohort of patients analyzed.
A total of 80.48 billion were sequenced bases (GB) with a total production of 1.1 billion reads. The average score is haploid coverage of 35X. The sequence alignment against the build 37 of the human genome is available at the National Center for Biotechnology Information (NCBI) and the subsequent processing of the data with the software SNVMix2 has allowed to identify a total number of single nucleotide variations 1,868,155 (SNVs). Of these, 630.764 represent new single nucleotide changes have not yet been identified. We have distinguished the new SNVs identified in exonic and not exonic and, through the use of a predictor (SNP & GO), we identified new 186 SNVs potentially associated with the pathology. Among the genes found to be mutated in the cases analyzed, there are genes known to be involved in different human tumors (KRAS, NRAS, JAK3, CREBBP, GATA1, MCL1) genes is that you do not seem to have a direct connection with cellular transformation (FECH, UROD, UROS, GCAT, ect ..). An accurate assessment of the possible biological role of these mutations single nucleotide will be necessary to determine which of these mutations may be at the base of leukemogenesis and may have a therapeutic and prognostic relevance.
To identify potential new fusion transcripts, the data obtained from sequencing massively-parallel were analyzed by two different predictors of chimeric transcripts, ChimeraScan and Defuse. Performing an intersection between the potential fusion transcripts identified by predictors was possible to identify 33 putative fusion transcripts and, through a filtering based on the number of reads that match exactly fusion (splitted reads) and reads that match flanking regions on the same (span reads), the number of mergers between genes has been further reduced up to 11 new putative fusion transcripts. Important to note that among these putative fusion transcripts were identified chimeric transcripts that we knew to be present in some of the samples analyzed. This figure represents an important positive internal control in the whole process of sequencing massively-parallel and next to the bioinformatic analysis of the data guaranteeing the correctness of the predicting method / filtering, as well as the reliability on new putative gene fusions identified. Among the new putative fusions between genes identified, the fusion CBFA2T3-GLIS2 was the first to be investigated more specifically given its recurrence in 3 cases of AML-Infant to normal karyotype sequenced. The fusion derives from a paracentric inversion of chromosome 16 and involves the gene CBFA2T3, a member of the family of transcription factors ETO, and the GLIS2 gene, encoding a transcription factor which is closely related with the members of the family GLI which mediate the transcriptional response to the activation of Hedgehog signaling pathway. The biological effect of this chimeric transcript and other putative fusion transcripts identified will be validated in the near future considering structural and functional characteristics of the chimeric protein encoded by the fusion of genes.
The study will continue with continued validation of genetic alterations identified (SNVs and fusion transcripts) and widening the cases of normal karyotype AML (infant and pediatric) in order to assess the actual occurrence of these new mutations.