Accepted_test
Advances in NGS have transformed diagnostics and treatment for chronic myeloid leukemia (CML) and chronic lymphocytic leukemia (CLL). Despite their robustness, Illumina platforms require substantial initial infrastructure and large sample volumes. In contrast, Oxford Nanopore Technologies (ONT) provides a cost-effective alternative, albeit with some accuracy and throughput challenges. In our study, we integrated publicly available Illumina RNA sequencing data with machine learning methods to classify CML and CLL samples into transcriptomic subtypes using nanopore RNA sequencing data. Data were sourced from Gene Expression Omnibus and cBioPortal, and processed using the oposSOM package to create transcriptome maps and identify gene modules. Samples from 25 patients were sequenced using the MinION platform, and data were projected using support vector regression onto these maps for classification.
For CML, gene modules related to metabolism and proliferation were upregulated in the chronic phase, while angiogenesis and inflammation pathways were prominent in the blast crisis phase. The projection of ONT sequenced samples onto the CML SOM space showed that the former samples are transitioning from the chronic to the blast crisis phase.
CLL samples were classified into eight distinct transcriptomic subtypes, each associated with their marker gene signatures. The projection of ONT CLL samples onto this SOM space indicated that the samples belonged to a transcriptomic subtype characterized by low expression of ZAP70 and CD38 and favorable prognosis.
Integrating shallow ONT sequencing with public datasets and machine learning algorithms can effectively subtype CML and CLL.
