Home About Browse Search
Svenska


Transcriptome profiling by combined machine learning and statistical R analysis identifies TMEM236 as a potential novel diagnostic biomarker for colorectal cancer

Maurya, Neha Shree and Kushwaha, Sandeep and Chawade, Aakash and Mani, Ashutosh (2021). Transcriptome profiling by combined machine learning and statistical R analysis identifies TMEM236 as a potential novel diagnostic biomarker for colorectal cancer. Scientific Reports. 11 , 14304
[Research article]

[img] PDF
3MB

Abstract

Colorectal cancer (CRC) is a common cause of cancer-related deaths worldwide. The CRC mRNA gene expression dataset containing 644 CRC tumor and 51 normal samples from the cancer genome atlas (TCGA) was pre-processed to identify the significant differentially expressed genes (DEGs). Feature selection techniques Least absolute shrinkage and selection operator (LASSO) and Relief were used along with class balancing for obtaining features (genes) of high importance. The classification of the CRC dataset was done by ML algorithms namely, random forest (RF), K-nearest neighbour (KNN), and artificial neural networks (ANN). The significant DEGs were 2933, having 1832 upregulated and 1101 downregulated genes. The CRC gene expression dataset had 23,186 features. LASSO had performed better than Relief for classifying tumor and normal samples through ML algorithms namely RF, KNN, and ANN with an accuracy of 100%, while Relief had given 79.5%, 85.05%, and 100% respectively. Common features between LASSO and DEGs were 38, from them only 5 common genes namely, VSTM2A, NR5A2, TMEM236, GDLN, and ETFDH had shown statistically significant survival analysis. Functional review and analysis of the selected genes helped in downsizing the 5 genes to 2, which are VSTM2A and TMEM236. Differential expression of TMEM236 was statistically significant and was markedly reduced in the dataset which solicits appreciation for assessment as a novel biomarker for CRC diagnosis.

Authors/Creators:Maurya, Neha Shree and Kushwaha, Sandeep and Chawade, Aakash and Mani, Ashutosh
Title:Transcriptome profiling by combined machine learning and statistical R analysis identifies TMEM236 as a potential novel diagnostic biomarker for colorectal cancer
Series Name/Journal:Scientific Reports
Year of publishing :2021
Volume:11
Article number:14304
Number of Pages:11
Publisher:NATURE RESEARCH
ISSN:2045-2322
Language:English
Publication Type:Research article
Article category:Scientific peer reviewed
Version:Published version
Copyright:Creative Commons: Attribution 4.0
Full Text Status:Public
Subjects:(A) Swedish standard research categories 2011 > 3 Medical and Health Sciences > 301 Basic Medicine > Medical Genetics
URN:NBN:urn:nbn:se:slu:epsilon-p-113234
Permanent URL:
http://urn.kb.se/resolve?urn=urn:nbn:se:slu:epsilon-p-113234
Additional ID:
Type of IDID
DOI10.1038/s41598-021-92692-0
Web of Science (WoS)000677493500001
ID Code:25124
Faculty:LTV - Fakulteten för landskapsarkitektur, trädgårds- och växtproduktionsvetenskap
Department:(LTJ, LTV) > Department of Plant Breeding (from 130101)
Deposited By: SLUpub Connector
Deposited On:31 Aug 2021 06:25
Metadata Last Modified:31 Aug 2021 06:31

Repository Staff Only: item control page

Downloads

Downloads per year (since September 2012)

View more statistics

Downloads
Hits