Classification of Gene Expression Data using Gaussian Restricted Boltzmann Machine (GRBM) – An Application on Human Lung Adenocarcinoma data

Main Article Content

Jit Gupta, Indranil Pradhan, Anupam Ghosh

Abstract

In this article, the work deals with the classification of gene expression data using a Gaussian Restricted Boltzmann Machine (a Machine Learning model concerning Neural Networks). An RBM is a generative stochastic artificial neural network that contains one single layer of visible units and another single layer of hidden units. It is usually used to reconstruct or classify image data using the contrastive divergence method but in our work, we have applied and used it on a binary classification problem to classify whether a certain human has been affected by lung adenocarcinoma or not depending on his or her gene expression values. To tackle the class imbalance problem, safe-level SMOTE algorithm was used to over sample the minority class and a Random Forest was used as a gene selector. On comparing the results produced by RBM with a k-NN classifier and a decision tree classifier, we found that the former over fit the data while the latter produced results comparable with the RBM, thus proving that our model learns the data efficiently and accurately. This proves that RBM can be used in future classification cases that deal with gene expression values irrespective of the number of data points and the number of genes.

Article Details

How to Cite
, J. G. I. P. A. G. (2017). Classification of Gene Expression Data using Gaussian Restricted Boltzmann Machine (GRBM) – An Application on Human Lung Adenocarcinoma data. International Journal on Recent and Innovation Trends in Computing and Communication, 5(6), 56–61. https://doi.org/10.17762/ijritcc.v5i6.719
Section
Articles