%0 Journal Article
%A Roozbeh, Mahdi
%A Maanavi, Monireh
%T Mammalian Eye Gene Expression Using Support Vector Regression to Evaluate a Strategy for Detecting Human Eye Disease
%J Iranian Journal Of Health Sciences
%V 10
%N 2
%U http://jhs.mazums.ac.ir/article-1-792-en.html
%R 10.18502/jhs.v10i2.9763
%D 2022
%K High-dimensional data set, Ordinary least square method, Outliers, Robust regression,
%X Background and purpose: Machine learning is a class of modern and strong tools that can solve many important problems that nowadays humans may be faced with. Support vector regression (SVR) is a way to build a regression model which is an incredible member of the machine learning family. SVR has been proven to be an effective tool in real-value function estimation. As a supervised-learning approach, SVR trains using a symmetrical loss function, which equally penalizes high and low misestimates. Recently, high-dimensional datasets are the most challenging problem that may be faced. The main problems in high-dimensional data are the estimation of the coefficients and interpretation. In the high-dimension problems, classical methods are not applicable because of a large number of predictor variables. SVR is an excellent alternative method to analyze such datasets. One of the main advantages of SVR is that its computational complexity does not depend on the dimensionality of the input space. Additionally, it has excellent generalization capability, with high prediction accuracy. Methods: SVR is one of the best methods to analyze high-dimensional datasets. It is a really reliable and robust approach to have a good fit with high accuracy. SVR uses the same principles as the support vector machine for classification, with only a few minor differences. Results: The techniques for analyzing the high-dimension datasets are really important methods because we frequently face such datasets in medical science and gene expression. It is not easy to analyze the high-dimension datasets because the classic methods cannot be used to estimate and interpret them. Therefore, we have to use alternative methods to analyze them. SVR is one of the best methods that can be applied. In this research, SVR is used in a real high-dimension dataset about the gene expression in eye disease, and then it is compared with well-known methods: LASSO and Sparse least trimmed squared (sparse LTS) methods. Based on the numerical result, SVR and Sparse LTS were better than LASSO, since the real dataset contained outliers (bad observation with big residuals). Conclusions: SVR method was the best method to model and predict the high-dimensional mammalian eye dataset, because it was not affected by the outliers' corruptive impact, and it has minimum MSE (mean squares error), MAE (mean absolute error) and RMSE (root mean squared error) fitting criteria in comparison with the classical methods such as LASSO and sparse LTS estimations. Thus, sparse LTS was found to act better than the LASSO method. Moreover, stabilization of the data and freedom from obtaining the regularization parameter by running a complicated algorithmic program, which decreased the computational costs dramatically, were the invaluable advantages of this technique in comparison with the classical methods.
%> http://jhs.mazums.ac.ir/article-1-792-en.pdf
%P 14-28
%& 14
%! Mammalian eye gene expression using support vector regression
%9 Original Article
%L A-10-741-3
%+ Semnan University
%G eng
%@
%[ 2022