<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Bioeng. Biotechnol.</journal-id>
<journal-title>Frontiers in Bioengineering and Biotechnology</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Bioeng. Biotechnol.</abbrev-journal-title>
<issn pub-type="epub">2296-4185</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fbioe.2020.00254</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Bioengineering and Biotechnology</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Early Diagnosis of Hepatocellular Carcinoma Using Machine Learning Method</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Zhang</surname> <given-names>Zi-Mei</given-names></name>
<uri xlink:href="http://loop.frontiersin.org/people/879992/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Tan</surname> <given-names>Jiu-Xin</given-names></name>
<uri xlink:href="http://loop.frontiersin.org/people/933647/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Wang</surname> <given-names>Fang</given-names></name>
<uri xlink:href="http://loop.frontiersin.org/people/933657/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Dao</surname> <given-names>Fu-Ying</given-names></name>
<uri xlink:href="http://loop.frontiersin.org/people/640027/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Zhang</surname> <given-names>Zhao-Yue</given-names></name>
<uri xlink:href="http://loop.frontiersin.org/people/933731/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Lin</surname> <given-names>Hao</given-names></name>
<xref ref-type="corresp" rid="c001"><sup>&#x002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/182351/overview"/>
</contrib>
</contrib-group>
<aff><institution>Key Laboratory for Neuro-Information of Ministry of Education, School of Life Sciences and Technology, Center for Informational Biology, University of Electronic Science and Technology of China</institution>, <addr-line>Chengdu</addr-line>, <country>China</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Meng Zhou, Wenzhou Medical University, China</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Lei Deng, Central South University, China; Shaherin Basith, Ewha Womans University, South Korea</p></fn>
<corresp id="c001">&#x002A;Correspondence: Hao Lin, <email>hlin@uestc.edu.cn</email></corresp>
<fn fn-type="other" id="fn004"><p>This article was submitted to Bioinformatics and Computational Biology, a section of the journal Frontiers in Bioengineering and Biotechnology</p></fn>
</author-notes>
<pub-date pub-type="epub">
<day>27</day>
<month>03</month>
<year>2020</year>
</pub-date>
<pub-date pub-type="collection">
<year>2020</year>
</pub-date>
<volume>8</volume>
<elocation-id>254</elocation-id>
<history>
<date date-type="received">
<day>02</day>
<month>01</month>
<year>2020</year>
</date>
<date date-type="accepted">
<day>12</day>
<month>03</month>
<year>2020</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x00A9; 2020 Zhang, Tan, Wang, Dao, Zhang and Lin.</copyright-statement>
<copyright-year>2020</copyright-year>
<copyright-holder>Zhang, Tan, Wang, Dao, Zhang and Lin</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract>
<p>Hepatocellular carcinoma (HCC) is a serious cancer which ranked the fourth in cancer-related death worldwide. Hence, more accurate diagnostic models are urgently needed to aid the early HCC diagnosis under clinical scenarios and thus improve HCC treatment and survival. Several conventional methods have been used for discriminating HCC from cirrhosis tissues in patients without HCC (CwoHCC). However, the recognition successful rates are still far from satisfactory. In this study, we applied a computational approach that based on machine learning method to a set of microarray data generated from 1091 HCC samples and 242 CwoHCC samples. The within-sample relative expression orderings (REOs) method was used to extract numerical descriptors from gene expression profiles datasets. After removing the unrelated features by using maximum redundancy minimum relevance (mRMR) with incremental feature selection, we achieved &#x201C;11-gene-pair&#x201D; which could produce outstanding results. We further investigated the discriminate capability of the &#x201C;11-gene-pair&#x201D; for HCC recognition on several independent datasets. The wonderful results were obtained, demonstrating that the selected gene pairs can be signature for HCC. The proposed computational model can discriminate HCC and adjacent non-cancerous tissues from CwoHCC even for minimum biopsy specimens and inaccurately sampled specimens, which can be practical and effective for aiding the early HCC diagnosis at individual level.</p>
</abstract>
<kwd-group>
<kwd>hepatocellular carcinoma</kwd>
<kwd>early diagnosis</kwd>
<kwd>cirrhosis</kwd>
<kwd>REOs</kwd>
<kwd>mRMR</kwd>
<kwd>support vector machine</kwd>
</kwd-group>
<counts>
<fig-count count="3"/>
<table-count count="3"/>
<equation-count count="3"/>
<ref-count count="77"/>
<page-count count="9"/>
<word-count count="0"/>
</counts>
</article-meta>
</front>
<body>
<sec id="S1">
<title>Introduction</title>
<p>Liver cancer is the fourth leading cause of death in patients with malignant cancerous (<xref ref-type="bibr" rid="B32">Indhumathy et al., 2018</xref>; <xref ref-type="bibr" rid="B60">Villanueva, 2019</xref>). Hepatocellular carcinoma (HCC), which accounts for approximately 90% of all liver cancer cases, is frequently diagnosed at a late stage and has a poor prognosis. Thus, the early HCC diagnosis is significant to improve the prognosis and survival of patients (<xref ref-type="bibr" rid="B4">Asia-Pacific Working Party on Prevention of Hepatocellular Carcinoma, 2010</xref>). At present, diagnosis of HCC is based on laboratory investigations and imaging techniques (<xref ref-type="bibr" rid="B23">El-Serag, 2011</xref>; <xref ref-type="bibr" rid="B30">Hartke et al., 2017</xref>). Nevertheless, for HCC, especially for early HCC, current serum biomarkers and tools, such as &#x03B1;-fetoprotein (AFP) and imaging techniques, displayed poor diagnostic sensitivity and specificity (<xref ref-type="bibr" rid="B53">Sun et al., 2015</xref>). Liver biopsy is regarded as a good diagnostic choice in clinical practice only when imaging techniques cannot provide accurate identification of HCC (<xref ref-type="bibr" rid="B50">Russo et al., 2018</xref>). However, the biopsy location is usually inaccurate, which might result in inaccurately sampling and thus decrease the diagnosis successful rate (<xref ref-type="bibr" rid="B24">Forner et al., 2008</xref>). Therefore, it is necessary to design new methods or discovery new diagnostic signatures to assist the pathologists in the identification of early HCC using biopsy specimens, even inaccurately sampled biopsy specimens. It is likely that the adjacent non-cancerous tissues (cirrhosis tissues in patients with HCC or normal tissues in patients with HCC) can be affected by cancerous tissues, so that they may obtain some similar molecular characteristics of cancerous tissues (<xref ref-type="bibr" rid="B9">Budhu et al., 2006</xref>; <xref ref-type="bibr" rid="B64">Wei et al., 2014</xref>).</p>
<p>The existed diagnostic signatures are mainly on the basis of risk scores obtained from signature genes&#x2019; expression (<xref ref-type="bibr" rid="B65">Wurmbach et al., 2007</xref>; <xref ref-type="bibr" rid="B3">Archer et al., 2009</xref>; <xref ref-type="bibr" rid="B74">Zhou et al., 2015</xref>, <xref ref-type="bibr" rid="B75">2017</xref>; <xref ref-type="bibr" rid="B49">Qu et al., 2019</xref>), which are highly sensitive to measurement batch effects (<xref ref-type="bibr" rid="B28">Guan et al., 2018</xref>) and are hardly applied in clinical settings. Luckily the relative expression orderings (REO)-based strategy (<xref ref-type="bibr" rid="B70">Zhang et al., 2013</xref>; <xref ref-type="bibr" rid="B76">Zhou et al., 2013</xref>; <xref ref-type="bibr" rid="B61">Wang et al., 2015</xref>; <xref ref-type="bibr" rid="B34">Li et al., 2016</xref>), which was firstly proposed by <xref ref-type="bibr" rid="B21">Eddy et al. (2010)</xref>, is highly robust against experimental batch effects (<xref ref-type="bibr" rid="B10">Cai et al., 2015</xref>; <xref ref-type="bibr" rid="B1">Ao et al., 2016</xref>; <xref ref-type="bibr" rid="B73">Zhao et al., 2016</xref>) and platform differences (<xref ref-type="bibr" rid="B27">Guan et al., 2016</xref>), partial RNA degradation (<xref ref-type="bibr" rid="B16">Chen et al., 2017</xref>; <xref ref-type="bibr" rid="B36">Liao et al., 2017</xref>, <xref ref-type="bibr" rid="B35">2018</xref>; <xref ref-type="bibr" rid="B56">Tang et al., 2018</xref>) and uncertain sampling sites within the same cancer tissue (<xref ref-type="bibr" rid="B17">Cheng et al., 2017</xref>). And thus the REOs have been used in the early diagnosis of HCC (<xref ref-type="bibr" rid="B2">Ao et al., 2018</xref>), gastric cancer (<xref ref-type="bibr" rid="B66">Yan et al., 2019</xref>) and colorectal cancer (<xref ref-type="bibr" rid="B29">Guan et al., 2019</xref>). In 2018, <xref ref-type="bibr" rid="B2">Ao et al. (2018)</xref> obtained 19 gene pairs by using the within-sample REOs. These genes could improve early HCC diagnosis using biopsy specimens, even inaccurately sampled biopsy specimens. However, the rule to identify HCC based on REOs is so simply that some intrinsic relationships among these genes are not revealed. Moreover, the accuracy for HCC diagnosis should still be improved.</p>
<p>Machine learning method is a good choice to uncover underlying patterns (<xref ref-type="bibr" rid="B51">Stephenson et al., 2019</xref>). It has been widely employed in bioinformatics (<xref ref-type="bibr" rid="B11">Cao et al., 2017</xref>; <xref ref-type="bibr" rid="B5">Bao et al., 2019</xref>; <xref ref-type="bibr" rid="B19">Conover et al., 2019</xref>; <xref ref-type="bibr" rid="B47">Moritz et al., 2019</xref>; <xref ref-type="bibr" rid="B51">Stephenson et al., 2019</xref>; <xref ref-type="bibr" rid="B77">Zou and Ma, 2019</xref>; <xref ref-type="bibr" rid="B52">Sun et al., 2020</xref>). The current work aims to develop a machine learning based method to diagnose HCC within-sample REOs. By removing redundant REOs using minimum redundancy maximum relevance (mRMR), a diagnostic signature consisting of 11 gene pairs was obtained. These signatures were also applied in some independent datasets for examining the performance of these gene pairs for HCC identification. High accuracies were obtained, suggesting that the obtained 11-gene-pair signature based on mRMR is better than the existed 19-gene-pair signature gained by Ao et al. (<xref ref-type="bibr" rid="B2">Ao et al., 2018</xref>).</p>
</sec>
<sec id="S2" sec-type="materials|methods">
<title>Materials and Methods</title>
<sec id="S2.SS1">
<title>Data Collection and Preprocessing</title>
<p>The gene expression profiles datasets were freely gained from GEO (<xref ref-type="bibr" rid="B6">Barrett et al., 2005</xref>) and TCGA (<xref ref-type="bibr" rid="B57">Tomczak et al., 2015</xref>) database. Firstly, according to the type and sampling method of samples, the training datasets were derived from biopsy samples of HCC (D1), surgery samples of HCC (D2), biopsy samples of CwoHCC (D3), and surgery samples of CwoHCC (D4), respectively. To objectively evaluate the model, we separated the samples of each type (D1, D2, D3, and D4) mentioned above into two data subsets: training (80% samples of each type) and testing datasets (20% samples of each type). Finally, the training datasets contained 1091 HCC samples (112 biopsy samples of HCC and 979 surgery samples of HCC) and 242 CwoHCC samples (70 biopsy samples of CwoHCC and 172 surgery samples of CwoHCC). The testing datasets contained 73 biopsy samples (29 HCC samples and 44 CwoHCC samples) and 263 surgery samples (245 HCC samples and 18 CwoHCC samples). The independent datasets, which was comprised of surgical resection samples and biopsy samples, was used to evaluate the performance signature. We used the R package of TCGAbiolinks (<xref ref-type="bibr" rid="B18">Colaprico et al., 2016</xref>) to download the gene expression data which including 371 HCC and 50 normal tissues in patients from TCGA data resource<sup><xref ref-type="fn" rid="footnote1">1</xref></sup> (up to October 19, 2019). The details have been listed in <xref ref-type="supplementary-material" rid="TS1">Supplementary Table S1</xref>.</p>
<p>For the raw data (.CEL files) detected by the Affymetrix platform, the RMA (Robust Multi-array Average) algorithm was used for background adjustment. If a gene was matched to multiple probes, the arithmetic mean expression value was used as the gene expression level. For the data sets detected by the Illumina platforms, we directly used the processed expression data.</p>
</sec>
<sec id="S2.SS2">
<title>The Within&#x2212;Sample Relative Expression Orderings</title>
<p>Within a sample, the REOs of two genes (<italic>a</italic> and <italic>b</italic>) is expressed as <italic>Ea</italic> &#x003E; <italic>Eb</italic> (or <italic>Ea</italic> &#x003C; <italic>Eb</italic>) if gene <italic>a</italic> has higher (or lower) expression level than gene <italic>b</italic>. The REOs pattern of a gene pair is regarded as stable if the REOs kept in at least 95% of the samples. A reversal gene pair is a gene pair with stable REOs in both cirrhosis tissues in patients without HCC (CwoHCC) samples and HCC samples, but the REOs patterns are reversed in the second group (<italic>Ea</italic> &#x003C; <italic>Eb</italic> or <italic>Ea</italic> &#x003E; <italic>Eb</italic> in CwoHCC samples but <italic>Ea</italic> &#x003E; <italic>Eb</italic> or <italic>Ea</italic> &#x003C; <italic>Eb</italic> in HCC samples). Here, the reversal gene pairs are selected as the candidate REOs signature for the identification of HCC. Then we obtained the common genes between training datasets and validation datasets and its corresponding gene expression profile. Subsequently, based on the gene expression profiles and reversal gene pairs, we generate a new profile by using 1, 0, and &#x2212;1 to represent <italic>Ea</italic> &#x003E; <italic>Eb</italic>, <italic>Ea</italic> &#x003C; <italic>Eb</italic>, and other cases (<italic>Ea</italic> or <italic>Eb</italic> do not exist), respectively.</p>
</sec>
<sec id="S2.SS3">
<title>Feature Selection Through mRMR and IFS Methods</title>
<p>Based on the new profiles, mRMR (minimum Redundancy Maximum Relevance) (<xref ref-type="bibr" rid="B48">Peng et al., 2005</xref>) was applied to ranking the gene pairs based on the conditions of maximum relevance with the disease type along with minimum redundancy with other gene pairs.</p>
<p>Here, &#x03A9; represents all 857 gene pairs, <italic>gi</italic> is a gene pair from the 857 gene pairs and <italic>T</italic> is the disease type. The mutual information (<italic>I</italic>) can be formulated as:</p>
<disp-formula id="S2.E1">
<label>(1)</label>
<mml:math id="M1">
<mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:mi>I</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>g</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mi>T</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo largeop="true" symmetric="true">&#x222B;</mml:mo>
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>g</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mi>T</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mpadded width="+1.7pt">
<mml:mi>ln</mml:mi>
</mml:mpadded>
<mml:mo>&#x2061;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>g</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mi>T</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>g</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>p</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>T</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo mathvariant="italic" rspace="0pt">d</mml:mo>
<mml:msub>
<mml:mi>g</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo mathvariant="italic" rspace="0pt">d</mml:mo>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mrow>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
<p>The mRMR function:</p>
<disp-formula id="S2.E2">
<label>(2)</label>
<mml:math id="M2">
<mml:mrow>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>R</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>M</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>R</mml:mi>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:mo>|</mml:mo>
<mml:mi mathvariant="normal">&#x03A9;</mml:mi>
<mml:mo>|</mml:mo>
</mml:mrow>
</mml:mfrac>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:munder>
<mml:mo largeop="true" movablelimits="false" symmetric="true">&#x2211;</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>g</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2208;</mml:mo>
<mml:mi mathvariant="normal">&#x03A9;</mml:mi>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mi>I</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>g</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mi>T</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mrow>
<mml:mo>-</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:msup>
<mml:mrow>
<mml:mo>|</mml:mo>
<mml:mi mathvariant="normal">&#x03A9;</mml:mi>
<mml:mo>|</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mfrac>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:munder>
<mml:mo largeop="true" movablelimits="false" symmetric="true">&#x2211;</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>g</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:msub>
<mml:mi>g</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>&#x2208;</mml:mo>
<mml:mi mathvariant="normal">&#x03A9;</mml:mi>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mi>I</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>g</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>g</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where <italic>I(gi, T)</italic> is mutual information between the <italic>gi</italic> gene pair and disease type <italic>T</italic>, <italic>I(gi, gj)</italic> is mutual information between <italic>gi</italic> and <italic>gj</italic>. Then we used incremental feature selection (IFS) (<xref ref-type="bibr" rid="B54">Tan et al., 2019</xref>; <xref ref-type="bibr" rid="B67">Yang et al., 2019</xref>) method to select the optimal gene pairs from 857 mRMR gene pairs as diagnostic signature. The details about IFS can be found in (<xref ref-type="bibr" rid="B20">Dao et al., 2019</xref>).</p>
</sec>
<sec id="S2.SS4">
<title>Classification Through SVM</title>
<p>Support Vector Machine (SVM) is a powerful classification method which has been used extensively in the fields of biological data mining (<xref ref-type="bibr" rid="B12">Cao et al., 2014</xref>; <xref ref-type="bibr" rid="B43">Manavalan and Lee, 2017</xref>; <xref ref-type="bibr" rid="B38">Manavalan et al., 2017</xref>, <xref ref-type="bibr" rid="B45">2018b</xref>,<xref ref-type="bibr" rid="B41">2019c</xref>,<xref ref-type="bibr" rid="B42">d</xref>; <xref ref-type="bibr" rid="B55">Tang et al., 2017</xref>; <xref ref-type="bibr" rid="B8">Bu et al., 2018</xref>; <xref ref-type="bibr" rid="B71">Zhang et al., 2018</xref>; <xref ref-type="bibr" rid="B14">Chao et al., 2019a</xref>, <xref ref-type="bibr" rid="B15">b</xref>; <xref ref-type="bibr" rid="B63">Wang et al., 2019</xref>). Here, the free package LibSVM (version 3.23) (<xref ref-type="bibr" rid="B13">Chang and Lin, 2011</xref>) was downloaded to implement SVM. Due to its good performance on non-linear problem, RBF (radial basis function) was utilized. The values of two parameters <italic>C</italic> and &#x03B3; for SVM are determined by the use of grid search with fivefold cross-validation. In present work, the optimal values are <italic>C</italic> = 0.125 and &#x03B3; = 0.5, respectively.</p>
</sec>
<sec id="S2.SS5">
<title>Performance Metrics</title>
<p>The sensitivity, specificity and accuracy (<xref ref-type="bibr" rid="B7">Basith et al., 2019</xref>; <xref ref-type="bibr" rid="B44">Manavalan et al., 2018a</xref>, <xref ref-type="bibr" rid="B46">c</xref>, <xref ref-type="bibr" rid="B39">2019a</xref>,<xref ref-type="bibr" rid="B40">b</xref>) was applied to evaluating the performance of prediction methods. Here, HCC samples were regarded as positive samples; CwoHCC samples were negative samples. Mathematical representation of the above mentioned measures are calculated as:</p>
<disp-formula id="S2.E3">
<label>(3)</label>
<mml:math id="M3">
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mtable displaystyle="true" rowspacing="0pt">
<mml:mtr>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mtext>Sensitivity</mml:mtext>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mtext>TP</mml:mtext>
<mml:mrow>
<mml:mpadded width="+3.3pt">
<mml:mtext>TP</mml:mtext>
</mml:mpadded>
<mml:mo rspace="5.8pt">+</mml:mo>
<mml:mtext>FN</mml:mtext>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mtext>Specificity</mml:mtext>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mtext>TN</mml:mtext>
<mml:mrow>
<mml:mpadded width="+3.3pt">
<mml:mtext>TN</mml:mtext>
</mml:mpadded>
<mml:mo rspace="5.8pt">+</mml:mo>
<mml:mtext>FP</mml:mtext>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mtext>Accuracy</mml:mtext>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mpadded width="+3.3pt">
<mml:mtext>TP</mml:mtext>
</mml:mpadded>
<mml:mo rspace="5.8pt">+</mml:mo>
<mml:mtext>TN</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mpadded width="+3.3pt">
<mml:mtext>TP</mml:mtext>
</mml:mpadded>
<mml:mo rspace="5.8pt">+</mml:mo>
<mml:mpadded width="+3.3pt">
<mml:mtext>FP</mml:mtext>
</mml:mpadded>
<mml:mo rspace="5.8pt">+</mml:mo>
<mml:mpadded width="+3.3pt">
<mml:mtext>TN</mml:mtext>
</mml:mpadded>
<mml:mo rspace="5.8pt">+</mml:mo>
<mml:mtext>FN</mml:mtext>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
<mml:mi/>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where TP, FN, TN, and FP denotes the number of true positives, false negatives, true negatives, and false positives, respectively. Additionally, the ROC curve and AUC are commonly used to test the balance between true positive rate and false positive rate.</p>
</sec>
</sec>
<sec id="S3">
<title>Results</title>
<sec id="S3.SS1">
<title>Identification of the Diagnostic Signature</title>
<p>The flow diagram for identifying and validating the diagnostic signature is shown in <xref ref-type="fig" rid="F1">Figure 1</xref>. Firstly, total of 13,586,043 stable gene pairs which have an identical REOs in at least 95% of the 1091 HCC samples were recognized. Similarly, we also identified 14,475,509 stable gene pairs which have an identical REOs in at least 95% of the 242 CwoHCC samples. Then, we obtained 857 reversal gene pairs between the HCC samples and CwoHCC samples in the training data (see section &#x201C;Materials and Methods&#x201D;). Based on the new profiles (see section &#x201C;Materials and Methods&#x201D;), 11 gene pairs shown in <xref ref-type="table" rid="T1">Table 1</xref> were picked out by using mRMR with SVM and regarded as the diagnostic signature. The 11-gene-pair could produce the accuracy of 100% on training data for HCC identification. <xref ref-type="fig" rid="F2">Figure 2</xref> showed the IFS process (blue curve).</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption><p>Flowchart presenting the process of developing and validating the HCC diagnostic signature.</p></caption>
<graphic xlink:href="fbioe-08-00254-g001.tif"/>
</fig>
<table-wrap position="float" id="T1">
<label>TABLE 1</label>
<caption><p>The 11&#x2212;gene&#x2212;pair signature for early diagnosis of HCC.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left">Signature</td>
<td valign="top" align="left">Gene <italic>a</italic></td>
<td valign="top" align="left">Gene <italic>b</italic></td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">pair1</td>
<td valign="top" align="left">TRMT112</td>
<td valign="top" align="left">SF3B1</td>
</tr>
<tr>
<td valign="top" align="left">pair2</td>
<td valign="top" align="left">MFSD5</td>
<td valign="top" align="left">COLEC10</td>
</tr>
<tr>
<td valign="top" align="left">pair3</td>
<td valign="top" align="left">FDXR</td>
<td valign="top" align="left">APC2</td>
</tr>
<tr>
<td valign="top" align="left">pair4</td>
<td valign="top" align="left">LAMC1</td>
<td valign="top" align="left">CHST4</td>
</tr>
<tr>
<td valign="top" align="left">pair5</td>
<td valign="top" align="left">UBE4B</td>
<td valign="top" align="left">HGF</td>
</tr>
<tr>
<td valign="top" align="left">pair6</td>
<td valign="top" align="left">NCAPH2</td>
<td valign="top" align="left">APC2</td>
</tr>
<tr>
<td valign="top" align="left">pair7</td>
<td valign="top" align="left">HSPH1</td>
<td valign="top" align="left">MTHFD2</td>
</tr>
<tr>
<td valign="top" align="left">pair8</td>
<td valign="top" align="left">TMEM38B</td>
<td valign="top" align="left">AGO3</td>
</tr>
<tr>
<td valign="top" align="left">pair9</td>
<td valign="top" align="left">PLGRKT</td>
<td valign="top" align="left">COLEC10</td>
</tr>
<tr>
<td valign="top" align="left">pair10</td>
<td valign="top" align="left">HNF1A</td>
<td valign="top" align="left">APC2</td>
</tr>
<tr>
<td valign="top" align="left">pair11</td>
<td valign="top" align="left">ARPC2</td>
<td valign="top" align="left">SF3B1</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<attrib><italic>Gene <italic>a</italic> has a higher expression level than Gene <italic>b</italic> in HCC patients compared with CwoHCC patients.</italic></attrib>
</table-wrap-foot>
</table-wrap>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption><p>A plot showing the IFS procedure for identifying HCC. When the top 857 features optimized by mRMR were used to perform prediction, the overall success rate reaches an IFS peak of 100% in fivefold cross validation. The solid line represents the ROC curve. The dotted line represents the strategy of randomly guess.</p></caption>
<graphic xlink:href="fbioe-08-00254-g002.tif"/>
</fig>
</sec>
<sec id="S3.SS2">
<title>Examination of the Diagnostic Signature on Independent Datasets</title>
<p>Subsequently, we used biopsy and surgically resected samples to estimate the performance of the 11-gene-pair (see <xref ref-type="table" rid="T2">Table 2</xref>). For 73 biopsy samples in the testing datasets, it yielded accuracy of 100%, sensitivity of 100%, specificity of 100%. For 263 surgically resected samples in the testing datasets, its accuracy is 100%, sensitivity 100%, specificity 100%. In the data set GSE121248, all (100.0%) of the 70 HCC samples were correctly recognized as HCC. For surgically resected samples, 79.79% of the 475 HCC samples from 3 datasets (GSE109211, GSE112790, and GSE102079) were correctly classified. Moreover, the 11-gene-pair based model could correctly identify the 371 HCC and the 50 normal tissues in patients with HCC (NwHCC) samples measured by RNA-seq, in which no RNA-seq information was included (<xref ref-type="table" rid="T2">Table 2</xref>). These results demonstrated that the 11-gene-pair signature could distinguish HCC from non-cancerous liver tissues and the signature was robust to clinicopathological variations. For the 1190 HCC samples and 62 CwoHCC samples, the sensitivity, specificity, and AUC are 91.93%, 100%, and 0.9597 [95% CI (confidence intervals) is 0.9519&#x2013;0.9674; see in <xref ref-type="fig" rid="F3">Figure 3</xref>], respectively.</p>
<table-wrap position="float" id="T2">
<label>TABLE 2</label>
<caption><p>The performance of the signature in the validation datasets.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left">Datasets</td>
<td valign="top" align="center">NSnHCC</td>
<td valign="top" align="center">NSpCwoHCC</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Testing datasets (biopsy)</td>
<td valign="top" align="center">100%(29/29)</td>
<td valign="top" align="center">100% (44/44)</td>
</tr>
<tr>
<td valign="top" align="left">Testing datasets (surgery)</td>
<td valign="top" align="center">100%(245/245)</td>
<td valign="top" align="center">100% (18/18)</td>
</tr>
<tr>
<td valign="top" align="left">GSE109211</td>
<td valign="top" align="center">31.43%(44/140)</td>
<td valign="top" align="center">&#x2013;</td>
</tr>
<tr>
<td valign="top" align="left">GSE112790</td>
<td valign="top" align="center">100%(183/183)</td>
<td valign="top" align="center">&#x2013;</td>
</tr>
<tr>
<td valign="top" align="left">GSE102079</td>
<td valign="top" align="center">100%(152/152)</td>
<td valign="top" align="center">&#x2013;</td>
</tr>
<tr>
<td valign="top" align="left">GSE121248</td>
<td valign="top" align="center">100%(70/70)</td>
<td valign="top" align="center">&#x2013;</td>
</tr>
<tr>
<td valign="top" align="left">TCGA</td>
<td valign="top" align="center">100%(371/371)</td>
<td valign="top" align="center">&#x2013;</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<attrib><italic>NSnHCC, number (sensitivity) of HCC samples; NSpCwoHCC, number (specificity) of CwoHCC samples.</italic></attrib>
</table-wrap-foot>
</table-wrap>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption><p>Area under the receiver operating characteristic curve (AUC) of the validation data from public databases of biopsy and surgically resected HCC and CwoHCC samples. The solid line represents the ROC curve. The dotted line represents the strategy of randomly guess.</p></caption>
<graphic xlink:href="fbioe-08-00254-g003.tif"/>
</fig>
<p>For biopsy samples, all of 80 cirrhosis tissues in patients with HCC (CwHCC) samples in GSE54236 and all of 97 NwHCC biopsy tissues from 2 datasets (GSE64041 and GSE121248) were correctly classified to HCC. The results proved again that, the 11-gene-pair still displayed good performance that most of HCC adjacent non-cancerous patients (CwHCC and NwHCC) can be correctly recognized, even for the inaccurate samples from biopsy specimens. For surgically resected samples, 93.7% of the 254 CwHCC samples and 100% of the 644 NwHCC samples can be accurately identified (see in <xref ref-type="table" rid="T3">Table 3</xref>). All above results demonstrated again that the obtained 11-gene-pair could be regarded as key biological signatures to diagnose HCC patients.</p>
<table-wrap position="float" id="T3">
<label>TABLE 3</label>
<caption><p>Comparison of 11 gene pairs with existing methods on independent datasets.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left">Dataset</td>
<td valign="top" align="center" colspan="3">11-gene-pair</td>
<td valign="top" align="center" colspan="3">19-gene-pair</td>
</tr>
<tr>
<td valign="top" align="center"></td>
<td valign="top" align="center" colspan="3"><hr/></td>
<td valign="top" align="center" colspan="3"><hr/></td>
</tr>
<tr>
<td/>
<td valign="top" align="center">NSnHCC</td>
<td valign="top" align="center">NACwHCC</td>
<td valign="top" align="center">NANwHCC</td>
<td valign="top" align="center">NSnHCC</td>
<td valign="top" align="center">NACwHCC</td>
<td valign="top" align="center">NANwHCC</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left" colspan="7"><bold>Datasets from surgical resection</bold></td>
</tr>
<tr>
<td valign="top" align="left">GSE6764</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">10/10(100.0%)</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">10/10(100.0%)</td>
<td valign="top" align="center">&#x2212;</td>
</tr>
<tr>
<td valign="top" align="left">GSE17548</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">18/20(90.0%)</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">18/20(90.0%)</td>
<td valign="top" align="center">&#x2212;</td>
</tr>
<tr>
<td valign="top" align="left">GSE17967</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">16/16(100.0%)</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">8/16(50.0%)</td>
<td valign="top" align="center">&#x2212;</td>
</tr>
<tr>
<td valign="top" align="left">GSE63898</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">168/168(100.0%)</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">168/168(100.0%)</td>
<td valign="top" align="center">&#x2212;</td>
</tr>
<tr>
<td valign="top" align="left">GSE25097</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">40/40(100.0%)</td>
<td valign="top" align="center">243/243(100.0%)</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">40/40(100.0%)</td>
<td valign="top" align="center">243/243(100.0%)</td>
</tr>
<tr>
<td valign="top" align="left">GSE62232</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">10/10(100.0%)</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">10/10(100.0%)</td>
</tr>
<tr>
<td valign="top" align="left">GSE36376</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">193/193(100.0%)</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">172/193(89.1%)</td>
</tr>
<tr>
<td valign="top" align="left">GSE39791</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">72/72(100.0%)</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">71/72(98.6%)</td>
</tr>
<tr>
<td valign="top" align="left">GSE41804</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">20/20(100.0%)</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">20/20(100.0%)</td>
</tr>
<tr>
<td valign="top" align="left">GSE112790</td>
<td valign="top" align="center">183/183(100.0%)</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">15/15(100.0%)</td>
<td valign="top" align="center">183/183(100.0%)</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">15/15(100.0%)</td>
</tr>
<tr>
<td valign="top" align="left">GSE102079</td>
<td valign="top" align="center">152/152(100.0%)</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">91/91(100.0%)</td>
<td valign="top" align="center">152/152(100.0%)</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">91/91(100.0%)</td>
</tr>
<tr>
<td valign="top" align="left">GSE109211</td>
<td valign="top" align="center">44/140(31.4%)</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">37/140(26.4%)</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">&#x2212;</td>
</tr>
<tr>
<td valign="top" align="left">Total</td>
<td valign="top" align="center">379/475(79.8%)</td>
<td valign="top" align="center">238/254(93.7%)</td>
<td valign="top" align="center">644/644(100.0%)</td>
<td valign="top" align="center">372/475(79.3%)</td>
<td valign="top" align="center">244/254(96.1%)</td>
<td valign="top" align="center">622/644(96.6%)</td>
</tr>
<tr>
<td valign="top" align="left" colspan="7"><bold>Datasets from biopsy</bold></td>
</tr>
<tr>
<td valign="top" align="left">GSE121248</td>
<td valign="top" align="center">70/70(100.0%)</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">37/37(100.0%)</td>
<td valign="top" align="center">70/70(100.0%)</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">37/37(100.0%)</td>
</tr>
<tr>
<td valign="top" align="left">GSE64041</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">60/60(100.0%)</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">60/60(100.0%)</td>
</tr>
<tr>
<td valign="top" align="left">GSE54236</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">80/80(100.0%)</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">&#x2212;</td>
<td valign="top" align="center">62/80(77.5%)</td>
<td valign="top" align="center">&#x2212;</td>
</tr>
<tr>
<td valign="top" align="left">Total</td>
<td valign="top" align="center">70/70(100.0%)</td>
<td valign="top" align="center">80/80(100.0%)</td>
<td valign="top" align="center">97/97(100.0%)</td>
<td valign="top" align="center">70/70(100.0%)</td>
<td valign="top" align="center">62/80(77.5%)</td>
<td valign="top" align="center">97/97(100.0%)</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<attrib><italic>NACwHCC, number (accuracy) of cirrhosis tissues in patients with HCC samples to HCC; NANwHCC, number (accuracy) of normal tissues in patients with HCC samples to HCC.</italic></attrib>
</table-wrap-foot>
</table-wrap>
</sec>
<sec id="S3.SS3">
<title>Comparison With Existing Methods</title>
<p>To further demonstrate the performance of our proposed signatures, we compared our method with 19-gene-pair-based models and recorded results in <xref ref-type="table" rid="T3">Table 3</xref>. An earlier work done by <xref ref-type="bibr" rid="B2">Ao et al. (2018)</xref> found that 19-gene-pair can be regarded as diagnostic signature to discriminate HCC and adjacent non-cancerous tissues (cirrhosis or normal) from CwoHCC. Their model could produce 99.69% of accuracy which is lower than that of our 11-gene-pair based model.</p>
<p>For biopsy samples, our proposed model could correctly identify the 70 HCC samples in GSE121248 and the 97 NwHCC biopsy tissues from 2 datasets (GSE64041 and GSE121248) with the accuracy of 100%. Moreover, all 80 CwHCC samples in GSE54236 can be predicted as HCC. Compared with the accuracy (77.5%) of 19-gene-pair based model, the accuracy of 11-gene-pare model could increase to 100%.</p>
<p>For surgically resected samples, based on the predictor of 11-gene-pair, 79.8% of the 475 HCC samples from 3 datasets (GSE109211, GSE112790, and GSE102079) and 93.7% of the 254 CwHCC samples from 5 datasets (GSE6764, GSE17548, GSE25097, GSE17967, and GSE63898) can be corrected as HCC. Moreover, the model can accurately predict the 644 NwHCC biopsy tissues integrated from 7 datasets (GSE25097, GSE62232, GSE36376, GSE39791, GSE41804, GSE112790, and GSE102079). Also, the sensitivity of HCC samples increases to 79.8% (19-gene-pair: 79.3%) and the accuracy of NwHCC samples to HCC increases to 100% (19-gene-pair: 96.6%). It can be seen from <xref ref-type="table" rid="T3">Table 3</xref> that in the identification of both HCC and adjacent non-cancerous tissues (CwHCC and NwHCC) from CwoHCC by surgically resected samples, the 11-gene-pair based model displayed better performance than the 19-gene-pair based model, demonstrating that the 11-gene-pair-based model is quite promising in generating reliable results for the early HCC diagnosis.</p>
<p>The above results showed that the proposed 11-gene-pair-based model is powerful on both training datasets and independent datasets. This achievement can be attribute to using within-sample REOs and SVM.</p>
</sec>
</sec>
<sec id="S4">
<title>Discussion</title>
<p>Clinical practice has demonstrated that diagnosing the tumors in early stages is key to improve the survival of patient. Although pathology is used as a gold standard for HCC diagnosis, the histological analysis of the HCC biopsy specimen is influenced by the sampling location and tissue amount. In present work, a set of diagnostic signature including 11-gene-pair consisting of 18 genes was identified, which can be used to discriminate HCC and adjacent non-cancerous tissues (CwHCC and NwHCC) from CwoHCC individuals for the early HCC diagnosis.</p>
<p>Ten genes in the signature set, including LAMC1, UBE4B, HSPH1, HNF1A, SF3B1, APC2, CHST4, HGF, MTHFD2, and AGO3, might have a vital role during the hepatocarcinogenesis and are key genes for cancer. For instance, LAMC1 mRNA can promote the development of HCC by competing with miR-124 and supporting the excretion of CD151 (<xref ref-type="bibr" rid="B69">Yang et al., 2017</xref>). UBE4B can be used as a potential prognostic marker for HCC treatment due to its carcinogenic effect in human primary HCC (<xref ref-type="bibr" rid="B72">Zhang et al., 2016</xref>). Additionally, HNF1A is closely associated with HCC because the number of HNF1A increase when non-cancerous liver develops into high differentiate HCC (<xref ref-type="bibr" rid="B62">Wang et al., 1998</xref>). SF3B1 is a highly conserved spliceosomal protein in evolution (<xref ref-type="bibr" rid="B22">Eilbracht and Schmidt-Zachmann, 2001</xref>) and its expression increases significantly in liver HCC tissues. Serum anti-SF3B1 autoantibody is a potential diagnostic marker for HCC patients (<xref ref-type="bibr" rid="B31">Hwang et al., 2018</xref>). Reportedly, HSPH1 (<xref ref-type="bibr" rid="B68">Yang et al., 2015</xref>), APC2 (<xref ref-type="bibr" rid="B26">Ghosh et al., 2016</xref>), CHST4 (<xref ref-type="bibr" rid="B25">Gao et al., 2015</xref>), HGF (<xref ref-type="bibr" rid="B59">Unic et al., 2018</xref>), MTHFD2 (<xref ref-type="bibr" rid="B37">Liu et al., 2016</xref>), and AGO3 (<xref ref-type="bibr" rid="B33">Kitagawa et al., 2013</xref>) are closely related to HCC.</p>
<p>Subsequently, the 18 genes (11-gene-pair) were used for functional enrichment analysis by using Metascape<sup><xref ref-type="fn" rid="footnote2">2</xref></sup> (<xref ref-type="bibr" rid="B58">Tripathi et al., 2015</xref>) on the KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways and GO (Gene Ontology) terms. In order to determine the significant terms, <italic>p</italic>-value &#x003C; 0.05 and the number of enriched genes &#x2265;3 were used as the statistical standard. Finally, 18 genes were significantly enriched in the &#x201C;ribonucleoprotein complex biogenesis,&#x201D; &#x201C;positive regulation of cellular component biogenesis,&#x201D; &#x201C;lymphocyte activation,&#x201D; and &#x201C;chemotaxis&#x201D; terms based on GO analysis, as well as &#x201C;Pathways in cancer&#x201D; according to KEGG analysis. The above analysis showed that the genes of the 11-gene-pair might have vital roles in the development and progression of HCC.</p>
<p>In current study, we showed that 11 gen pairs can be applied to accurately diagnose the tumors found in the liver. Further, we shall try to establish a user-friendly web-server for the proposed &#x201C;11-gene-pair&#x201D; model. In the future, we will apply other feature selection techniques and algorithms to further improve the diagnosis of cancers.</p>
</sec>
<sec id="S5">
<title>Data Availability Statement</title>
<p>The datasets used in this study can be freely download from the GEO (<ext-link ext-link-type="uri" xlink:href="https://www.ncbi.nlm.nih.gov/geo/">https://www.ncbi.nlm.nih.gov/geo/</ext-link>) and TCGA (<ext-link ext-link-type="uri" xlink:href="https://portal.gdc.cancer.gov/repository">https://portal.gdc.cancer.gov/repository</ext-link>) repository.</p>
</sec>
<sec id="S6">
<title>Author Contributions</title>
<p>HL designed the study and revised the manuscript. Z-MZ carried out all the data collection and drafted the manuscript. Z-MZ, J-XT, FW, F-YD, and Z-YZ performed the data analysis. All authors approved the final manuscript.</p>
</sec>
<sec id="conf1">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
</body>
<back>
<fn-group>
<fn fn-type="financial-disclosure">
<p><bold>Funding.</bold> This work was supported by the National Nature Scientific Foundation of China (61772119) and Sichuan Provincial Science Fund for Distinguished Young Scholars (20JCQN0262).</p>
</fn>
</fn-group>
<sec id="S8" sec-type="supplementary material"><title>Supplementary Material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fbioe.2020.00254/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fbioe.2020.00254/full#supplementary-material</ext-link></p>
<supplementary-material xlink:href="Table_1.DOCX" id="TS1" mimetype="application/vnd.openxmlformats-officedocument.wordprocessingml.document" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<ref-list>
<title>References</title>
<ref id="B1"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ao</surname> <given-names>L.</given-names></name> <name><surname>Song</surname> <given-names>X.</given-names></name> <name><surname>Li</surname> <given-names>X.</given-names></name> <name><surname>Tong</surname> <given-names>M.</given-names></name> <name><surname>Guo</surname> <given-names>Y.</given-names></name> <name><surname>Li</surname> <given-names>J.</given-names></name><etal/></person-group> (<year>2016</year>). <article-title>An individualized prognostic signature and multiomics distinction for early stage hepatocellular carcinoma patients with surgical resection.</article-title> <source><italic>Oncotarget</italic></source> <volume>7</volume> <fpage>24097</fpage>&#x2013;<lpage>24110</lpage>. <pub-id pub-id-type="doi">10.18632/oncotarget.8212</pub-id> <pub-id pub-id-type="pmid">27006471</pub-id></citation></ref>
<ref id="B2"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ao</surname> <given-names>L.</given-names></name> <name><surname>Zhang</surname> <given-names>Z.</given-names></name> <name><surname>Guan</surname> <given-names>Q.</given-names></name> <name><surname>Guo</surname> <given-names>Y.</given-names></name> <name><surname>Guo</surname> <given-names>Y.</given-names></name> <name><surname>Zhang</surname> <given-names>J.</given-names></name><etal/></person-group> (<year>2018</year>). <article-title>A qualitative signature for early diagnosis of hepatocellular carcinoma based on relative expression orderings.</article-title> <source><italic>Liver Int.</italic></source> <volume>38</volume> <fpage>1812</fpage>&#x2013;<lpage>1819</lpage>. <pub-id pub-id-type="doi">10.1111/liv.13864</pub-id> <pub-id pub-id-type="pmid">29682909</pub-id></citation></ref>
<ref id="B3"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Archer</surname> <given-names>K. J.</given-names></name> <name><surname>Mas</surname> <given-names>V. R.</given-names></name> <name><surname>David</surname> <given-names>K.</given-names></name> <name><surname>Maluf</surname> <given-names>D. G.</given-names></name> <name><surname>Bornstein</surname> <given-names>K.</given-names></name> <name><surname>Fisher</surname> <given-names>R. A.</given-names></name></person-group> (<year>2009</year>). <article-title>Identifying genes for establishing a multigenic test for hepatocellular carcinoma surveillance in hepatitis C virus-positive cirrhotic patients.</article-title> <source><italic>Cancer Epidemiol. Biomarkers Prev.</italic></source> <volume>18</volume> <fpage>2929</fpage>&#x2013;<lpage>2932</lpage>. <pub-id pub-id-type="doi">10.1158/1055-9965.EPI-09-0767</pub-id> <pub-id pub-id-type="pmid">19861515</pub-id></citation></ref>
<ref id="B4"><citation citation-type="journal"><collab>Asia-Pacific Working Party on Prevention of Hepatocellular Carcinoma</collab> (<year>2010</year>). <article-title>Prevention of hepatocellular carcinoma in the Asia-Pacific region: consensus statements.</article-title> <source><italic>J. Gastroenterol. Hepatol.</italic></source> <volume>25</volume> <fpage>657</fpage>&#x2013;<lpage>663</lpage>. <pub-id pub-id-type="doi">10.1111/j.1440-1746.2009.06167.x</pub-id> <pub-id pub-id-type="pmid">20492323</pub-id></citation></ref>
<ref id="B5"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bao</surname> <given-names>S.</given-names></name> <name><surname>Zhao</surname> <given-names>H.</given-names></name> <name><surname>Yuan</surname> <given-names>J.</given-names></name> <name><surname>Fan</surname> <given-names>D.</given-names></name> <name><surname>Zhang</surname> <given-names>Z.</given-names></name> <name><surname>Su</surname> <given-names>J.</given-names></name><etal/></person-group> (<year>2019</year>). <article-title>Computational identification of mutator-derived lncRNA signatures of genome instability for improving the clinical outcome of cancers: a case study in breast cancer.</article-title> <source><italic>Brief. Bioinform.</italic></source> <pub-id pub-id-type="doi">10.1093/bib/bbz118</pub-id> <comment>[Epub ahead of print]</comment>. <pub-id pub-id-type="pmid">31665214</pub-id></citation></ref>
<ref id="B6"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Barrett</surname> <given-names>T.</given-names></name> <name><surname>Suzek</surname> <given-names>T. O.</given-names></name> <name><surname>Troup</surname> <given-names>D. B.</given-names></name> <name><surname>Wilhite</surname> <given-names>S. E.</given-names></name> <name><surname>Ngau</surname> <given-names>W. C.</given-names></name> <name><surname>Ledoux</surname> <given-names>P.</given-names></name><etal/></person-group> (<year>2005</year>). <article-title>NCBI GEO: mining millions of expression profiles&#x2013;database and tools.</article-title> <source><italic>Nucleic Acids Res.</italic></source> <volume>33</volume> <fpage>D562</fpage>&#x2013;<lpage>D566</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gki022</pub-id> <pub-id pub-id-type="pmid">15608262</pub-id></citation></ref>
<ref id="B7"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Basith</surname> <given-names>S.</given-names></name> <name><surname>Manavalan</surname> <given-names>B.</given-names></name> <name><surname>Shin</surname> <given-names>T. H.</given-names></name> <name><surname>Lee</surname> <given-names>G.</given-names></name></person-group> (<year>2019</year>). <article-title>SDM6A: a web-based integrative machine-learning framework for predicting 6mA sites in the rice genome.</article-title> <source><italic>Mol. Ther. Nucleic Acids</italic></source> <volume>18</volume> <fpage>131</fpage>&#x2013;<lpage>141</lpage>. <pub-id pub-id-type="doi">10.1016/j.omtn.2019.08.011</pub-id> <pub-id pub-id-type="pmid">31542696</pub-id></citation></ref>
<ref id="B8"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bu</surname> <given-names>H. D.</given-names></name> <name><surname>Hao</surname> <given-names>J. Q.</given-names></name> <name><surname>Guan</surname> <given-names>J. H.</given-names></name> <name><surname>Zhou</surname> <given-names>S. G.</given-names></name></person-group> (<year>2018</year>). <article-title>Predicting enhancers from multiple cell lines and tissues across different developmental stages based on SVM method.</article-title> <source><italic>Curr. Bioinform.</italic></source> <volume>13</volume> <fpage>655</fpage>&#x2013;<lpage>660</lpage>. <pub-id pub-id-type="doi">10.2174/1574893613666180726163429</pub-id></citation></ref>
<ref id="B9"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Budhu</surname> <given-names>A.</given-names></name> <name><surname>Forgues</surname> <given-names>M.</given-names></name> <name><surname>Ye</surname> <given-names>Q. H.</given-names></name> <name><surname>Jia</surname> <given-names>H. L.</given-names></name> <name><surname>He</surname> <given-names>P.</given-names></name> <name><surname>Zanetti</surname> <given-names>K. A.</given-names></name><etal/></person-group> (<year>2006</year>). <article-title>Prediction of venous metastases, recurrence, and prognosis in hepatocellular carcinoma based on a unique immune response signature of the liver microenvironment.</article-title> <source><italic>Cancer Cell</italic></source> <volume>10</volume> <fpage>99</fpage>&#x2013;<lpage>111</lpage>. <pub-id pub-id-type="doi">10.1016/j.ccr.2006.06.016</pub-id> <pub-id pub-id-type="pmid">16904609</pub-id></citation></ref>
<ref id="B10"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cai</surname> <given-names>H.</given-names></name> <name><surname>Li</surname> <given-names>X.</given-names></name> <name><surname>Li</surname> <given-names>J.</given-names></name> <name><surname>Ao</surname> <given-names>L.</given-names></name> <name><surname>Yan</surname> <given-names>H.</given-names></name> <name><surname>Tong</surname> <given-names>M.</given-names></name><etal/></person-group> (<year>2015</year>). <article-title>Tamoxifen therapy benefit predictive signature coupled with prognostic signature of post-operative recurrent risk for early stage ER+ breast cancer.</article-title> <source><italic>Oncotarget</italic></source> <volume>6</volume> <fpage>44593</fpage>&#x2013;<lpage>44608</lpage>. <pub-id pub-id-type="doi">10.18632/oncotarget.6260</pub-id> <pub-id pub-id-type="pmid">26527319</pub-id></citation></ref>
<ref id="B11"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cao</surname> <given-names>R.</given-names></name> <name><surname>Freitas</surname> <given-names>C.</given-names></name> <name><surname>Chan</surname> <given-names>L.</given-names></name> <name><surname>Sun</surname> <given-names>M.</given-names></name> <name><surname>Jiang</surname> <given-names>H.</given-names></name> <name><surname>Chen</surname> <given-names>Z.</given-names></name></person-group> (<year>2017</year>). <article-title>ProLanGO: protein function prediction using neural machine translation based on a recurrent neural network.</article-title> <source><italic>Molecules</italic></source> <volume>22</volume>:<issue>1732</issue>. <pub-id pub-id-type="doi">10.3390/molecules22101732</pub-id> <pub-id pub-id-type="pmid">29039790</pub-id></citation></ref>
<ref id="B12"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cao</surname> <given-names>R.</given-names></name> <name><surname>Wang</surname> <given-names>Z.</given-names></name> <name><surname>Wang</surname> <given-names>Y.</given-names></name> <name><surname>Cheng</surname> <given-names>J.</given-names></name></person-group> (<year>2014</year>). <article-title>SMOQ: a tool for predicting the absolute residue-specific quality of a single protein model with support vector machines.</article-title> <source><italic>BMC Bioinformatics</italic></source> <volume>15</volume>:<issue>120</issue>. <pub-id pub-id-type="doi">10.1186/1471-2105-15-120</pub-id> <pub-id pub-id-type="pmid">24776231</pub-id></citation></ref>
<ref id="B13"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chang</surname> <given-names>C. C.</given-names></name> <name><surname>Lin</surname> <given-names>C. J.</given-names></name></person-group> (<year>2011</year>). <article-title>LIBSVM: a library for support vector machines.</article-title> <source><italic>ACM Trans. Intell. Syst. Technol.</italic></source> <volume>2</volume>:<issue>27</issue>.</citation></ref>
<ref id="B14"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chao</surname> <given-names>L.</given-names></name> <name><surname>Jin</surname> <given-names>S.</given-names></name> <name><surname>Wang</surname> <given-names>L.</given-names></name> <name><surname>Guo</surname> <given-names>F.</given-names></name> <name><surname>Zou</surname> <given-names>Q.</given-names></name></person-group> (<year>2019a</year>). <article-title>AOPs-SVM: a sequence-based classifier of antioxidant proteins using a support vector machine.</article-title> <source><italic>Front. Bioeng. Biotechnol.</italic></source> <volume>7</volume>:<issue>224</issue>. <pub-id pub-id-type="doi">10.3389/fbioe.2019.00224</pub-id> <pub-id pub-id-type="pmid">31620433</pub-id></citation></ref>
<ref id="B15"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chao</surname> <given-names>L.</given-names></name> <name><surname>Wei</surname> <given-names>L.</given-names></name> <name><surname>Zou</surname> <given-names>Q.</given-names></name></person-group> (<year>2019b</year>). <article-title>SecProMTB: a SVM-based classifier for secretory proteins of <italic>Mycobacterium tuberculosis</italic> with imbalanced data set.</article-title> <source><italic>Proteomics</italic></source> <volume>19</volume>:<issue>e1900007</issue>.</citation></ref>
<ref id="B16"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>R.</given-names></name> <name><surname>Guan</surname> <given-names>Q.</given-names></name> <name><surname>Cheng</surname> <given-names>J.</given-names></name> <name><surname>He</surname> <given-names>J.</given-names></name> <name><surname>Liu</surname> <given-names>H.</given-names></name> <name><surname>Cai</surname> <given-names>H.</given-names></name><etal/></person-group> (<year>2017</year>). <article-title>Robust transcriptional tumor signatures applicable to both formalin-fixed paraffin-embedded and fresh-frozen samples.</article-title> <source><italic>Oncotarget</italic></source> <volume>8</volume> <fpage>6652</fpage>&#x2013;<lpage>6662</lpage>. <pub-id pub-id-type="doi">10.18632/oncotarget.14257</pub-id> <pub-id pub-id-type="pmid">28036264</pub-id></citation></ref>
<ref id="B17"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cheng</surname> <given-names>J.</given-names></name> <name><surname>Guo</surname> <given-names>Y.</given-names></name> <name><surname>Gao</surname> <given-names>Q.</given-names></name> <name><surname>Li</surname> <given-names>H.</given-names></name> <name><surname>Yan</surname> <given-names>H.</given-names></name> <name><surname>Li</surname> <given-names>M.</given-names></name><etal/></person-group> (<year>2017</year>). <article-title>Circumvent the uncertainty in the applications of transcriptional signatures to tumor tissues sampled from different tumor sites.</article-title> <source><italic>Oncotarget</italic></source> <volume>8</volume> <fpage>30265</fpage>&#x2013;<lpage>30275</lpage>. <pub-id pub-id-type="doi">10.18632/oncotarget.15754</pub-id> <pub-id pub-id-type="pmid">28427173</pub-id></citation></ref>
<ref id="B18"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Colaprico</surname> <given-names>A.</given-names></name> <name><surname>Silva</surname> <given-names>T. C.</given-names></name> <name><surname>Olsen</surname> <given-names>C.</given-names></name> <name><surname>Garofano</surname> <given-names>L.</given-names></name> <name><surname>Cava</surname> <given-names>C.</given-names></name> <name><surname>Garolini</surname> <given-names>D.</given-names></name><etal/></person-group> (<year>2016</year>). <article-title>TCGAbiolinks: an R/bioconductor package for integrative analysis of TCGA data.</article-title> <source><italic>Nucleic Acids Res.</italic></source> <volume>44</volume>:<issue>e71</issue>. <pub-id pub-id-type="doi">10.1093/nar/gkv1507</pub-id> <pub-id pub-id-type="pmid">26704973</pub-id></citation></ref>
<ref id="B19"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Conover</surname> <given-names>M.</given-names></name> <name><surname>Staples</surname> <given-names>M.</given-names></name> <name><surname>Si</surname> <given-names>D.</given-names></name> <name><surname>Sun</surname> <given-names>M.</given-names></name> <name><surname>Cao</surname> <given-names>R.</given-names></name></person-group> (<year>2019</year>). <article-title>AngularQA: protein model quality assessment with LSTM networks.</article-title> <source><italic>Comput. Math. Biophys.</italic></source> <volume>7</volume> <fpage>1</fpage>&#x2013;<lpage>9</lpage>. <pub-id pub-id-type="doi">10.1515/cmb-2019-0001</pub-id></citation></ref>
<ref id="B20"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dao</surname> <given-names>F. Y.</given-names></name> <name><surname>Lv</surname> <given-names>H.</given-names></name> <name><surname>Wang</surname> <given-names>F.</given-names></name> <name><surname>Feng</surname> <given-names>C. Q.</given-names></name> <name><surname>Ding</surname> <given-names>H.</given-names></name> <name><surname>Chen</surname> <given-names>W.</given-names></name><etal/></person-group> (<year>2019</year>). <article-title>Identify origin of replication in <italic>Saccharomyces cerevisiae</italic> using two-step feature selection technique.</article-title> <source><italic>Bioinformatics</italic></source> <volume>35</volume> <fpage>2075</fpage>&#x2013;<lpage>2083</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/bty943</pub-id> <pub-id pub-id-type="pmid">30428009</pub-id></citation></ref>
<ref id="B21"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Eddy</surname> <given-names>J. A.</given-names></name> <name><surname>Sung</surname> <given-names>J.</given-names></name> <name><surname>Geman</surname> <given-names>D.</given-names></name> <name><surname>Price</surname> <given-names>N. D.</given-names></name></person-group> (<year>2010</year>). <article-title>Relative expression analysis for molecular cancer diagnosis and prognosis.</article-title> <source><italic>Technol. Cancer Res. Treat.</italic></source> <volume>9</volume> <fpage>149</fpage>&#x2013;<lpage>159</lpage>. <pub-id pub-id-type="doi">10.1177/153303461000900204</pub-id> <pub-id pub-id-type="pmid">20218737</pub-id></citation></ref>
<ref id="B22"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Eilbracht</surname> <given-names>J.</given-names></name> <name><surname>Schmidt-Zachmann</surname> <given-names>M. S.</given-names></name></person-group> (<year>2001</year>). <article-title>Identification of a sequence element directing a protein to nuclear speckles.</article-title> <source><italic>Proc. Natl. Acad. Sci. U.S.A.</italic></source> <volume>98</volume> <fpage>3849</fpage>&#x2013;<lpage>3854</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.071042298</pub-id> <pub-id pub-id-type="pmid">11274404</pub-id></citation></ref>
<ref id="B23"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>El-Serag</surname> <given-names>H. B.</given-names></name></person-group> (<year>2011</year>). <article-title>Hepatocellular carcinoma.</article-title> <source><italic>N. Engl. J. Med.</italic></source> <volume>365</volume> <fpage>1118</fpage>&#x2013;<lpage>1127</lpage>. <pub-id pub-id-type="doi">10.1056/NEJMra1001683</pub-id> <pub-id pub-id-type="pmid">21992124</pub-id></citation></ref>
<ref id="B24"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Forner</surname> <given-names>A.</given-names></name> <name><surname>Vilana</surname> <given-names>R.</given-names></name> <name><surname>Ayuso</surname> <given-names>C.</given-names></name> <name><surname>Bianchi</surname> <given-names>L.</given-names></name> <name><surname>Sole</surname> <given-names>M.</given-names></name> <name><surname>Ayuso</surname> <given-names>J. R.</given-names></name><etal/></person-group> (<year>2008</year>). <article-title>Diagnosis of hepatic nodules 20 mm or smaller in cirrhosis: prospective validation of the noninvasive diagnostic criteria for hepatocellular carcinoma.</article-title> <source><italic>Hepatology</italic></source> <volume>47</volume> <fpage>97</fpage>&#x2013;<lpage>104</lpage>. <pub-id pub-id-type="doi">10.1002/hep.21966</pub-id> <pub-id pub-id-type="pmid">18069697</pub-id></citation></ref>
<ref id="B25"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gao</surname> <given-names>F.</given-names></name> <name><surname>Liang</surname> <given-names>H.</given-names></name> <name><surname>Lu</surname> <given-names>H.</given-names></name> <name><surname>Wang</surname> <given-names>J.</given-names></name> <name><surname>Xia</surname> <given-names>M.</given-names></name> <name><surname>Yuan</surname> <given-names>Z.</given-names></name><etal/></person-group> (<year>2015</year>). <article-title>Global analysis of DNA methylation in hepatocellular carcinoma by a liquid hybridization capture-based bisulfite sequencing approach.</article-title> <source><italic>Clin. Epigenetics</italic></source> <volume>7</volume>:<issue>86</issue>. <pub-id pub-id-type="doi">10.1186/s13148-015-0121-1</pub-id> <pub-id pub-id-type="pmid">26300991</pub-id></citation></ref>
<ref id="B26"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ghosh</surname> <given-names>A.</given-names></name> <name><surname>Ghosh</surname> <given-names>A.</given-names></name> <name><surname>Datta</surname> <given-names>S.</given-names></name> <name><surname>Dasgupta</surname> <given-names>D.</given-names></name> <name><surname>Das</surname> <given-names>S.</given-names></name> <name><surname>Ray</surname> <given-names>S.</given-names></name><etal/></person-group> (<year>2016</year>). <article-title>Hepatic miR-126 is a potential plasma biomarker for detection of hepatitis B virus infected hepatocellular carcinoma.</article-title> <source><italic>Int. J. Cancer</italic></source> <volume>138</volume> <fpage>2732</fpage>&#x2013;<lpage>2744</lpage>. <pub-id pub-id-type="doi">10.1002/ijc.29999</pub-id> <pub-id pub-id-type="pmid">26756996</pub-id></citation></ref>
<ref id="B27"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Guan</surname> <given-names>Q.</given-names></name> <name><surname>Chen</surname> <given-names>R.</given-names></name> <name><surname>Yan</surname> <given-names>H.</given-names></name> <name><surname>Cai</surname> <given-names>H.</given-names></name> <name><surname>Guo</surname> <given-names>Y.</given-names></name> <name><surname>Li</surname> <given-names>M.</given-names></name><etal/></person-group> (<year>2016</year>). <article-title>Differential expression analysis for individual cancer samples based on robust within-sample relative gene expression orderings across multiple profiling platforms.</article-title> <source><italic>Oncotarget</italic></source> <volume>7</volume> <fpage>68909</fpage>&#x2013;<lpage>68920</lpage>. <pub-id pub-id-type="doi">10.18632/oncotarget.11996</pub-id> <pub-id pub-id-type="pmid">27634898</pub-id></citation></ref>
<ref id="B28"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Guan</surname> <given-names>Q.</given-names></name> <name><surname>Yan</surname> <given-names>H.</given-names></name> <name><surname>Chen</surname> <given-names>Y.</given-names></name> <name><surname>Zheng</surname> <given-names>B.</given-names></name> <name><surname>Cai</surname> <given-names>H.</given-names></name> <name><surname>He</surname> <given-names>J.</given-names></name><etal/></person-group> (<year>2018</year>). <article-title>Quantitative or qualitative transcriptional diagnostic signatures? A case study for colorectal cancer.</article-title> <source><italic>BMC Genomics</italic></source> <volume>19</volume>:<issue>99</issue>. <pub-id pub-id-type="doi">10.1186/s12864-018-4446-y</pub-id> <pub-id pub-id-type="pmid">29378509</pub-id></citation></ref>
<ref id="B29"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Guan</surname> <given-names>Q.</given-names></name> <name><surname>Zeng</surname> <given-names>Q.</given-names></name> <name><surname>Yan</surname> <given-names>H.</given-names></name> <name><surname>Xie</surname> <given-names>J.</given-names></name> <name><surname>Cheng</surname> <given-names>J.</given-names></name> <name><surname>Ao</surname> <given-names>L.</given-names></name><etal/></person-group> (<year>2019</year>). <article-title>A qualitative transcriptional signature for the early diagnosis of colorectal cancer.</article-title> <source><italic>Cancer Sci.</italic></source> <volume>110</volume> <fpage>3225</fpage>&#x2013;<lpage>3234</lpage>. <pub-id pub-id-type="doi">10.1111/cas.14137</pub-id> <pub-id pub-id-type="pmid">31335996</pub-id></citation></ref>
<ref id="B30"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hartke</surname> <given-names>J.</given-names></name> <name><surname>Johnson</surname> <given-names>M.</given-names></name> <name><surname>Ghabril</surname> <given-names>M.</given-names></name></person-group> (<year>2017</year>). <article-title>The diagnosis and treatment of hepatocellular carcinoma.</article-title> <source><italic>Semin. Diagn. Pathol.</italic></source> <volume>34</volume> <fpage>153</fpage>&#x2013;<lpage>159</lpage>. <pub-id pub-id-type="doi">10.1053/j.semdp.2016.12.011</pub-id> <pub-id pub-id-type="pmid">28108047</pub-id></citation></ref>
<ref id="B31"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hwang</surname> <given-names>H. M.</given-names></name> <name><surname>Heo</surname> <given-names>C. K.</given-names></name> <name><surname>Lee</surname> <given-names>H. J.</given-names></name> <name><surname>Kwak</surname> <given-names>S. S.</given-names></name> <name><surname>Lim</surname> <given-names>W. H.</given-names></name> <name><surname>Yoo</surname> <given-names>J. S.</given-names></name><etal/></person-group> (<year>2018</year>). <article-title>Identification of anti-SF3B1 autoantibody as a diagnostic marker in patients with hepatocellular carcinoma.</article-title> <source><italic>J. Transl. Med.</italic></source> <volume>16</volume>:<issue>177</issue>. <pub-id pub-id-type="doi">10.1186/s12967-018-1546-z</pub-id> <pub-id pub-id-type="pmid">29954402</pub-id></citation></ref>
<ref id="B32"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Indhumathy</surname> <given-names>M.</given-names></name> <name><surname>Nabhan</surname> <given-names>A. R.</given-names></name> <name><surname>Arumugam</surname> <given-names>S.</given-names></name></person-group> (<year>2018</year>). <article-title>A weighted association rule mining method for predicting HCV-human protein interactions.</article-title> <source><italic>Curr. Bioinform.</italic></source> <volume>13</volume> <fpage>73</fpage>&#x2013;<lpage>84</lpage>. <pub-id pub-id-type="doi">10.2174/1574893611666161123142425</pub-id></citation></ref>
<ref id="B33"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kitagawa</surname> <given-names>N.</given-names></name> <name><surname>Ojima</surname> <given-names>H.</given-names></name> <name><surname>Shirakihara</surname> <given-names>T.</given-names></name> <name><surname>Shimizu</surname> <given-names>H.</given-names></name> <name><surname>Kokubu</surname> <given-names>A.</given-names></name> <name><surname>Urushidate</surname> <given-names>T.</given-names></name><etal/></person-group> (<year>2013</year>). <article-title>Downregulation of the microRNA biogenesis components and its association with poor prognosis in hepatocellular carcinoma.</article-title> <source><italic>Cancer Sci.</italic></source> <volume>104</volume> <fpage>543</fpage>&#x2013;<lpage>551</lpage>. <pub-id pub-id-type="doi">10.1111/cas.12126</pub-id> <pub-id pub-id-type="pmid">23398123</pub-id></citation></ref>
<ref id="B34"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>X.</given-names></name> <name><surname>Cai</surname> <given-names>H.</given-names></name> <name><surname>Zheng</surname> <given-names>W.</given-names></name> <name><surname>Tong</surname> <given-names>M.</given-names></name> <name><surname>Li</surname> <given-names>H.</given-names></name> <name><surname>Ao</surname> <given-names>L.</given-names></name><etal/></person-group> (<year>2016</year>). <article-title>An individualized prognostic signature for gastric cancer patients treated with 5-Fluorouracil-based chemotherapy and distinct multi-omics characteristics of prognostic groups.</article-title> <source><italic>Oncotarget</italic></source> <volume>7</volume> <fpage>8743</fpage>&#x2013;<lpage>8755</lpage>. <pub-id pub-id-type="doi">10.18632/oncotarget.7087</pub-id> <pub-id pub-id-type="pmid">26840027</pub-id></citation></ref>
<ref id="B35"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liao</surname> <given-names>Z.</given-names></name> <name><surname>Li</surname> <given-names>D.</given-names></name> <name><surname>Wang</surname> <given-names>X.</given-names></name> <name><surname>Li</surname> <given-names>L.</given-names></name> <name><surname>Zou</surname> <given-names>Q.</given-names></name></person-group> (<year>2018</year>). <article-title>Cancer diagnosis from isomiR expression with machine learning method.</article-title> <source><italic>Curr. Bioinform.</italic></source> <volume>13</volume> <fpage>57</fpage>&#x2013;<lpage>63</lpage>. <pub-id pub-id-type="doi">10.2174/1574893611666160609081155</pub-id></citation></ref>
<ref id="B36"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liao</surname> <given-names>Z.</given-names></name> <name><surname>Wang</surname> <given-names>X.</given-names></name> <name><surname>Lin</surname> <given-names>D.</given-names></name> <name><surname>Zou</surname> <given-names>Q.</given-names></name></person-group> (<year>2017</year>). <article-title>Construction and identification of the RNAi recombinant lentiviral vector targeting human DEPDC7 gene.</article-title> <source><italic>Interdiscip. Sci.</italic></source> <volume>9</volume> <fpage>350</fpage>&#x2013;<lpage>356</lpage>. <pub-id pub-id-type="doi">10.1007/s12539-016-0162-y</pub-id> <pub-id pub-id-type="pmid">27016254</pub-id></citation></ref>
<ref id="B37"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liu</surname> <given-names>X.</given-names></name> <name><surname>Huang</surname> <given-names>Y.</given-names></name> <name><surname>Jiang</surname> <given-names>C.</given-names></name> <name><surname>Ou</surname> <given-names>H.</given-names></name> <name><surname>Guo</surname> <given-names>B.</given-names></name> <name><surname>Liao</surname> <given-names>H.</given-names></name><etal/></person-group> (<year>2016</year>). <article-title>Methylenetetrahydrofolate dehydrogenase 2 overexpression is associated with tumor aggressiveness and poor prognosis in hepatocellular carcinoma.</article-title> <source><italic>Dig. Liver Dis.</italic></source> <volume>48</volume> <fpage>953</fpage>&#x2013;<lpage>960</lpage>. <pub-id pub-id-type="doi">10.1016/j.dld.2016.04.015</pub-id> <pub-id pub-id-type="pmid">27257051</pub-id></citation></ref>
<ref id="B38"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Manavalan</surname> <given-names>B.</given-names></name> <name><surname>Basith</surname> <given-names>S.</given-names></name> <name><surname>Shin</surname> <given-names>T. H.</given-names></name> <name><surname>Choi</surname> <given-names>S.</given-names></name> <name><surname>Kim</surname> <given-names>M. O.</given-names></name> <name><surname>Lee</surname> <given-names>G.</given-names></name></person-group> (<year>2017</year>). <article-title>MLACP: machine-learning-based prediction of anticancer peptides.</article-title> <source><italic>Oncotarget</italic></source> <volume>8</volume> <fpage>77121</fpage>&#x2013;<lpage>77136</lpage>. <pub-id pub-id-type="doi">10.18632/oncotarget.20365</pub-id> <pub-id pub-id-type="pmid">29100375</pub-id></citation></ref>
<ref id="B39"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Manavalan</surname> <given-names>B.</given-names></name> <name><surname>Basith</surname> <given-names>S.</given-names></name> <name><surname>Shin</surname> <given-names>T. H.</given-names></name> <name><surname>Lee</surname> <given-names>D. Y.</given-names></name> <name><surname>Wei</surname> <given-names>L.</given-names></name> <name><surname>Lee</surname> <given-names>G.</given-names></name></person-group> (<year>2019a</year>). <article-title>4mCpred-EL: an ensemble learning framework for identification of DNA <italic>N</italic><sup>4</sup>-methylcytosine sites in the mouse genome.</article-title> <source><italic>Cells</italic></source> <volume>8</volume>:<issue>1332</issue>. <pub-id pub-id-type="doi">10.3390/cells8111332</pub-id> <pub-id pub-id-type="pmid">31661923</pub-id></citation></ref>
<ref id="B40"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Manavalan</surname> <given-names>B.</given-names></name> <name><surname>Basith</surname> <given-names>S.</given-names></name> <name><surname>Shin</surname> <given-names>T. H.</given-names></name> <name><surname>Wei</surname> <given-names>L.</given-names></name> <name><surname>Lee</surname> <given-names>G.</given-names></name></person-group> (<year>2019b</year>). <article-title>AtbPpred: a robust sequence-based prediction of anti-tubercular peptides using extremely randomized trees.</article-title> <source><italic>Comput. Struct. Biotechnol. J.</italic></source> <volume>17</volume> <fpage>972</fpage>&#x2013;<lpage>981</lpage>. <pub-id pub-id-type="doi">10.1016/j.csbj.2019.06.024</pub-id> <pub-id pub-id-type="pmid">31372196</pub-id></citation></ref>
<ref id="B41"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Manavalan</surname> <given-names>B.</given-names></name> <name><surname>Basith</surname> <given-names>S.</given-names></name> <name><surname>Shin</surname> <given-names>T. H.</given-names></name> <name><surname>Wei</surname> <given-names>L.</given-names></name> <name><surname>Lee</surname> <given-names>G.</given-names></name></person-group> (<year>2019c</year>). <article-title>mAHTPred: a sequence-based meta-predictor for improving the prediction of anti-hypertensive peptides using effective feature representation.</article-title> <source><italic>Bioinformatics</italic></source> <volume>35</volume> <fpage>2757</fpage>&#x2013;<lpage>2765</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/bty1047</pub-id> <pub-id pub-id-type="pmid">30590410</pub-id></citation></ref>
<ref id="B42"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Manavalan</surname> <given-names>B.</given-names></name> <name><surname>Basith</surname> <given-names>S.</given-names></name> <name><surname>Shin</surname> <given-names>T. H.</given-names></name> <name><surname>Wei</surname> <given-names>L.</given-names></name> <name><surname>Lee</surname> <given-names>G.</given-names></name></person-group> (<year>2019d</year>). <article-title>Meta-4mCpred: a sequence-based meta-predictor for accurate DNA 4mC site prediction using effective feature representation.</article-title> <source><italic>Mol. Ther. Nucleic Acids</italic></source> <volume>16</volume> <fpage>733</fpage>&#x2013;<lpage>744</lpage>. <pub-id pub-id-type="doi">10.1016/j.omtn.2019.04.019</pub-id> <pub-id pub-id-type="pmid">31146255</pub-id></citation></ref>
<ref id="B43"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Manavalan</surname> <given-names>B.</given-names></name> <name><surname>Lee</surname> <given-names>J.</given-names></name></person-group> (<year>2017</year>). <article-title>SVMQA: support-vector-machine-based protein single-model quality assessment.</article-title> <source><italic>Bioinformatics</italic></source> <volume>33</volume> <fpage>2496</fpage>&#x2013;<lpage>2503</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btx222</pub-id> <pub-id pub-id-type="pmid">28419290</pub-id></citation></ref>
<ref id="B44"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Manavalan</surname> <given-names>B.</given-names></name> <name><surname>Shin</surname> <given-names>T. H.</given-names></name> <name><surname>Lee</surname> <given-names>G.</given-names></name></person-group> (<year>2018a</year>). <article-title>DHSpred: support-vector-machine-based human DNase I hypersensitive sites prediction using the optimal features selected by random forest.</article-title> <source><italic>Oncotarget</italic></source> <volume>9</volume> <fpage>1944</fpage>&#x2013;<lpage>1956</lpage>. <pub-id pub-id-type="doi">10.18632/oncotarget.23099</pub-id> <pub-id pub-id-type="pmid">29416743</pub-id></citation></ref>
<ref id="B45"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Manavalan</surname> <given-names>B.</given-names></name> <name><surname>Shin</surname> <given-names>T. H.</given-names></name> <name><surname>Lee</surname> <given-names>G.</given-names></name></person-group> (<year>2018b</year>). <article-title>PVP-SVM: sequence-based prediction of phage virion proteins using a support vector machine.</article-title> <source><italic>Front. Microbiol.</italic></source> <volume>9</volume>:<issue>476</issue>. <pub-id pub-id-type="doi">10.3389/fmicb.2018.00476</pub-id> <pub-id pub-id-type="pmid">29616000</pub-id></citation></ref>
<ref id="B46"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Manavalan</surname> <given-names>B.</given-names></name> <name><surname>Subramaniyam</surname> <given-names>S.</given-names></name> <name><surname>Shin</surname> <given-names>T. H.</given-names></name> <name><surname>Kim</surname> <given-names>M. O.</given-names></name> <name><surname>Lee</surname> <given-names>G.</given-names></name></person-group> (<year>2018c</year>). <article-title>Machine-learning-based prediction of cell-penetrating peptides and their uptake efficiency with improved accuracy.</article-title> <source><italic>J. Proteome Res.</italic></source> <volume>17</volume> <fpage>2715</fpage>&#x2013;<lpage>2726</lpage>. <pub-id pub-id-type="doi">10.1021/acs.jproteome.8b00148</pub-id> <pub-id pub-id-type="pmid">29893128</pub-id></citation></ref>
<ref id="B47"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Moritz</surname> <given-names>S.</given-names></name> <name><surname>Pfab</surname> <given-names>J.</given-names></name> <name><surname>Wu</surname> <given-names>T.</given-names></name> <name><surname>Hou</surname> <given-names>J.</given-names></name> <name><surname>Cheng</surname> <given-names>J.</given-names></name> <name><surname>Cao</surname> <given-names>R.</given-names></name><etal/></person-group> (<year>2019</year>). <article-title>Cascaded-CNN: deep learning to predict protein backbone structure from high-resolution cryo-EM density maps.</article-title> <source><italic>BioRxiv [Preprint]</italic></source> <pub-id pub-id-type="doi">10.1038/s41598-020-60598-y</pub-id> <pub-id pub-id-type="pmid">32152330</pub-id></citation></ref>
<ref id="B48"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Peng</surname> <given-names>H.</given-names></name> <name><surname>Long</surname> <given-names>F.</given-names></name> <name><surname>Ding</surname> <given-names>C.</given-names></name></person-group> (<year>2005</year>). <article-title>Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy.</article-title> <source><italic>IEEE Trans. Pattern Anal. Mach. Intell.</italic></source> <volume>27</volume> <fpage>1226</fpage>&#x2013;<lpage>1238</lpage>. <pub-id pub-id-type="doi">10.1109/TPAMI.2005.159</pub-id> <pub-id pub-id-type="pmid">16119262</pub-id></citation></ref>
<ref id="B49"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Qu</surname> <given-names>K. Y.</given-names></name> <name><surname>Gao</surname> <given-names>F.</given-names></name> <name><surname>Guo</surname> <given-names>F.</given-names></name> <name><surname>Zou</surname> <given-names>Q.</given-names></name></person-group> (<year>2019</year>). <article-title>Taxonomy dimension reduction for colorectal cancer prediction.</article-title> <source><italic>Comput. Biol. Chem.</italic></source> <volume>83</volume>:<issue>107160</issue>. <pub-id pub-id-type="doi">10.1016/j.compbiolchem.2019.107160</pub-id> <pub-id pub-id-type="pmid">31743831</pub-id></citation></ref>
<ref id="B50"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Russo</surname> <given-names>F. P.</given-names></name> <name><surname>Imondi</surname> <given-names>A.</given-names></name> <name><surname>Lynch</surname> <given-names>E. N.</given-names></name> <name><surname>Farinati</surname> <given-names>F.</given-names></name></person-group> (<year>2018</year>). <article-title>When and how should we perform a biopsy for HCC in patients with liver cirrhosis in 2018? A review.</article-title> <source><italic>Dig. Liver Dis.</italic></source> <volume>50</volume> <fpage>640</fpage>&#x2013;<lpage>646</lpage>. <pub-id pub-id-type="doi">10.1016/j.dld.2018.03.014</pub-id> <pub-id pub-id-type="pmid">29636240</pub-id></citation></ref>
<ref id="B51"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Stephenson</surname> <given-names>N.</given-names></name> <name><surname>Shane</surname> <given-names>E.</given-names></name> <name><surname>Chase</surname> <given-names>J.</given-names></name> <name><surname>Rowland</surname> <given-names>J.</given-names></name> <name><surname>Ries</surname> <given-names>D.</given-names></name> <name><surname>Justice</surname> <given-names>N.</given-names></name><etal/></person-group> (<year>2019</year>). <article-title>Survey of machine learning techniques in drug discovery.</article-title> <source><italic>Curr. Drug Metab.</italic></source> <volume>20</volume> <fpage>185</fpage>&#x2013;<lpage>193</lpage>. <pub-id pub-id-type="doi">10.2174/1389200219666180820112457</pub-id> <pub-id pub-id-type="pmid">30124147</pub-id></citation></ref>
<ref id="B52"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sun</surname> <given-names>J.</given-names></name> <name><surname>Zhang</surname> <given-names>Z.</given-names></name> <name><surname>Bao</surname> <given-names>S.</given-names></name> <name><surname>Yan</surname> <given-names>C.</given-names></name> <name><surname>Hou</surname> <given-names>P.</given-names></name> <name><surname>Wu</surname> <given-names>N.</given-names></name><etal/></person-group> (<year>2020</year>). <article-title>Identification of tumor immune infiltration-associated lncRNAs for improving prognosis and immunotherapy response of patients with non-small cell lung cancer.</article-title> <source><italic>J. Immunother. Cancer</italic></source> <volume>8</volume>:<issue>e000110</issue>. <pub-id pub-id-type="doi">10.1136/jitc-2019-000110</pub-id> <pub-id pub-id-type="pmid">32041817</pub-id></citation></ref>
<ref id="B53"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sun</surname> <given-names>W.</given-names></name> <name><surname>Liu</surname> <given-names>Y.</given-names></name> <name><surname>Shou</surname> <given-names>D.</given-names></name> <name><surname>Sun</surname> <given-names>Q.</given-names></name> <name><surname>Shi</surname> <given-names>J.</given-names></name> <name><surname>Chen</surname> <given-names>L.</given-names></name><etal/></person-group> (<year>2015</year>). <article-title>AFP (alpha fetoprotein): who are you in gastrology?</article-title> <source><italic>Cancer Lett.</italic></source> <volume>357</volume> <fpage>43</fpage>&#x2013;<lpage>46</lpage>. <pub-id pub-id-type="doi">10.1016/j.canlet.2014.11.018</pub-id> <pub-id pub-id-type="pmid">25462859</pub-id></citation></ref>
<ref id="B54"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tan</surname> <given-names>J. X.</given-names></name> <name><surname>Li</surname> <given-names>S. H.</given-names></name> <name><surname>Zhang</surname> <given-names>Z. M.</given-names></name> <name><surname>Chen</surname> <given-names>C. X.</given-names></name> <name><surname>Chen</surname> <given-names>W.</given-names></name> <name><surname>Tang</surname> <given-names>H.</given-names></name><etal/></person-group> (<year>2019</year>). <article-title>Identification of hormone binding proteins based on machine learning methods.</article-title> <source><italic>Math. Biosci. Eng.</italic></source> <volume>16</volume> <fpage>2466</fpage>&#x2013;<lpage>2480</lpage>. <pub-id pub-id-type="doi">10.3934/mbe.2019123</pub-id> <pub-id pub-id-type="pmid">31137222</pub-id></citation></ref>
<ref id="B55"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tang</surname> <given-names>H.</given-names></name> <name><surname>Cao</surname> <given-names>R. Z.</given-names></name> <name><surname>Wang</surname> <given-names>W.</given-names></name> <name><surname>Liu</surname> <given-names>T. S.</given-names></name> <name><surname>Wang</surname> <given-names>L. M.</given-names></name> <name><surname>He</surname> <given-names>C. M.</given-names></name></person-group> (<year>2017</year>). <article-title>A two-step discriminated method to identify thermophilic proteins.</article-title> <source><italic>Int. J. Biomath.</italic></source> <volume>10</volume>:<issue>1750050</issue>. <pub-id pub-id-type="doi">10.1142/s1793524517500504</pub-id></citation></ref>
<ref id="B56"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tang</surname> <given-names>W.</given-names></name> <name><surname>Wan</surname> <given-names>S.</given-names></name> <name><surname>Yang</surname> <given-names>Z.</given-names></name> <name><surname>Teschendorff</surname> <given-names>A. E.</given-names></name> <name><surname>Zou</surname> <given-names>Q.</given-names></name></person-group> (<year>2018</year>). <article-title>Tumor origin detection with tissue-specific miRNA and DNA methylation markers.</article-title> <source><italic>Bioinformatics</italic></source> <volume>34</volume> <fpage>398</fpage>&#x2013;<lpage>406</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btx622</pub-id> <pub-id pub-id-type="pmid">29028927</pub-id></citation></ref>
<ref id="B57"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tomczak</surname> <given-names>K.</given-names></name> <name><surname>Czerwinska</surname> <given-names>P.</given-names></name> <name><surname>Wiznerowicz</surname> <given-names>M.</given-names></name></person-group> (<year>2015</year>). <article-title>The cancer genome atlas (TCGA): an immeasurable source of knowledge.</article-title> <source><italic>Contemp. Oncol.</italic></source> <volume>19</volume> <fpage>A68</fpage>&#x2013;<lpage>A77</lpage>. <pub-id pub-id-type="doi">10.5114/wo.2014.47136</pub-id> <pub-id pub-id-type="pmid">25691825</pub-id></citation></ref>
<ref id="B58"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tripathi</surname> <given-names>S.</given-names></name> <name><surname>Pohl</surname> <given-names>M. O.</given-names></name> <name><surname>Zhou</surname> <given-names>Y.</given-names></name> <name><surname>Rodriguez-Frandsen</surname> <given-names>A.</given-names></name> <name><surname>Wang</surname> <given-names>G.</given-names></name> <name><surname>Stein</surname> <given-names>D. A.</given-names></name><etal/></person-group> (<year>2015</year>). <article-title>Meta- and orthogonal integration of influenza &#x201C;OMICs&#x201D; data defines a role for UBR4 in virus budding.</article-title> <source><italic>Cell Host Microbe</italic></source> <volume>18</volume> <fpage>723</fpage>&#x2013;<lpage>735</lpage>. <pub-id pub-id-type="doi">10.1016/j.chom.2015.11.002</pub-id> <pub-id pub-id-type="pmid">26651948</pub-id></citation></ref>
<ref id="B59"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Unic</surname> <given-names>A.</given-names></name> <name><surname>Derek</surname> <given-names>L.</given-names></name> <name><surname>Duvnjak</surname> <given-names>M.</given-names></name> <name><surname>Patrlj</surname> <given-names>L.</given-names></name> <name><surname>Rakic</surname> <given-names>M.</given-names></name> <name><surname>Kujundzic</surname> <given-names>M.</given-names></name><etal/></person-group> (<year>2018</year>). <article-title>Diagnostic specificity and sensitivity of PIVKAII, GP3, CSTB, SCCA1 and HGF for the diagnosis of hepatocellular carcinoma in patients with alcoholic liver cirrhosis.</article-title> <source><italic>Ann. Clin. Biochem.</italic></source> <volume>55</volume> <fpage>355</fpage>&#x2013;<lpage>362</lpage>. <pub-id pub-id-type="doi">10.1177/0004563217726808</pub-id> <pub-id pub-id-type="pmid">28766361</pub-id></citation></ref>
<ref id="B60"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Villanueva</surname> <given-names>A.</given-names></name></person-group> (<year>2019</year>). <article-title>Hepatocellular carcinoma.</article-title> <source><italic>N. Engl. J. Med.</italic></source> <volume>380</volume> <fpage>1450</fpage>&#x2013;<lpage>1462</lpage>. <pub-id pub-id-type="doi">10.1056/NEJMra1713263</pub-id> <pub-id pub-id-type="pmid">30970190</pub-id></citation></ref>
<ref id="B61"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>H.</given-names></name> <name><surname>Sun</surname> <given-names>Q.</given-names></name> <name><surname>Zhao</surname> <given-names>W.</given-names></name> <name><surname>Qi</surname> <given-names>L.</given-names></name> <name><surname>Gu</surname> <given-names>Y.</given-names></name> <name><surname>Li</surname> <given-names>P.</given-names></name><etal/></person-group> (<year>2015</year>). <article-title>Individual-level analysis of differential expression of genes and pathways for personalized medicine.</article-title> <source><italic>Bioinformatics</italic></source> <volume>31</volume> <fpage>62</fpage>&#x2013;<lpage>68</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btu522</pub-id> <pub-id pub-id-type="pmid">25165092</pub-id></citation></ref>
<ref id="B62"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>W.</given-names></name> <name><surname>Hayashi</surname> <given-names>Y.</given-names></name> <name><surname>Ninomiya</surname> <given-names>T.</given-names></name> <name><surname>Ohta</surname> <given-names>K.</given-names></name> <name><surname>Nakabayashi</surname> <given-names>H.</given-names></name> <name><surname>Tamaoki</surname> <given-names>T.</given-names></name><etal/></person-group> (<year>1998</year>). <article-title>Expression of HNF-1 alpha and HNF-1 beta in various histological differentiations of hepatocellular carcinoma.</article-title> <source><italic>J. Pathol.</italic></source> <volume>184</volume> <fpage>272</fpage>&#x2013;<lpage>278</lpage>. <pub-id pub-id-type="doi">10.1002/(sici)1096-9896(199803)184:3&#x003C;272::aid-path4&#x003E;3.0.co;2-k</pub-id> <pub-id pub-id-type="pmid">9614379</pub-id></citation></ref>
<ref id="B63"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>Y.</given-names></name> <name><surname>Shi</surname> <given-names>F. Q.</given-names></name> <name><surname>Cao</surname> <given-names>L. Y.</given-names></name> <name><surname>Dey</surname> <given-names>N.</given-names></name> <name><surname>Wu</surname> <given-names>Q.</given-names></name> <name><surname>Ashour</surname> <given-names>A. S.</given-names></name><etal/></person-group> (<year>2019</year>). <article-title>Morphological segmentation analysis and texture-based support vector machines classification on mice liver fibrosis microscopic images.</article-title> <source><italic>Curr. Bioinform.</italic></source> <volume>14</volume> <fpage>282</fpage>&#x2013;<lpage>294</lpage>. <pub-id pub-id-type="doi">10.2174/1574893614666190304125221</pub-id></citation></ref>
<ref id="B64"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wei</surname> <given-names>L.</given-names></name> <name><surname>Lian</surname> <given-names>B.</given-names></name> <name><surname>Zhang</surname> <given-names>Y.</given-names></name> <name><surname>Li</surname> <given-names>W.</given-names></name> <name><surname>Gu</surname> <given-names>J.</given-names></name> <name><surname>He</surname> <given-names>X.</given-names></name><etal/></person-group> (<year>2014</year>). <article-title>Application of microRNA and mRNA expression profiling on prognostic biomarker discovery for hepatocellular carcinoma.</article-title> <source><italic>BMC Genomics</italic></source> <volume>15</volume>(<issue>Suppl. 1):S13</issue>. <pub-id pub-id-type="doi">10.1186/1471-2164-15-S1-S13</pub-id> <pub-id pub-id-type="pmid">24564407</pub-id></citation></ref>
<ref id="B65"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wurmbach</surname> <given-names>E.</given-names></name> <name><surname>Chen</surname> <given-names>Y. B.</given-names></name> <name><surname>Khitrov</surname> <given-names>G.</given-names></name> <name><surname>Zhang</surname> <given-names>W.</given-names></name> <name><surname>Roayaie</surname> <given-names>S.</given-names></name> <name><surname>Schwartz</surname> <given-names>M.</given-names></name><etal/></person-group> (<year>2007</year>). <article-title>Genome-wide molecular profiles of HCV-induced dysplasia and hepatocellular carcinoma.</article-title> <source><italic>Hepatology</italic></source> <volume>45</volume> <fpage>938</fpage>&#x2013;<lpage>947</lpage>. <pub-id pub-id-type="doi">10.1002/hep.21622</pub-id> <pub-id pub-id-type="pmid">17393520</pub-id></citation></ref>
<ref id="B66"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yan</surname> <given-names>H.</given-names></name> <name><surname>Li</surname> <given-names>M.</given-names></name> <name><surname>Cao</surname> <given-names>L.</given-names></name> <name><surname>Chen</surname> <given-names>H.</given-names></name> <name><surname>Lai</surname> <given-names>H.</given-names></name> <name><surname>Guan</surname> <given-names>Q.</given-names></name><etal/></person-group> (<year>2019</year>). <article-title>A robust qualitative transcriptional signature for the correct pathological diagnosis of gastric cancer.</article-title> <source><italic>J. Transl. Med.</italic></source> <volume>17</volume>:<issue>63</issue>. <pub-id pub-id-type="doi">10.1186/s12967-019-1816-4</pub-id> <pub-id pub-id-type="pmid">30819200</pub-id></citation></ref>
<ref id="B67"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yang</surname> <given-names>W.</given-names></name> <name><surname>Zhu</surname> <given-names>X. J.</given-names></name> <name><surname>Huang</surname> <given-names>J.</given-names></name> <name><surname>Ding</surname> <given-names>H.</given-names></name> <name><surname>Lin</surname> <given-names>H.</given-names></name></person-group> (<year>2019</year>). <article-title>A brief survey of machine learning methods in protein sub-Golgi localization.</article-title> <source><italic>Curr. Bioinform.</italic></source> <volume>14</volume> <fpage>234</fpage>&#x2013;<lpage>240</lpage>. <pub-id pub-id-type="doi">10.2174/1574893613666181113131415</pub-id></citation></ref>
<ref id="B68"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yang</surname> <given-names>Z.</given-names></name> <name><surname>Zhuang</surname> <given-names>L.</given-names></name> <name><surname>Szatmary</surname> <given-names>P.</given-names></name> <name><surname>Wen</surname> <given-names>L.</given-names></name> <name><surname>Sun</surname> <given-names>H.</given-names></name> <name><surname>Lu</surname> <given-names>Y.</given-names></name><etal/></person-group> (<year>2015</year>). <article-title>Upregulation of heat shock proteins (HSPA12A, HSP90B1, HSPA4, HSPA5 and HSPA6) in tumour tissues is associated with poor outcomes from HBV-related early-stage hepatocellular carcinoma.</article-title> <source><italic>Int. J. Med. Sci.</italic></source> <volume>12</volume> <fpage>256</fpage>&#x2013;<lpage>263</lpage>. <pub-id pub-id-type="doi">10.7150/ijms.10735</pub-id> <pub-id pub-id-type="pmid">25798051</pub-id></citation></ref>
<ref id="B69"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yang</surname> <given-names>Z. P.</given-names></name> <name><surname>Ma</surname> <given-names>H. S.</given-names></name> <name><surname>Wang</surname> <given-names>S. S.</given-names></name> <name><surname>Wang</surname> <given-names>L.</given-names></name> <name><surname>Liu</surname> <given-names>T.</given-names></name></person-group> (<year>2017</year>). <article-title>LAMC1 mRNA promotes malignancy of hepatocellular carcinoma cells by competing for MicroRNA-124 binding with CD151.</article-title> <source><italic>IUBMB Life</italic></source> <volume>69</volume> <fpage>595</fpage>&#x2013;<lpage>605</lpage>. <pub-id pub-id-type="doi">10.1002/iub.1642</pub-id> <pub-id pub-id-type="pmid">28524360</pub-id></citation></ref>
<ref id="B70"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>L.</given-names></name> <name><surname>Hao</surname> <given-names>C.</given-names></name> <name><surname>Shen</surname> <given-names>X.</given-names></name> <name><surname>Hong</surname> <given-names>G.</given-names></name> <name><surname>Li</surname> <given-names>H.</given-names></name> <name><surname>Zhou</surname> <given-names>X.</given-names></name><etal/></person-group> (<year>2013</year>). <article-title>Rank-based predictors for response and prognosis of neoadjuvant taxane-anthracycline-based chemotherapy in breast cancer.</article-title> <source><italic>Breast Cancer Res. Treat.</italic></source> <volume>139</volume> <fpage>361</fpage>&#x2013;<lpage>369</lpage>. <pub-id pub-id-type="doi">10.1007/s10549-013-2566-2</pub-id> <pub-id pub-id-type="pmid">23695655</pub-id></citation></ref>
<ref id="B71"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>N.</given-names></name> <name><surname>Yu</surname> <given-names>S.</given-names></name> <name><surname>Guo</surname> <given-names>Y.</given-names></name> <name><surname>Wang</surname> <given-names>L.</given-names></name> <name><surname>Wang</surname> <given-names>P.</given-names></name> <name><surname>Feng</surname> <given-names>Y.</given-names></name></person-group> (<year>2018</year>). <article-title>Discriminating ramos and jurkat cells with image textures from diffraction imaging flow cytometry based on a support vector machine.</article-title> <source><italic>Curr. Bioinform.</italic></source> <volume>13</volume> <fpage>50</fpage>&#x2013;<lpage>56</lpage>. <pub-id pub-id-type="doi">10.2174/1574893611666160608102537</pub-id></citation></ref>
<ref id="B72"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>X. F.</given-names></name> <name><surname>Pan</surname> <given-names>Q. Z.</given-names></name> <name><surname>Pan</surname> <given-names>K.</given-names></name> <name><surname>Weng</surname> <given-names>D. S.</given-names></name> <name><surname>Wang</surname> <given-names>Q. J.</given-names></name> <name><surname>Zhao</surname> <given-names>J. J.</given-names></name><etal/></person-group> (<year>2016</year>). <article-title>Expression and prognostic role of ubiquitination factor E4B in primary hepatocellular carcinoma.</article-title> <source><italic>Mol. Carcinog.</italic></source> <volume>55</volume> <fpage>64</fpage>&#x2013;<lpage>76</lpage>. <pub-id pub-id-type="doi">10.1002/mc.22259</pub-id> <pub-id pub-id-type="pmid">25557723</pub-id></citation></ref>
<ref id="B73"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhao</surname> <given-names>W.</given-names></name> <name><surname>Chen</surname> <given-names>B.</given-names></name> <name><surname>Guo</surname> <given-names>X.</given-names></name> <name><surname>Wang</surname> <given-names>R.</given-names></name> <name><surname>Chang</surname> <given-names>Z.</given-names></name> <name><surname>Dong</surname> <given-names>Y.</given-names></name><etal/></person-group> (<year>2016</year>). <article-title>A rank-based transcriptional signature for predicting relapse risk of stage II colorectal cancer identified with proper data sources.</article-title> <source><italic>Oncotarget</italic></source> <volume>7</volume> <fpage>19060</fpage>&#x2013;<lpage>19071</lpage>. <pub-id pub-id-type="doi">10.18632/oncotarget.7956</pub-id> <pub-id pub-id-type="pmid">26967049</pub-id></citation></ref>
<ref id="B74"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhou</surname> <given-names>M.</given-names></name> <name><surname>Guo</surname> <given-names>M.</given-names></name> <name><surname>He</surname> <given-names>D.</given-names></name> <name><surname>Wang</surname> <given-names>X.</given-names></name> <name><surname>Cui</surname> <given-names>Y.</given-names></name> <name><surname>Yang</surname> <given-names>H.</given-names></name><etal/></person-group> (<year>2015</year>). <article-title>A potential signature of eight long non-coding RNAs predicts survival in patients with non-small cell lung cancer.</article-title> <source><italic>J. Transl. Med.</italic></source> <volume>13</volume>:<issue>231</issue>. <pub-id pub-id-type="doi">10.1186/s12967-015-0556-3</pub-id> <pub-id pub-id-type="pmid">26183581</pub-id></citation></ref>
<ref id="B75"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhou</surname> <given-names>M.</given-names></name> <name><surname>Zhao</surname> <given-names>H.</given-names></name> <name><surname>Xu</surname> <given-names>W.</given-names></name> <name><surname>Bao</surname> <given-names>S.</given-names></name> <name><surname>Cheng</surname> <given-names>L.</given-names></name> <name><surname>Sun</surname> <given-names>J.</given-names></name></person-group> (<year>2017</year>). <article-title>Discovery and validation of immune-associated long non-coding RNA biomarkers associated with clinically molecular subtype and prognosis in diffuse large B cell lymphoma.</article-title> <source><italic>Mol. Cancer</italic></source> <volume>16</volume>:<issue>16</issue>. <pub-id pub-id-type="doi">10.1186/s12943-017-0580-4</pub-id> <pub-id pub-id-type="pmid">28103885</pub-id></citation></ref>
<ref id="B76"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhou</surname> <given-names>X.</given-names></name> <name><surname>Li</surname> <given-names>B.</given-names></name> <name><surname>Zhang</surname> <given-names>Y.</given-names></name> <name><surname>Gu</surname> <given-names>Y.</given-names></name> <name><surname>Chen</surname> <given-names>B.</given-names></name> <name><surname>Shi</surname> <given-names>T.</given-names></name><etal/></person-group> (<year>2013</year>). <article-title>A relative ordering-based predictor for tamoxifen-treated estrogen receptor-positive breast cancer patients: multi-laboratory cohort validation.</article-title> <source><italic>Breast Cancer Res. Treat.</italic></source> <volume>142</volume> <fpage>505</fpage>&#x2013;<lpage>514</lpage>. <pub-id pub-id-type="doi">10.1007/s10549-013-2767-8</pub-id> <pub-id pub-id-type="pmid">24253811</pub-id></citation></ref>
<ref id="B77"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zou</surname> <given-names>Q.</given-names></name> <name><surname>Ma</surname> <given-names>Q.</given-names></name></person-group> (<year>2019</year>). <article-title>The application of machine learning to disease diagnosis and treatment.</article-title> <source><italic>Math. Biosci.</italic></source> <volume>320</volume>:<issue>108305</issue>. <pub-id pub-id-type="doi">10.1016/j.mbs.2019.108305</pub-id> <pub-id pub-id-type="pmid">1857093</pub-id></citation></ref>
</ref-list>
<glossary>
<title>Abbreviations</title>
<def-list id="DL1">
<def-item><term>CwHCC</term><def><p>cirrhosis tissues in patients with HCC</p></def></def-item>
<def-item><term>CwoHCC</term><def><p>cirrhosis tissues in patients without HCC</p></def></def-item>
<def-item><term>HCC</term><def><p>hepatocellular carcinoma</p></def></def-item>
<def-item><term>IFS</term><def><p>incremental feature selection</p></def></def-item>
<def-item><term>mRMR</term><def><p>maximum redundancy minimum relevance</p></def></def-item>
<def-item><term>REOs</term><def><p>relative expression orderings</p></def></def-item>
<def-item><term>SVM</term><def><p>support vector machine.</p></def></def-item>
</def-list>
</glossary>
<fn-group>
<fn id="footnote1">
<label>1</label>
<p><ext-link ext-link-type="uri" xlink:href="https://portal.gdc.cancer.gov/repository">https://portal.gdc.cancer.gov/repository</ext-link></p></fn>
<fn id="footnote2">
<label>2</label>
<p><ext-link ext-link-type="uri" xlink:href="http://metascape.org">http://metascape.org</ext-link></p></fn>
</fn-group>
</back>
</article>