<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Genet.</journal-id>
<journal-title>Frontiers in Genetics</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Genet.</abbrev-journal-title>
<issn pub-type="epub">1664-8021</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fgene.2020.00269</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Genetics</subject>
<subj-group>
<subject>Methods</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>MASQC: Next Generation Sequencing Assists Third Generation Sequencing for Quality Control in N6-Methyladenine DNA Identification</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Yang</surname> <given-names>Siqian</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="author-notes" rid="fn002"><sup>&#x2020;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/831092/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Wang</surname> <given-names>Yaoxin</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="author-notes" rid="fn002"><sup>&#x2020;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/835078/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Chen</surname> <given-names>Ying</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/877221/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Dai</surname> <given-names>Qi</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="corresp" rid="c002"><sup>&#x002A;</sup></xref>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>College of Life Sciences and Medicine, Zhejiang Sci-Tech University</institution>, <addr-line>Hangzhou</addr-line>, <country>China</country></aff>
<aff id="aff2"><sup>2</sup><institution>State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University</institution>, <addr-line>Guangzhou</addr-line>, <country>China</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Xiaochen Bo, Academy of Military Medical Sciences, China</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Shengli Zhang, Xidian University, China; Yusen Zhang, Shandong University, China</p></fn>
<corresp id="c001">&#x002A;Correspondence: Ying Chen, <email>chenying2016@gmail.com</email></corresp>
<corresp id="c002">Qi Dai, <email>daiailiu04@yahoo.com</email></corresp>
<fn fn-type="other" id="fn002"><p><sup>&#x2020;</sup>These authors have contributed equally to this work and share first authorship</p></fn>
<fn fn-type="other" id="fn004"><p>This article was submitted to Genomic Assay Technology, a section of the journal Frontiers in Genetics</p></fn>
</author-notes>
<pub-date pub-type="epub">
<day>24</day>
<month>03</month>
<year>2020</year>
</pub-date>
<pub-date pub-type="collection">
<year>2020</year>
</pub-date>
<volume>11</volume>
<elocation-id>269</elocation-id>
<history>
<date date-type="received">
<day>25</day>
<month>10</month>
<year>2019</year>
</date>
<date date-type="accepted">
<day>05</day>
<month>03</month>
<year>2020</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x00A9; 2020 Yang, Wang, Chen and Dai.</copyright-statement>
<copyright-year>2020</copyright-year>
<copyright-holder>Yang, Wang, Chen and Dai</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract>
<p>DNA N6-methyladenine (6mA) modification has been discovered as the most prevalent DNA modification in prokaryotes and eukaryotes, involving gene expression, DNA replication and repair, and host-pathogen interactions. Single-molecule real-time sequencing (SMRT-seq) can detect 6mA events in prokaryotic and eukaryotic genomes at the single-nucleotide level. However, there are no strict and economical quality control methods for high false-positive 6mA events in eukaryotic genomes. Therefore, by analyzing the distribution of 6mA in eukaryotic and prokaryotes, we proposed a method named MASQC (MeDIP-seq assists SMRT-seq for quality control in 6mA identification), which can identify 6mA events without doing the whole genome amplification (WGA) sequencing. The proposed MASQC method was assessed on two eukaryotic genomes and six bacterial genomes, our results demonstrate that MASQC performs well in quality control of false positive 6mA identification for both eukaryotic and prokaryotic genomes.</p>
</abstract>
<kwd-group>
<kwd>DNA N6-methyladenine</kwd>
<kwd>MeDIP-seq</kwd>
<kwd>SMRT-seq</kwd>
<kwd>eukaryotes</kwd>
<kwd>prokaryotes</kwd>
</kwd-group>
<contract-num rid="cn001">61772028</contract-num>
<contract-num rid="cn001">91953122</contract-num>
<contract-num rid="cn001">31701146</contract-num>
<contract-sponsor id="cn001">National Natural Science Foundation of China<named-content content-type="fundref-id">10.13039/501100001809</named-content></contract-sponsor>
<counts>
<fig-count count="7"/>
<table-count count="0"/>
<equation-count count="9"/>
<ref-count count="32"/>
<page-count count="10"/>
<word-count count="0"/>
</counts>
</article-meta>
</front>
<body>
<sec id="S1">
<title>Introduction</title>
<p>Epigenetics is a study based on changes in gene expression levels caused by non-gene sequence changes. The epigenetic control of gene expression mainly includes DNA methylation, histone modification, chromosomal remodeling and non-coding RNA regulation (<xref ref-type="bibr" rid="B9">Geiman and Robertson, 2002</xref>), among which DNA methylation modification plays an important role in the regulation of gene expression in epigenetics (<xref ref-type="bibr" rid="B3">Calicchio et al., 2014</xref>). It is well known that C5-methylcytosine (5mC) and N6-methyldeoxyadenosine (6mA) are the most abundant and predominant DNA methylation modifications and play a crucial role in both eukaryotic and prokaryotic life processes (<xref ref-type="bibr" rid="B20">Ratel et al., 2006</xref>; <xref ref-type="bibr" rid="B14">Liu et al., 2016</xref>).</p>
<p>The 5mC modification has been well-studied in prokaryotes and eukaryotes which regulates diverse biological functions and life processes. In contrast, the 6mA modification commonly associates with restriction modification (RM) systems that defend hosts against invading foreign genomes (<xref ref-type="bibr" rid="B8">Fu et al., 2015</xref>), while the in-depth research on it has not made significant progress due to the limitation of previous detection technology. Subsequently, the development of specific antibodies and Next Generation Sequencing technology brought a glimmer of light to this problem, which could detect the conservative regions 6mA events occur in. Based on these techniques, previous researches have been reported the detection of 6mA events in <italic>C. elegans</italic> (<xref ref-type="bibr" rid="B10">Greer et al., 2015</xref>), <italic>D. melanogaster</italic> (<xref ref-type="bibr" rid="B28">Zhang et al., 2015</xref>), <italic>Homo sapiens</italic> (<xref ref-type="bibr" rid="B26">Xiao et al., 2018</xref>), <italic>S. cerevisiae</italic> (<xref ref-type="bibr" rid="B18">Mondo et al., 2017</xref>), and <italic>Chlamydomonas reinhardtii</italic> (<xref ref-type="bibr" rid="B8">Fu et al., 2015</xref>).</p>
<p>At present, a variety of methods have been proposed to detect the 6mA events in eukaryotic and prokaryotic genomes, including bisulfite sequencing (<xref ref-type="bibr" rid="B23">Svadbina et al., 2004</xref>), methylated DNA immunoprecipitation sequencing (MeDIP-seq) (<xref ref-type="bibr" rid="B30">Zhao et al., 2014</xref>), restriction enzyme-based 6mA sequencing (RE-seq) (<xref ref-type="bibr" rid="B15">Luo et al., 2016</xref>), single-molecule real-time sequencing (SMRT-seq) (<xref ref-type="bibr" rid="B7">Flusberg et al., 2010</xref>) and Nanopore sequencing (ONT-seq) (<xref ref-type="bibr" rid="B2">Branton et al., 2008</xref>). Previously, the whole genome DNA methylation detection mainly relied on bisulfite sequencing or the next generation sequencing of methylated DNA immunoprecipitation (<xref ref-type="bibr" rid="B22">Shanmuganathan et al., 2013</xref>), but it was difficult to accurately identify the methylation of genomic repeat regions due to the short reads. Although methylated DNA immunoprecipitation (MeDIP) can detect the region of the 6mA event on the genome, it is not possible to identify the 6mA event on a single nucleotide (<xref ref-type="bibr" rid="B31">Zhu et al., 2016</xref>; <xref ref-type="bibr" rid="B19">Rand et al., 2017</xref>).</p>
<p>Single-molecule real-time (SMRT) sequencing by Pacific Biosciences enables the genome-wide mapping of 6mA modification at single nucleotide resolution and even single molecule level by monitoring pulsed fluorescence of single nucleotide events (<xref ref-type="bibr" rid="B11">Koren and Phillippy, 2015</xref>; <xref ref-type="bibr" rid="B24">VanBuren et al., 2015</xref>). The time at which SMRT sequencing monitors the pulsed fluorescence of a single nucleotide is termed as inter-pulse duration (IPD) (<xref ref-type="bibr" rid="B7">Flusberg et al., 2010</xref>; <xref ref-type="bibr" rid="B6">Feng et al., 2013</xref>). The IPD ratio is derive from that ratio of the IPD observed from the reference location on each strand and the control IPD. Control IPDs are supplied by either an <italic>in silico</italic> computational model or observed IPDs from unmodified &#x201C;control&#x201D; DNA. IPD ratio reflects the deviation of IPDs distribution from the expected level, and the IPD deviations are highly related to neighboring nucleotides modifications. With the help of the IPD ratio from SMRT sequencing, a host of 6mA events have been detected in hundreds of bacterial and archaeal genomes (<xref ref-type="bibr" rid="B21">Sanchez-Romero et al., 2015</xref>; <xref ref-type="bibr" rid="B1">Blow et al., 2016</xref>). Although SMRT sequencing has also been used to detect 6mA events in eukaryotes (<xref ref-type="bibr" rid="B10">Greer et al., 2015</xref>), its application still faces enormous challenges.</p>
<p>There are many differences among the 6mA events in eukaryotic and prokaryotic organisms. Firstly, the 6mA abundance (6mA/A) in eukaryotes is lower than that in prokaryotes (<xref ref-type="bibr" rid="B4">Casadesus and Low, 2006</xref>), and the detection of DNA methylation modification has a certain of false positive rate (FPR). In eukaryotes, the lower the 6mA abundance, the higher the 6mA FPR, the true 6mA events will be overwhelmed by a large number of false positive events (<xref ref-type="bibr" rid="B5">Fang et al., 2012</xref>). Secondly, 6mA events in prokaryotes are highly sequence specificity due to participation in the RM system. Typically, 6mA events in the prokaryotic genome occur almost (&#x003E;95%) on several particular motifs. In contrast, 6mA events are motif driven weakly in eukaryotes, probably resulting from participation in functional regulation rather than the RM system (<xref ref-type="bibr" rid="B25">Wu et al., 2016</xref>). For instance, a small fraction (&#x003C;3%) of occurrences on motifs have been recognized as true 6mA events in <italic>C. reinhardtii</italic> and <italic>C. elegans</italic>. Lastly, other types of DNA modifications (DNA damage, 5mC and derivatives produced during demethylation) in adjacent bases may interfere with the IPD ratio of adenine sites, leading to high FPR in the 6mA events detection. In order to reduce the FPR, the whole genome amplified DNA (WGA DNA, unmethylated DNA) was required to do sequencing as a control, but the WGA SMRT sequencing is extremely expensive. There is a pressing need to develop an efficient cost-effective computational method to reduce the FPR of 6mA events identification.</p>
<p>With the above problems in mind, we proposed a statistical method to control the FPR of 6mA events identification with the help of MeDIP-seq datasets. Take full advantage of the peak regions from MeDIP-seq datasets, we identified the 6mA events detected by SMRT sequencing and calculated a threshold of IPD ratio directly to filter out a large number of false positive events. Besides, the proposed method makes no use of WGA data, which significantly lowers the cost of sequencing.</p>
</sec>
<sec id="S2" sec-type="materials|methods">
<title>Materials and Methods</title>
<sec id="S2.SS1">
<title>MeDIP Sequencing Data and SMRT Sequencing Data</title>
<p>The raw data files of SMRT-seq and MeDIP-seq used in this study were downloaded from NCBI SRA database, including MeDIP-seq raw data for <italic>C. elegans</italic> (<xref ref-type="bibr" rid="B10">Greer et al., 2015</xref>), SMRT-seq dataset for <italic>C. elegans</italic> from Shi, Y.&#x2019;s paper result (<xref ref-type="bibr" rid="B10">Greer et al., 2015</xref>), MeDIP-seq raw data for <italic>C. reinhardtii</italic> (<xref ref-type="bibr" rid="B8">Fu et al., 2015</xref>), SMRT-seq raw data and WGA raw data for <italic>C. reinhardtii</italic> (<xref ref-type="bibr" rid="B32">Zhu et al., 2018</xref>), MeDIP-seq raw data and SMRT-seq raw data for six bacterial genomes (<italic>E. coli</italic>, <italic>B. subtilis</italic>, <italic>E. faecalis</italic>, <italic>S. aureus</italic>, and <italic>S. enterica</italic>) (<xref ref-type="bibr" rid="B16">McIntyre et al., 2019</xref>). The detail description of these raw data can be found in <xref ref-type="supplementary-material" rid="DS1">Supplementary Material</xref>.</p>
</sec>
<sec id="S2.SS2">
<title>MeDIP-seq Assists SMRT-seq for 6mA Quality Control (MASQC) Framework</title>
<p>MASQC is a proposed statistical method that combines MeDIP-seq with SMRT-seq. In MASQC, the input files include a reference genome, h5 format files generated by PacBio RSII sequencers and MeDIP-seq data generated by Illumina sequencers, the output results include 6mA peaks regions files and datasets of 6mA sites before and after threshold filtering. MASQC contains several steps shown in <xref ref-type="fig" rid="F1">Figure 1</xref>.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption><p>The overview of MASQC method.</p></caption>
<graphic xlink:href="fgene-11-00269-g001.tif"/>
</fig>
<list list-type="simple">
<list-item>
<label>(1)</label>
<p>The input MeDIP-seq datasets consist of a reference genome, Input and IP reads files. Input and IP reads were aligned to their reference genomes using BWA-MEM (<xref ref-type="bibr" rid="B12">Li and Durbin, 2009</xref>), and the peaks were called by using MACS2 (&#x2013;nomodel) (<xref ref-type="bibr" rid="B29">Zhang et al., 2008</xref>). Peak regions were in output file which is end with &#x201C;narrow. Peak.&#x201D;</p>
</list-item>
<list-item>
<label>(2)</label>
<p>PacBio SMRT Tools (version 2.3.0) was used to detect DNA 6mA modifications for each strain<sup><xref ref-type="fn" rid="footnote1">1</xref></sup>. In brief, an initial filtering step removes reads containing adapters, short reads and the other low quality reads with cutoffs (MapQV &#x2264; 240, read quality &#x2264; 0.75, read length &#x2264; 500 nt, and subread length &#x2264; 50 nt) in Eukaryotes (<xref ref-type="bibr" rid="B13">Liang et al., 2018</xref>), but using default parameters in prokaryotes. The detailed analysis workflow is as follows: Firstly, the clean reads were aligned to the corresponding reference genome of each strain by pbalign. Secondly, the polymerase kinetics information was loaded after being processed by loadChemistry. py and loadPulses. Finally, the post-aligned datasets were sorted by using cmph5tools and the 6mA was identified by using ipdSummary. py script. 6mA events with less than 50-fold coverage per chromosome of each strain were excluded for further analysis to ensure reliable detection.</p>
</list-item>
<list-item>
<label>(3)</label>
<p>MASQC uses peak regions to construct a new conservative dataset of 6mA events in overlap regions which contains key features in both MeDIP-seq and SMRT-seq. These several features are extracted from the output modification files and peak files, including coverage, fraction, score, Enrichment and &#x2212;10 log (<italic>q</italic>-value) that are described as below.</p>
</list-item>
</list>
<list list-type="simple">
<list-item>
<label>(i)</label>
<p>Coverage refers to the default coverage of the position which has a 6mA base, coverage at that position is at least 10x.</p>
</list-item>
<list-item>
<label>(ii)</label>
<p>Fraction refers to the fraction of reads aligning to this position which has a 6mA base.</p>
</list-item>
<list-item>
<label>(iii)</label>
<p>Score refers to the reliability of 6mA come from SMRT analysis, 20 is the minimum default threshold for the datasets, and corresponds to a <italic>p</italic>-value of 0.01. Score of 30 corresponds to a <italic>p</italic>-value of 0.001.</p>
</list-item>
<list-item>
<label>(iv)</label>
<p>Enrichment refers to the enrichment factor of peak (relative to random Poisson distribution with local lambda).</p>
</list-item>
<list-item>
<label>(v)</label>
<p>&#x2212;10 log (<italic>q</italic>-value) evaluates the reliability of this peak [default <italic>q</italic>-value &#x003C; 0.05 correspond to &#x2212;10 log (<italic>q</italic>-value) &#x003E; 1.3, and <italic>q</italic>-value &#x003C; 0.01 correspond to &#x2212;10 log (<italic>q</italic>-value) &#x003E; 2].</p>
</list-item>
</list>
<p>IPD ratio is not stable because it can be influenced by various factors (background value, noise, etc.), but the peak regions of MeDIP-seq are conservative and reliable so the peak-filtered sites are more reliable. We calculated the mean of these reliable IPD ratios and got the confidence interval of the mean to filter the most reliable sites from the raw data. Combined with the SMRT sequencing and MeDIP-seq principle analysis, the higher the probability of 6mA methylation events in the peak regions, the higher the detected fraction of 6mA abundance (0.7&#x223C;1). For the sake of obtaining the closest fully true dataset, MASQC firstly performs stricter filtering on the peak regions [enrichment &#x2265; 1, &#x2212;10 log (<italic>q</italic>-value) &#x2265; 2] and the sites detected by SMRT analysis (coverage &#x2265; 50, score &#x2265; 30, fraction &#x2265; 0.7). The filtered dataset has been exceedingly close to the expected fully true dataset. We hold that the expected fully true dataset distribution follows a normal distribution, consequently the sample is extracted from the filtered dataset, and the overall distribution is verified by the sample distribution. The normal distribution equation is</p>
<disp-formula id="S2.E1">
<label>(1)</label>
<mml:math id="M1">
<mml:mrow>
<mml:mrow>
<mml:mi>f</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:msqrt>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>&#x2062;</mml:mo>
<mml:mi mathvariant="normal">&#x03C0;</mml:mi>
</mml:mrow>
</mml:msqrt>
<mml:mo>&#x2062;</mml:mo>
<mml:mi mathvariant="normal">&#x03C3;</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:mo>&#x2062;</mml:mo>
<mml:mi mathvariant="italic">exp</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mo>-</mml:mo>
<mml:mfrac>
<mml:msup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>-</mml:mo>
<mml:mi mathvariant="normal">&#x03BC;</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>&#x2062;</mml:mo>
<mml:msup>
<mml:mi mathvariant="normal">&#x03C3;</mml:mi>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where &#x03BC; is the mean of sample, &#x03C3; is the standard deviation of sample. If a random variable X obeys a normal distribution with &#x03BC; and variance &#x03C3;<sup>2</sup>, it is defined as N (&#x03BC;, &#x03C3;<sup>2</sup>). The equation indicates that &#x03BC; of the normal distribution determines the position, and its standard deviation &#x03C3; determines the magnitude of distribution. When &#x03BC; = 0 and &#x03C3; = 1, the normal distribution is the standard normal distribution. According to the central limit theorem, the mean and variance of the population can be calculated based on the sample. Therefore, the 95% confidence interval of the overall IPD ratio can be inferred from the mean of the sample IPD ratio. MASQC obtains the 95% confidence interval by Student&#x2019;s test. When the variance &#x03C3;<sup>2</sup> of population X is unknown, the variance <italic>S</italic><sup>2</sup> of sample is instead of &#x03C3;<sup>2</sup>, so the 95% confidence interval of &#x03BC; is</p>
<disp-formula id="S2.E2">
<label>(2)</label>
<mml:math id="M2">
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>X</mml:mi>
<mml:mo>&#x00AF;</mml:mo>
</mml:mover>
<mml:mo>-</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mi>S</mml:mi>
<mml:msqrt>
<mml:mi>n</mml:mi>
</mml:msqrt>
</mml:mfrac>
<mml:mo>&#x2062;</mml:mo>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mfrac>
<mml:mi mathvariant="normal">&#x03B1;</mml:mi>
<mml:mn>2</mml:mn>
</mml:mfrac>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>-</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
<mml:mo>,</mml:mo>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>X</mml:mi>
<mml:mo>&#x00AF;</mml:mo>
</mml:mover>
<mml:mo>+</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mi>S</mml:mi>
<mml:msqrt>
<mml:mi>n</mml:mi>
</mml:msqrt>
</mml:mfrac>
<mml:mo>&#x2062;</mml:mo>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mfrac>
<mml:mi mathvariant="normal">&#x03B1;</mml:mi>
<mml:mn>2</mml:mn>
</mml:mfrac>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>-</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
<p>Where &#x03B1; = 0.05, <inline-formula><mml:math id="INEQ10"><mml:mrow><mml:mrow><mml:msub><mml:mi>t</mml:mi><mml:mfrac><mml:mi mathvariant="normal">&#x03B1;</mml:mi><mml:mn>2</mml:mn></mml:mfrac></mml:msub><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>=</mml:mo><mml:mn>1.96</mml:mn></mml:mrow></mml:math></inline-formula> are according to the T-distribution table, the number of sample <italic>n</italic> is 30. MASQC takes <inline-formula><mml:math id="INEQ12"><mml:mrow><mml:mover accent="true"><mml:mi>X</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mo>-</mml:mo><mml:mrow><mml:mfrac><mml:mi>S</mml:mi><mml:msqrt><mml:mi>n</mml:mi></mml:msqrt></mml:mfrac><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>t</mml:mi><mml:mfrac><mml:mi mathvariant="normal">&#x03B1;</mml:mi><mml:mn>2</mml:mn></mml:mfrac></mml:msub><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math></inline-formula> the lower bound of the confidence interval as a threshold.</p>
<p>(4) Given the threshold of IPD ratio, most of false positive detection of 6mA events can be filtered out by threshold.</p>
<disp-formula id="S2.E3">
<label>(3)</label>
<mml:math id="M3">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo rspace="5.3pt">=</mml:mo>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x2265;</mml:mo>
<mml:mi mathvariant="italic">thres</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</disp-formula>
<p>Where <italic>T</italic> denotes the 6mA events after quality control, <italic>N</italic> denotes the total 6mA events and <italic>i</italic> denotes the threshold of IPD ratio.</p>
</sec>
<sec id="S2.SS3">
<title>Evaluation and Verification</title>
<p>We compared the number of published 6mA-containing motifs for each species before and after threshold filtering got from MASQC. Three tests were used to evaluate the performance of MASQC. We also analyzed the change of the proportion of published 6mA-containing motifs in peak regions before and after threshold filtering to verify MASQC. <italic>P</italic><sub>1</sub>, <italic>P</italic><sub>2</sub>, <italic>P</italic><sub>3</sub>, and <italic>P</italic><sub>4</sub> denote the proportions of the single motif in states PacBio, PacBio + MeDIP, PacBio + threshold and PacBio + MeDIP + threshold.<italic>I</italic> and <italic>D</italic> are the increase and decrease proportions of total motifs before and after the threshold filtering. <italic>N</italic> is the number of total 6mA events, <italic>m</italic> is the number of single motif and <italic>M</italic> is the number of all motifs in each strain.</p>
<disp-formula id="S2.E4">
<label>(4)</label>
<mml:math id="M4">
<mml:mrow>
<mml:msub>
<mml:mi>P</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mi>m</mml:mi>
<mml:mpadded width="+2.8pt">
<mml:mi>N</mml:mi>
</mml:mpadded>
</mml:mfrac>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="S2.E5">
<label>(5)</label>
<mml:math id="M5">
<mml:mrow>
<mml:msub>
<mml:mi>P</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mpadded width="+2.8pt">
<mml:mfrac>
<mml:msub>
<mml:mi>m</mml:mi>
<mml:mi mathvariant="italic">peak</mml:mi>
</mml:msub>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mi mathvariant="italic">peak</mml:mi>
</mml:msub>
</mml:mfrac>
</mml:mpadded>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="S2.E6">
<label>(6)</label>
<mml:math id="M6">
<mml:mrow>
<mml:msub>
<mml:mi>P</mml:mi>
<mml:mn>3</mml:mn>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>&#x2265;</mml:mo>
<mml:mi mathvariant="italic">thres</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>N</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>&#x2265;</mml:mo>
<mml:mi mathvariant="italic">thres</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="S2.E7">
<label>(7)</label>
<mml:math id="M7">
<mml:mrow>
<mml:mi>I</mml:mi>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mi>M</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>&#x2265;</mml:mo>
<mml:mi mathvariant="italic">thres</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>N</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>&#x2265;</mml:mo>
<mml:mi mathvariant="italic">thres</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
<mml:mo>-</mml:mo>
<mml:mfrac>
<mml:mi>M</mml:mi>
<mml:mi>N</mml:mi>
</mml:mfrac>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="S2.E8">
<label>(8)</label>
<mml:math id="M8">
<mml:mrow>
<mml:msub>
<mml:mi>P</mml:mi>
<mml:mn>4</mml:mn>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mi>m</mml:mi>
<mml:mi mathvariant="italic">peak</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>&#x2265;</mml:mo>
<mml:mi mathvariant="italic">thres</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mi mathvariant="italic">peak</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>&#x2265;</mml:mo>
<mml:mi mathvariant="italic">thres</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="S2.E9">
<label>(9)</label>
<mml:math id="M9">
<mml:mrow>
<mml:mi>D</mml:mi>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>-</mml:mo>
<mml:mfrac>
<mml:mi>M</mml:mi>
<mml:mi>N</mml:mi>
</mml:mfrac>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>-</mml:mo>
<mml:mpadded width="+2.8pt">
<mml:mfrac>
<mml:mrow>
<mml:mi>N</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>&#x2265;</mml:mo>
<mml:mi mathvariant="italic">thres</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>-</mml:mo>
<mml:mi>M</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>&#x2265;</mml:mo>
<mml:mi mathvariant="italic">thres</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mi>N</mml:mi>
</mml:mfrac>
</mml:mpadded>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
</sec>
</sec>
<sec id="S3">
<title>Results</title>
<sec id="S3.SS1">
<title>Influence of the Thresholds</title>
<p>The proposed method MASQC sets the lower bound of the confidence interval which infers from the IPD ratio of the sample as the threshold. However, it must be point out that the threshold would change for different experiments resulting from the sampling bias. To assess the stability of the thresholds generated by MASQC, we tested the datasets of eight species three times. As shown in <xref ref-type="fig" rid="F2">Figure 2</xref>, the deviations of three thresholds in each species are very small, the result indicates that thresholds bias generated by MASQC have little effect on the final results after filtration (<xref ref-type="supplementary-material" rid="DS1">Supplementary Table 1</xref>).</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption><p>The thresholds of IPD ratio across eight species datasets. The thresholds are generated by MASQC three times tests for eight species.</p></caption>
<graphic xlink:href="fgene-11-00269-g002.tif"/>
</fig>
</sec>
<sec id="S3.SS2">
<title>Comparative Analysis of Single Motif</title>
<p>To compare the proportions of 6mA-containing motifs per species before and after filtration, we selected 18 motifs from two eukaryotic and six bacterial genomes (<xref ref-type="bibr" rid="B16">McIntyre et al., 2019</xref>). AAGANNNNNCTC and GAGNNNNNTCTT in <italic>E. coli</italic>, GATCGVNY in <italic>S. aureus</italic>, BATGCATV in <italic>S. enterica</italic> and ANARAGTANYR in <italic>L. monocytogenes</italic> are with small size, resulting in a lower probability of containing 6mA events. As shown in <xref ref-type="fig" rid="F3">Figure 3</xref>, the proportions of 6mA-containg motifs in threshold filtered <italic>C. elegans, C. reinhardtii, E. coli, S. aureus, B. subtilis, E. faecalis, S. enterica, L. monocytogenes</italic> were significantly higher than that without threshold filtering, but in prokaryotes, the proportions of 6mA-containg motifs in peak regions before and after filtering were stable. The result suggests that the threshold can filter out a large number of non-motifs events and few motifs which may contain true 6mA events. As for <italic>C. elegans and C. reinhardtii</italic>, thresholds filtration did not significantly increase the proportions of 6mA-containg motifs, which was related to the fact that 6mA events in eukaryotes were weakly motif driven.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption><p>Proportions of identified 6mA-containg motifs for <italic>C. elegans</italic>, <italic>C. reinhardtii</italic>, <italic>E. coli</italic>, <italic>S. aureus</italic>, <italic>B. subtilis</italic>, <italic>E. faecalis</italic>, <italic>S. enterica</italic>, <italic>L. monocytogenes</italic>. The sites 6mA at different single motifs are identified as methylated by PacBio, PacBio + threshold, PacBio + MeDIP, PacBio + MeDIP + threshold across eight species.</p></caption>
<graphic xlink:href="fgene-11-00269-g003.tif"/>
</fig>
</sec>
<sec id="S3.SS3">
<title>Comparative Analysis of Filtered 6mA Events</title>
<p>To assess the quality of 6mA events filtered through MASQC, we compared the motif and non-motif proportions of IPD ratio below threshold for all 6mA events. As shown in <xref ref-type="fig" rid="F4">Figure 4</xref>. Recent studies identified that the events on the motifs are most likely to be 6mA events than those on the non-motif. The filtered out non-motif events proportions are 98.0, 93.0, 51.3, 97.2, 97.5, 82.0, 52.4, 97.8% for <italic>C. elegans</italic>, <italic>C. reinhardtii</italic>, <italic>E. coli</italic>, <italic>S. aureus</italic>, <italic>B. subtilis</italic>, <italic>E. faecalis</italic>, <italic>S. enterica</italic>, <italic>L. monocytogenes</italic>, which are higher than those of filtered out motifs. The above conclusions suggest that most of the 6mA events filtered out by the proposed threshold may be false positive.</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption><p>Comparison of proportions of filtered motifs and non-motifs. The green indicates the proportion of filtered motifs and the orange indicates the proportion of filtered non-motifs sites.</p></caption>
<graphic xlink:href="fgene-11-00269-g004.tif"/>
</fig>
</sec>
<sec id="S3.SS4">
<title>Comparative Analysis of Total Motifs in Each Species</title>
<p>In order to analyze the distribution of total motifs, we compared their proportions before and after threshold filtration. The proportions of total motifs are represented in <xref ref-type="fig" rid="F5">Figure 5</xref>, we found that the proportions of the total motifs increase slightly after three thresholds filtrations for <italic>C. elegans</italic> and <italic>C. reinhardtii</italic>. As the 6mA events in eukaryotes are motif driven weakly and the proportions of 6mA events on motifs are &#x003C;3%, a growth of 1.3% for <italic>C. elegans</italic> and 3.2% for <italic>Chlamydomonas</italic> after thresholds filtration. On the contrary, 6mA methylation is motif driven highly in bacteria and the proportions of 6mA events on motifs are &#x003E;95%, so that the proportions of the total motifs are greatly improved compared with eukaryotes. In detail, there is a growth of 24.8% for <italic>E. coli</italic>, 76.9% for <italic>S. aureus</italic>, 84.5% for <italic>B. subtilis</italic>, 74.1% for <italic>E. faecalis</italic>, 6.1% for <italic>S. enterica</italic>, and 73.7% for <italic>L. monocytogenes</italic>. The above results indicate that the proportions of total motifs increase after threshold filtrations in both eukaryotes and prokaryotes.</p>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption><p>Comparison of proportions of total motifs before and after three threshold filtrations for eight species.</p></caption>
<graphic xlink:href="fgene-11-00269-g005.tif"/>
</fig>
</sec>
<sec id="S3.SS5">
<title>Comparative Analysis of Non-motifs Events in Each Species</title>
<p>In order to determine the effectiveness of the proposed method, we further analyzed the distribution of non-motifs events before and after thresholds filtration. As shown in <xref ref-type="fig" rid="F6">Figure 6</xref>, the proportions of non-motifs events decrease after three thresholds filtrations in eight species. In detail, there is a decrease of 45.2% for <italic>C. elegans</italic>, 37.7% for <italic>C. reinhardtii</italic>, 25.5% for <italic>E. coli</italic>, 88.2% for <italic>S. aureus</italic>, 90.9% for <italic>B. subtilis</italic>, 74.1% for <italic>E. faecalis</italic>, 20.2% for <italic>S. enterica</italic> and 93.1% for <italic>L. monocytogenes</italic>. A comparative analysis of <xref ref-type="fig" rid="F5">Figures 5</xref>, <xref ref-type="fig" rid="F6">6</xref> shows that the proposed MASQC can effectively filter out many fake 6mA events on non-motifs and few fake 6mA events on motifs.</p>
<fig id="F6" position="float">
<label>FIGURE 6</label>
<caption><p>Comparison of proportions of non-motifs sites before and after three threshold filtrations for eight species.</p></caption>
<graphic xlink:href="fgene-11-00269-g006.tif"/>
</fig>
</sec>
<sec id="S3.SS6">
<title>DNA N6-Methyladenine Identification in <italic>C. reinhardtii</italic></title>
<p><italic>Chlamydomonas</italic> is a kind of classic eukaryotic model organism. Fu et al. identified the 6mA modification in 84% of genes in <italic>Chlamydomona</italic>s through MeDIP-seq, enzyme-treated DNA-seq, MNase-seq and RNA-seq (<xref ref-type="bibr" rid="B8">Fu et al., 2015</xref>). Fang used WGA and Pacbio SMRT sequencing to detect 6mA in <italic>C. reinhardtii</italic> at a single base level for the first time, which improved the accuracy of the 6mA identification and reduced false positives in the eukaryotic (<xref ref-type="bibr" rid="B32">Zhu et al., 2018</xref>). Similarly, we made use of the dataset of <italic>C. reinhardtii</italic> to assess the proposed method MASQC.</p>
<p>We got the IPD ratio &#x2265;4.3 by applying Fang&#x2019;s method in our data, which achieved 99.87% accuracy in <italic>C. reinhardtii</italic> motifs, although it achieved better performance, the whole genome SMRT sequencing cost a lot and required the WGA sequencing data as a control (<xref ref-type="bibr" rid="B32">Zhu et al., 2018</xref>). Herein, we calculated the threshold of IPD ratio by MASQC and then the 6mA events can be filtered by threshold directly. As shown in <xref ref-type="fig" rid="F7">Figure 7</xref>, the accuracies of threshold from MASQC and Fang&#x2019;s methods to identify 6mA events and motifs in <italic>C. reinhardtii</italic> were compared.</p>
<fig id="F7" position="float">
<label>FIGURE 7</label>
<caption><p>Comparison of accuracies of the 6mA events and VATB (V = A/G/C, B = G/C/T) motifs in <italic>C. reinhardtii</italic> using the threshold from MASQC and Fang&#x2019;s method. The yellow ellipse is the proportion of the 6mA sites and motifs filtered by MASQC; the gray ellipse is the number of total motifs in <italic>C. reinhardtii</italic>; The green ellipse is the proportion of the 6mA sites and motifs filtered by Fang&#x2019;s method.</p></caption>
<graphic xlink:href="fgene-11-00269-g007.tif"/>
</fig>
<p>The threshold derived from the proposed method MASQC is about 4.5. When we used IPD ratio &#x2265;4.5 to filter the 6mA events in peak regions, 99.88% motifs out of the filtered 6mA events and 56.3% VATB motifs out of all VATB motifs (yellow ellipse) in <italic>C. reinhardtii</italic>. The results filtered by IPD ratio &#x2265;4.3 are 99.87% motifs out of the filtered 6mA events and 61.1% VATB motifs out of all VATB motifs (green ellipse) in <italic>C. reinhardtii.</italic> The comparison indicates that our method&#x2019;s performance is as good as Fang&#x2019;s method, and our method needs not do WGA sequencing, which saves the cost of the sequencing.</p>
</sec>
</sec>
<sec id="S4">
<title>Discussion</title>
<p>DNA N6-methyladenine (6mA) mainly exists in prokaryotic genomes (<xref ref-type="bibr" rid="B20">Ratel et al., 2006</xref>). Recently, 6mA has been discovered in eukaryotic genome, which opened up a new and promising direction for epigenetics research. With the development of specific antibodies and high-throughput sequencing technologies in the past 3 years, 6mA modification has made great breakthroughs in the research of different species. For PacBio SMRT-seq, base modification would affect DNA polymerase kinetics, and then could express different IPD. SMRT-seq can detect not only 6mA events specifically, but also any forms of DNA modifications of DNA polymerase kinetics that is significantly affected by IPD (<xref ref-type="bibr" rid="B17">Michael et al., 2018</xref>). Different types of DNA modification (DNA damage, m5C, and derivatives produced during demethylation) at or adjacent to the sites of interest may produce an IPD ratio similar to that of the adenine site, resulting in a high FPR of 6mA events (<xref ref-type="bibr" rid="B7">Flusberg et al., 2010</xref>). In bacterial genomes, DNA methylation is relatively limited in form (6mA, 5mC, 4mC) (<xref ref-type="bibr" rid="B27">Yu et al., 2015</xref>) and highly motif driven, which greatly reduces the difficulty of detecting and distinguishing 6mA events from other DNA modifications. In contrast, the 6mA events in the eukaryotic genome are much more abundant and motif is driven weakly, that is why it may coexist with other forms of DNA modifications. These differences between eukaryotic and bacterial methylation groups require to be noted when interpreting a hypothetical 6mA call based on SMRT sequencing to avoid misinterpreting false positive events.</p>
<p>This work aims to develop a common computational method to control the quality of 6mA events identification from SMRT sequencing in both eukaryotic and prokaryotic genomes. Fang et al. proposed a method to identify 6mA methylation events in eukaryotes based on both native DNA and whole genome amplification of the same sample without 6mA methylations (<xref ref-type="bibr" rid="B32">Zhu et al., 2018</xref>). Although it had an accurate performance of about 80% in Fang&#x2019;s paper, the whole genome SMRT sequencing is extremely expensive. In this paper, the proposed MASQC controls the FPR of 6mA events with the help of MeDIP-seq datasets. With the help of peak regions from MeDIP-seq datasets, we filtered the 6mA events detected by SMRT sequencing and calculated the threshold of IPD ratio directly to filter out a large number of false positive events. The results indicate that the accuracy of the proposed MASQC could be up to about 99.88% in <italic>C. reinhardtii</italic> which is as good as 99.87% by Fang&#x2019;s method.</p>
<p>It is worth to note that the 6mA sites filtered by the proposed MASQC may contain a small number of false 6mA events, but they have little effect on the further study of subsequent epigenetics. Researchers can use both parameters &#x201C;fraction &#x003E; 0.7&#x201D; and threshold generated by MASQC to perform more rigorous filtration and get a more conservative truly 6mA dataset.</p>
</sec>
<sec id="S5">
<title>Data Availability Statement</title>
<p>All datasets generated for this study are included in the article/<xref ref-type="supplementary-material" rid="DS1">Supplementary Material</xref>. Scripts used for analysis and figure generation are available at <ext-link ext-link-type="uri" xlink:href="https://github.com/yang-siqian/MASQC">https://github.com/yang-siqian/MASQC</ext-link>.</p>
</sec>
<sec id="S6">
<title>Author Contributions</title>
<p>QD and YC conceived and designed the project. SY implemented the algorithms and provided theoretical analysis of the algorithms. SY and YW analyzed the data. QD, SY, and YW wrote the manuscript.</p>
</sec>
<sec id="conf1">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
</body>
<back>
<fn-group>
<fn fn-type="financial-disclosure">
<p><bold>Funding.</bold> This study was supported in part by the National Natural Science Foundation of China (Grant Nos. 61772028, 91953122, and 31701146).</p>
</fn>
</fn-group>
<ack>
<p>Thanks to all the members of QD laboratory of Zhejiang Sci-Tech University and YC laboratory in Zhongshan Ophthalmic Center, Sun Yat-sen University for the helpful discussion.</p>
</ack>
<sec id="S9" sec-type="supplementary material"><title>Supplementary Material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fgene.2020.00269/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fgene.2020.00269/full#supplementary-material</ext-link></p>
<supplementary-material xlink:href="Data_Sheet_1.PDF" id="DS1" mimetype="application/pdf" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Table_4.xlsx" id="TS4" mimetype="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Table_5.xlsx" id="TS5" mimetype="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Table_6.xlsx" id="TS6" mimetype="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Table_7.xlsx" id="TS7" mimetype="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<ref-list>
<title>References</title>
<ref id="B1"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Blow</surname> <given-names>M. J.</given-names></name> <name><surname>Clark</surname> <given-names>T. A.</given-names></name> <name><surname>Daum</surname> <given-names>C. G.</given-names></name> <name><surname>Deutschbauer</surname> <given-names>A. M.</given-names></name> <name><surname>Fomenkov</surname> <given-names>A.</given-names></name> <name><surname>Fries</surname> <given-names>R.</given-names></name><etal/></person-group> (<year>2016</year>). <article-title>The epigenomic landscape of prokaryotes.</article-title> <source><italic>PLoS Genet.</italic></source> <volume>12</volume>:<issue>e1005854</issue>. <pub-id pub-id-type="doi">10.1371/journal.pgen.1005854</pub-id> <pub-id pub-id-type="pmid">26870957</pub-id></citation></ref>
<ref id="B2"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Branton</surname> <given-names>D.</given-names></name> <name><surname>Deamer</surname> <given-names>D. W.</given-names></name> <name><surname>Marziali</surname> <given-names>A.</given-names></name> <name><surname>Bayley</surname> <given-names>H.</given-names></name> <name><surname>Benner</surname> <given-names>S. A.</given-names></name> <name><surname>Butler</surname> <given-names>T.</given-names></name><etal/></person-group> (<year>2008</year>). <article-title>The potential and challenges of nanopore sequencing.</article-title> <source><italic>Nat. Biotechnol.</italic></source> <volume>26</volume> <fpage>1146</fpage>&#x2013;<lpage>1153</lpage>. <pub-id pub-id-type="doi">10.1038/nbt.1495</pub-id> <pub-id pub-id-type="pmid">18846088</pub-id></citation></ref>
<ref id="B3"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Calicchio</surname> <given-names>R.</given-names></name> <name><surname>Doridot</surname> <given-names>L.</given-names></name> <name><surname>Miralles</surname> <given-names>F.</given-names></name> <name><surname>Mehats</surname> <given-names>C.</given-names></name> <name><surname>Vaiman</surname> <given-names>D.</given-names></name></person-group> (<year>2014</year>). <article-title>DNA methylation, an epigenetic mode of gene expression regulation in reproductive science.</article-title> <source><italic>Curr. Pharm. Des.</italic></source> <volume>20</volume> <fpage>1726</fpage>&#x2013;<lpage>1750</lpage>. <pub-id pub-id-type="doi">10.2174/13816128113199990517</pub-id> <pub-id pub-id-type="pmid">23888966</pub-id></citation></ref>
<ref id="B4"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Casadesus</surname> <given-names>J.</given-names></name> <name><surname>Low</surname> <given-names>D.</given-names></name></person-group> (<year>2006</year>). <article-title>Epigenetic gene regulation in the bacterial world.</article-title> <source><italic>Microbiol. Mol. Biol. Rev.</italic></source> <volume>70</volume> <fpage>830</fpage>&#x2013;<lpage>856</lpage>. <pub-id pub-id-type="doi">10.1128/mmbr.00016-06</pub-id> <pub-id pub-id-type="pmid">16959970</pub-id></citation></ref>
<ref id="B5"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fang</surname> <given-names>G.</given-names></name> <name><surname>Munera</surname> <given-names>D.</given-names></name> <name><surname>Friedman</surname> <given-names>D. I.</given-names></name> <name><surname>Mandlik</surname> <given-names>A.</given-names></name> <name><surname>Chao</surname> <given-names>M. C.</given-names></name> <name><surname>Banerjee</surname> <given-names>O.</given-names></name><etal/></person-group> (<year>2012</year>). <article-title>Genome-wide mapping of methylated adenine residues in pathogenic <italic>Escherichia coli</italic> using single-molecule real-time sequencing.</article-title> <source><italic>Nat. Biotechnol.</italic></source> <volume>30</volume> <fpage>1232</fpage>&#x2013;<lpage>1239</lpage>. <pub-id pub-id-type="doi">10.1038/nbt.2432</pub-id> <pub-id pub-id-type="pmid">23138224</pub-id></citation></ref>
<ref id="B6"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Feng</surname> <given-names>Z.</given-names></name> <name><surname>Fang</surname> <given-names>G.</given-names></name> <name><surname>Korlach</surname> <given-names>J.</given-names></name> <name><surname>Clark</surname> <given-names>T.</given-names></name> <name><surname>Luong</surname> <given-names>K.</given-names></name> <name><surname>Zhang</surname> <given-names>X.</given-names></name><etal/></person-group> (<year>2013</year>). <article-title>Detecting DNA modifications from SMRT sequencing data by modeling sequence context dependence of polymerase kinetic.</article-title> <source><italic>PLoS Comput. Biol.</italic></source> <volume>9</volume>:<issue>e1002935</issue>. <pub-id pub-id-type="doi">10.1371/journal.pcbi.1002935</pub-id> <pub-id pub-id-type="pmid">23516341</pub-id></citation></ref>
<ref id="B7"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Flusberg</surname> <given-names>B. A.</given-names></name> <name><surname>Webster</surname> <given-names>D. R.</given-names></name> <name><surname>Lee</surname> <given-names>J. H.</given-names></name> <name><surname>Travers</surname> <given-names>K. J.</given-names></name> <name><surname>Olivares</surname> <given-names>E. C.</given-names></name> <name><surname>Clark</surname> <given-names>T. A.</given-names></name><etal/></person-group> (<year>2010</year>). <article-title>Direct detection of DNA methylation during single-molecule, real-time sequencing.</article-title> <source><italic>Nat. Methods</italic></source> <volume>7</volume> <fpage>461</fpage>&#x2013;<lpage>465</lpage>. <pub-id pub-id-type="doi">10.1038/nmeth.1459</pub-id> <pub-id pub-id-type="pmid">20453866</pub-id></citation></ref>
<ref id="B8"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fu</surname> <given-names>Y.</given-names></name> <name><surname>Luo</surname> <given-names>G. Z.</given-names></name> <name><surname>Chen</surname> <given-names>K.</given-names></name> <name><surname>Deng</surname> <given-names>X.</given-names></name> <name><surname>Yu</surname> <given-names>M.</given-names></name> <name><surname>Han</surname> <given-names>D.</given-names></name><etal/></person-group> (<year>2015</year>). <article-title>N6-methyldeoxyadenosine marks active transcription start sites in <italic>Chlamydomonas</italic>.</article-title> <source><italic>Cell</italic></source> <volume>161</volume> <fpage>879</fpage>&#x2013;<lpage>892</lpage>. <pub-id pub-id-type="doi">10.1016/j.cell.2015.04.010</pub-id> <pub-id pub-id-type="pmid">25936837</pub-id></citation></ref>
<ref id="B9"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Geiman</surname> <given-names>T. M.</given-names></name> <name><surname>Robertson</surname> <given-names>K. D.</given-names></name></person-group> (<year>2002</year>). <article-title>Chromatin remodeling, histone modifications, and DNA methylation-how does it all fit together?</article-title> <source><italic>J. Cell. Biochem.</italic></source> <volume>87</volume> <fpage>117</fpage>&#x2013;<lpage>125</lpage>. <pub-id pub-id-type="doi">10.1002/jcb.10286</pub-id> <pub-id pub-id-type="pmid">12244565</pub-id></citation></ref>
<ref id="B10"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Greer</surname> <given-names>E. L.</given-names></name> <name><surname>Blanco</surname> <given-names>M. A.</given-names></name> <name><surname>Gu</surname> <given-names>L.</given-names></name> <name><surname>Sendinc</surname> <given-names>E.</given-names></name> <name><surname>Liu</surname> <given-names>J.</given-names></name> <name><surname>Aristizabal-Corrales</surname> <given-names>D.</given-names></name><etal/></person-group> (<year>2015</year>). <article-title>DNA methylation on N6-adenine in <italic>C. elegans</italic>.</article-title> <source><italic>Cell</italic></source> <volume>161</volume> <fpage>868</fpage>&#x2013;<lpage>878</lpage>. <pub-id pub-id-type="doi">10.1016/j.cell.2015.04.005</pub-id> <pub-id pub-id-type="pmid">25936839</pub-id></citation></ref>
<ref id="B11"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Koren</surname> <given-names>S.</given-names></name> <name><surname>Phillippy</surname> <given-names>A. M.</given-names></name></person-group> (<year>2015</year>). <article-title>One chromosome, one contig: complete microbial genomes from long-read sequencing and assembly.</article-title> <source><italic>Curr. Opin. Microbiol.</italic></source> <volume>23</volume> <fpage>110</fpage>&#x2013;<lpage>120</lpage>. <pub-id pub-id-type="doi">10.1016/j.mib.2014.11.014</pub-id> <pub-id pub-id-type="pmid">25461581</pub-id></citation></ref>
<ref id="B12"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>H.</given-names></name> <name><surname>Durbin</surname> <given-names>R.</given-names></name></person-group> (<year>2009</year>). <article-title>Fast and accurate short read alignment with Burrows-Wheeler transform.</article-title> <source><italic>Bioinformatics</italic></source> <volume>25</volume> <fpage>1754</fpage>&#x2013;<lpage>1760</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btp324</pub-id> <pub-id pub-id-type="pmid">19451168</pub-id></citation></ref>
<ref id="B13"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liang</surname> <given-names>Z.</given-names></name> <name><surname>Shen</surname> <given-names>L.</given-names></name> <name><surname>Cui</surname> <given-names>X.</given-names></name> <name><surname>Bao</surname> <given-names>S.</given-names></name> <name><surname>Geng</surname> <given-names>Y.</given-names></name> <name><surname>Yu</surname> <given-names>G.</given-names></name><etal/></person-group> (<year>2018</year>). <article-title>DNA N(6)-Adenine methylation in <italic>Arabidopsis thaliana</italic>.</article-title> <source><italic>Dev. Cell</italic></source> <volume>45</volume> <fpage>406.e3</fpage>&#x2013;<lpage>416.e3</lpage>. <pub-id pub-id-type="doi">10.1016/j.devcel.2018.03.012</pub-id> <pub-id pub-id-type="pmid">29656930</pub-id></citation></ref>
<ref id="B14"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liu</surname> <given-names>J.</given-names></name> <name><surname>Zhu</surname> <given-names>Y.</given-names></name> <name><surname>Luo</surname> <given-names>G. Z.</given-names></name> <name><surname>Wang</surname> <given-names>X.</given-names></name> <name><surname>Yue</surname> <given-names>Y.</given-names></name> <name><surname>Wang</surname> <given-names>X.</given-names></name><etal/></person-group> (<year>2016</year>). <article-title>Abundant DNA 6mA methylation during early embryogenesis of zebrafish and pig.</article-title> <source><italic>Nat. Commun.</italic></source> <volume>7</volume>:<issue>13052</issue>. <pub-id pub-id-type="doi">10.1038/ncomms13052</pub-id> <pub-id pub-id-type="pmid">27713410</pub-id></citation></ref>
<ref id="B15"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Luo</surname> <given-names>G. Z.</given-names></name> <name><surname>Wang</surname> <given-names>F.</given-names></name> <name><surname>Weng</surname> <given-names>X.</given-names></name> <name><surname>Chen</surname> <given-names>K.</given-names></name> <name><surname>Hao</surname> <given-names>Z.</given-names></name> <name><surname>Yu</surname> <given-names>M.</given-names></name><etal/></person-group> (<year>2016</year>). <article-title>Characterization of eukaryotic DNA N(6)-methyladenine by a highly sensitive restriction enzyme-assisted sequencing.</article-title> <source><italic>Nat. Commun.</italic></source> <volume>7</volume>:<issue>11301</issue>. <pub-id pub-id-type="doi">10.1038/ncomms11301</pub-id> <pub-id pub-id-type="pmid">27079427</pub-id></citation></ref>
<ref id="B16"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>McIntyre</surname> <given-names>A. B. R.</given-names></name> <name><surname>Alexander</surname> <given-names>N.</given-names></name> <name><surname>Grigorev</surname> <given-names>K.</given-names></name> <name><surname>Bezdan</surname> <given-names>D.</given-names></name> <name><surname>Sichtig</surname> <given-names>H.</given-names></name> <name><surname>Chiu</surname> <given-names>C. Y.</given-names></name><etal/></person-group> (<year>2019</year>). <article-title>Single-molecule sequencing detection of N6-methyladenine in microbial reference materials.</article-title> <source><italic>Nat. Commun.</italic></source> <volume>10</volume>:<issue>579</issue>. <pub-id pub-id-type="doi">10.1038/s41467-019-08289-9</pub-id> <pub-id pub-id-type="pmid">30718479</pub-id></citation></ref>
<ref id="B17"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Michael</surname> <given-names>T. P.</given-names></name> <name><surname>Jupe</surname> <given-names>F.</given-names></name> <name><surname>Bemm</surname> <given-names>F.</given-names></name> <name><surname>Motley</surname> <given-names>S. T.</given-names></name> <name><surname>Sandoval</surname> <given-names>J. P.</given-names></name> <name><surname>Lanz</surname> <given-names>C.</given-names></name><etal/></person-group> (<year>2018</year>). <article-title>High contiguity <italic>Arabidopsis thaliana</italic> genome assembly with a single nanopore flow cell.</article-title> <source><italic>Nat. Commun.</italic></source> <volume>9</volume>:<issue>541</issue>. <pub-id pub-id-type="doi">10.1038/s41467-018-03016-2</pub-id> <pub-id pub-id-type="pmid">29416032</pub-id></citation></ref>
<ref id="B18"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mondo</surname> <given-names>S. J.</given-names></name> <name><surname>Dannebaum</surname> <given-names>R. O.</given-names></name> <name><surname>Kuo</surname> <given-names>R. C.</given-names></name> <name><surname>Louie</surname> <given-names>K. B.</given-names></name> <name><surname>Bewick</surname> <given-names>A. J.</given-names></name> <name><surname>LaButti</surname> <given-names>K.</given-names></name><etal/></person-group> (<year>2017</year>). <article-title>Widespread adenine N6-methylation of active genes in fungi.</article-title> <source><italic>Nat. Genet.</italic></source> <volume>49</volume> <fpage>964</fpage>&#x2013;<lpage>968</lpage>. <pub-id pub-id-type="doi">10.1038/ng.3859</pub-id> <pub-id pub-id-type="pmid">28481340</pub-id></citation></ref>
<ref id="B19"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rand</surname> <given-names>A. C.</given-names></name> <name><surname>Jain</surname> <given-names>M.</given-names></name> <name><surname>Eizenga</surname> <given-names>J. M.</given-names></name> <name><surname>Musselman-Brown</surname> <given-names>A.</given-names></name> <name><surname>Olsen</surname> <given-names>H. E.</given-names></name> <name><surname>Akeson</surname> <given-names>M.</given-names></name><etal/></person-group> (<year>2017</year>). <article-title>Mapping DNA methylation with high-throughput nanopore sequencing.</article-title> <source><italic>Nat. Methods</italic></source> <volume>14</volume> <fpage>411</fpage>&#x2013;<lpage>413</lpage>. <pub-id pub-id-type="doi">10.1038/nmeth.4189</pub-id> <pub-id pub-id-type="pmid">28218897</pub-id></citation></ref>
<ref id="B20"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ratel</surname> <given-names>D.</given-names></name> <name><surname>Ravanat</surname> <given-names>J. L.</given-names></name> <name><surname>Berger</surname> <given-names>F.</given-names></name> <name><surname>Wion</surname> <given-names>D.</given-names></name></person-group> (<year>2006</year>). <article-title>N6-methyladenine: the other methylated base of DNA.</article-title> <source><italic>Bioessays</italic></source> <volume>28</volume> <fpage>309</fpage>&#x2013;<lpage>315</lpage>. <pub-id pub-id-type="doi">10.1002/bies.20342</pub-id> <pub-id pub-id-type="pmid">16479578</pub-id></citation></ref>
<ref id="B21"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sanchez-Romero</surname> <given-names>M. A.</given-names></name> <name><surname>Cota</surname> <given-names>I.</given-names></name> <name><surname>Casadesus</surname> <given-names>J.</given-names></name></person-group> (<year>2015</year>). <article-title>DNA methylation in bacteria: from the methyl group to the methylome.</article-title> <source><italic>Curr. Opin. Microbiol.</italic></source> <volume>25</volume> <fpage>9</fpage>&#x2013;<lpage>16</lpage>. <pub-id pub-id-type="doi">10.1016/j.mib.2015.03.004</pub-id> <pub-id pub-id-type="pmid">25818841</pub-id></citation></ref>
<ref id="B22"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shanmuganathan</surname> <given-names>R.</given-names></name> <name><surname>Basheer</surname> <given-names>N. B.</given-names></name> <name><surname>Amirthalingam</surname> <given-names>L.</given-names></name> <name><surname>Muthukumar</surname> <given-names>H.</given-names></name> <name><surname>Kaliaperumal</surname> <given-names>R.</given-names></name> <name><surname>Shanmugam</surname> <given-names>K.</given-names></name></person-group> (<year>2013</year>). <article-title>Conventional and nanotechniques for DNA methylation profiling.</article-title> <source><italic>J. Mol. Diagn.</italic></source> <volume>15</volume> <fpage>17</fpage>&#x2013;<lpage>26</lpage>. <pub-id pub-id-type="doi">10.1016/j.jmoldx.2012.06.007</pub-id> <pub-id pub-id-type="pmid">23127612</pub-id></citation></ref>
<ref id="B23"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Svadbina</surname> <given-names>I. V.</given-names></name> <name><surname>Zelinskaya</surname> <given-names>N. V.</given-names></name> <name><surname>Kovalevskaya</surname> <given-names>N. P.</given-names></name> <name><surname>Zheleznaya</surname> <given-names>L. A.</given-names></name> <name><surname>Matvienko</surname> <given-names>N. I.</given-names></name></person-group> (<year>2004</year>). <article-title>Isolation and characterization of site-specific DNA-methyltransferases from <italic>Bacillus coagulans</italic> K.</article-title> <source><italic>Biochemistry (Mosc)</italic></source> <volume>69</volume> <fpage>299</fpage>&#x2013;<lpage>305</lpage>. <pub-id pub-id-type="doi">10.1023/b:biry.0000022061.29918.8b</pub-id> <pub-id pub-id-type="pmid">15061697</pub-id></citation></ref>
<ref id="B24"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>VanBuren</surname> <given-names>R.</given-names></name> <name><surname>Bryant</surname> <given-names>D.</given-names></name> <name><surname>Edger</surname> <given-names>P. P.</given-names></name> <name><surname>Tang</surname> <given-names>H.</given-names></name> <name><surname>Burgess</surname> <given-names>D.</given-names></name> <name><surname>Challabathula</surname> <given-names>D.</given-names></name><etal/></person-group> (<year>2015</year>). <article-title>Single-molecule sequencing of the desiccation-tolerant grass <italic>Oropetium thomaeum</italic>.</article-title> <source><italic>Nature</italic></source> <volume>527</volume> <fpage>508</fpage>&#x2013;<lpage>511</lpage>. <pub-id pub-id-type="doi">10.1038/nature15714</pub-id> <pub-id pub-id-type="pmid">26560029</pub-id></citation></ref>
<ref id="B25"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wu</surname> <given-names>T. P.</given-names></name> <name><surname>Wang</surname> <given-names>T.</given-names></name> <name><surname>Seetin</surname> <given-names>M. G.</given-names></name> <name><surname>Lai</surname> <given-names>Y.</given-names></name> <name><surname>Zhu</surname> <given-names>S.</given-names></name> <name><surname>Lin</surname> <given-names>K.</given-names></name><etal/></person-group> (<year>2016</year>). <article-title>DNA methylation on N(6)-adenine in mammalian embryonic stem cells.</article-title> <source><italic>Nature</italic></source> <volume>532</volume> <fpage>329</fpage>&#x2013;<lpage>333</lpage>. <pub-id pub-id-type="doi">10.1038/nature17640</pub-id> <pub-id pub-id-type="pmid">27027282</pub-id></citation></ref>
<ref id="B26"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xiao</surname> <given-names>C. L.</given-names></name> <name><surname>Zhu</surname> <given-names>S.</given-names></name> <name><surname>He</surname> <given-names>M.</given-names></name> <name><surname>Chen, Zhang</surname> <given-names>Q.</given-names></name> <name><surname>Chen</surname> <given-names>Y.</given-names></name><etal/></person-group> (<year>2018</year>). <article-title>N(6)-methyladenine DNA modification in the human genome.</article-title> <source><italic>Mol. Cell</italic></source> <volume>71</volume> <fpage>306.e7</fpage>&#x2013;<lpage>318.e7</lpage>. <pub-id pub-id-type="doi">10.1016/j.molcel.2018.06.015</pub-id> <pub-id pub-id-type="pmid">30017583</pub-id></citation></ref>
<ref id="B27"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yu</surname> <given-names>M.</given-names></name> <name><surname>Ji</surname> <given-names>L.</given-names></name> <name><surname>Neumann</surname> <given-names>D. A.</given-names></name> <name><surname>Chung</surname> <given-names>D. H.</given-names></name> <name><surname>Groom</surname> <given-names>J.</given-names></name> <name><surname>Westpheling</surname> <given-names>J.</given-names></name><etal/></person-group> (<year>2015</year>). <article-title>Base-resolution detection of N4-methylcytosine in genomic DNA using 4mC-Tet-assisted-bisulfite- sequencing.</article-title> <source><italic>Nucleic Acids Res.</italic></source> <volume>43</volume>:<issue>e148</issue>. <pub-id pub-id-type="doi">10.1093/nar/gkv738</pub-id> <pub-id pub-id-type="pmid">26184871</pub-id></citation></ref>
<ref id="B28"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>G.</given-names></name> <name><surname>Huang</surname> <given-names>H.</given-names></name> <name><surname>Liu</surname> <given-names>D.</given-names></name> <name><surname>Cheng</surname> <given-names>Y.</given-names></name> <name><surname>Liu</surname> <given-names>X.</given-names></name> <name><surname>Zhang</surname> <given-names>W.</given-names></name><etal/></person-group> (<year>2015</year>). <article-title>N6-methyladenine DNA modification in <italic>Drosophila</italic>.</article-title> <source><italic>Cell</italic></source> <volume>161</volume> <fpage>893</fpage>&#x2013;<lpage>906</lpage>. <pub-id pub-id-type="doi">10.1016/j.cell.2015.04.018</pub-id> <pub-id pub-id-type="pmid">25936838</pub-id></citation></ref>
<ref id="B29"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>Y.</given-names></name> <name><surname>Liu</surname> <given-names>T.</given-names></name> <name><surname>Meyer</surname> <given-names>C. A.</given-names></name> <name><surname>Eeckhoute</surname> <given-names>J.</given-names></name> <name><surname>Johnson</surname> <given-names>D. S.</given-names></name> <name><surname>Bernstein</surname> <given-names>B. E.</given-names></name><etal/></person-group> (<year>2008</year>). <article-title>Model-based analysis of ChIP-Seq (MACS).</article-title> <source><italic>Genome Biol.</italic></source> <volume>9</volume>:<issue>R137</issue>. <pub-id pub-id-type="doi">10.1186/gb-2008-9-9-r137</pub-id> <pub-id pub-id-type="pmid">18798982</pub-id></citation></ref>
<ref id="B30"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhao</surname> <given-names>M. T.</given-names></name> <name><surname>Whyte</surname> <given-names>J. J.</given-names></name> <name><surname>Hopkins</surname> <given-names>G. M.</given-names></name> <name><surname>Kirk</surname> <given-names>M. D.</given-names></name> <name><surname>Prather</surname> <given-names>R. S.</given-names></name></person-group> (<year>2014</year>). <article-title>Methylated DNA immunoprecipitation and high-throughput sequencing (MeDIP-seq) using low amounts of genomic DNA.</article-title> <source><italic>Cell. Reprogram.</italic></source> <volume>16</volume> <fpage>175</fpage>&#x2013;<lpage>184</lpage>. <pub-id pub-id-type="doi">10.1089/cell.2014.0002</pub-id> <pub-id pub-id-type="pmid">24773292</pub-id></citation></ref>
<ref id="B31"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhu</surname> <given-names>L.</given-names></name> <name><surname>Zhong</surname> <given-names>J.</given-names></name> <name><surname>Jia</surname> <given-names>X.</given-names></name> <name><surname>Liu</surname> <given-names>G.</given-names></name> <name><surname>Kang</surname> <given-names>Y.</given-names></name> <name><surname>Dong</surname> <given-names>M.</given-names></name><etal/></person-group> (<year>2016</year>). <article-title>Precision methylome characterization of <italic>Mycobacterium tuberculosis</italic> complex (MTBC) using PacBio single-molecule real-time (SMRT) technology.</article-title> <source><italic>Nucleic Acids Res.</italic></source> <volume>44</volume> <fpage>730</fpage>&#x2013;<lpage>743</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkv1498</pub-id> <pub-id pub-id-type="pmid">26704977</pub-id></citation></ref>
<ref id="B32"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhu</surname> <given-names>S.</given-names></name> <name><surname>Beaulaurier</surname> <given-names>J.</given-names></name> <name><surname>Deikus</surname> <given-names>G.</given-names></name> <name><surname>Wu</surname> <given-names>T. P.</given-names></name> <name><surname>Strahl</surname> <given-names>M.</given-names></name> <name><surname>Hao</surname> <given-names>Z.</given-names></name><etal/></person-group> (<year>2018</year>). <article-title>Mapping and characterizing N6-methyladenine in eukaryotic genomes using single-molecule real-time sequencing.</article-title> <source><italic>Genome Res.</italic></source> <volume>28</volume> <fpage>1067</fpage>&#x2013;<lpage>1078</lpage>. <pub-id pub-id-type="doi">10.1101/gr.231068.117</pub-id> <pub-id pub-id-type="pmid">29764913</pub-id></citation></ref>
</ref-list>
<fn-group>
<fn id="footnote1">
<label>1</label>
<p><ext-link ext-link-type="uri" xlink:href="https://www.pacb.com/products-and-services/analyticalsoftware/smrt-analysis/">https://www.pacb.com/products-and-services/analyticalsoftware/smrt-analysis/</ext-link></p></fn>
</fn-group>
</back>
</article>