<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="methods-article" dtd-version="1.2" xml:lang="en">
    <front>
        <journal-meta>
            <journal-id journal-id-type="pmc">Gates Open Res</journal-id>
            <journal-title-group>
                <journal-title>Gates Open Research</journal-title>
            </journal-title-group>
            <issn pub-type="epub">2572-4754</issn>
            <publisher>
                <publisher-name>F1000 Research Limited</publisher-name>
                <publisher-loc>London, UK</publisher-loc>
            </publisher>
        </journal-meta>
        <article-meta>
            <article-id pub-id-type="doi">10.12688/gatesopenres.12891.1</article-id>
            <article-categories>
                <subj-group subj-group-type="heading">
                    <subject>Method Article</subject>
                </subj-group>
                <subj-group>
                    <subject>Articles</subject>
                </subj-group>
            </article-categories>
            <title-group>
                <article-title>Automated verbal autopsy classification: using one-against-all ensemble method and Na&#x00ef;ve Bayes classifier</article-title>
                <fn-group content-type="pub-status">
                    <fn>
                        <p>[version 1; peer review: 1 approved, 1 approved with reservations]</p>
                    </fn>
                </fn-group>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author" corresp="yes">
                    <name>
                        <surname>Murtaza</surname>
                        <given-names>Syed Shariyar</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Project Administration</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0003-3330-4783</uri>
                    <xref ref-type="corresp" rid="c1">a</xref>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Kolpak</surname>
                        <given-names>Patrycja</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0003-0867-319X</uri>
                    <xref ref-type="aff" rid="a2">2</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Bener</surname>
                        <given-names>Ayse</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Jha</surname>
                        <given-names>Prabhat</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Funding Acquisition</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a2">2</xref>
                    <xref ref-type="aff" rid="a3">3</xref>
                </contrib>
                <aff id="a1">
                    <label>1</label>Data Science Lab, Ryerson University, Toronto, Ontario, M5B 2K3, Canada</aff>
                <aff id="a2">
                    <label>2</label>Centre for Global Health Research, St. Michael's Hospital, Toronto, Toronto, Ontario, Canada</aff>
                <aff id="a3">
                    <label>3</label>Dalla Lana School of Public Health, University of Toronto, Toronto, Canada</aff>
            </contrib-group>
            <author-notes>
                <corresp id="c1">
                    <label>a</label>
                    <email xlink:href="mailto:syed.shariyar@ryerson.ca">syed.shariyar@ryerson.ca</email>
                </corresp>
                <fn fn-type="conflict">
                    <p>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>28</day>
                <month>11</month>
                <year>2018</year>
            </pub-date>
            <pub-date pub-type="collection">
                <year>2018</year>
            </pub-date>
            <volume>2</volume>
            <elocation-id>63</elocation-id>
            <history>
                <date date-type="accepted">
                    <day>19</day>
                    <month>11</month>
                    <year>2018</year>
                </date>
            </history>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2018 Murtaza SS et al.</copyright-statement>
                <copyright-year>2018</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <self-uri content-type="pdf" xlink:href="https://gatesopenresearch.org/articles/2-63/pdf"/>
            <abstract>
                <p>Verbal autopsy (VA) deals with post-mortem surveys about deaths, mostly in low and middle income countries, where the majority of deaths occur at home rather than a hospital, for retrospective assignment of causes of death (COD) and subsequently evidence-based health system strengthening. Automated algorithms for VA COD assignment have been developed and their performance has been assessed against physician and clinical diagnoses. Since the performance of automated classification methods remains low, we aimed to enhance the Na&#x00ef;ve Bayes Classifier (NBC) algorithm to produce better ranked COD classifications on 26,766 deaths from four globally diverse VA datasets compared to some of the leading VA classification methods, namely Tariff, InterVA-4, InSilicoVA and NBC. We used a different strategy, by training multiple NBC algorithms using the one-against-all approach (OAA-NBC). To compare performance, we computed the cumulative cause-specific mortality fraction (CSMF) accuracies for population-level agreement from rank one to five COD classifications. To assess individual-level COD assignments, cumulative partially-chance corrected concordance (PCCC) and sensitivity was measured for up to five ranked classifications. Overall results show that OAA-NBC consistently assigns CODs that are the most alike physician and clinical COD assignments compared to some of the leading algorithms based on the cumulative CSMF accuracy, PCCC and sensitivity scores. The results demonstrate that our approach improves the performance of classification (sensitivity) from 6% to 8% when compared against current leading VA classifiers. Population-level agreements for OAA-NBC and NBC were found to be similar or higher than the other algorithms used in the experiments. Although OAA-NBC still requires improvement for individual-level COD assignment, the one-against-all approach improved its ability to assign CODs that more closely resemble physician or clinical COD classifications compared to some of the other leading VA classifiers.</p>
            </abstract>
            <kwd-group kwd-group-type="author">
                <kwd>COD classification</kwd>
                <kwd>VA algorithms</kwd>
                <kwd>CSMF Accuracy</kwd>
                <kwd>sensitivity</kwd>
                <kwd>performance assessment</kwd>
            </kwd-group>
            <funding-group>
                <award-group id="fund-1" xlink:href="http://dx.doi.org/10.13039/100000865">
                    <funding-source>Gates Foundation</funding-source>
                    <award-id>OPP51447</award-id>
                </award-group>
                <funding-statement>This work was supported by the Gates Foundation (OPP51447). </funding-statement>
                <funding-statement>
                    <italic>The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.</italic>
                </funding-statement>
            </funding-group>
        </article-meta>
    </front>
    <body>
        <sec sec-type="intro">
            <title>Introduction</title>
            <p>Verbal autopsy (VA) is increasingly being used in developing countries where most deaths occur at home rather than in hospitals, and causes of death (COD) information remains unknown
                <sup>
                    <xref ref-type="bibr" rid="ref-1">1</xref>
                </sup>. This gap in information prevents evidence-based healthcare programming and policy reform needed to reduce the global burden of diseases
                <sup>
                    <xref ref-type="bibr" rid="ref-2">2</xref>
                </sup>. VA consists of a structured questionnaire to gather information on symptoms and risk factors leading up to death from family members of the deceased. Each completed survey is then typically reviewed independently by two physicians, and COD diagnosis is assigned using World Health Organization (WHO) International Classification of Disease (ICD) codes
                <sup>
                    <xref ref-type="bibr" rid="ref-3">3</xref>
                </sup>. If there is disagreement in diagnosis, then the VA undergoes further review by a senior physician
                <sup>
                    <xref ref-type="bibr" rid="ref-4">4</xref>,
                    <xref ref-type="bibr" rid="ref-5">5</xref>
                </sup>.</p>
            <p> In recent years, efforts have been made to automate VA COD diagnosis using various computational algorithms in an attempt to further standardize VA COD diagnosis and alleviate physician time and costs
                <sup>
                    <xref ref-type="bibr" rid="ref-6">6</xref>&#x2013;
                    <xref ref-type="bibr" rid="ref-13">13</xref>
                </sup>. The current leading computational VA techniques include, InterVA-4
                <sup>
                    <xref ref-type="bibr" rid="ref-7">7</xref>
                </sup>, Tariff
                <sup>
                    <xref ref-type="bibr" rid="ref-6">6</xref>
                </sup>, InSilicoVA
                <sup>
                    <xref ref-type="bibr" rid="ref-8">8</xref>
                </sup>, King-Lu
                <sup>
                    <xref ref-type="bibr" rid="ref-10">10</xref>
                </sup>, and Na&#x00ef;ve Bayes Classifier (NBC)
                <sup>
                    <xref ref-type="bibr" rid="ref-11">11</xref>
                </sup>. InterVA-4 employs medical-expert-defined static weights for symptoms and risk factors given a particular COD, and subsequently calculates the sum of these weights to determine the most likely COD
                <sup>
                    <xref ref-type="bibr" rid="ref-7">7</xref>
                </sup>. Conversely, Tariff was pre-trained on the Population Health Metrics Research Consortium (PHMRC) VA data to compute tariffs, which express the strength of association between symptoms and CODs that are later summed and ranked to determine a COD; the same procedure is used on the test dataset, with the resultant summed and ranked tariffs scores compared against the pre-trained COD rankings
                <sup>
                    <xref ref-type="bibr" rid="ref-14">14</xref>
                </sup>. InSilicoVA assigns CODs by employing a hierarchical Bayesian framework with a na&#x00ef;ve Bayes calculation component; it also computes the uncertainty for individual CODs and population-level COD distributions
                <sup>
                    <xref ref-type="bibr" rid="ref-8">8</xref>
                </sup>. The King-Lu method measures the distribution of the COD and symptoms in the VA training dataset and uses these to predict CODs in the VA test dataset
                <sup>
                    <xref ref-type="bibr" rid="ref-10">10</xref>
                </sup>. Lastly, NBC predicts the COD after computing the conditional probabilities of observing a symptom for a given COD from the VA training dataset, and then applying the Bayes rule against these probabilities
                <sup>
                    <xref ref-type="bibr" rid="ref-11">11</xref>
                </sup>. These existing automated classification algorithms, however, generate low predictive accuracy when compared against physician VA or hospital-based COD diagnoses
                <sup>
                    <xref ref-type="bibr" rid="ref-8">8</xref>,
                    <xref ref-type="bibr" rid="ref-11">11</xref>,
                    <xref ref-type="bibr" rid="ref-15">15</xref>,
                    <xref ref-type="bibr" rid="ref-16">16</xref>
                </sup>. Therefore, there is need to improve automated classification techniques to enable wider and more reliable employment in the field.</p>
            <p>The aim of this research is to develop a classification method for predicting CODs using responses from structured questions in a VA survey. We used a different strategy by training multiple NBC algorithms
                <sup>
                    <xref ref-type="bibr" rid="ref-17">17</xref>
                </sup> using the one-against-all approach (OAA-NBC)
                <sup>
                    <xref ref-type="bibr" rid="ref-18">18</xref>,
                    <xref ref-type="bibr" rid="ref-19">19</xref>
                </sup> to generate ranked assignments of CODs for 26,766 deaths from four globally diverse VA datasets (one VA dataset was divided into four datasets; a total of seven datasets were used for analysis). We also compare our technique against the current leading algorithms Tariff
                <sup>
                    <xref ref-type="bibr" rid="ref-6">6</xref>
                </sup>, InterVA-4
                <sup>
                    <xref ref-type="bibr" rid="ref-7">7</xref>
                </sup>, NBC
                <sup>
                    <xref ref-type="bibr" rid="ref-11">11</xref>
                </sup> and InSilicoVA
                <sup>
                    <xref ref-type="bibr" rid="ref-8">8</xref>
                </sup> on the same deaths used for OAA-NBC.</p>
        </sec>
        <sec sec-type="methods">
            <title>Methods</title>
            <sec>
                <title>Datasets</title>
                <p>In order to test the performance of the algorithms, we use four main datasets, containing information on a total of 26,766 deaths: three physician COD diagnosed VA datasets, namely the Indian Million Death Study (MDS)
                    <sup>
                        <xref ref-type="bibr" rid="ref-20">20</xref>
                    </sup>, South African Agincourt Demographic and Health Survey (DHS) dataset
                    <sup>
                        <xref ref-type="bibr" rid="ref-21">21</xref>
                    </sup>, and Bangladeshi Matlab DHS dataset
                    <sup>
                        <xref ref-type="bibr" rid="ref-22">22</xref>
                    </sup>, and one health facility diagnosed COD dataset, namely the PHMRC VA data collected from six sites in four countries (India, Mexico, the Philippines and Tanzania)
                    <sup>
                        <xref ref-type="bibr" rid="ref-23">23</xref>,
                        <xref ref-type="bibr" rid="ref-24">24</xref>
                    </sup>. We use four combinations of the PHMRC data by age group (adult and child) and by site (all versus India-only); this filtering was done to determine the effect on results when deaths are collected from the same geographical setting. A total of seven datasets were used and are summarized in 
                    <xref ref-type="table" rid="T1">Table 1</xref>. These datasets are publicly available, except for the MDS, and have been used in other studies
                    <sup>
                        <xref ref-type="bibr" rid="ref-11">11</xref>,
                        <xref ref-type="bibr" rid="ref-15">15</xref>,
                        <xref ref-type="bibr" rid="ref-24">24</xref>
                    </sup>.</p>
                <table-wrap id="T1" orientation="portrait" position="anchor">
                    <label>Table 1. </label>
                    <caption>
                        <title>Verbal autopsy (VA) datasets used in the study.</title>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th colspan="1" rowspan="1"/>
                                <th align="left" colspan="1" rowspan="1" valign="top">MDS</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Agincourt</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Matlab</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">PHMRC-
                                    <break/>Adult
                                    <break/>(All Sites)</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">PHMRC-
                                    <break/>Child
                                    <break/>(All Sites)</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">PHMRC-
                                    <break/>Adult
                                    <break/>(India)</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">PHMRC-
                                    <break/>Child
                                    <break/>(India)</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td colspan="1" rowspan="1">Region</td>
                                <td colspan="1" rowspan="1">India</td>
                                <td colspan="1" rowspan="1">South Africa</td>
                                <td colspan="1" rowspan="1">Bangladesh</td>
                                <td colspan="1" rowspan="1">Multiple
                                    <sup>
                                        <xref ref-type="other" rid="FN1">1</xref>
                                    </sup>
</td>
                                <td colspan="1" rowspan="1">Multiple</td>
                                <td colspan="1" rowspan="1">Andhra
                                    <break/>Pradesh
                                    <break/>and Uttar
                                    <break/>Pradesh</td>
                                <td colspan="1" rowspan="1">Andhra
                                    <break/>Pradesh
                                    <break/>and Uttar
                                    <break/>Pradesh</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1"># of deaths</td>
                                <td colspan="1" rowspan="1">12,225</td>
                                <td colspan="1" rowspan="1">5,823</td>
                                <td colspan="1" rowspan="1">2,000</td>
                                <td colspan="1" rowspan="1">4,654</td>
                                <td colspan="1" rowspan="1">2,064</td>
                                <td colspan="1" rowspan="1">1233</td>
                                <td colspan="1" rowspan="1">948</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1">Ages</td>
                                <td colspan="1" rowspan="1">1&#x2013;59 months</td>
                                <td colspan="1" rowspan="1">15&#x2013;64 years</td>
                                <td colspan="1" rowspan="1">20&#x2013;64 years</td>
                                <td colspan="1" rowspan="1">12&#x2013;69 years</td>
                                <td colspan="1" rowspan="1">28 days&#x2013;
                                    <break/>11 years</td>
                                <td colspan="1" rowspan="1">12&#x2013;69 years</td>
                                <td colspan="1" rowspan="1">28 days&#x2013;
                                    <break/>11 years</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1"># of grouped
                                    <break/>CODs</td>
                                <td colspan="1" rowspan="1">15</td>
                                <td colspan="1" rowspan="1">16</td>
                                <td colspan="1" rowspan="1">15</td>
                                <td colspan="1" rowspan="1">13</td>
                                <td colspan="1" rowspan="1">9</td>
                                <td colspan="1" rowspan="1">13</td>
                                <td colspan="1" rowspan="1">9</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1"># of
                                    <break/>Symptoms</td>
                                <td colspan="1" rowspan="1">90</td>
                                <td colspan="1" rowspan="1">88</td>
                                <td colspan="1" rowspan="1">214</td>
                                <td colspan="1" rowspan="1">224</td>
                                <td colspan="1" rowspan="1">133</td>
                                <td colspan="1" rowspan="1">224</td>
                                <td colspan="1" rowspan="1">133</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1">Physician
                                    <break/>Classification</td>
                                <td colspan="1" rowspan="1">Dual
                                    <break/>physician
                                    <break/>agreement</td>
                                <td colspan="1" rowspan="1">Dual
                                    <break/>physician
                                    <break/>agreement</td>
                                <td colspan="1" rowspan="1">Two level
                                    <break/>physician
                                    <break/>classification</td>
                                <td colspan="1" rowspan="1">Hospital
                                    <break/>certified
                                    <break/>cause of
                                    <break/>death,
                                    <break/>including
                                    <break/>clinical and
                                    <break/>diagnostic
                                    <break/>tests</td>
                                <td colspan="1" rowspan="1">Hospital
                                    <break/>certified
                                    <break/>cause of
                                    <break/>death,
                                    <break/>including
                                    <break/>clinical
                                    <break/>and
                                    <break/>diagnostic
                                    <break/>tests</td>
                                <td colspan="1" rowspan="1">Hospital
                                    <break/>certified
                                    <break/>cause of
                                    <break/>death,
                                    <break/>including
                                    <break/>clinical and
                                    <break/>diagnostic
                                    <break/>tests</td>
                                <td colspan="1" rowspan="1">Hospital
                                    <break/>certified
                                    <break/>cause of
                                    <break/>death,
                                    <break/>including
                                    <break/>clinical and
                                    <break/>diagnostic
                                    <break/>tests</td>
                            </tr>
                        </tbody>
                    </table>
                    <table-wrap-foot>
                        <fn id="FN1">
                            <p>

                                <sup>1</sup>Six sites in total: Andhra Pradesh and Uttar Pradesh (India), Distrito Federal (Mexico), Bohol (Philippines) and Dar es Salaam and Pemba (Tanzania); applicable to both adult and child age group specific datasets.</p>
                        </fn>
                    </table-wrap-foot>
                </table-wrap>
                <p>The MDS VA dataset used in this study contains information on 12,225 child deaths from ages one to 59 months. For each death, two trained physicians independently and anonymously assigned a WHO ICD version 10 code
                    <sup>
                        <xref ref-type="bibr" rid="ref-25">25</xref>
                    </sup>. In the cases where the two physicians did not initially agree or reconcile on a COD, a third senior physician adjudicated
                    <sup>
                        <xref ref-type="bibr" rid="ref-20">20</xref>
                    </sup>. Similarly, the Agincourt dataset
                    <sup>
                        <xref ref-type="bibr" rid="ref-21">21</xref>
                    </sup> underwent dual physician COD assignment on its 5,823 deaths for ages 15 to 64 years. COD assignment was slightly different for the Matlab dataset which has 2,000 deaths for ages 20 to 64 years; a single physician assigned a COD, followed by review and verification by a second physician or an experienced paramedic
                    <sup>
                        <xref ref-type="bibr" rid="ref-22">22</xref>
                    </sup>. In contrast, the PHMRC dataset is comprised of 6,718 hospital deaths that were assigned a COD based on certain clinical diagnostic criteria, including laboratory, pathology, and medical imaging findings
                    <sup>
                        <xref ref-type="bibr" rid="ref-23">23</xref>,
                        <xref ref-type="bibr" rid="ref-24">24</xref>
                    </sup>. For each VA datasets, we grouped the physician assigned CODs into 17 broad categories, refer to 
                    <xref ref-type="table" rid="T2">Table 2</xref>. We also show the distribution of records for each COD for each of the seven datasets used in our study.</p>
                <table-wrap id="T2" orientation="portrait" position="anchor">
                    <label>Table 2. </label>
                    <caption>
                        <title>Cause list with absolute death counts by VA dataset.</title>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th align="left" colspan="1" rowspan="1" valign="top">Groups</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Causes</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Agincourt</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Matlab</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">MDS</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">PHMRC
                                    <break/>All Sites
                                    <break/>Adult</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">PHMRC
                                    <break/>Indian
                                    <break/>Adults</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">PHMRC
                                    <break/>All Sites
                                    <break/>Children</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">PHMRC
                                    <break/>Indian
                                    <break/>Children</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Acute respiratory</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">110</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">11</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">3392</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">304</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">81</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">532</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">141</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">HIV/AIDS</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">2012</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">NA</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">5</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">NA</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">NA</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">NA</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">NA</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">3</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Diarrhoeal</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">66</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">29</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">2711</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">101</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">41</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">`256</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">112</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">4</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Pulmonary TB</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">690</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">43</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">78</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">177</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">21</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">NA</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">5</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Other and
                                    <break/>unspecified infections</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">432</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">79</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">2514</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">622</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">174</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">376</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">187</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">6</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Neoplasms (cancer)</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">244</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">352</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">96</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">497</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">19</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">28</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">15</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">7</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Nutrition and
                                    <break/>endocrine</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">70</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">90</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">372</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">NA</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">NA</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">NA</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">NA</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">8</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Cardiovascular
                                    <break/>Diseases</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">381</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">714</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">18</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">928</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">242</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">76</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">25</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">9</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Chronic Respiratory</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">27</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">129</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">21</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">84</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">52</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">NA</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">NA</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">10</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Liver cirrhosis</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">89</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">100</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">112</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">234</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">59</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">NA</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">NA</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">11</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Other non-
                                    <break/>communicable
                                    <break/>diseases</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">221</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">244</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">1345</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">697</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">125</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">186</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">80</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">12</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Neonatal conditions</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">NA</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">NA</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">410</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">NA</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">NA</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">NA</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">NA</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">13</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Road and transport
                                    <break/>injuries</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">219</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">49</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">95</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">124</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">32</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">92</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">64</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">14</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Other injuries</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">366</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">68</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">659</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">471</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">218</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">324</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">259</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">15</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Ill-defined</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">711</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">35</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">397</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">NA</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">NA</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">194</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">65</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">16</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Suicide</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">125</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">34</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">NA</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">70</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">33</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">NA</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">NA</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">17</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Maternal</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">60</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">23</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">NA</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">345</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">136</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">NA</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">NA</td>
                            </tr>
                        </tbody>
                    </table>
                </table-wrap>
            </sec>
            <sec>
                <title>One-against-all Na&#x00ef;ve Bayes (OAA-NBC) approach</title>
                <p>An overview of our approach is shown in 
                    <xref ref-type="fig" rid="f1">Figure 1</xref>. We transformed each VA dataset into binary format with VA survey questions being the attributes (columns), answers being the values of cells in rows (re-coded into binary format with &#x2018;Yes&#x2019; coded as 1 and &#x2018;No&#x2019; as 0) and CODs (group number identifier listed in 
                    <xref ref-type="table" rid="T2">Table 2</xref>) being the last (or the first) column. For all VA datasets, a death is represented as a row (record).</p>
                <fig fig-type="figure" id="f1" orientation="portrait" position="float">
                    <label>Figure 1. </label>
                    <caption>
                        <title>Overview of one-against-all approach.</title>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://gatesopenresearch-files.f1000.com/manuscripts/13987/58fb9376-ce69-4a8f-be7d-15b9e9c21751_figure1.gif"/>
                </fig>
                <p>We divided each VA dataset into training and testing datasets. We trained multiple NBC models
                    <sup>
                        <xref ref-type="bibr" rid="ref-17">17</xref>
                    </sup> on the transformed training datasets using the one-against-all approach
                    <sup>
                        <xref ref-type="bibr" rid="ref-18">18</xref>,
                        <xref ref-type="bibr" rid="ref-19">19</xref>
                    </sup>. We choose NBC because it has shown better results on VA surveys in the past
                    <sup>
                        <xref ref-type="bibr" rid="ref-11">11</xref>
                    </sup>. The one-against-all approach was used because it improves the algorithm&#x2019;s classification accuracy on datasets with several categories of dependent variables as demonstrated by past literature
                    <sup>
                        <xref ref-type="bibr" rid="ref-18">18</xref>,
                        <xref ref-type="bibr" rid="ref-19">19</xref>
                    </sup>. This will be explained in detail in the next section. During testing, the trained NBC models assign CODs to each death in the testing dataset. The assigned causes are ordered by their probabilities with the assumption that top cause is most likely the real cause.</p>
                <p>
                    <bold>
                        <italic toggle="yes">Training Na&#x00ef;ve Bayes using one-against-all approach.</italic>
                    </bold> NBC uses a training dataset to learn the probabilities of symptoms and their CODs
                    <sup>
                        <xref ref-type="bibr" rid="ref-11">11</xref>,
                        <xref ref-type="bibr" rid="ref-17">17</xref>
                    </sup>. NBC first measures the probability of each COD, P(COD), in the training dataset. Secondly, it determines the conditional probabilities of each symptom given a particular COD, P(Sym|COD). Thirdly, NBC determines the probability of every COD given a VA record in the test set, i.e., P(COD|VA).</p>
                <p>
                    <disp-formula>
                        <mml:math display="block" id="math1">
                            <mml:mrow>
                                <mml:mtext>P(COD|</mml:mtext>
                                <mml:mspace width="0.1em"/>
                                <mml:mtext>VA)</mml:mtext>
                                <mml:mo>=</mml:mo>
                                <mml:mtext>P(COD)</mml:mtext>
                                <mml:mspace width="0.2em"/>
                                <mml:mstyle displaystyle="false">
                                    <mml:msub>
                                        <mml:mo>&#x220f;</mml:mo>
                                        <mml:mrow>
                                            <mml:mi>S</mml:mi>
                                            <mml:mi>y</mml:mi>
                                            <mml:mi>m</mml:mi>
                                            <mml:mspace width="0.2em"/>
                                            <mml:mo>&#x2208;</mml:mo>
                                            <mml:mspace width="0.2em"/>
                                            <mml:mi>V</mml:mi>
                                            <mml:mi>A</mml:mi>
                                        </mml:mrow>
                                    </mml:msub>
                                    <mml:mrow>
                                        <mml:mi>P</mml:mi>
                                        <mml:mo stretchy="false">(</mml:mo>
                                        <mml:mi>S</mml:mi>
                                        <mml:mi>y</mml:mi>
                                        <mml:mi>m</mml:mi>
                                        <mml:mo>|</mml:mo>
                                        <mml:mi>C</mml:mi>
                                        <mml:mi>O</mml:mi>
                                        <mml:mi>D</mml:mi>
                                        <mml:mo stretchy="false">)</mml:mo>
                                    </mml:mrow>
                                </mml:mstyle>
                            </mml:mrow>
                        </mml:math>  </disp-formula>
                </p>
                <p id="e1">
                    <bold>Equation 1.</bold> Conditional probability of COD given a VA record.</p>
                <p>P(COD|VA) is determined by taking the product of all P(Sym|COD) (i.e., all symptoms in the VA record) and P(COD). The highest P(COD|VA) value determines that COD as the correct COD. In particular, we chose the Na&#x00ef;ve Bayes Multinomial classification algorithm that estimates probabilities by using a maximum likelihood estimate which is readily available in data mining software applications like Weka
                    <sup>
                        <xref ref-type="bibr" rid="ref-17">17</xref>,
                        <xref ref-type="bibr" rid="ref-18">18</xref>
                    </sup>.</p>
                <p>
                    <disp-formula>
                        <mml:math display="block" id="math2">
                            <mml:mrow>
                                <mml:msub>
                                    <mml:mrow>
                                        <mml:mtext>COD</mml:mtext>
                                    </mml:mrow>
                                    <mml:mrow>
                                        <mml:mtext>NBC</mml:mtext>
                                    </mml:mrow>
                                </mml:msub>
                                <mml:mtext>=</mml:mtext>
                                <mml:mi>a</mml:mi>
                                <mml:mi>r</mml:mi>
                                <mml:mi>g</mml:mi>
                                <mml:mi>m</mml:mi>
                                <mml:mi>a</mml:mi>
                                <mml:msub>
                                    <mml:mi>x</mml:mi>
                                    <mml:mrow>
                                        <mml:mi>C</mml:mi>
                                        <mml:mi>O</mml:mi>
                                        <mml:mi>D</mml:mi>
                                        <mml:mspace width="0.1em"/>
                                        <mml:mo>&#x2208;</mml:mo>
                                        <mml:mi> </mml:mi>
                                        <mml:mi>C</mml:mi>
                                        <mml:mi>O</mml:mi>
                                        <mml:mi>D</mml:mi>
                                        <mml:mi>s</mml:mi>
                                    </mml:mrow>
                                </mml:msub>
                                <mml:mtext>P</mml:mtext>
                                <mml:mo stretchy="false">(</mml:mo>
                                <mml:mi>COD</mml:mi>
                                <mml:mo>|</mml:mo>
                                <mml:mtext>VA</mml:mtext>
                                <mml:mo stretchy="false">)</mml:mo>
                            </mml:mrow>
                        </mml:math> </disp-formula>
                </p>
                <p id="e2">
                    <bold>Equation 2.</bold> Select the class with the maximum probability.</p>
                <p>In the one-against-all approach, we built an NBC model for each COD instead of one model for all CODs. In this approach, a dataset with M categories of CODs (dependent variables) is decomposed into M datasets with binary categories (CODs). Each binary dataset Di has a COD Ci (where i = 1 to M) labelled as positive and all other CODs labelled as negative with no two datasets having the same CODs labelled as positive. Finally, NBC is trained on each dataset Di resulting in M Na&#x00ef;ve Bayes models, as shown in 
                    <xref ref-type="fig" rid="f2">Figure 2</xref>. Each model is then used to classify the CODs for records in the test dataset producing a probability of classification. The cause Ci (where i=1 to M) with the highest probability is considered as the correct classification.</p>
                <fig fig-type="figure" id="f2" orientation="portrait" position="float">
                    <label>Figure 2. </label>
                    <caption>
                        <title>One-against-all approach for ensemble learning.</title>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://gatesopenresearch-files.f1000.com/manuscripts/13987/58fb9376-ce69-4a8f-be7d-15b9e9c21751_figure2.gif"/>
                </fig>
                <p>
                    <bold>
                        <italic toggle="yes">Testing OAA-NBC on new surveys.</italic>
                    </bold> During testing, each Na&#x00ef;ve Bayes model predicts a COD for each VA record in the test dataset, resulting in a list of CODs for each VA record in the test dataset. The list of assigned CODs is sorted by the COD probabilities. We made a minor modification in the one-against-all approach; instead of selecting a COD with the highest probability, we ranked the CODs in descending order of their probabilities for each VA record. We kept the ranked probabilities to generate cumulative performance measures, which are described in detail in the next section.</p>
            </sec>
            <sec>
                <title>Assessment methods</title>
                <p>A VA algorithm&#x2019;s performance is measured by quantifying the similarity between the algorithm&#x2019;s COD assignments to physician review (or clinical diagnoses in PHMRC) assignments. Since the community VA datasets included in this study come from countries that have weak civil and death registration, physician review is the most practical and relatively accurate (and only) option to use for assessing algorithm performance. Moreover, given that these deaths are unattended, it follows that there is no &#x2018;gold standard&#x2019; for such community VA datasets. Nevertheless, we are confident in the robustness of dual physician review as initial physician agreement (i.e. where two physicians agreed right at the onset of COD coding) was relatively high; e.g., 79% for MDS and 77% for Agincourt.</p>
                <p>We measured and compared the individual and population-level performance of all of the algorithms using the following metrics: sensitivity, partially chance corrected concordance (PCCC) and cause-specific mortality fraction (CSMF) accuracy. These measures are commonly used in VA studies
                    <sup>
                        <xref ref-type="bibr" rid="ref-11">11</xref>,
                        <xref ref-type="bibr" rid="ref-15">15</xref>,
                        <xref ref-type="bibr" rid="ref-26">26</xref>
                    </sup>. They are shown in 
                    <xref ref-type="other" rid="e3">Equation 3</xref> &#x2013; 
                    <xref ref-type="other" rid="e5">Equation 5</xref>. They are helpful in objectively assessing the performance of VA algorithms, as they provide a robust strategy to assess an algorithm&#x2019;s classification ability for test datasets with widely varying COD distributions
                    <sup>
                        <xref ref-type="bibr" rid="ref-12">12</xref>,
                        <xref ref-type="bibr" rid="ref-26">26</xref>
                    </sup>.</p>
                <p>
                    <disp-formula>
                        <mml:math display="block" id="math3">
                            <mml:mrow>
                                <mml:mi>S</mml:mi>
                                <mml:mi>e</mml:mi>
                                <mml:mi>n</mml:mi>
                                <mml:mi>s</mml:mi>
                                <mml:mi>i</mml:mi>
                                <mml:mi>t</mml:mi>
                                <mml:mi>i</mml:mi>
                                <mml:mi>v</mml:mi>
                                <mml:mi>i</mml:mi>
                                <mml:mi>t</mml:mi>
                                <mml:mi>y</mml:mi>
                                <mml:mo>=</mml:mo>
                                <mml:mfrac>
                                    <mml:mrow>
                                        <mml:mi>T</mml:mi>
                                        <mml:mi>r</mml:mi>
                                        <mml:mi>u</mml:mi>
                                        <mml:mi>e</mml:mi>
                                        <mml:mspace width="0.3em"/>
                                        <mml:mi>p</mml:mi>
                                        <mml:mi>o</mml:mi>
                                        <mml:mi>s</mml:mi>
                                        <mml:mi>i</mml:mi>
                                        <mml:mi>t</mml:mi>
                                        <mml:mi>i</mml:mi>
                                        <mml:mi>v</mml:mi>
                                        <mml:mi>e</mml:mi>
                                    </mml:mrow>
                                    <mml:mrow>
                                        <mml:mi>T</mml:mi>
                                        <mml:mi>r</mml:mi>
                                        <mml:mi>u</mml:mi>
                                        <mml:mi>e</mml:mi>
                                        <mml:mspace width="0.3em"/>
                                        <mml:mi>p</mml:mi>
                                        <mml:mi>o</mml:mi>
                                        <mml:mi>s</mml:mi>
                                        <mml:mi>i</mml:mi>
                                        <mml:mi>t</mml:mi>
                                        <mml:mi>i</mml:mi>
                                        <mml:mi>v</mml:mi>
                                        <mml:mi>e</mml:mi>
                                        <mml:mo>+</mml:mo>
                                        <mml:mi>F</mml:mi>
                                        <mml:mi>a</mml:mi>
                                        <mml:mi>l</mml:mi>
                                        <mml:mi>s</mml:mi>
                                        <mml:mi>e</mml:mi>
                                        <mml:mtext>&#x200b;</mml:mtext>
                                        <mml:mspace width="0.3em"/>
                                        <mml:mi>N</mml:mi>
                                        <mml:mi>e</mml:mi>
                                        <mml:mi>g</mml:mi>
                                        <mml:mi>a</mml:mi>
                                        <mml:mi>t</mml:mi>
                                        <mml:mi>i</mml:mi>
                                        <mml:mi>v</mml:mi>
                                        <mml:mi>e</mml:mi>
                                    </mml:mrow>
                                </mml:mfrac>
                            </mml:mrow>
                        </mml:math>    </disp-formula>
                </p>
                <p id="e3">
                    <bold>Equation 3.</bold> Sensitivity of classification</p>
                <p>
                    <disp-formula>
                        <mml:math display="block" id="math4">
                            <mml:mrow>
                                <mml:mi>P</mml:mi>
                                <mml:mi>C</mml:mi>
                                <mml:mi>C</mml:mi>
                                <mml:mi>C</mml:mi>
                                <mml:mo stretchy="false">(</mml:mo>
                                <mml:mi>k</mml:mi>
                                <mml:mo stretchy="false">)</mml:mo>
                                <mml:mo>=</mml:mo>
                                <mml:mfrac>
                                    <mml:mrow>
                                        <mml:mi>S</mml:mi>
                                        <mml:mo>&#x2212;</mml:mo>
                                        <mml:mfrac>
                                            <mml:mi>k</mml:mi>
                                            <mml:mi>n</mml:mi>
                                        </mml:mfrac>
                                    </mml:mrow>
                                    <mml:mrow>
                                        <mml:mn>1</mml:mn>
                                        <mml:mo>&#x2212;</mml:mo>
                                        <mml:mfrac>
                                            <mml:mi>k</mml:mi>
                                            <mml:mi>n</mml:mi>
                                        </mml:mfrac>
                                    </mml:mrow>
                                </mml:mfrac>
                            </mml:mrow>
                        </mml:math> </disp-formula>
                </p>
                <p>
                    <disp-formula>
                        <mml:math display="block" id="math5">
                            <mml:mrow>
                                <mml:mtext>Where</mml:mtext>
                                <mml:mspace width="0.3em"/>
                                <mml:mtext>S</mml:mtext>
                                <mml:mo>=</mml:mo>
                                <mml:mfrac>
                                    <mml:mrow>
                                        <mml:mi>T</mml:mi>
                                        <mml:mi>r</mml:mi>
                                        <mml:mi>u</mml:mi>
                                        <mml:mi>e</mml:mi>
                                        <mml:mspace width="0.3em"/>
                                        <mml:mi>p</mml:mi>
                                        <mml:mi>o</mml:mi>
                                        <mml:mi>s</mml:mi>
                                        <mml:mi>i</mml:mi>
                                        <mml:mi>t</mml:mi>
                                        <mml:mi>i</mml:mi>
                                        <mml:mi>v</mml:mi>
                                        <mml:mi>e</mml:mi>
                                    </mml:mrow>
                                    <mml:mrow>
                                        <mml:mi>T</mml:mi>
                                        <mml:mi>r</mml:mi>
                                        <mml:mi>u</mml:mi>
                                        <mml:mi>e</mml:mi>
                                        <mml:mspace width="0.3em"/>
                                        <mml:mi>p</mml:mi>
                                        <mml:mi>o</mml:mi>
                                        <mml:mi>s</mml:mi>
                                        <mml:mi>i</mml:mi>
                                        <mml:mi>t</mml:mi>
                                        <mml:mi>i</mml:mi>
                                        <mml:mi>v</mml:mi>
                                        <mml:mi>e</mml:mi>
                                        <mml:mo>+</mml:mo>
                                        <mml:mi>F</mml:mi>
                                        <mml:mi>a</mml:mi>
                                        <mml:mi>l</mml:mi>
                                        <mml:mi>s</mml:mi>
                                        <mml:mi>e</mml:mi>
                                        <mml:mtext>&#x200b;</mml:mtext>
                                        <mml:mspace width="0.3em"/>
                                        <mml:mi>N</mml:mi>
                                        <mml:mi>e</mml:mi>
                                        <mml:mi>g</mml:mi>
                                        <mml:mi>a</mml:mi>
                                        <mml:mi>t</mml:mi>
                                        <mml:mi>i</mml:mi>
                                        <mml:mi>v</mml:mi>
                                        <mml:mi>e</mml:mi>
                                    </mml:mrow>
                                </mml:mfrac>
                            </mml:mrow>
                        </mml:math>    </disp-formula>
                </p>
                <p>
                    <bold>Equation 4.</bold> Partially chance corrected concordance (PCCC) of classification: S is the fraction of positively (correctly) assigned causes when the correct cause is in the top k assigned causes out of total n causes.</p>
                <p>Sensitivity and PCCC are metrics that assess the performance of an algorithm for correctly classifying the CODs at the individual level. Sensitivity measures the proportion of death records that are correctly assigned for each COD
                    <sup>
                        <xref ref-type="bibr" rid="ref-12">12</xref>
                    </sup>. Similarly, PCCC computes how well a VA classification algorithm classifies the CODs at the individual-level while also taking chance (likelihood that it was randomly assigned a COD) into consideration
                    <sup>
                        <xref ref-type="bibr" rid="ref-8">8</xref>,
                        <xref ref-type="bibr" rid="ref-11">11</xref>,
                        <xref ref-type="bibr" rid="ref-12">12</xref>,
                        <xref ref-type="bibr" rid="ref-15">15</xref>
                    </sup>.</p>
                <p>
                    <disp-formula>
                        <mml:math display="block" id="math6">
                            <mml:mrow>
                                <mml:mi>C</mml:mi>
                                <mml:mi>S</mml:mi>
                                <mml:mi>M</mml:mi>
                                <mml:mi>F</mml:mi>
                                <mml:mspace width="0.3em"/>
                                <mml:mi>A</mml:mi>
                                <mml:mi>c</mml:mi>
                                <mml:mi>c</mml:mi>
                                <mml:mi>u</mml:mi>
                                <mml:mi>r</mml:mi>
                                <mml:mi>a</mml:mi>
                                <mml:mi>c</mml:mi>
                                <mml:mi>y</mml:mi>
                                <mml:mo>=</mml:mo>
                                <mml:mn>1</mml:mn>
                                <mml:mo>&#x2212;</mml:mo>
                                <mml:mfrac>
                                    <mml:mrow>
                                        <mml:mstyle displaystyle="false">
                                            <mml:msubsup>
                                                <mml:mo>&#x2211;</mml:mo>
                                                <mml:mrow>
                                                    <mml:mi>j</mml:mi>
                                                    <mml:mo>=</mml:mo>
                                                    <mml:mn>1</mml:mn>
                                                </mml:mrow>
                                                <mml:mi>n</mml:mi>
                                            </mml:msubsup>
                                            <mml:mrow>
                                                <mml:mrow>
                                                    <mml:mo>|</mml:mo>
                                                    <mml:mrow>
                                                        <mml:mi>C</mml:mi>
                                                        <mml:mi>S</mml:mi>
                                                        <mml:mi>M</mml:mi>
                                                        <mml:msubsup>
                                                            <mml:mi>F</mml:mi>
                                                            <mml:mi>j</mml:mi>
                                                            <mml:mrow>
                                                                <mml:mi>T</mml:mi>
                                                                <mml:mi>r</mml:mi>
                                                                <mml:mi>u</mml:mi>
                                                                <mml:mi>e</mml:mi>
                                                            </mml:mrow>
                                                        </mml:msubsup>
                                                        <mml:mo>&#x2212;</mml:mo>
                                                        <mml:mi>C</mml:mi>
                                                        <mml:mi>S</mml:mi>
                                                        <mml:mi>M</mml:mi>
                                                        <mml:msubsup>
                                                            <mml:mi>F</mml:mi>
                                                            <mml:mi>j</mml:mi>
                                                            <mml:mrow>
                                                                <mml:mi>P</mml:mi>
                                                                <mml:mi>r</mml:mi>
                                                                <mml:mi>e</mml:mi>
                                                                <mml:mi>d</mml:mi>
                                                            </mml:mrow>
                                                        </mml:msubsup>
                                                    </mml:mrow>
                                                    <mml:mo>|</mml:mo>
                                                </mml:mrow>
                                            </mml:mrow>
                                        </mml:mstyle>
                                    </mml:mrow>
                                    <mml:mrow>
                                        <mml:mn>2</mml:mn>
                                        <mml:mo stretchy="false">(</mml:mo>
                                        <mml:mn>1</mml:mn>
                                        <mml:mo>&#x2212;</mml:mo>
                                        <mml:mi>M</mml:mi>
                                        <mml:mi>i</mml:mi>
                                        <mml:mi>n</mml:mi>
                                        <mml:mi>i</mml:mi>
                                        <mml:mi>m</mml:mi>
                                        <mml:mi>u</mml:mi>
                                        <mml:mi>m</mml:mi>
                                        <mml:mrow>
                                            <mml:mo>(</mml:mo>
                                            <mml:mrow>
                                                <mml:mi>C</mml:mi>
                                                <mml:mi>S</mml:mi>
                                                <mml:mi>M</mml:mi>
                                                <mml:msubsup>
                                                    <mml:mi>F</mml:mi>
                                                    <mml:mi>j</mml:mi>
                                                    <mml:mrow>
                                                        <mml:mi>T</mml:mi>
                                                        <mml:mi>r</mml:mi>
                                                        <mml:mi>u</mml:mi>
                                                        <mml:mi>e</mml:mi>
                                                    </mml:mrow>
                                                </mml:msubsup>
                                            </mml:mrow>
                                            <mml:mo>)</mml:mo>
                                        </mml:mrow>
                                        <mml:mo stretchy="false">)</mml:mo>
                                    </mml:mrow>
                                </mml:mfrac>
                            </mml:mrow>
                        </mml:math> </disp-formula>
                </p>
                <p>
                    <disp-formula>
                        <mml:math display="block" id="math7">
                            <mml:mrow>
                                <mml:mtext>Where</mml:mtext>
                                <mml:mspace width="0.3em"/>
                                <mml:mi>C</mml:mi>
                                <mml:mi>S</mml:mi>
                                <mml:mi>M</mml:mi>
                                <mml:msup>
                                    <mml:mi>F</mml:mi>
                                    <mml:mrow>
                                        <mml:mi>P</mml:mi>
                                        <mml:mi>r</mml:mi>
                                        <mml:mi>e</mml:mi>
                                        <mml:mi>d</mml:mi>
                                    </mml:mrow>
                                </mml:msup>
                                <mml:mo>=</mml:mo>
                                <mml:mfrac>
                                    <mml:mrow>
                                        <mml:mo stretchy="false">(</mml:mo>
                                        <mml:mi>T</mml:mi>
                                        <mml:mi>P</mml:mi>
                                        <mml:mo>+</mml:mo>
                                        <mml:mi>F</mml:mi>
                                        <mml:mi>P</mml:mi>
                                        <mml:mo stretchy="false">)</mml:mo>
                                    </mml:mrow>
                                    <mml:mi>N</mml:mi>
                                </mml:mfrac>
                                <mml:mspace width="0.8em"/>
                                <mml:mi>a</mml:mi>
                                <mml:mi>n</mml:mi>
                                <mml:mi>d</mml:mi>
                                <mml:mspace width="0.4em"/>
                                <mml:mi>C</mml:mi>
                                <mml:mi>S</mml:mi>
                                <mml:mi>M</mml:mi>
                                <mml:msup>
                                    <mml:mi>F</mml:mi>
                                    <mml:mrow>
                                        <mml:mi>T</mml:mi>
                                        <mml:mi>r</mml:mi>
                                        <mml:mi>u</mml:mi>
                                        <mml:mi>e</mml:mi>
                                    </mml:mrow>
                                </mml:msup>
                                <mml:mo>=</mml:mo>
                                <mml:mfrac>
                                    <mml:mrow>
                                        <mml:mo stretchy="false">(</mml:mo>
                                        <mml:mi>T</mml:mi>
                                        <mml:mi>P</mml:mi>
                                        <mml:mo>+</mml:mo>
                                        <mml:mi>F</mml:mi>
                                        <mml:mi>N</mml:mi>
                                        <mml:mo stretchy="false">)</mml:mo>
                                    </mml:mrow>
                                    <mml:mi>N</mml:mi>
                                </mml:mfrac>
                            </mml:mrow>
                        </mml:math>   </disp-formula>
                </p>
                <p id="e5">
                    <bold>Equation 5.</bold> Cause-specific mortality fraction (CSMF) Accuracy of classification: n is the total COD and N is the total records.</p>
                <p>In contrast, CSMF accuracy (hereafter referred to as &#x2018;agreement&#x2019;) is a measure for assessing how closely the algorithms classified the overall COD distribution at the population level
                    <sup>
                        <xref ref-type="bibr" rid="ref-12">12</xref>
                    </sup>. It can be observed from 
                    <xref ref-type="other" rid="e5">Equation 5</xref> that CSMF Accuracy computes the absolute error between predicted COD distributions by an algorithm (pred) and the observed (true) COD distributions.</p>
                <p>We measure the cumulative values of sensitivity, PCCC and agreement on each rank and for each algorithm; e.g., sensitivity at rank two represents the sensitivity of both rank one and rank two classifications, which facilitates measuring the overall performance of the algorithms for classifications at the top two or more ranks. If 60% of the individual classifications are correct at rank one and 15% more are correct at rank two then the overall accuracy is 75% at both ranks. Finally, we also perform a statistical test of significance on the results of all the datasets to ascertain that the difference in results is not by chance. This type of the statistical test depends on the data distribution and association between experiments. We use Wilcoxon signed rank test as we are unsure about normal data distribution of our results. Our null hypothesis is that there is no significant difference between OAA-NBC and another algorithm. This is further discussed in the results section.</p>
            </sec>
            <sec id="S1">
                <title>Experimental setup</title>
                <p>In order to compare the performance between OAA-NBC, InterVA-4
                    <sup>
                        <xref ref-type="bibr" rid="ref-7">7</xref>
                    </sup>, Tariff
                    <sup>
                        <xref ref-type="bibr" rid="ref-6">6</xref>
                    </sup>, InSilicoVA
                    <sup>
                        <xref ref-type="bibr" rid="ref-8">8</xref>
                    </sup> and NBC
                    <sup>
                        <xref ref-type="bibr" rid="ref-11">11</xref>
                    </sup>, we follow a seven step procedure. In Step one, we partitioned each VA dataset using the commonly used evaluation criteria in data mining: 10-fold cross validation
                    <sup>
                        <xref ref-type="bibr" rid="ref-18">18</xref>
                    </sup>. In 10-fold cross validation, a dataset is divided into 10 parts. Each part is created by using stratified sampling method&#x2014;i.e., each part contains the same proportion of standardized CODs as the original dataset. In Step two, we selected one part for testing and nine parts for training from each VA dataset. In Step three, we trained OAA-NBC, InterVA-4, Tariff, InSilicoVA and NBC on the designated training data subsets from each partitioned VA dataset. In Step four, we generated classifications with ranks for each algorithm on the test part per VA dataset. In Step five, we calculated the cumulative sensitivity, PCCC and agreement for each rank per each VA dataset. In Step six, we repeated the process from Step two to Step five up to 10 repetitions with a different part for testing in each turn and for each VA dataset. In Step seven, we computed the average sensitivity, PCCC and agreement for each rank per VA dataset and algorithm.</p>
                <p>We implemented OAA-NBC in Java and with 
                    <ext-link ext-link-type="uri" xlink:href="http://weka.sourceforge.net/doc.stable/">Weka</ext-link> API
                    <sup>
                        <xref ref-type="bibr" rid="ref-18">18</xref>
                    </sup>. Weka provides APIs for one-against-all approach and Na&#x00ef;ve Bayes Multinomial classifier
                    <sup>
                        <xref ref-type="bibr" rid="ref-18">18</xref>
                    </sup>. We used the 
                    <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/web/packages/openVA/index.html">OpenVA</ext-link> package version 1.0.2 in R to implement InterVA-4, Tariff, InSilicoVA and NBC algorithms. The data format also was transformed into InterVA-4 input format (Y for 1 and empty for 0 values). It is important to note that the Tariff version provided in the OpenVA package is computationally different from the IHME&#x2019;s SmartVA-Analyze application tool. We used custom training option for InterVA-4 and InSilicoVA as present in OpenVA package in R. In custom training, the names of symptoms do not need to be in the WHO standardised format, and the rankings of the conditional probability P(symptom|cause) are determined by marching the same quantile distributions in the default InterVA P(symptom|cause). The reason for choosing customized training instead of using pre-trained global models is that different datasets have different proportions of symptoms and causes of deaths, and custom training allows algorithms to generate models customized for the dataset. It also allows for fair evaluation across algorithms, especially for the ones that only work by using customized training on datasets and acquire more knowledge of the dataset during testing.</p>
                <p>We performed data partitioning, as discussed in Step 1, using Java and Weka&#x2019;s
                    <sup>
                        <xref ref-type="bibr" rid="ref-18">18</xref>
                    </sup> stratified sampling API. Each algorithm was executed on that partitioned data. We used a separate Java program to compute the cumulative measures of sensitivity, PCCC and agreement (CSMF accuracy) on the COD assignments of each algorithm for each VA dataset. This process ascertained that our evaluation measures are calculated in the exact same manner. Our source code for all the experiments is available on 
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/sshahriyar/va">GitHub</ext-link> and is archived at 
                    <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5281/zenodo.1489267">Zenodo</ext-link>
                    <sup>
                        <xref ref-type="bibr" rid="ref-27">27</xref>
                    </sup>.</p>
            </sec>
        </sec>
        <sec sec-type="results">
            <title>Results</title>
            <sec>
                <title>Ranked CSMF accuracy comparison</title>
                <p>

                    <xref ref-type="fig" rid="f3">Figure 3</xref> shows the averaged agreements (CSMF accuracy) by algorithms across all VA datasets using rank one (most likely) cause (COD) assignments and the fifth most likely cause assignments (rank five). OAA-NBC produces the highest agreements for most of the VA datasets, ranging from 86% to 90% for rank one; it comes second or identical to NBC for the PHMRC Child datasets (global and India). Furthermore, OAA-NBC agreements were relatively consistent across the VA datasets compared to some of the other algorithms that varied considerably, such as Tariff, InterVA-4 and InSilicoVA. As expected, the cumulative agreements increased the overall agreements for each algorithm when including the top five ranked classifications for every VA dataset.</p>
                <fig fig-type="figure" id="f3" orientation="portrait" position="float">
                    <label>Figure 3. </label>
                    <caption>
                        <title>Ranks 1 and 5 cause-specific mortality fraction (CSMF) accuracies (agreement) across VA datasets and algorithms.</title>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://gatesopenresearch-files.f1000.com/manuscripts/13987/58fb9376-ce69-4a8f-be7d-15b9e9c21751_figure3.gif"/>
                </fig>
            </sec>
            <sec>
                <title>Ranked sensitivity comparison</title>
                <p>Individual-level cumulative sensitivity results for classification ranks one and five are shown in 
                    <xref ref-type="table" rid="T3">Table 3</xref>; cumulative PCCC values were not shown as the values were very close to the cumulative sensitivity values. It can be observed from 
                    <xref ref-type="table" rid="T3">Table 3</xref> that OAA-NBC has the highest sensitivity values for the first ranked (most likely) COD assignments compared to the other algorithms, ranging between 53-63%. When considering all top five ranked classifications, OAA-NBC has improved sensitivity values by 31&#x2013;38%, with cumulative values ranging from 91&#x2013;95%. In the case of Tariff, InterVA-4 and InSilicoVA, the sensitivity values are significantly lower (10&#x2013;40%) in comparison to OAA-NBC; NBC does not differ substantially from OAA-NBC, as differences only range from 3&#x2013;7%. These results show that OAA-NBC consistently yields closer agreement with the physician review or clinical diagnoses at the individual-level than the other algorithms on most of the VA datasets.</p>
                <table-wrap id="T3" orientation="portrait" position="anchor">
                    <label>Table 3. </label>
                    <caption>
                        <title>Cumulative sensitivity of rank 1 and 5 COD classifications by VA dataset and algorithm.</title>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th colspan="1" rowspan="2"/>
                                <th align="center" colspan="14" rowspan="1" valign="top">VA dataset, rank, cumulative sensitivity (%)</th>
                            </tr>
                            <tr>
                                <th align="center" colspan="2" rowspan="1" valign="top">MDS</th>
                                <th align="center" colspan="2" rowspan="1" valign="top">Matlab</th>
                                <th align="center" colspan="2" rowspan="1" valign="top">Agincourt</th>
                                <th align="center" colspan="2" rowspan="1" valign="top">PHMRC
                                    <break/>Adult -Global</th>
                                <th align="center" colspan="2" rowspan="1" valign="top">PHMRC
                                    <break/>Adult - India</th>
                                <th align="center" colspan="2" rowspan="1" valign="top">PHMRC
                                    <break/>Child - Global</th>
                                <th align="center" colspan="2" rowspan="1" valign="top">PHMRC
                                    <break/>Child - India</th>
                            </tr>
                            <tr>
                                <th align="center" colspan="1" rowspan="1" valign="top">Algorithm</th>
                                <th align="center" colspan="1" rowspan="1" valign="top">Rank
                                    <break/>1</th>
                                <th align="center" colspan="1" rowspan="1" valign="top">Rank
                                    <break/>5</th>
                                <th align="center" colspan="1" rowspan="1" valign="top">Rank
                                    <break/>1</th>
                                <th align="center" colspan="1" rowspan="1" valign="top">Rank
                                    <break/>5</th>
                                <th align="center" colspan="1" rowspan="1" valign="top">Rank
                                    <break/>1</th>
                                <th align="center" colspan="1" rowspan="1" valign="top">Rank
                                    <break/>5</th>
                                <th align="center" colspan="1" rowspan="1" valign="top">Rank
                                    <break/>1</th>
                                <th align="center" colspan="1" rowspan="1" valign="top">Rank
                                    <break/>5</th>
                                <th align="center" colspan="1" rowspan="1" valign="top">Rank
                                    <break/>1</th>
                                <th align="center" colspan="1" rowspan="1" valign="top">Rank
                                    <break/>5</th>
                                <th align="center" colspan="1" rowspan="1" valign="top">Rank
                                    <break/>1</th>
                                <th align="center" colspan="1" rowspan="1" valign="top">Rank
                                    <break/>5</th>
                                <th align="center" colspan="1" rowspan="1" valign="top">Rank
                                    <break/>1</th>
                                <th align="center" colspan="1" rowspan="1" valign="top">Rank
                                    <break/>5</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td align="right" colspan="1" rowspan="1" valign="top">Tariff</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">31.5</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">71.4</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">40.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">75.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">27.5</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">72.1</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">35.9</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">74.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">44.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">79.4</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">37.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">83.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">39.5</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">86.3</td>
                            </tr>
                            <tr>
                                <td align="right" colspan="1" rowspan="1" valign="top">InterVA-4</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">48.8</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">82.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">34.8</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">79.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">46.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">78.8</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">36.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">82.2</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">41.1</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">84.6</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">45.1</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">91.8</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">51.2</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">93.0</td>
                            </tr>
                            <tr>
                                <td align="right" colspan="1" rowspan="1" valign="top">InSilicoVA</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">45.6</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">85.9</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">35.6</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">80.8</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">35.8</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">80.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">35.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">79.5</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">50.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">87.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">43.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">89.6</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">49.4</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">92.4</td>
                            </tr>
                            <tr>
                                <td align="right" colspan="1" rowspan="1" valign="top">NBC</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">56.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">90.1</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">50.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">87.2</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">48.2</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">87.4</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">47.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">88.1</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">54.8</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">86.1</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">51.5</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">93.1</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">58.6</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">92.4</td>
                            </tr>
                            <tr>
                                <td align="right" colspan="1" rowspan="1" valign="top">OAA-NBC</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">61.1</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">94.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">57.9</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">91.2</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">55.5</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">93.1</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">53.1</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">91.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">60.1</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">93.1</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">54.6</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">93.4</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">63.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">94.7</td>
                            </tr>
                        </tbody>
                    </table>
                </table-wrap>
                <p>We also performed a Wilcoxon signed rank statistical test on the reported sensitivity in 
                    <xref ref-type="table" rid="T3">Table 3</xref>, generated from the five algorithms (we also included rank two to rank three values which are not shown in the table to minimize space). For 35 observations, the Wilcoxon signed ranked test yielded Z-score=5.194 and two tailed p-value=2.47 x 10
                    <sup>-7</sup> between OAA-NBC and NBC. It yielded the same Z-scores and two tailed p-values against InSilicoVA, InterVA-4, and Tariff. Thus, there is a statistically significant difference between the sensitivity values generated by OAA-NBC and the four other algorithms (p &lt; 0.05). Similarly, we conducted the Wilcoxon signed rank test on 35 observations of agreements for the five different algorithms, finding a statistically significant difference between OAA-NBC and the other algorithms (Z=4.248, p &lt; 0.05).</p>
                <p>Thus, the use of one-against-all approach with NBC (OAA-NBC) improves the performance of COD classification for VA records, and yields better COD assignments at the population- and individual-level, which are statistically different and not attributed to chance compared to the four other algorithms. This also conforms to the machine learning literature that the one-against-all approach improves the performance of classification algorithms when there are more than two classes (CODs)
                    <sup>
                        <xref ref-type="bibr" rid="ref-19">19</xref>
                    </sup>. However, this does not indicate that OAA-NBC does not require improvement, as the overall sensitivity for the top ranked CODs per VA record is still lower than 80%. We also made an additional assessment on the COD sensitivity; 
                    <xref ref-type="table" rid="T4">Table 4</xref> shows the sensitivity per cause for first ranked predictions and VA dataset for each algorithm (PHMRC Indian datasets are excluded as their results are similar to PHMRC global datasets and this minimizes space too). The sensitivity values vary per VA dataset and cause for all of the algorithms; road and transport injuries and other injuries were the only causes that OAA-NBC predicted consistently well for four out of the five VA datasets. However, there are several causes where the sensitivity of the classifications by OAA-NBC were lower than 50%, and in some cases, 0% (four causes in MDS and two causes in PHMRC &#x2013; Child global datasets). Sensitivity values are 0% for COD groups that have proportion of records near 1% per VA dataset (number of records for each COD in VA datasets are shown in 
                    <xref ref-type="table" rid="T2">Table 2</xref>). In general, the algorithms performances vary on different CODs for certain conditions in VA datasets. For example, classifications were equal to or under 10% across all algorithms for HIV/AIDs, cancers, cardiovascular disease, and chronic respiratory diseases in the MDS dataset. Algorithms like OAA-NBC and NBC mostly have better sensitivity for COD groups that have higher proportion of records in training dataset. However, this is not always the case, and better sensitivity values also depend on how distinguishable VA records of a COD group are from all other COD groups. In the next section, we discuss the problem and effects of imbalance within datasets on the algorithms&#x2019; classification accuracy.</p>
                <table-wrap id="T4" orientation="portrait" position="anchor">
                    <label>Table 4. </label>
                    <caption>
                        <title>Top ranked (most likely) sensitivity scores per COD by VA dataset and algorithm with physician assigned COD distributions.</title>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th colspan="1" rowspan="1"/>
                                <th colspan="1" rowspan="1"/>
                                <th align="left" colspan="17" rowspan="1" valign="top">Cause, sensitivity (%)</th>
                            </tr>
                            <tr>
                                <th align="left" colspan="1" rowspan="1" valign="top">VA Dataset</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Algorithm</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Acute
                                    <break/>respiratory</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">HIV/AIDS</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Diarrhoeal</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Tuberculosis</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Other &amp;
                                    <break/>unspecified
                                    <break/>infections</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Cancers</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Nutrition &amp;
                                    <break/>endocrine</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Cardiovascular
                                    <break/>diseases</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Chronic Respiratory</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Liver cirrhosis</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Other NCDs</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Neonatal
                                    <break/>conditions</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Road &amp; transport
                                    <break/>injuries</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Other injuries</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Ill-defined</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Suicide</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Maternal</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td align="center" colspan="1" rowspan="6" valign="top">MDS</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">Physician
                                            <xref ref-type="other" rid="fn10">*</xref>
                                        </italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">27.7</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">0.04</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">22.2</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">0.6</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">20.6</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">0.8</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">3.0</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">0.1</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">0.2</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">0.9</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">11.0</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">3.3</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">0.8</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">5.4</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">3.2</italic>
	</bold>
</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">-</italic>
	</bold>
</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">-</italic>
	</bold>
</td>
                            </tr>
                            <tr>
                                <td align="right" colspan="1" rowspan="1" valign="top">Tariff</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">36.1</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">10.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">47.5</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">42.5</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">19.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">16.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">31.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">5.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">23.9</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">3.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">11.2</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">84.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">57.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">25.7</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                            </tr>
                            <tr>
                                <td align="right" colspan="1" rowspan="1" valign="top">InterVA-4</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">78.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">55.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">51.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">43.6</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">9.5</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.9</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">8.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">3.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">29.2</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">1.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">15.4</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">70.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">71.5</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">10.4</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                            </tr>
                            <tr>
                                <td align="right" colspan="1" rowspan="1" valign="top">InSilicoVA</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">61.5</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">55.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">50.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">32.4</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.6</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">42.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">21.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">6.8</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">13.9</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">82.1</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">69.6</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">63.3</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                            </tr>
                            <tr>
                                <td align="right" colspan="1" rowspan="1" valign="top">NBC</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">74.9</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">70.4</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">31.6</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">46.5</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">4.1</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">41.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">1.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">18.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">22.6</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">15.2</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">73.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">80.1</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">49.0</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                            </tr>
                            <tr>
                                <td align="right" colspan="1" rowspan="1" valign="top">OAA-NBC</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">85.2</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">78.5</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">17.9</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">51.5</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">25.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">4.5</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">23.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">11.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">79.8</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">80.6</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">25.7</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                            </tr>
                            <tr>
                                <td align="center" colspan="1" rowspan="6" valign="top">Matlab</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
                                    <bold>
                                        <italic toggle="yes">Physician
                                            <xref ref-type="other" rid="fn10">*</xref>
                                        </italic>
                                    </bold>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
                                    <bold>
                                        <italic toggle="yes">0.5</italic>
                                    </bold>
                                </td>
                                <td align="center" colspan="1" rowspan="1" valign="top">
                                    <bold>
                                        <italic toggle="yes">-</italic>
                                    </bold>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
                                    <bold>
                                        <italic toggle="yes">1.4</italic>
                                    </bold>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
                                    <bold>
                                        <italic toggle="yes">2.1</italic>
                                    </bold>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
                                    <bold>
                                        <italic toggle="yes">3.9</italic>
                                    </bold>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
                                    <bold>
                                        <italic toggle="yes">17.6</italic>
                                    </bold>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
                                    <bold>
                                        <italic toggle="yes">4.5</italic>
                                    </bold>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
                                    <bold>
                                        <italic toggle="yes">35.7</italic>
                                    </bold>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
                                    <bold>
                                        <italic toggle="yes">6.4</italic>
                                    </bold>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
                                    <bold>
                                        <italic toggle="yes">5.0</italic>
                                    </bold>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
                                    <bold>
                                        <italic toggle="yes">12.2</italic>
                                    </bold>
                                </td>
                                <td align="center" colspan="1" rowspan="1" valign="top">
                                    <bold>
                                        <italic toggle="yes">-</italic>
                                    </bold>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
                                    <bold>
                                        <italic toggle="yes">2.4</italic>
                                    </bold>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
                                    <bold>
                                        <italic toggle="yes">3.4</italic>
                                    </bold>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
                                    <bold>
                                        <italic toggle="yes">1.7</italic>
                                    </bold>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
                                    <bold>
                                        <italic toggle="yes">1.7</italic>
                                    </bold>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
                                    <bold>
                                        <italic toggle="yes">1.1</italic>
                                    </bold>
                                </td>
                            </tr>
                            <tr>
                                <td align="right" colspan="1" rowspan="1" valign="top">Tariff</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">15.0</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">53.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">55.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">15.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">41.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">61.1</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">38.1</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">79.8</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">50.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">9.9</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">57.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">51.2</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">13.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">70.8</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">16.7</td>
                            </tr>
                            <tr>
                                <td align="right" colspan="1" rowspan="1" valign="top">InterVA-4</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.0</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">26.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">51.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">29.8</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">48.6</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">21.1</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">32.1</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">42.6</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">61.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">7.4</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">81.5</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">37.1</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">70.8</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">15.0</td>
                            </tr>
                            <tr>
                                <td align="right" colspan="1" rowspan="1" valign="top">InSilicoVA</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">20.0</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">50.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">34.5</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">11.4</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">17.1</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">34.4</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">47.9</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">71.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">53.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">8.2</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">91.5</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">19.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">13.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">86.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">8.3</td>
                            </tr>
                            <tr>
                                <td align="right" colspan="1" rowspan="1" valign="top">NBC</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">10.0</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">21.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">42.5</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">15.4</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">55.4</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">43.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">64.1</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">66.5</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">57.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">20.0</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">83.5</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">15.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">21.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">76.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">5.0</td>
                            </tr>
                            <tr>
                                <td align="right" colspan="1" rowspan="1" valign="top">OAA-NBC</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">20.0</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">51.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">30.5</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">7.5</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">67.6</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">38.9</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">75.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">75.8</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">53.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">23.8</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">96.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">39.5</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">2.5</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">75.8</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">5.0</td>
                            </tr>
                            <tr>
                                <td align="center" colspan="1" rowspan="6" valign="top">Agincourt</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">Physician
                                            <xref ref-type="other" rid="fn10">*</xref>
                                        </italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">1.9</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">34.5</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">1.1</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">11.8</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">7.4</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">4.2</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">1.2</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">6.5</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">0.5</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">1.5</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">3.8</italic>
	</bold>
</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">-</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">3.8</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">6.3</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">12.2</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">2.1</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">1.0</italic>
	</bold>
</td>
                            </tr>
                            <tr>
                                <td align="right" colspan="1" rowspan="1" valign="top">Tariff</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">44.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">21.4</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">39.8</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">53.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">7.2</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">24.6</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">69.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">24.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">30.8</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">50.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">19.6</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">80.8</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">41.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">3.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">14.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">60.3</td>
                            </tr>
                            <tr>
                                <td align="right" colspan="1" rowspan="1" valign="top">InterVA-4</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">36.1</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">74.5</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">34.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">59.9</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">12.5</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">28.1</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">25.8</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">13.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">43.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">50.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">9.9</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">78.4</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">64.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">21.9</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">29.2</td>
                            </tr>
                            <tr>
                                <td align="right" colspan="1" rowspan="1" valign="top">InSilicoVA</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">53.1</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">29.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">31.2</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">60.9</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">11.4</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">26.2</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">32.8</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">14.8</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">35.8</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">41.4</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">18.5</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">81.5</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">52.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">29.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">79.8</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">52.1</td>
                            </tr>
                            <tr>
                                <td align="right" colspan="1" rowspan="1" valign="top">NBC</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">41.2</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">59.4</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">27.9</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">60.8</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">26.6</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">35.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">33.2</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">28.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">33.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">39.1</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">16.6</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">79.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">63.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">27.1</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">69.2</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">53.3</td>
                            </tr>
                            <tr>
                                <td align="right" colspan="1" rowspan="1" valign="top">OAA-NBC</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">39.1</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">77.9</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">24.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">48.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">52.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">28.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">42.9</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">44.1</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">3.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">35.8</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">19.0</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">82.6</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">82.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">26.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">6.4</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">48.3</td>
                            </tr>
                            <tr>
                                <td align="center" colspan="1" rowspan="6" valign="top">PHMRC -
                                    <break/>Adult Global</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">Physician
                                            <xref ref-type="other" rid="fn10">*</xref>
                                        </italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">6.5</italic>
	</bold>
</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">-</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">2.2</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">3.8</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">13.4</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">10.7</italic>
	</bold>
</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">-</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">19.9</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">1.8</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">5.0</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">15.0</italic>
	</bold>
</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">-</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">2.7</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">10.1</italic>
	</bold>
</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">-</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">1.5</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">7.4</italic>
	</bold>
</td>
                            </tr>
                            <tr>
                                <td align="right" colspan="1" rowspan="1" valign="top">Tariff</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">26.0</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">28.6</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">47.4</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">26.8</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">48.7</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">30.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">19.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">64.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">5.8</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">64.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">37.8</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">22.9</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">89.9</td>
                            </tr>
                            <tr>
                                <td align="right" colspan="1" rowspan="1" valign="top">InterVA-4</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">14.5</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">5.9</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">14.6</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">45.8</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">47.7</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">32.6</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">45.4</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">87.2</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">13.0</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">29.2</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">40.8</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">25.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">61.8</td>
                            </tr>
                            <tr>
                                <td align="right" colspan="1" rowspan="1" valign="top">InSilicoVA</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">16.1</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">36.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">22.6</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">27.6</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">39.4</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">25.2</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">32.1</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">46.9</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">13.1</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">76.1</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">59.0</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">35.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">80.3</td>
                            </tr>
                            <tr>
                                <td align="right" colspan="1" rowspan="1" valign="top">NBC</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">26.7</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">31.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">30.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">40.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">60.0</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">49.4</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">41.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">60.6</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">21.3</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">61.4</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">69.6</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">35.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">84.1</td>
                            </tr>
                            <tr>
                                <td align="right" colspan="1" rowspan="1" valign="top">OAA-NBC</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">22.7</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">22.8</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">20.3</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">52.1</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">64.2</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">64.6</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">27.4</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">62.4</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">26.3</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">59.7</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">74.3</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">18.6</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">90.1</td>
                            </tr>
                            <tr>
                                <td align="center" colspan="1" rowspan="6" valign="top">PHMRC -
                                    <break/>Child Global</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">Physician
                                            <xref ref-type="other" rid="fn10">*</xref>
                                        </italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">25.8</italic>
	</bold>
</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">-</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">12.4</italic>
	</bold>
</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">-</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">18.2</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">1.4</italic>
	</bold>
</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">-</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">3.7</italic>
	</bold>
</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">-</italic>
	</bold>
</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">-</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">9.0</italic>
	</bold>
</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">-</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">4.5</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">15.7</italic>
	</bold>
</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">9.4</italic>
	</bold>
</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">-</italic>
	</bold>
</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">
	
                                    <bold>
		
                                        <italic toggle="yes">-</italic>
	</bold>
</td>
                            </tr>
                            <tr>
                                <td align="right" colspan="1" rowspan="1" valign="top">Tariff</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">28.9</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">56.2</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">20.5</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">6.7</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">14.5</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">22.7</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">67.8</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">62.2</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">36.4</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                            </tr>
                            <tr>
                                <td align="right" colspan="1" rowspan="1" valign="top">InterVA-4</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">69.9</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">45.3</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">25.8</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">43.3</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">5.0</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">8.6</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">78.4</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">63.5</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">18.4</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                            </tr>
                            <tr>
                                <td align="right" colspan="1" rowspan="1" valign="top">InSilicoVA</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">39.3</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">45.6</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">26.9</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">35.0</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">10.4</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">17.2</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">87.1</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">86.4</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">29.2</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                            </tr>
                            <tr>
                                <td align="right" colspan="1" rowspan="1" valign="top">NBC</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">60.5</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">48.4</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">45.5</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">10.0</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">15.7</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">12.9</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">83.9</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">85.5</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">27.2</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                            </tr>
                            <tr>
                                <td align="right" colspan="1" rowspan="1" valign="top">OAA-NBC</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">71.0</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">53.4</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">46.6</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.0</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.0</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">8.5</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">90.4</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">91.0</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">23.0</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="center" colspan="1" rowspan="1" valign="top">-</td>
                            </tr>
                        </tbody>
                    </table>
                    <table-wrap-foot>
                        <fn>
                            <p id="fn10">* Proportion of deaths assigned for each COD by physician(s) or clinical diagnoses (PHMRC)</p>
                        </fn>
                    </table-wrap-foot>
                </table-wrap>
            </sec>
        </sec>
        <sec sec-type="discussion">
            <title>Discussion</title>
            <p>Our approach (OAA-NBC) produces better population and individual-level agreement (sensitivity) from different VA surveys compared to other algorithms. However, the overall sensitivity values are still in the range of 55&#x2013;61% and not greater than 80% for the top ranked COD assignments. There are several reasons for the low sensitivity values; firstly, each VA dataset is unique, with varying amounts of overlapping or different symptoms. In this respect, the symptom-cause information (SCI) was unique to each VA dataset, and so, some of the algorithms could have had more trouble generating adequate SCIs due to the logic employed by the algorithm itself and VA data. This could help explain the low sensitivity scores by cause and per algorithm for the MDS data, which was one of the VA datasets with the fewest amounts of symptoms, and which could have impacted the SCI used for COD assignment by the algorithms. Conversely, some algorithms like InterVA-4 (when you specify the format as following the WHO 2012 or 2014 VA Instrument) require a set of predefined symptoms, or else prefer independent symptoms (i.e. had a fever) over dependent symptoms (i.e. fever lasted for a week) or interdependent symptoms (i.e. did s/he have diarrhoea and dysentery); the absence of such symptoms would also impact the algorithms&#x2019; ability to classify VA records correctly. A solution to this problem is to have better differentiating symptoms for each COD.</p>
            <p>One may argue that algorithms, such as InterVA-4 and InSilicoVA (non-training option), which use a different input, namely symptom list, based on WHO&#x2019;s forms for assigning CODs and do not need training on data, would be unfairly evaluated by using customized training. We converted symptoms in our datasets to WHO standardised names and evaluated InterVA-4, and InSilicoVA on the datasets. We used the same method of 10-fold cross validation method as we used in our experiments earlier but we only provided a test set for each fold to the algorithms for assigning causes of deaths based on standardised symptom names. The output of these algorithms was one of the 63 standardised CODs. We mapped these 63 causes to our 17 CODs for a fair evaluation (see 
                <xref ref-type="table" rid="T6">Table 6</xref> for complete details on mapping to the 17 COD categories). We observed that sensitivity for rank one for InterVA-4 remains between approximately 25% and 42%, and sensitivity for InSilicoVA remained between 20% and 43% on all datasets. The use of pre-trained models on standardized VA data inputs did not yield any better results than customized training on datasets.</p>
            <p>One may also argue for the use of more recent algorithm versions, such as InterVA-5, for assessments. Due to the fact that the VA data used were captured prior to the release of the WHO 2016 forms, the resultant binary files would have many missing symptoms. Furthermore, InterVA-5 was only recently released for public use, specifically in September of 2018. Although an enhanced algorithm may perform more effectively due to logic employed, the VA data is also very relevant for performance. Since the VA data used in this study conformed better with the 2014 forms, we ran experiments using algorithms that were designed from WHO 2014 VA forms or do not require a specific input for a fair comparison.</p>
            <p> VA datasets also differ in COD composition counts; there are some CODs in the VA datasets which have large number of records, while other CODs have fewer records. The ratio of composition of these CODs is highly imbalanced which can make any algorithm more biased towards the CODs with higher ratio of records in the training set. This implies that the overall agreement would most likely remain low for the algorithms in such cases. COD balancing can be performed by duplicating the number of records for the minority CODs (CODs with the least amounts of records) or decreasing the number of records for the majority CODs (CODs with the greatest amounts of records)
                <sup>
                    <xref ref-type="bibr" rid="ref-18">18</xref>
                </sup>. However, these types of artificial balancing approaches do not always yield improvements in results.</p>
            <p> A point for discussion relates to the distribution of CODs in training and test datasets. In machine learning, the composition of records of classes (e.g., CODs) are kept in the same proportion in the training and test set as in the original dataset when performing experiments
                <sup>
                    <xref ref-type="bibr" rid="ref-18">18</xref>
                </sup>. This allows for a fair evaluation of the algorithm, otherwise too many VA records in a test set of a COD and too few in the training set would only result in poor performance of the algorithm for that COD. In real life situations, when a machine learning application is in production, it is possible that we may not get all the variations in the training (historical) set and we may have more variations of a particular COD in the newly collected data. The common solution to this problem is to update the training data, and re-train the algorithm to reflect newer SCI variations as they are observed
                <sup>
                    <xref ref-type="bibr" rid="ref-18">18</xref>
                </sup>. Nonetheless, to understand the effect of different variations of CODs in training and test set, we performed another experiment by using Dirichlet distribution, which allowed us to vary the composition of records in the test set
                <sup>
                    <xref ref-type="bibr" rid="ref-28">28</xref>
                </sup>. We used Dirichlet distribution-based sampling that actually models variability in occurrences of classes (CODs) by applying resampling with replacement. We divided the dataset into 10 parts using 10-fold cross validation method
                <sup>
                    <xref ref-type="bibr" rid="ref-18">18</xref>
                </sup> as in our experiments above. On each fold, we resampled the test set with replacement using Dirichlet distribution
                <sup>
                    <xref ref-type="bibr" rid="ref-28">28</xref>
                </sup>, resulting in different number of records for each type of COD. OAA-NBC, InterVA-4, Tariff, InSilicoVA and NBC were then evaluated on the resampled test set with different distribution of CODs. The results are shown in 
                <xref ref-type="table" rid="T5">Table 5</xref> for Matlab and MDS datasets. The overall performance of classification decreased as expected because the CODs with too few VA records in the actual training set have been duplicated many times by the Dirichlet distribution in the new test set only. For example, if a record related to COD is not classified correctly by an algorithm and it is repeated many times in the test set then sensitivity will decrease on that COD. The overall performance of the algorithms remain low as expected. OAA-NBC and NBC still yield better performance than all other algorithms. We show results for these two datasets only as the other VA datasets had similar results of a dip in performance. An ideal training dataset would be a large repository of community VA deaths with enough variations in symptom patterns for each COD that are clinically verified; however, no such repository exists. The whole purpose of training on VA datasets is to be able to help classify CODs in situations where deaths occur unattended.</p>
            <table-wrap id="T5" orientation="portrait" position="anchor">
                <label>Table 5. </label>
                <caption>
                    <title>Comparison of cumulative sensitivity and cause-specific mortality fraction (CSMF) accuracy of rank 1 and 5 classifications using Dirichlet distribution on MDS and Matlab data.</title>
                </caption>
                <table content-type="article-table" frame="hsides">
                    <thead>
                        <tr>
                            <th align="center" colspan="1" rowspan="4" valign="bottom">Algorithm</th>
                            <th align="center" colspan="8" rowspan="1" valign="top">VA dataset, rank, cumulative sensitivity and CSMF accuracy (%)</th>
                        </tr>
                        <tr>
                            <th align="center" colspan="4" rowspan="1" valign="top">MDS</th>
                            <th align="center" colspan="4" rowspan="1" valign="top">Matlab</th>
                        </tr>
                        <tr>
                            <th align="center" colspan="2" rowspan="1" valign="top">Sensitivity</th>
                            <th align="center" colspan="2" rowspan="1" valign="top">CSMF accuracy</th>
                            <th align="center" colspan="2" rowspan="1" valign="top">Sensitivity</th>
                            <th align="center" colspan="2" rowspan="1" valign="top">CSMF accuracy</th>
                        </tr>
                        <tr>
                            <th align="center" colspan="1" rowspan="1">Rank 1</th>
                            <th align="center" colspan="1" rowspan="1">Rank 5</th>
                            <th align="center" colspan="1" rowspan="1">Rank 1</th>
                            <th align="center" colspan="1" rowspan="1">Rank 5</th>
                            <th align="center" colspan="1" rowspan="1">Rank 1</th>
                            <th align="center" colspan="1" rowspan="1">Rank 5</th>
                            <th align="center" colspan="1" rowspan="1">Rank 1</th>
                            <th align="center" colspan="1" rowspan="1">Rank 5</th>
                        </tr>
                    </thead>
                    <tbody>
                        <tr>
                            <td align="right" colspan="1" rowspan="1">Tariff</td>
                            <td align="right" colspan="1" rowspan="1">29.0</td>
                            <td align="right" colspan="1" rowspan="1">64.7</td>
                            <td align="right" colspan="1" rowspan="1">53.7</td>
                            <td align="right" colspan="1" rowspan="1">74.6</td>
                            <td align="right" colspan="1" rowspan="1">45.2</td>
                            <td align="right" colspan="1" rowspan="1">79.0</td>
                            <td align="right" colspan="1" rowspan="1">54.6</td>
                            <td align="right" colspan="1" rowspan="1">80.8</td>
                        </tr>
                        <tr>
                            <td align="right" colspan="1" rowspan="1">InterVA-4</td>
                            <td align="right" colspan="1" rowspan="1">33.6</td>
                            <td align="right" colspan="1" rowspan="1">63.9</td>
                            <td align="right" colspan="1" rowspan="1">49.4</td>
                            <td align="right" colspan="1" rowspan="1">70.7</td>
                            <td align="right" colspan="1" rowspan="1">33.4</td>
                            <td align="right" colspan="1" rowspan="1">71.5</td>
                            <td align="right" colspan="1" rowspan="1">51.6</td>
                            <td align="right" colspan="1" rowspan="1">75.1</td>
                        </tr>
                        <tr>
                            <td align="right" colspan="1" rowspan="1">InSilicoVA</td>
                            <td align="right" colspan="1" rowspan="1">38.1</td>
                            <td align="right" colspan="1" rowspan="1">75.9</td>
                            <td align="right" colspan="1" rowspan="1">57.2</td>
                            <td align="right" colspan="1" rowspan="1">80.5</td>
                            <td align="right" colspan="1" rowspan="1">37.7</td>
                            <td align="right" colspan="1" rowspan="1">81.4</td>
                            <td align="right" colspan="1" rowspan="1">59.4</td>
                            <td align="right" colspan="1" rowspan="1">85.8</td>
                        </tr>
                        <tr>
                            <td align="right" colspan="1" rowspan="1">NBC</td>
                            <td align="right" colspan="1" rowspan="1">41.7</td>
                            <td align="right" colspan="1" rowspan="1">74.7</td>
                            <td align="right" colspan="1" rowspan="1">60.4</td>
                            <td align="right" colspan="1" rowspan="1">79.6</td>
                            <td align="right" colspan="1" rowspan="1">38.7</td>
                            <td align="right" colspan="1" rowspan="1">73.7</td>
                            <td align="right" colspan="1" rowspan="1">57.6</td>
                            <td align="right" colspan="1" rowspan="1">76.7</td>
                        </tr>
                        <tr>
                            <td align="right" colspan="1" rowspan="1">OAA-NBC</td>
                            <td align="right" colspan="1" rowspan="1">41.0</td>
                            <td align="right" colspan="1" rowspan="1">75.0</td>
                            <td align="right" colspan="1" rowspan="1">59.8</td>
                            <td align="right" colspan="1" rowspan="1">79.2</td>
                            <td align="right" colspan="1" rowspan="1">45.6</td>
                            <td align="right" colspan="1" rowspan="1">86.2</td>
                            <td align="right" colspan="1" rowspan="1">60.4</td>
                            <td align="right" colspan="1" rowspan="1">88.3</td>
                        </tr>
                    </tbody>
                </table>
            </table-wrap>
            <table-wrap id="T6" orientation="portrait" position="anchor">
                <label>Table 6. </label>
                <caption>
                    <title>Complete mapping of ICD-10 and WHO cause labels to the cause list used for performance assessments.</title>
                </caption>
                <table content-type="article-table" frame="hsides">
                    <thead>
                        <tr>
                            <th align="left" colspan="1" rowspan="1" valign="top">No.</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">Cause of Death</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">WHO list of Causes</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">ICD-10 Range</th>
                        </tr>
                    </thead>
                    <tbody>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">1</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Acute respiratory</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Acute resp infect incl pneumonia, Neonatal
                                <break/>pneumonia</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">H65-H68, H70-H71, J00-J22, J32, J36,
                                <break/>J85-J86, P23</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">HIV/AIDS</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">NA</td>
                            <td colspan="1" rowspan="1"/>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">3</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Diarrhoeal</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Diarrhoeal diseases</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">A00-A09</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">4</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Pulmonary TB</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Pulmonary tuberculosis</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">A15-A16, B90, J65</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">5</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Other and
                                <break/>unspecified
                                <break/>infections</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Sepsis (non-obstetric), HIV/AIDS related death,
                                <break/>Malaria, Measles, Meningitis and encephalitis,
                                <break/>Tetanus, Pertussis, Haemorrhagic fever, Other and
                                <break/>unspecified infect dis, Neonatal sepsis</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">A17-A33, A35-A99, B00-B17, B19-B89,
                                <break/>B91-B99, C46, D64, D84, G00-G09,
                                <break/>H10, H60, I30, I32-I33, K02, K04-K05,
                                <break/>K61, K65, K67, K81, L00-L04, L08,
                                <break/>M00-M01, M60, M86, N10, N30, N34,
                                <break/>N41, N49, N61, N70-N74, P35-P39,
                                <break/>R50, R75, ZZ21</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">6</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Neoplasms
                                <break/>(cancer)</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Oral neoplasms, Digestive neoplasms, Respiratory
                                <break/>neoplasms, Breast neoplasms, Reproductive
                                <break/>neoplasms MF, Other and unspecified neoplasms</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">C00-C26, C30-C45, C47-C58, C60-C97,
                                <break/>D00-D48, D91, N60, N62-N64, N87, R59</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">7</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Nutrition and
                                <break/>endocrine</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Severe anaemia, Severe malnutrition</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">D50-D53, E00-E02, E40-E46, E50-E64,
                                <break/>X53-X54</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">8</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Cardiovascular
                                <break/>Diseases (CVD)</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Diabetes mellitus, Acute cardiac disease, Stroke,
                                <break/>Other and unspecified cardiac dis</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">E10-E14, G45-G46, G81-G83, I60-I69,
                                <break/>I00-I03, I05-I15, I26-28, I31, I34-I52,
                                <break/>I70-I99, R00-R01, R03, ZZ23</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">9</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Chronic
                                <break/>respiratory</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Chronic obstructive pulmonary dis, Asthma</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">J30-J31, J33-J35, J37-J64, J66-J84,
                                <break/>J90-J99, R04-R06, R84, R91</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">10</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Liver cirrhosis</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Liver cirrhosis</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">B18, F10, K70-K77, R16-R18, X45, Y15,
                                <break/>Y90-91</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">11</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Other non-
                                <break/>communicable
                                <break/>diseases</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Sickle cell with crisis, Acute abdomen, Renal
                                <break/>failure, Epilepsy, Congenital malformation, Other
                                <break/>and unspecified, Other and unspecified NCD</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">D55-D63, D65-D83, D86, D89, E03-E07,
                                <break/>E15-E35, E65-E68, E70-E90, F00-F09,
                                <break/>F11-F52, F54-F99, G10-G37, G40-G41,
                                <break/>G50-G80, G84-G99, H00-H06, H11-H59,
                                <break/>H61-H62, H69, H72-H95, K00-K01, K03,
                                <break/>K06-K14, K20-K31, K35-K38, K40-K60,
                                <break/>K62-K64, K66, K78-K80, K82-K93, L05,
                                <break/>L10-L99, M02-M54, M61-M85, M87-M99,
                                <break/>N00-N08, N11-N29, N31-N33, N35-N40,
                                <break/>N42-N48, N50-N59, N75-N86, N88-N99,
                                <break/>Q00-Q99, R10-R15, R19-R23, R26-R27,
                                <break/>R29-R49, R56, R63, R70-R74, R76-R77,
                                <break/>R80-R82, R85-R87, R90, ZZ25</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">12</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Neonatal
                                <break/>conditions</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Cause of death unknown, Prematurity, Birth
                                <break/>asphyxia, Other and unspecified neonatal CoD</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">C76, D64, G40, O60, P00, P01, P02-P03,
                                <break/>P05, P07, P10-P15, P21, P22, P24-P29,
                                <break/>P50-P52, P61, P77, P80, P90-P92, R04,
                                <break/>R06, Q00-Q99, W79, Z37</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">13</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Road and
                                <break/>transport injuries
                                <break/>(RTI)</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Road traffic accident, Other transport accident</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">V01-V99, Y85</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">14</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Other injuries</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Accid fall, Accid drowning and submersion,
                                <break/>Accid expos to smoke fire &amp; flame, Contact with
                                <break/>	venomous plant/animal, Accid poisoning &amp; noxious
                                <break/>subs, Assault, Exposure to force of nature, Other
                                <break/>and unspecified external CoD</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">S00-S99, T00-T99, W00-W99, X00-X44,
                                <break/>X46-X52, X55-X59, X85-X99, Y00-Y14,
                                <break/>Y16-Y84, Y86-Y89, Y92-Y98, ZZ27</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">15</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Ill-defined</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">NA</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">P96, R02, R07-R09, R25, R51-R54,
                                <break/>R57-R58, R60-R62, R64-R69, R78-R79,
                                <break/>R83, R89, R92-R94, R96, R98-R99</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">16</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Suicide</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Intentional self-harm</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">X60-X84</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">17</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Maternal</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Ectopic pregnancy, Abortion-related death,
                                <break/>Pregnancy-induced hypertension, Obstetric
                                <break/>haemorrhage, Obstructed labour, Pregnancy-
                                <break/>related sepsis, Anaemia of pregnancy, Ruptured
                                <break/>uterus, Other and unspecified maternal CoD, Not
                                <break/>pregnant or recently delivered, Pregnancy ended
                                <break/>within 6 weeks of death, Pregnant at death, Birth
                                <break/>asphyxia, Fresh stillbirth, Macerated stillbirth</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">A34, F53, O00-O08, O10-O16, O20-O99</td>
                        </tr>
                    </tbody>
                </table>
            </table-wrap>
            <p>Finally, the performance of machine learning algorithms depend on the logic employed by the algorithm and the VA data, in terms of generating an adequate SCI for COD classification to discriminate different classes (CODs). To mitigate the effects of using one set of training data on all VA data, we trained algorithms on data derived from its origin dataset by using 10-fold cross validation method. By doing so, only SCIs generated from each separate VA data was considered when algorithms were classifying deaths per VA dataset. For the most part, the algorithms performed consistently, with OAA NBC performing better the majority of the time. Our results are reproducible; all of the scripts used and sample datasets are publicly available (see Experimental Setup section).</p>
        </sec>
        <sec sec-type="conclusions">
            <title>Conclusion</title>
            <p>In this study, we enhanced the NBC algorithm using the one-against-all approach to assign CODs to records in multiple VA datasets from different settings. The results show that our approach has 6-8% better sensitivity and PCCC for individual-level COD classification than some of the current best performing computer-coded VA algorithms (i.e., Tariff, InterVA-4, NBC and InSilicoVA). Population-level agreements for OAA-NBC and NBC were found to be similar or higher than the other algorithms used in the experiments. Overall results show that OAA-NBC classification results are most like dual physician and clinical diagnostic COD assignments when compared against some of the leading algorithms by using cumulative sensitivity, PCCC and CSMF accuracy scores. The performance results are not due to chance as indicated by the Wilcoxon Signed Rank.</p>
            <p>Thus, we conclude that using the one-against-all approach with NBC helped improve accuracy of COD classification. The one-against-all approach (and other ensemble methods of machine learning) can also be used with other VA algorithms instead of just Na&#x00ef;ve Bayes. Although OAA-NBC generates the highest cumulative CSMF accuracy values, OAA-NBC still requires improvements to produce the most accurate COD classifications, especially for individual-level classification which is still below 80%. In the future, we plan to extend this work to include narratives present in the VA surveys for automated classification. Another endeavour would be to apply the one-against-all approach to the other algorithms to determine whether they can be improved further to classify community VA deaths more similarly to dual physician review.</p>
        </sec>
        <sec>
            <title>Data availability</title>
            <p>Some of the data used in the analysis has already been made available, specifically the PHMRC data which can be found at: 
                <ext-link ext-link-type="uri" xlink:href="http://ghdx.healthdata.org/record/population-health-metrics-research-consortium-gold-standard-verbal-autopsy-data-2005-2011">http://ghdx.healthdata.org/record/population-health-metrics-research-consortium-gold-standard-verbal-autopsy-data-2005-2011</ext-link>.</p>
            <p>The other datasets are included with the source code: 
                <ext-link ext-link-type="uri" xlink:href="https://github.com/sshahriyar/va">https://github.com/sshahriyar/va</ext-link> (archived at 
                <ext-link ext-link-type="uri" xlink:href="https://dx.doi.org/10.5281/zenodo.1489267">https://doi.org/10.5281/zenodo.1489267</ext-link>
                <sup>
                    <xref ref-type="bibr" rid="ref-27">27</xref>
                </sup>.</p>
        </sec>
        <sec>
            <title>Software availability</title>
            <p>
                <bold>Source code available from:</bold> 
                <ext-link ext-link-type="uri" xlink:href="https://github.com/sshahriyar/va">https://github.com/sshahriyar/va</ext-link>
            </p>
            <p>
                <bold>Archived source code at time of publication:</bold> 
                <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5281/zenodo.1489267">https://doi.org/10.5281/zenodo.1489267</ext-link>
                <sup>
                    <xref ref-type="bibr" rid="ref-27">27</xref>
                </sup>.</p>
            <p>

                <bold>License:</bold> 
                <ext-link ext-link-type="uri" xlink:href="https://opensource.org/licenses/MIT">MIT License</ext-link>.</p>
        </sec>
    </body>
    <back>
        <ref-list>
            <ref id="ref-1">
                <label>1</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Jha</surname>
                            <given-names>P</given-names>
                        </name>
			</person-group>:
                    <article-title>Reliable direct measurement of causes of death in low- and middle-income countries.</article-title>
                    <source>
				
                        <italic toggle="yes">BMC Med.</italic>
			</source>
                    <year>2014</year>;<volume>12</volume>(<issue>1</issue>):<fpage>19</fpage>.
                    <pub-id pub-id-type="pmid">24495839</pub-id>
                    <pub-id pub-id-type="doi">10.1186/1741-7015-12-19</pub-id>
                    <pub-id pub-id-type="pmcid">3912491</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-2">
                <label>2</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Assembly</surname>
                            <given-names>UG</given-names>
                        </name>
			</person-group>:
                    <article-title>Transforming our world: the 2030 Agenda for Sustainable Development.</article-title>New York: United Nations.<year>2015</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://www.un.org/ga/search/view_doc.asp?symbol=A/RES/70/1&amp;Lang=E">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-3">
                <label>3</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>World Health</surname>
                            <given-names>O</given-names>
                        </name>
			</person-group>:
                    <article-title>International Statistical Classification of Diseases and Related Health Problems.</article-title>ICD-10: World Health Organization;<year>2012</year>.</mixed-citation>
            </ref>
            <ref id="ref-4">
                <label>4</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Setel</surname>
                            <given-names>PW</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Sankoh</surname>
                            <given-names>O</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Rao</surname>
                            <given-names>C</given-names>
                        </name>
				
                        <etal/>
			</person-group>:
                    <article-title>Sample registration of vital events with verbal autopsy: a renewed commitment to measuring and monitoring vital statistics.</article-title>
                    <source>
				
                        <italic toggle="yes">Bull World Health Organ.</italic>
			</source>
                    <year>2005</year>;<volume>83</volume>(<issue>8</issue>):<fpage>611</fpage>&#x2013;<lpage>7</lpage>.
                    <pub-id pub-id-type="pmid">16184280</pub-id>
                    <pub-id pub-id-type="pmcid">2626308</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-5">
                <label>5</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Fottrell</surname>
                            <given-names>E</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Byass</surname>
                            <given-names>P</given-names>
                        </name>
			</person-group>:
                    <article-title>Verbal autopsy: methods in transition.</article-title>
                    <source>
				
                        <italic toggle="yes">Epidemiol Rev.</italic>
			</source>
                    <year>2010</year>;<volume>32</volume>(<issue>1</issue>):<fpage>38</fpage>&#x2013;<lpage>55</lpage>.
                    <pub-id pub-id-type="pmid">20203105</pub-id>
                    <pub-id pub-id-type="doi">10.1093/epirev/mxq003</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-6">
                <label>6</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>James</surname>
                            <given-names>SL</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Flaxman</surname>
                            <given-names>AD</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Murray</surname>
                            <given-names>CJ</given-names>
                        </name>
				
                        <etal/>
			</person-group>:
                    <article-title>Performance of the Tariff Method: validation of a simple additive algorithm for analysis of verbal autopsies.</article-title>
                    <source>
				
                        <italic toggle="yes">Popul Health Metr.</italic>
			</source>
                    <year>2011</year>;<volume>9</volume>(<issue>1</issue>):<fpage>31</fpage>.
                    <pub-id pub-id-type="pmid">21816107</pub-id>
                    <pub-id pub-id-type="doi">10.1186/1478-7954-9-31</pub-id>
                    <pub-id pub-id-type="pmcid">3160924</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-7">
                <label>7</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Byass</surname>
                            <given-names>P</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Chandramohan</surname>
                            <given-names>D</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Clark</surname>
                            <given-names>SJ</given-names>
                        </name>
				
                        <etal/>
			</person-group>:
                    <article-title>Strengthening standardised interpretation of verbal autopsy data: the new InterVA-4 tool.</article-title>
                    <source>
				
                        <italic toggle="yes">Glob Health Action.</italic>
			</source>
                    <year>2012</year>;<volume>5</volume>:<fpage>1</fpage>&#x2013;<lpage>8</lpage>.
                    <pub-id pub-id-type="pmid">22944365</pub-id>
                    <pub-id pub-id-type="doi">10.3402/gha.v5i0.19281</pub-id>
                    <pub-id pub-id-type="pmcid">3433652</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-8">
                <label>8</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>McCormick</surname>
                            <given-names>TH</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Li</surname>
                            <given-names>Z</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Calvert</surname>
                            <given-names>C</given-names>
                        </name>
				
                        <etal/>
			</person-group>:
                    <article-title>Probabilistic cause-of-death assignment using verbal autopsies.</article-title>
                    <source>
				
                        <italic toggle="yes">arXiv preprint arXiv: 14113042.</italic>
			</source>
                    <year>2014</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/pdf/1411.3042.pdf">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-9">
                <label>9</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Flaxman</surname>
                            <given-names>AD</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Vahdatpour</surname>
                            <given-names>A</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Green</surname>
                            <given-names>S</given-names>
                        </name>
				
                        <etal/>
			</person-group>:
                    <article-title>Random forests for verbal autopsy analysis: multisite validation study using clinical diagnostic gold standards.</article-title>
                    <source>
				
                        <italic toggle="yes">Popul Health Metr.</italic>
			</source>
                    <year>2011</year>;<volume>9</volume>:<fpage>29</fpage>.
                    <pub-id pub-id-type="pmid">21816105</pub-id>
                    <pub-id pub-id-type="doi">10.1186/1478-7954-9-29</pub-id>
                    <pub-id pub-id-type="pmcid">3160922</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-10">
                <label>10</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>King</surname>
                            <given-names>G</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Lu</surname>
                            <given-names>Y</given-names>
                        </name>
			</person-group>:
                    <article-title>Verbal autopsy methods with multiple causes of death.</article-title>
                    <source>
				
                        <italic toggle="yes">Stat Sci.</italic>
			</source>
                    <year>2008</year>;<volume>23</volume>(<issue>1</issue>):<fpage>78</fpage>&#x2013;<lpage>91</lpage>.
                    <pub-id pub-id-type="doi">10.1214/07-STS247</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-11">
                <label>11</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Miasnikof</surname>
                            <given-names>P</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Giannakeas</surname>
                            <given-names>V</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Gomes</surname>
                            <given-names>M</given-names>
                        </name>
				
                        <etal/>
			</person-group>:
                    <article-title>Naive Bayes classifiers for verbal autopsies: comparison to physician-based classification for 21,000 child and adult deaths.</article-title>
                    <source>
				
                        <italic toggle="yes">BMC Med.</italic>
			</source>
                    <year>2015</year>;<volume>13</volume>(<issue>1</issue>):<fpage>286</fpage>.
                    <pub-id pub-id-type="pmid">26607695</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s12916-015-0521-2</pub-id>
                    <pub-id pub-id-type="pmcid">4660822</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-12">
                <label>12</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Murray</surname>
                            <given-names>CJ</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Lozano</surname>
                            <given-names>R</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Flaxman</surname>
                            <given-names>AD</given-names>
                        </name>
				
                        <etal/>
			</person-group>:
                    <article-title>Using verbal autopsy to measure causes of death: the comparative performance of existing methods.</article-title>
                    <source>
				
                        <italic toggle="yes">BMC Med.</italic>
			</source>
                    <year>2014</year>;<volume>12</volume>(<issue>1</issue>):<fpage>5</fpage>.
                    <pub-id pub-id-type="pmid">24405531</pub-id>
                    <pub-id pub-id-type="doi">10.1186/1741-7015-12-5</pub-id>
                    <pub-id pub-id-type="pmcid">3891983</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-13">
                <label>13</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Byass</surname>
                            <given-names>P</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Huong</surname>
                            <given-names>DL</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Minh</surname>
                            <given-names>HV</given-names>
                        </name>
			</person-group>:
                    <article-title>A probabilistic approach to interpreting verbal autopsies: methodology and preliminary validation in Vietnam.</article-title>
                    <source>
				
                        <italic toggle="yes">Scand J Public Health Suppl.</italic>
			</source>
                    <year>2003</year>;<volume>31</volume>(<issue>62 suppl</issue>):<fpage>32</fpage>&#x2013;<lpage>7</lpage>.
                    <pub-id pub-id-type="pmid">14649636</pub-id>
                    <pub-id pub-id-type="doi">10.1080/14034950310015086</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-14">
                <label>14</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Serina</surname>
                            <given-names>P</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Riley</surname>
                            <given-names>I</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Stewart</surname>
                            <given-names>A</given-names>
                        </name>
				
                        <etal/>
			</person-group>:
                    <article-title>Improving performance of the Tariff Method for assigning causes of death to verbal autopsies.</article-title>
                    <source>
				
                        <italic toggle="yes">BMC Med.</italic>
			</source>
                    <year>2015</year>;<volume>13</volume>(<issue>1</issue>):<fpage>291</fpage>.
                    <pub-id pub-id-type="pmid">26644140</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s12916-015-0527-9</pub-id>
                    <pub-id pub-id-type="pmcid">4672473</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-15">
                <label>15</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Desai</surname>
                            <given-names>N</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Aleksandrowicz</surname>
                            <given-names>L</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Miasnikof</surname>
                            <given-names>P</given-names>
                        </name>
				
                        <etal/>
			</person-group>:
                    <article-title>Performance of four computer-coded verbal autopsy methods for cause of death assignment compared with physician coding on 24,000 deaths in low- and middle-income countries.</article-title>
                    <source>
				
                        <italic toggle="yes">BMC Med.</italic>
			</source>
                    <year>2014</year>;<volume>12</volume>(<issue>1</issue>):<fpage>20</fpage>.
                    <pub-id pub-id-type="pmid">24495855</pub-id>
                    <pub-id pub-id-type="doi">10.1186/1741-7015-12-20</pub-id>
                    <pub-id pub-id-type="pmcid">3912488</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-16">
                <label>16</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Garenne</surname>
                            <given-names>M</given-names>
                        </name>
			</person-group>:
                    <article-title>Prospects for automated diagnosis of verbal autopsies.</article-title>
                    <source>
				
                        <italic toggle="yes">BMC Med.</italic>
			</source>
                    <year>2014</year>;<volume>12</volume>(<issue>1</issue>):<fpage>18</fpage>.
                    <pub-id pub-id-type="pmid">24495788</pub-id>
                    <pub-id pub-id-type="doi">10.1186/1741-7015-12-18</pub-id>
                    <pub-id pub-id-type="pmcid">3912493</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-17">
                <label>17</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Mitchell</surname>
                            <given-names>TM</given-names>
                        </name>
			</person-group>:
                    <article-title>Machine learning.</article-title>WCB. McGraw-Hill Boston, MA;<year>1997</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://profsite.um.ac.ir/~monsefi/machine-learning/pdf/Machine-Learning-Tom-Mitchell.pdf">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-18">
                <label>18</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Witten</surname>
                            <given-names>IH</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Frank</surname>
                            <given-names>E</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Hall</surname>
                            <given-names>MA</given-names>
                        </name>
				
                        <etal/>
			</person-group>:
                    <article-title>Data Mining: Practical machine learning tools and techniques.</article-title>Morgan Kaufmann;<year>2016</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://books.google.co.in/books?id=1SylCgAAQBAJ&amp;printsec=frontcover&amp;dq=Data+Mining:+Practical+machine+learning+tools+and+techniques+2016&amp;hl=en&amp;sa=X&amp;ved=0ahUKEwignePL6uTeAhUIO48KHfTKDmgQ6AEIJzAA#v=onepage&amp;q&amp;f=false">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-19">
                <label>19</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Polat</surname>
                            <given-names>K</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>G&#x00fc;ne&#x015f;</surname>
                            <given-names>S</given-names>
                        </name>
			</person-group>:
                    <article-title>A novel hybrid intelligent method based on C4.5 decision tree classifier and one-against-all approach for multi-class classification problems.</article-title>
                    <source>
				
                        <italic toggle="yes">Expert Syst Appl.</italic>
			</source>
                    <year>2009</year>;<volume>36</volume>(<issue>2</issue>):<fpage>1587</fpage>&#x2013;<lpage>92</lpage>.
                    <pub-id pub-id-type="doi">10.1016/j.eswa.2007.11.051</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-20">
                <label>20</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Aleksandrowicz</surname>
                            <given-names>L</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Malhotra</surname>
                            <given-names>V</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Dikshit</surname>
                            <given-names>R</given-names>
                        </name>
				
                        <etal/>
			</person-group>:
                    <article-title>Performance criteria for verbal autopsy-based systems to estimate national causes of death: development and application to the Indian Million Death Study.</article-title>
                    <source>
				
                        <italic toggle="yes">BMC Med.</italic>
			</source>
                    <year>2014</year>;<volume>12</volume>(<issue>1</issue>):<fpage>21</fpage>.
                    <pub-id pub-id-type="pmid">24495287</pub-id>
                    <pub-id pub-id-type="doi">10.1186/1741-7015-12-21</pub-id>
                    <pub-id pub-id-type="pmcid">3912490</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-21">
                <label>21</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Kahn</surname>
                            <given-names>K</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Collinson</surname>
                            <given-names>MA</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>G&#x00f3;mez-Oliv&#x00e9;</surname>
                            <given-names>FX</given-names>
                        </name>
				
                        <etal/>
			</person-group>:
                    <article-title>Profile: Agincourt health and socio-demographic surveillance system.</article-title>
                    <source>
				
                        <italic toggle="yes">Int J Epidemiol.</italic>
			</source>
                    <year>2012</year>;<volume>41</volume>(<issue>4</issue>):<fpage>988</fpage>&#x2013;<lpage>1001</lpage>.
                    <pub-id pub-id-type="pmid">22933647</pub-id>
                    <pub-id pub-id-type="doi">10.1093/ije/dys115</pub-id>
                    <pub-id pub-id-type="pmcid">3429877</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-22">
                <label>22</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Streatfield</surname>
                            <given-names>P</given-names>
                        </name>
			</person-group>:
                    <article-title>Health and Demographic Surveillance System-Matlab: Registration of health and demographic events 2003</article-title>. International Center for Diarrheal Disease Research.<year>2005</year>.</mixed-citation>
            </ref>
            <ref id="ref-23">
                <label>23</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Byass</surname>
                            <given-names>P</given-names>
                        </name>
			</person-group>:
                    <article-title>Usefulness of the Population Health Metrics Research Consortium gold standard verbal autopsy data for general verbal autopsy methods.</article-title>
                    <source>
				
                        <italic toggle="yes">BMC Med.</italic>
			</source>
                    <year>2014</year>;<volume>12</volume>(<issue>1</issue>):<fpage>23</fpage>.
                    <pub-id pub-id-type="pmid">24495341</pub-id>
                    <pub-id pub-id-type="doi">10.1186/1741-7015-12-23</pub-id>
                    <pub-id pub-id-type="pmcid">3912496</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-24">
                <label>24</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Murray</surname>
                            <given-names>CJ</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Lopez</surname>
                            <given-names>AD</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Black</surname>
                            <given-names>R</given-names>
                        </name>
				
                        <etal/>
			</person-group>:
                    <article-title>Population Health Metrics Research Consortium gold standard verbal autopsy validation study: design, implementation, and development of analysis datasets.</article-title>
                    <source>
				
                        <italic toggle="yes">Popul Health Metr.</italic>
			</source>
                    <year>2011</year>;<volume>9</volume>(<issue>1</issue>):<fpage>27</fpage>.
                    <pub-id pub-id-type="pmid">21816095</pub-id>
                    <pub-id pub-id-type="doi">10.1186/1478-7954-9-27</pub-id>
                    <pub-id pub-id-type="pmcid">3160920</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-25">
                <label>25</label>
                <mixed-citation publication-type="journal">
                    <collab>WHO</collab>:
                    <article-title>International statistical classification of diseases and related health problems</article-title>.<year>2009</year>.</mixed-citation>
            </ref>
            <ref id="ref-26">
                <label>26</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Murray</surname>
                            <given-names>CJ</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Lozano</surname>
                            <given-names>R</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Flaxman</surname>
                            <given-names>AD</given-names>
                        </name>
				
                        <etal/>
			</person-group>:
                    <article-title>Robust metrics for assessing the performance of different verbal autopsy cause assignment methods in validation studies.</article-title>
                    <source>
				
                        <italic toggle="yes">Popul Health Metr.</italic>
			</source>
                    <year>2011</year>;<volume>9</volume>:<fpage>28</fpage>.
                    <pub-id pub-id-type="pmid">21816106</pub-id>
                    <pub-id pub-id-type="doi">10.1186/1478-7954-9-28</pub-id>
                    <pub-id pub-id-type="pmcid">3160921</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-27">
                <label>27</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>sshahriyar</surname>
                        </name>
			</person-group>:
                    <article-title>sshahriyar/va: OAA-NBC and Experiments (Version 0.0.1).</article-title>
                    <source>
				
                        <italic toggle="yes">Zenodo.</italic>
			</source>
                    <year>2018</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://www.doi.org/10.5281/zenodo.1489268">http://www.doi.org/10.5281/zenodo.1489268</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-28">
                <label>28</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
				
                        <name name-style="western">
                            <surname>Frigyik</surname>
                            <given-names>BA</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Kapila</surname>
                            <given-names>A</given-names>
                        </name>
				
                        <name name-style="western">
                            <surname>Gupta</surname>
                            <given-names>MR</given-names>
                        </name>
			</person-group>:
                    <article-title>Introduction to the Dirichlet Distribution and Related Processes.</article-title>Technical Report UWEETR-2010-0006, Department of Electrical Engineering, University of Washington.<year>2010</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://vannevar.ece.uw.edu/techsite/papers/documents/UWEETR-2010-0006.pdf">Reference Source</ext-link>
                </mixed-citation>
            </ref>
        </ref-list>
    </back>
    <sub-article article-type="reviewer-report" id="report26787">
        <front-stub>
            <article-id pub-id-type="doi">10.21956/gatesopenres.13987.r26787</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Lu</surname>
                        <given-names>Ying</given-names>
                    </name>
                    <xref ref-type="aff" rid="r26787a1">1</xref>
                    <role>Referee</role>
                </contrib>
                <aff id="r26787a1">
                    <label>1</label>Department of Applied Statistics, Social Sciences and Humanities, Steinhardt School of Education, Culture and Human Development, New York University, New York, NY, USA</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>3</day>
                <month>1</month>
                <year>2019</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2019 Lu Y</copyright-statement>
                <copyright-year>2019</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport26787" related-article-type="peer-reviewed-article" xlink:href="10.12688/gatesopenres.12891.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>First I would like to congratulate the authors for developing an effective solution to&#x00a0;the verbal autopsy classification problem. The results look very convincing, and the rationale of the methods seems to be reasonable. The source code is open-access.&#x00a0;</p>
            <p> </p>
            <p> I have several&#x00a0;questions / suggestions for&#x00a0;the authors: 
                <list list-type="order">
                    <list-item>
                        <p>How well does&#x00a0;the one-against-all method perform when the number of disease categories increases? Will the uncertainty go up significantly?</p>
                    </list-item>
                    <list-item>
                        <p>Since different NBC&#x00a0;is&#x00a0;fit for each COD, the probability of particular cause predicted for each death will be different. When the final cause is determined, it seems that these&#x00a0;individual probabilities should be weighted rather than just simply taking the max of all. The weights can be chosen to be the values that will optimize the&#x00a0;overall cause specific mortality rate distribution (CSMFs).&#x00a0;</p>
                    </list-item>
                    <list-item>
                        <p>For each NBC, it seems that some feature selections can be done to improve the accuracy of these individual predictions.&#x00a0;</p>
                    </list-item>
                </list>
            </p>
            <p>Is the rationale for developing the new method (or application) clearly explained?</p>
            <p>Yes</p>
            <p>Is the description of the method technically sound?</p>
            <p>Yes</p>
            <p>Are the conclusions about the method and its performance adequately supported by the findings presented in the article?</p>
            <p>Yes</p>
            <p>If any results are presented, are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Yes</p>
            <p>Are sufficient details provided to allow replication of the method development and its use by others?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>Applied statistics, classic statistical modeling, predictive analytics</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
        <sub-article article-type="response" id="comment3147-26787">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Murtaza</surname>
                            <given-names>Syed Shariyar</given-names>
                        </name>
                        <aff>Ryerson University, Canada</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>10</day>
                    <month>1</month>
                    <year>2019</year>
                </pub-date>
            </front-stub>
            <body>
                <p>Thank you for reviewing this article. Please find below replies to your questions.</p>
                <p>
                    <bold>Q1.</bold> How well does the one-against-all method perform when the number of disease categories increases? Will the uncertainty go up significantly?</p>
                <p>
                    <bold>REPLY 1:</bold> We are not sure if the reviewers mean CODs by disease categories or symptoms (features). We will answer from both perspectives. 
                    <list list-type="bullet">
                        <list-item>
                            <p>Number of CODs increases:</p>
                        </list-item>
                    </list> It will depend on the dataset. If we add a new COD with few (e.g., 10) records in the dataset, then the accuracy of classification will decrease slightly (e.g., approx. 1-2%). This is because a small ratio of records is not helpful in classification when some other CODs have very high ratio of records (e.g., few thousand records). Also, if the newer COD has most of the symptoms similar to another COD, then there are no sufficient discriminating factors between records of CODs. The accuracy of classification of this new category would remain low in this case too. If number of records are sufficient in ratio (e.g., at least 50-100) for newer category and there is sufficient discrimination in terms of symptoms then the machine learning approaches, like one-against-all method with Na&#x00ef;ve Bayes algorithm, will be able to classify records with good accuracy. In the case of current VA datasets, they accuracy of classification can be improved by increasing the ratio of records for categories of diseases that have a very small ratio of records compared to others and also by introducing better discriminating symptoms. The newer symptoms can be synthetic too by using other approaches in machine learning (see Answer 3). 
                    <list list-type="bullet">
                        <list-item>
                            <p>Number of Symptoms (Features) Increases:</p>
                        </list-item>
                    </list> If the number of symptoms increase and they increase the discriminating power between CODs then the accuracy will improve; otherwise, the increase in symptoms will not affect accuracy or will decrease the accuracy.</p>
                <p>
                    <bold>Q2.</bold> Since different NBC is fit for each COD, the probability of particular ......... overall cause specific mortality rate distribution (CSMFs).</p>
                <p>
                    <bold>REPLY 2:</bold> Each NBC generates a probability of COD and final list of predicted CODs from all NBCs is generated by sorting them in descending order by their probabilities. However, not all NBCs predict a COD with a probability, some NBCs also predict the cause &#x201c;Others&#x201d;&#x2014;recall that each NBC has two causes to predict: COD and &#x201c;Others&#x201d;. When &#x201c;Others&#x201d; cause is predicted then it means that NBC is predicting that the COD (that it knows) is not the real cause, and we can simply ignore &#x201c;Others&#x201d; prediction. In this way, for 15 NBCs there are different numbers of predicted CODs in the final list depending on the VA record.</p>
                <p>It is a good suggestion to weight the predictions of CODs and then sort the predicted CODs by their final weighted probabilities. CSMF distribution is highly imbalanced for CODs in the VA datasets. So assigning weights proportional to the CMSF distribution would increase the chances of prediction of CODs in majority but they are already predicted accurately because of their large number of records. This could eventually decrease the accuracy. However, in our view a better way would be use the weights inversely proportional to CSMF distributions because that would give a better chance to those CODs which have fewer records and&#x00a0; which are not correctly predicted by individual NBCs. This is a very good direction of research, we would like to explore this further in our future work and added to the future work section of our paper.</p>
                <p>
                    <bold>Q3.</bold> For each NBC, it seems that some feature selections can be done to improve the accuracy of these individual predictions.</p>
                <p>
                    <bold>REPLY 3:</bold> This is correct, better discriminating symptoms (features) can improve the accuracy of prediction for each COD (i.e., each NBC). Feature selection can be done subjectively by using expert judgements or by using feature selection algorithms in machine learning.&#x00a0; Accuracy could also be improved by introducing additional features, those features could be synthetic too; e.g., a feature X can be transformed into a new feature X+c where c is a constant, by taking its power such as X1/2, and by using similar such techniques. This could generate a new feature space that could help in better classifying the CODs. There are many feature selection methods and feature transformation methods. This will require another set of exploratory experiments to determine which one can actually improve accuracy of classification of CODs. This is a good direction of future work and we have added to the future work section of our paper.</p>
            </body>
        </sub-article>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report26788">
        <front-stub>
            <article-id pub-id-type="doi">10.21956/gatesopenres.13987.r26788</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Karat</surname>
                        <given-names>Aaron S.</given-names>
                    </name>
                    <xref ref-type="aff" rid="r26788a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-9643-664X</uri>
                </contrib>
                <contrib contrib-type="author">
                    <name>
                        <surname>Calvert</surname>
                        <given-names>Clara</given-names>
                    </name>
                    <xref ref-type="aff" rid="r26788a2">2</xref>
                    <role>Co-referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0003-3272-1040</uri>
                </contrib>
                <aff id="r26788a1">
                    <label>1</label>Department of Clinical Research, London School of Hygiene &amp; Tropical Medicine, London, UK</aff>
                <aff id="r26788a2">
                    <label>2</label>Department of Population Health, London School of Hygiene &amp; Tropical Medicine, London, UK</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>20</day>
                <month>12</month>
                <year>2018</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2018 Karat AS and Calvert C</copyright-statement>
                <copyright-year>2018</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport26788" related-article-type="peer-reviewed-article" xlink:href="10.12688/gatesopenres.12891.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>Thank you for the opportunity to review this article: it describes the development of a new method in an important area of global health and for the most part is well written and organised. Overall, the authors make a coherent argument, though we have a few suggestions on how certain aspects could be clarified or improved.</p>
            <p> </p>
            <p> 
                <bold>Introduction</bold> 
                <list list-type="order">
                    <list-item>
                        <p>This provides a good overview of the current state of automated VA classification (though describing King-Lu as a &#x2018;current leading&#x2019; method seems a bit of a stretch).&#x00a0;The justification for the development of this method could be fleshed out a little more, perhaps explaining (for those unfamiliar with how VA data feed into policy) why it is important for these methods to be more accurate.&#x00a0;To this end, the authors may want to consider citing the 2014 systematic review by Leitao et al. comparing PCVA with CCVA in LMIC and mentioning - even briefly - the large project underway to incorporate VA into CRVS systems (
                            <ext-link ext-link-type="uri" xlink:href="https://crvsgateway.info/A-stepwise-process~503">https://crvsgateway.info/A-stepwise-process~503</ext-link>).</p>
                    </list-item>
                    <list-item>
                        <p>The NBC is the model chosen for testing the one-against-all approach &#x2013; it would be good to include a couple of sentences justifying this choice based on previous literature before the final paragraph in the introduction.&#x00a0;</p>
                    </list-item>
                    <list-item>
                        <p>In general, we found the use of the term &#x2018;CoD diagnosis&#x2019; (used in the introduction and elsewhere in the manuscript) a little confusing.&#x00a0;We would suggest using &#x2018;assignment&#x2019; consistently throughout, to differentiate from &#x2018;diagnoses&#x2019; made by clinicians during life.</p>
                    </list-item>
                </list> </p>
            <p> 
                <bold>Methods</bold> 
                <list list-type="order">
                    <list-item>
                        <p>Though it is made reasonably clear in the text that the MDS, Agincourt, and Matlab CoD are based on physician review of VA data compared with physician review of clinical data for PHMRC CoD, we think that this distinction could (and should) be made more clearly and repeatedly throughout the manuscript, including in Tables 1, 2, and 4.&#x00a0;As the authors are no doubt aware, the use of PCVA CoD as a gold standard is not ideal, constituting, to some extent, a &#x2018;circular&#x2019; comparison, as both methods ultimately rely on the quality of the VA data.&#x00a0;We feel that the authors could make greater efforts to make clear (to the non-expert reader) this key difference between the different datasets. (Note the justification for the use of PCVA CoD as gold standard [page 6, under &#x2018;assessment methods&#x2019;] does not really address this issue &#x2013; a high level of agreement between physicians reading the same VA data does not have any bearing on the objective &#x2018;truth&#x2019; of their assignments.)</p>
                    </list-item>
                    <list-item>
                        <p>A minor point: the use of &#x2018;historical&#x2019; and &#x2018;new&#x2019; VA surveys in Figure 1 is potentially misleading, as it suggests that new data were collected and used to test the algorithm/s.&#x00a0;Would &#x2018;train&#x2019; and &#x2018;test&#x2019; be more appropriate?</p>
                    </list-item>
                    <list-item>
                        <p>Figure 2 is a helpful representation of the OAA approach.&#x00a0;It may be useful to combine figures 1 and 2, showing in one place the workings of the method and how it fits into the process and, perhaps, showing in more detail how outputs from the multiple models are &#x2018;re-assembled&#x2019; to give one list of causes and probabilities that can then be interpreted or compared with the outputs from other methods.</p>
                    </list-item>
                    <list-item>
                        <p>The authors provide a detailed description of the methods used to compare the CoD assigned by different methods, citing the guidance from Murray et al. in 2011 (reference 26). However, they do not report the chance-corrected CSMF accuracy (as described by Flaxman et al. (2015)
                            <sup>
                                <xref ref-type="bibr" rid="rep-ref-26788-1">1</xref>
                            </sup>) &#x2013; could the reasons for this be mentioned?&#x00a0;</p>
                    </list-item>
                    <list-item>
                        <p>Referring to CSMF accuracy as &#x201c;agreement&#x201d; is potentially confusing &#x2013; we would suggest using the full term or &#x201c;CSMFa&#x201d; throughout.</p>
                    </list-item>
                    <list-item>
                        <p>The description of the calculation of cumulative sensitivity is a little confusing.&#x00a0;Does the &#x2018;15% more correct at rank 2&#x2019; include only those which are also correct at rank 1? i.e., if methods corresponded at rank 2 but not at rank 1, would they be included in the cumulative estimate?&#x00a0;This is not a method previously described in the VA literature and is fairly central to the interpretation of the results presented, so more detail is necessary.&#x00a0;It would also be helpful to provide some justification for the choice of reporting cumulative sensitivity to rank 5.</p>
                    </list-item>
                    <list-item>
                        <p>A minor point: the description of computing the &#x2018;average&#x2019; sensitivity, PCCC, and agreement (page 7, column 2, end of paragraph 1) is a little vague &#x2013; please consider using &#x2018;mean&#x2019; or another appropriate technical term.</p>
                    </list-item>
                </list> &#x00a0;</p>
            <p> 
                <bold>Results</bold> 
                <list list-type="order">
                    <list-item>
                        <p>It is not clear from the text in paragraph 1 of the results that agreements for &#x2018;rank 5&#x2019; are cumulative from ranks 1 to 5. Stating that the most likely 
                            <italic>and</italic> the fifth most likely were used implies that ranks 2&#x2013;4 were excluded.&#x00a0;Similarly, in Table 3, although &#x2018;cumulative sensitivity&#x2019; is mentioned, the authors may want to consider changing the column headers from &#x2018;Rank 5&#x2019; to &#x2018;Ranks 1&#x2013;5&#x2019;, to signpost more clearly what the numbers represent.</p>
                    </list-item>
                    <list-item>
                        <p>The decision to display only estimates of sensitivity, without any estimates of PCCC, is defended in the text; however, for full transparency and to allow for comparison with other similar studies, none of which (to our knowledge) report on cumulative sensitivity, the authors should consider including these results in a supplementary table/appendix.&#x00a0;It would also be helpful to provide numeric values for the estimates of CSMF accuracy presented in Figure 3.</p>
                    </list-item>
                    <list-item>
                        <p>It is not clear what exactly was being tested when the authors write &#x201c;we conducted the Wilcoxon signed rank test on 35 observations of agreements for the five algorithms&#x201d; (page 9, second paragraph under &#x201c;Ranked sensitivity comparison&#x201d;). We assume that the PCCC values were being tested, but where does the 35 come from? It would be helpful to have more detail on this in the methods section.</p>
                    </list-item>
                    <list-item>
                        <p>If possible, please provide the exact p-values (or at least a range of values) from applying the Wilcoxon signed rank test to the population-level agreements; alternatively, these could be provided in a supplementary table. It would also be helpful to clarify whether there was any evidence that sensitivity was statistically different between OAA-NBC and NBC and include a p-value for this.</p>
                    </list-item>
                    <list-item>
                        <p>A minor point: when discussing the Wilcoxon signed rank statistical test, it is written that &#x201c;we also included rank two and rank three values&#x201d; &#x2013; what about rank 4 values? Why would this be left out?</p>
                    </list-item>
                </list> &#x00a0;</p>
            <p> 
                <bold>Discussion</bold> 
                <list list-type="order">
                    <list-item>
                        <p>A number of results are described in the discussion section (results of testing pre-trained models and Dirichlet distributions-based samples [Table 5]).&#x00a0;Might these more appropriately be moved to the results, with the corresponding methods described and tables included as appendices as needed?</p>
                    </list-item>
                    <list-item>
                        <p>In Table 6, it is not clear why &#x2018;HIV/AIDS-related deaths&#x2019; (ICD-10 codes B20&#x2013;B24) are included under &#x2018;other and unspecified infections&#x2019;.&#x00a0;Is this an error?&#x00a0;This does not correspond with the cause-specific sensitivities shown in Table 4 - please could the authors clarify?</p>
                    </list-item>
                    <list-item>
                        <p>It would be pertinent to acknowledge, again, the difference between the &#x2018;reference standards&#x2019; used for comparison and to discuss (even briefly) the potential implications of using CoD derived from VA data as reference.&#x00a0;Greater clarity in describing the two reference standards would also be useful; for example, in the first paragraph under &#x2018;Conclusions&#x2019;, describing &#x201c;dual physician 
                            <italic>assignment based on</italic> 
                            <italic>VA data</italic> and clinical diagnostic COD&#x2026;&#x201d; would more clearly make the distinction.</p>
                    </list-item>
                    <list-item>
                        <p>It would be useful to include a paragraph comparing the results of this exercise to previous validation exercises done on the algorithms; did the authors find similar results to, for example, James et al. (Pop Health Met 2011)?&#x00a0;If not, what are the differences in the exercises undertaken?</p>
                    </list-item>
                    <list-item>
                        <p>The reporting of cumulative sensitivity as the main measure of agreement is an unusual aspect of this study; acknowledgment of this as a potential limitation would help provide context for comparisons with other similar studies.</p>
                    </list-item>
                </list> &#x00a0;</p>
            <p> 
                <bold>Minor points</bold> 
                <list list-type="order">
                    <list-item>
                        <p>In the abstract, the authors write: &#x201c;The results demonstrate that our approach improves the classification from 6% to 8%&#x201d;, which could be interpreted as suggesting that sensitivity was only 6% in the other algorithms. Perhaps this could be re-phrased along the lines of &#x201c;The results demonstrate that our approach improves the classification by between 6% and 8% compared with the other algorithms&#x201d;.</p>
                    </list-item>
                    <list-item>
                        <p>The Matlab and Agincourt datasets are referred to as DHS &#x2013; it would be better to refer to them as Health and Demographic Surveillance Sites (HDSS) to prevent confusion between these and the Demographic and Health Surveys.</p>
                    </list-item>
                    <list-item>
                        <p>The article switches from past to present tense a number of times; for example, the first paragraph of methods is (mostly) in present tense, but most of the rest of the methods is in past tense. For consistency and to improve readability, we would suggest re-writing these passages in past tense.</p>
                    </list-item>
                    <list-item>
                        <p>Probable typo: methods, paragraph 2, line 15 &#x2013; &#x201c;datasets&#x201d; should be &#x201c;dataset&#x201d;?</p>
                    </list-item>
                    <list-item>
                        <p>Per normal conventions, please consider adding legends to tables and figures spelling out any acronyms used</p>
                    </list-item>
                </list>
            </p>
            <p>Is the rationale for developing the new method (or application) clearly explained?</p>
            <p>Yes</p>
            <p>Is the description of the method technically sound?</p>
            <p>Partly</p>
            <p>Are the conclusions about the method and its performance adequately supported by the findings presented in the article?</p>
            <p>Partly</p>
            <p>If any results are presented, are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Yes</p>
            <p>Are sufficient details provided to allow replication of the method development and its use by others?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>HIV &amp; TB epidemiology, demographic surveillance, maternal health, verbal autopsy methods</p>
            <p>We confirm that we have read this submission and believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however we have significant reservations, as outlined above.</p>
        </body>
        <back>
            <ref-list>
                <title>References</title>
                <ref id="rep-ref-26788-1">
                    <label>1</label>
                    <mixed-citation publication-type="journal">
                        <person-group person-group-type="author"/>:
                        <article-title>Measuring causes of death in populations: a new metric that corrects cause-specific mortality fractions for chance.</article-title>
                        <source>
                            <italic>Popul Health Metr</italic>
                        </source>.<year>2015</year>;<volume>13</volume>:
                        <elocation-id>10.1186/s12963-015-0061-1</elocation-id>
                        <fpage>28</fpage>
                        <pub-id pub-id-type="pmid">26464564</pub-id>
                        <pub-id pub-id-type="doi">10.1186/s12963-015-0061-1</pub-id>
                    </mixed-citation>
                </ref>
            </ref-list>
        </back>
        <sub-article article-type="response" id="comment3146-26788">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Murtaza</surname>
                            <given-names>Syed Shariyar</given-names>
                        </name>
                        <aff>Ryerson University, Canada</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>10</day>
                    <month>1</month>
                    <year>2019</year>
                </pub-date>
            </front-stub>
            <body>
                <p>Thank you for reviewing this article. Please find below replies to your questions. We have also submitted a modified version with your recommendations.</p>
                <p>
                    <bold>Introduction</bold>
                </p>
                <p>Q1. This provides a good overview of the current state of automated VA classification (though describing King-Lu as a &#x2018;current leading&#x2019; method seems a bit of a stretch). ...................orate VA into CRVS systems (
                    <ext-link ext-link-type="uri" xlink:href="https://crvsgateway.info/A-stepwise-process~503">https://crvsgateway.info/A-stepwise-process~503</ext-link>).</p>
                <p>REPLY 1: We have made the modification in the Introduction Section.</p>
                <p>Q2. The NBC is the model chosen for testing the one-against-all approach ........... introduction.&#x00a0;</p>
                <p>REPLY 2: In our earlier version, we have discussed the justification in Methods section. We have also added a similar statement in the last paragraph of Introduction as suggested by reviewers.</p>
                <p>Q3: In general, we found the use of the term &#x2018;CoD diagnosis&#x2019;....................made by clinicians during life.</p>
                <p>REPLY 3: We have changed diagnosis to assignment to avoid confusion as suggested.</p>
                <p>
                    <bold>Methods</bold>
                </p>
                <p>Q4. Though it is made reasonably clear in the text that the MDS, Agincourt, and Matlab CoD are based ....................... &#x2018;truth&#x2019; of their assignments.)</p>
                <p>REPLY 4: Table 4 already had a footnote; however, based on your suggestion we clarified it further and we added footnotes for Table 1 and Table 2.</p>
                <p>Q5. A minor point: the use of &#x2018;historical&#x2019; and &#x2018;new&#x2019; VA surveys in ..............&#x00a0; &#x2018;test&#x2019; be more appropriate?</p>
                <p>REPLY 5: We have modified the figure with the inclusion of the terms suggested.</p>
                <p>Q6. Figure 2 is a helpful representation of the OAA approach............... or compared with the outputs from other methods.</p>
                <p>REPLY 6: This is a good suggestion but we would like to keep the figures separate as it would clutter one figure, and the abstraction makes it easier to understand. However, we made some changes in Figure 1 which captures your suggestion. The output of each model is assembled in a simple manner: COD predictions/assignments from all models are simply sorted by their probability of prediction/assignments.</p>
                <p>Q7. The authors provide a detailed description of the methods ................................... be mentioned?&#x00a0;</p>
                <p>REPLY 7:&#x00a0; CSMF accuracy is the most widely used measure in the VA assessment studies, and this is the primary reason for choosing it in our study too. Chance-corrected CSMF (CCCSMF) accuracy could have been used in our study but it would not have made any difference in the value of overall results other than reducing the CSMF accuracy values for each algorithm/method. This can be understood from this equation presented by Flaxman et al. for chance correcting previous results: CCCSMF= (CSMF- mean random allocation / 1 &#x2013; mean random allocation). The mean random allocation values in this equation for a dataset are measured by performing random predictions using Dirichlet distribution many times and taking their mean. This would be a constant number for a dataset, and it would only end up reducing a CSMF accuracy value by a constant rate only.</p>
                <p>Furthermore, we have shown results separately using Dirichlet distribution for different datasets and methods. We have also shown results on individual causes alongside individual sensitivity measures. All these different perspectives mitigate the doubts of incorrect reported performances of methods in our study.&#x00a0;</p>
                <p>On another note, the use of Dirichlet distribution method only duplicates or reduces VA records in a training or test dataset, which actually only result in reduce performance of methods. An appropriate approach would be to have a training set with all variations of a cause of deaths that are expected to be observed in the field.</p>
                <p>Q8. Referring to CSMF accuracy as &#x201c;agreement&#x201d; is potentially confusing &#x2013; we would suggest using the full term or &#x201c;CSMFa&#x201d; throughout.</p>
                <p>REPLY 8: We have changed agreement to CSMF accuracy throughout the entire paper as suggested.</p>
                <p>Q9. The description of the calculation of cumulative sensitivity is a little confusing. ............&#x00a0; provide some justification for the choice of reporting cumulative sensitivity to rank 5.</p>
                <p>REPLY 9: Reporting cumulative results is a popular approach used in applied machine learning and software engineering literature (see for example [1][2][3]). Random probability of prediction of causes of a problem is 1/N, where N is the number of causes. When data is not big, not separable, and has many causes, first rank prediction from any algorithm would not reach close to 100% mark.&#x00a0; It is then useful to know how an algorithm would fare on top few predictions of causes (e.g., top 3 ranks, top 5 ranks, etc.) because an accuracy of 90% on top 4 causes implies that there is a 25% (1/4) probability of 90% accurate sensitivity (predictions). This is better than reviewing N causes (15 approximately in our datasets) which has a probability of 6.6% success.</p>
                <p>Yes, if an algorithm has 15% sensitivity at rank 1 and 20 % sensitivity at rank 2 then cumulative sensitivity would be 35% at rank 2. Sensitivity at rank N is the sum of sensitivity values from rank 1 to rank N. Consider a method A has sensitivity values for top two ranks 30% &amp; 20%, and a method B has sensitivity values 20% &amp; 30% for top 2 ranks. The cumulative sensitivity values at rank 2 for both methods A and B would be 50%.&#x00a0; However, this was not the case in our experiments. OAA-NBC consistently yielded better results at all ranks (from 1 to 5 and afterwards). The reason for choosing top 5 ranked predictions is subjective and it could have been top 4 or top 3 too.</p>
                <p>The concept of cumulative reporting is straightforward, it does not affect traditional method of reporting results (which is only about first rank), and only adds additional information to the existing way of reporting. This should not be a source of concern for evaluation. We have modified text in the last paragraph of Assessment Methods section in Methods section to make the explanation clearer.</p>
                <p>[1] S. S. Murtaza, N. H. Madhavji, M. Gittens and A. Hamou-Lhadj, "Identifying Recurring Faulty Functions in Field Traces of a Large Industrial Software System," in 
                    <italic>IEEE Transactions on Reliability</italic>, vol. 64, no. 1, pages 269-283, &#x00a0;2015.</p>
                <p>[2] W. Wong,&#x00a0; V. Debroy, R. Golden, X. Xiaofeng, B. Thuraisingham, Effective software fault localization using &#x00a0;an RBF neural network,&#x00a0; IEEE Trans. Reliab, Issue 61, Vol 1, pages 149&#x2013;169, &#x00a0;2012.</p>
                <p>[3] S. S. Murtaza, A. Hamou-Lhadj, N. H. Madhavji, M. Gittens, An empirical study on the use of mutant traces for diagnosis of faults in deployed systems, Journal of Systems and Software, Volume 90, pages 29-44, 2014.&#x00a0;</p>
                <p>Q10. A minor point: the description of computing the &#x2018;average&#x2019; sensitivity, PCCC, and agreement (page 7, column 2, end of paragraph 1) is a little vague &#x2013; please consider using &#x2018;mean&#x2019; or another appropriate technical term.</p>
                <p>REPLY 10: We have made the modification.</p>
                <p>
                    <bold>Results</bold>
                </p>
                <p>Q11. It is not clear from the text in paragraph 1 of the results that agreements..........to signpost more clearly what the numbers represent.</p>
                <p>REPLY 11: We have made the modifications everywhere in the text to further articulate that fifth rank represents the cumulative value from rank 1 to rank 5 as per your suggestion.</p>
                <p>Q12. The decision to display only estimates of sensitivity, ................. estimates of CSMF accuracy presented in Figure 3.</p>
                <p>REPLY 12: The reason for removing results (rank values, PCCC values, etc.) is to avoid cluttering of text with lots of tables and increase readability. We have added an appendix in the paper which reports all the results including sensitivity, PCC, CSMF accuracy and values at different ranks.</p>
                <p>Q13. It is not clear what exactly was being tested when the ...........It would be helpful to have more detail on this in the methods section.</p>
                <p>REPLY 13: 35 observations refer to 5 ranked (rank 1 to rank 5) prediction values across the seven VA datasets. So for each algorithm we have 35 observations of sensitivity values, PCC and CSMF values. All the data is now present in Appendix A. The word agreement has been removed and replaced with CSMF accuracy.</p>
                <p>Q14. If possible, please provide the exact p-values (or at least a range of values) from applying the Wilcoxon signed rank test.......... OAA-NBC and NBC and include a p-value for this.</p>
                <p>REPLY 14: We already provided the evidence of p values between OAA-NBC and NBC. Below is a sample from the text of Results section:</p>
                <p>&#x201c;We also performed a Wilcoxon signed rank statistical test on the reported sensitivity in Table 3, generated from the five&#x2026;&#x2026;&#x2026;&#x2026;&#x2026;.. the Wilcoxon signed ranked test yielded Z-score=5.194 and two tailed p-value=2.47 x 10 
                    <sup>-7</sup> between OAA-NBC and NBC.&#x201d;</p>
                <p>For Wilcoxon test on CSMF accuracy we found&#x00a0; the Z-score=4.248 and p-value =2.15 x 10
                    <sup>-5</sup> between OAA-NBC and NBC. Exactly same values were also obtained for test of OAA-NBC and other algorithms in a pairwise manner.</p>
                <p>The p values are extremely small in all the comparisons of OAA-NBC against other algorithms for both sensitivity and CSMF accuracy. Since the values are the same (for CSMF and for sensitivity; see Results Section), it is not worth showing these many similar p values, especially now all data is present in Appendix A and is trivial to determine the p values.</p>
                <p>We didn&#x2019;t perform the test for PCC as those values are similar to sensitivity values and would not add any additional information. Finally, we have made modifications in the text to show exact p values for CSMF accuracy values of OAA-NBC vs NBC too.&#x201d;</p>
                <p>Q15. A minor point: when discussing the Wilcoxon signed rank......... would this be left out?</p>
                <p>REPLY 15: Thank you for pointing this out. It was a typo, and we changed it to &#x201c;rank two to rank four&#x201d; in the text. All rank 1 to rank 5 values were used.</p>
                <p>
                    <bold>Discussion</bold>
                </p>
                <p>Q16. A number of results are described in the discussion section ................. tables included as appendices as needed?</p>
                <p>REPLY 16: We would like to keep these results separate from the main results and actual method as they are not part of the proposed method. Dirichlet distribution based variations of the test set is not the recommended approach in standard text of machine learning; however, researchers in VA studies have used this method for evaluation of algorithms. So, for consistent comparison with the literature we have also performed experiments using Dirichlet distribution. We have also added details of results based on Dirichlet distribution in the Appendix A. Similarly, we would like to keep pre-trained models separate too as all other algorithms have customized training. Pre-trained models actually generate poor results and it is not fair to compare them with the customized model in the Results section.</p>
                <p>Q17. In Table 6, it is not clear why &#x2018;HIV/AIDS-related deaths&#x2019;....... please could the authors clarify?</p>
                <p>REPLY 17: Thank you for noting this as this was a typo, we fixed this error in the Table 6.</p>
                <p>Q18. It would be pertinent to acknowledge, again, ................. describing &#x201c;dual physician 
                    <italic>assignment based on</italic> 
                    <italic>VA data</italic> and clinical diagnostic COD&#x2026;&#x201d; would more clearly make the distinction.</p>
                <p>REPLY 18: Thank you for pointing this out, we have modified the text as suggested.</p>
                <p>Q19. It would be useful to include a paragraph comparing the results .......... If not, what are the differences in the exercises undertaken?</p>
                <p>REPLY 19: In terms of the paper pointed out by reviewers on Tariff algorithm, our results show that for PHMRC adult and child, PCCC values remain around 30%&#x00a0; for the first rank (see Appendix A) &#x00a0;and James et al. reported in the range of 22-40% (for only the first rank). Similarly, mean CSMF values in our case remain closer to 70% and their median CSMF values also remain closer to 70%. The main difference, however, is that they have partitioned PHMRC data based on health care experience, and we have used all PHMRC data and a partition of PHRMC based on Indian origin. It is not possible to compare the results exactly due to different partitions.</p>
                <p>We have added complete details of the results in the Appendix A, and it should be transparent now in terms of comparison with any paper. Due to many differences in the setup of the experiments (as noted above), it is not possible to write the similarities and differences with all the past studies in one paragraph. This is mitigated by the fact that we have executed all the algorithms and shown all the results in a transparent manner. Thus, individual comparisons with studies will not generate any value in terms of comparisons of results.</p>
                <p>Q20. The reporting of cumulative sensitivity as the main measure of agreement is an unusual aspect of this study; acknowledgment of this as a potential limitation would help provide context for comparisons with other similar studies.</p>
                <p>REPLY 20: We disagree with the reviewers on this comment. There seems to be some confusion around this concept with reviewers. We have added clarification in the text about the concept as per their earlier comment. We have not introduced any new way of measuring performance of algorithms; in fact the cumulative frequency, cumulative distributions, etc. are common concepts in statistics. It is also common in applied machine learning literature (see above). In the case of top rank prediction (rank 1), results are the same as traditional method of reporting sensitivity, PCCC, CMSF or any other measure. For the next most likely predictions&#x2014;i.e., rank 2 and onwards, cumulative values just show the sum of previous values. It is a very simple concept; it only provides additional information and does not hide or conceal any results. This is actually the richness of the information in the paper and not the weakness of the paper in any way because earlier researchers have not shown such information. We believe that public health community will only benefit more from such information.</p>
                <p>
                    <bold>Minor points</bold>
                </p>
                <p>Q21. In the abstract, the authors write: ................... &#x201c;The results demonstrate that our approach improves the classification by between 6% and 8% compared with the other algorithms&#x201d;.</p>
                <p>REPLY 21: We have made the change as per your suggestion.</p>
                <p>Q22. The Matlab and Agincourt datasets are referred to as DHS &#x2013; it would be better to refer to them as Health and Demographic Surveillance Sites (HDSS) to prevent confusion between these and the Demographic and Health Surveys.</p>
                <p>REPLY 22: Thank you for pointing this out. We changed the text &#x201c;South African Agincourt Demographic and Health Survey (DHS) dataset , and Bangladeshi Matlab DHS dataset&#x201d; to &#x201c;South African Agincourt Demographic Surveillance Sites (HDSS) dataset , and Bangladeshi Matlab HDSS dataset.&#x201d;</p>
                <p>Q23. The article switches from past to present tense a number of times; ....................we would suggest re-writing these passages in past tense.</p>
                <p>REPLY 23: Changed</p>
                <p>Q24. Probable typo: methods, paragraph 2, line 15 &#x2013; &#x201c;datasets&#x201d; should be &#x201c;dataset&#x201d;?</p>
                <p>REPLY 24: Changed</p>
                <p>Q25. Per normal conventions, please consider adding legends to tables and figures spelling out any acronyms used</p>
                <p>REPLY 25: We have carefully reviewed all the tables and figures. We have added description of acronyms for WHO, ICD, COD and VA. For the names of datasets and algorithms, expansions of their acronyms in the tables seem to add lots of redundant information for known items, we have avoided that.</p>
            </body>
        </sub-article>
    </sub-article>
</article>
