<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article" dtd-version="1.2" xml:lang="en">
    <front>
        <journal-meta>
            <journal-id journal-id-type="pmc">Gates Open Res</journal-id>
            <journal-title-group>
                <journal-title>Gates Open Research</journal-title>
            </journal-title-group>
            <issn pub-type="epub">2572-4754</issn>
            <publisher>
                <publisher-name>F1000 Research Limited</publisher-name>
                <publisher-loc>London, UK</publisher-loc>
            </publisher>
        </journal-meta>
        <article-meta>
            <article-id pub-id-type="doi">10.12688/gatesopenres.13202.1</article-id>
            <article-categories>
                <subj-group subj-group-type="heading">
                    <subject>Research Article</subject>
                </subj-group>
                <subj-group>
                    <subject>Articles</subject>
                </subj-group>
            </article-categories>
            <title-group>
                <article-title>The relative incidence of COVID-19 in healthcare workers versus non-healthcare workers: evidence from a web-based survey of Facebook users in the United States</article-title>
                <fn-group content-type="pub-status">
                    <fn>
                        <p>[version 1; peer review: 2 approved with reservations, 1 not approved]</p>
                    </fn>
                </fn-group>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author" corresp="yes">
                    <name>
                        <surname>Flaxman</surname>
                        <given-names>Abraham D.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-6033-4713</uri>
                    <xref ref-type="corresp" rid="c1">a</xref>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Henning</surname>
                        <given-names>Daniel J.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a2">2</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Duber</surname>
                        <given-names>Herbert C.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-5077-3170</uri>
                    <xref ref-type="aff" rid="a1">1</xref>
                    <xref ref-type="aff" rid="a2">2</xref>
                </contrib>
                <aff id="a1">
                    <label>1</label>Institute for Health Metrics and Evaluation, University of Washigton, Seattle, WA, 98195, USA</aff>
                <aff id="a2">
                    <label>2</label>Department of Emergency Medicine, University of Washigton, Seattle, WA, 98195, USA</aff>
            </contrib-group>
            <author-notes>
                <corresp id="c1">
                    <label>a</label>
                    <email xlink:href="mailto:abie@uw.edu">abie@uw.edu</email>
                </corresp>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>ADF has consulted recently for Janssen; SwissRe; Sanofi; Merck for Mothers; and Agathos, Ltd. DJH has received research funding from Baxter and performed consulting services for Cytovale. HCD has no competing interests to disclose.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>27</day>
                <month>11</month>
                <year>2020</year>
            </pub-date>
            <pub-date pub-type="collection">
                <year>2020</year>
            </pub-date>
            <volume>4</volume>
            <elocation-id>174</elocation-id>
            <history>
                <date date-type="accepted">
                    <day>13</day>
                    <month>11</month>
                    <year>2020</year>
                </date>
            </history>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2020 Flaxman AD et al.</copyright-statement>
                <copyright-year>2020</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <self-uri content-type="pdf" xlink:href="https://gatesopenresearch.org/articles/4-174/pdf"/>
            <abstract>
                <p>
                    <bold>Background</bold>: Healthcare workers are at the forefront of the COVID-19 pandemic and it is essential to monitor the relative infection rate of this group, as compared to workers in other occupations. This study aimed to produce estimates of the relative incidence ratio between healthcare workers and workers in non-healthcare occupations.</p>
                <p>
                    <bold>Methods</bold>: Analysis of cross-sectional data from a daily, web-based survey of 1,788,795 Facebook users from September 6, 2020 to October 18, 2020. Participants were Facebook users in the United States aged 18 and above who were tested for COVID-19 because of an employer or school requirement in the past 14 days. The exposure variable was a self-reported history of working in healthcare in the past four weeks and the main outcome was a self-reported positive test for COVID-19.</p>
                <p>
                    <bold>Results</bold>: On October 18, 2020, in the United States, there was a relative COVID-19 incidence ratio of 0.7 (95% UI 0.6 to 0.8) between healthcare workers and workers in non-healthcare occupations.</p>
                <p>
                    <bold>Conclusions:</bold> Currently  in the United States, healthcare workers have a substantially and significantly lower COVID-19 incidence rate than workers in non-healthcare occupations.</p>
            </abstract>
            <kwd-group kwd-group-type="author">
                <kwd>COVID-19</kwd>
                <kwd>healthcare workers</kwd>
            </kwd-group>
            <funding-group>
                <award-group id="fund-1">
                    <funding-source>Gates Foundation</funding-source>
                    <award-id>OPP1170133</award-id>
                </award-group>
                <award-group id="fund-2" xlink:href="http://dx.doi.org/10.13039/100000001">
                    <funding-source>National Science Foundation</funding-source>
                    <award-id>DMS-1839116</award-id>
                </award-group>
                <funding-statement>This work was supported by the Gates Foundation [OPP1170133] and the National Science Foundation [DMS-1839116].</funding-statement>
                <funding-statement>
                    <italic>The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.</italic>
                </funding-statement>
            </funding-group>
        </article-meta>
    </front>
    <body>
        <sec sec-type="intro">
            <title>Introduction</title>
            <p>In August, the Peterson-KFF Health System Tracker published a collection of charts showing how healthcare utilization has declined during the COVID-19 pandemic in the United States
                <sup>
                    <xref ref-type="bibr" rid="ref-1">1</xref>
                </sup>, showing that facility discharge volume dropped by over 25% and cancer screening volumes dropped by over 85% from levels in 2019. This decrease is consistent with evidence from other sources
                <sup>
                    <xref ref-type="bibr" rid="ref-2">2</xref>,
                    <xref ref-type="bibr" rid="ref-3">3</xref>
                </sup>, and could be driven by a perceived risk of interacting with workers at health facilities. It is yet to be seen how much this delayed and foregone care will reduce population health. Meanwhile, a Wall Street Journal analysis of Centers for Disease Control and Prevention (CDC) data found that at least 7,400 COVID-19 infections were transmitted in US hospitals in 2020
                <sup>
                    <xref ref-type="bibr" rid="ref-4">4</xref>
                </sup>. Access to adequate resources for infection prevention among health care workers (HCWs) remains a topic of urgent importance
                <sup>
                    <xref ref-type="bibr" rid="ref-5">5</xref>
                </sup>.</p>
            <p>There is currently no population-based evidence quantifying the relative COVID-19 incidence rate among HCWs as compared to workers in non-healthcare occupations (non-HCWs) in the US. We hypothesized that there is not a substantially elevated rate of COVID-19 infection among HCWs and that HCWs might even have lower incidence rate than non-HCWs, and we analyzed data from a large survey of Facebook users to investigate.</p>
        </sec>
        <sec sec-type="methods">
            <title>Methods</title>
            <sec>
                <title>Study design</title>
                <p>We analyzed individual participant data from a large, web-based survey of Facebook users aged 18 and above in the United States (around 300,000 respondents per week). Every day Facebook offered a random sample of US-based users a Qualtrics survey run by the Delphi lab at Carnegie Mellon University who made it rapidly available to other academic researchers
                    <sup>
                        <xref ref-type="bibr" rid="ref-5">6</xref>
                    </sup>. Facebook also provided survey weights to adjust for the demographics of the active Facebook user population
                    <sup>
                        <xref ref-type="bibr" rid="ref-7">7</xref>,
                        <xref ref-type="bibr" rid="ref-8">8</xref>
                    </sup>. This sort of survey data has been used previously to perform population based analyses related to COVID-19, though never before at such large scale
                    <sup>
                        <xref ref-type="bibr" rid="ref-9">9</xref>,
                        <xref ref-type="bibr" rid="ref-10">10</xref>
                    </sup>. Our analysis relied on the responses to two lines of questions: (1) questions about recent work history, worded as, &#x201c;In the past 4 weeks, did you do any kind of work for pay?&#x201d; and if so, &#x201c;[p]lease select the occupational group that best fits the main kind of work you were doing in the last four weeks&#x201d;; and (2) questions about COVID-19 testing history, worded as, &#x201c;Have you 
                    <bold>ever</bold> been tested for coronavirus (COVID-19)?&#x201d;, &#x201c;[h]ave you been tested for coronavirus (COVID-19) in the 
                    <bold>last 14 days</bold>?&#x201d;, &#x201c;[d]id this test find that you had coronavirus (COVID-19)&#x201d;, and &#x201c;[d]o any of the following reasons describe why you were tested for coronavirus (COVID-19) in 
                    <bold>the last 14 days</bold>? Please select all that apply.&#x201d;</p>
                <p>We analyzed the most recently available six weeks of data from September 6, 2020 to October 18, 2020, which provided more than 80% power to detect a 30% difference between COVID-19 prevalence in HCWs and non-HCWs (details below).</p>
            </sec>
            <sec>
                <title>Variables</title>
                <p> To quantify the relative risk of COVID-19 among healthcare workers (HCWs) versus workers in non-healthcare occupations (non-HCWs), we used the response to the occupational group question as our exposure variable (we coded respondents who selected option &#x201c;Healthcare practitioners and technicians&#x201d; or &#x201c;Healthcare support&#x201d; as HCWs, and all others, including those with a missing value, as non-HCWs). We identified individuals with COVID-19 as those who reported that they had tested positive for COVID-19 in the last 14 days.</p>
            </sec>
            <sec>
                <title>Statistical methods</title>
                <p>We calculated the endorsement rate of positive COVID-19 test (ER) for the HCW and non-HCW population as the survey-weighted percent of respondents in either group who reported COVID-19, and calculated the relative COVID-19 incidence ratio (RR) with the equation</p>
                <p>&#x00a0;&#x00a0;&#x00a0;RR = (ER among HCWs) / (ER among non-HCWs).</p>
                <p>We quantified the uncertainty in this ratio using non-parametric bootstrap resampling to obtain a 95% uncertainty interval
                    <sup>
                        <xref ref-type="bibr" rid="ref-11">11</xref>
                    </sup>. To control for confounding due to differential access to COVID-19 testing, we restricted our analysis to only HCWs and non-HCWs who were tested in the last 14 days because their employer or school required it.</p>
                <p>As sensitivity analyses, we considered also alternative inclusion criteria and more restrictive subsets of HCWs. The survey provided sample weights that adjust for non-response bias, which we used in our main analysis. As a sensitivity analysis, we repeated our calculation using the unweighted data. To investigate the possibility that workplace testing practices differ between HCW and non-HCW occupational settings, we also repeated our analysis with additional filtering based on the &#x201c;why you were tested&#x201d; question.  In the main result we used the subset of individuals who responded that they were tested in the last 14 days because of employer/educational requirements, and this question has a &#x201c;select all that apply&#x201d; answer type, and also includes &#x201c;I felt sick&#x201d; as an option. As a sensitivity analysis, we used only those individuals who were tested because of a workplace requirement 
                    <italic toggle="yes">and</italic> did not feel sick.</p>
                <p>
                    <italic toggle="yes">Power calculation:</italic> To determine the sample size necessary to detect a difference of 30% between the COVID-19 prevalence of HCWs and non-HCWs, we developed a small simulation model where the fraction of HCWs in the general population and the COVID-19 prevalence in the general population both match that observed in the survey data.</p>
                <p>Of respondents who were tested in the last 14 days because their employer or school required it, 33.9% were HCWs and 4.9% tested positive for COVID-19, so we simulated populations of size 
                    <italic toggle="yes">n</italic> with these fractions of HCWs and this positive rate among the non-HCW population. We made the positive rate among the HCW population 30% lower:</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="color:#007020; font-weight:bold">def</styled-content> 
                        <styled-content style="color:#000000">sim_data(n_simulants):</styled-content>
    
                        <styled-content style="color:#000000">frac_hcw</styled-content> 
                        <styled-content style="color:#666666">=</styled-content> 
                        <styled-content style="color:#40A070;">.339</styled-content>
    
                        <styled-content style="color:#000000">frac_cli</styled-content> 
                        <styled-content style="color:#666666">=</styled-content> 
                        <styled-content style="color:#40A070;">.049</styled-content>
    
                        <styled-content style="color:#000000">rr_hcw</styled-content> 
                        <styled-content style="color:#666666">=</styled-content> 
                        <styled-content style="color:#40A070;">0.7</styled-content>
    
    
                        <styled-content style="color:#000000">data</styled-content> 
                        <styled-content style="color:#666666">=</styled-content> 
                        <styled-content style="color:#000000">pd.DataFrame(index=range(n_simulants))</styled-content>
    
                        <styled-content style="color:#000000">data[</styled-content>
                        <styled-content style="color:#4070C6">'hcw'</styled-content>
                        <styled-content style="color:#000000">] </styled-content> 
                        <styled-content style="color:#666666">=</styled-content> 
                        <styled-content style="color:#000000">np.random.uniform(size</styled-content>
                        <styled-content style="color:#666666">=</styled-content>
                        <styled-content style="color:#000000">n_simulants)</styled-content> 
                        <styled-content style="color:#666666">&lt;</styled-content> 
                        <styled-content style="color:#000000">frac_hcw</styled-content>
    
                        <styled-content style="color:#000000">cli_pr</styled-content> 
                        <styled-content style="color:#666666">=</styled-content> 
                        <styled-content style="color:#000000">np.where(data.hcw, rr_hcw</styled-content> 
                        <styled-content style="color:#666666">*</styled-content> 
                        <styled-content style="color:#000000">frac_cli, frac_cli)</styled-content>
    
                        <styled-content style="color:#000000">data[</styled-content>
                        <styled-content style="color:#4070C6">'cli'</styled-content>
                        <styled-content style="color:#000000">]</styled-content> 
                        <styled-content style="color:#666666">=</styled-content> 
                        <styled-content style="color:#000000">np.random.uniform(size</styled-content>
                        <styled-content style="color:#666666">=</styled-content>
                        <styled-content style="color:#000000">n_simulants)</styled-content> 
                        <styled-content style="color:#666666">&lt;</styled-content> 
                        <styled-content style="color:#000000">cli_pr</styled-content>
    
                        <styled-content style="color:#007020; font-weight:bold">return</styled-content> data</preformat>
                </p>
                <p>Then for populations of ranging in size from 
                    <italic toggle="yes">n =</italic> 500 to 9,500, we repeatedly synthesized a simulated population, calculated the RR of COVID-19 between the HCWs and non-HCWs as described in the main text, and checked if the upper bound of the uncertainty interval was less than 1.0.  We replicated this experiment 10,000 times for each population size 
                    <italic toggle="yes">n</italic> and found the 
                    <italic toggle="yes">n</italic> where at least 80% of the experimental replications where the uncertainty interval upper bound was less than one.</p>
            </sec>
            <sec>
                <title>Ethical statement</title>
                <p>These research activities used no identifiable private information and were therefore exempt from institutional board review.</p>
            </sec>
        </sec>
        <sec sec-type="results">
            <title>Results</title>
            <p>The survey data contained 40,552 respondents who were tested due to workplace requirements in the time period we focused on, 13,747 HCWs and 26,805 non-HCWs (see 
                <xref ref-type="table" rid="T1">Table 1</xref> for demographic details). There were 1,993 respondents who reported a positive test for COVID-19 in the last 14 days (527 among HCWs and 1,466 among non-HCWs).</p>
            <table-wrap id="T1" orientation="portrait" position="anchor">
                <label>Table 1. </label>
                <caption>
                    <title>Characteristics of survey respondents.</title>
                </caption>
                <table content-type="article-table" frame="hsides">
                    <thead>
                        <tr>
                            <th align="center" colspan="1" rowspan="1"/>
                            <th align="center" colspan="2" rowspan="1">Non- healthcare workers</th>
                            <th align="center" colspan="2" rowspan="1">Healthcare workers</th>
                        </tr>
                        <tr>
                            <th align="right" colspan="1" rowspan="1"/>
                            <th align="right" colspan="1" rowspan="1">n</th>
                            <th align="center" colspan="1" rowspan="1">(%)</th>
                            <th align="right" colspan="1" rowspan="1">n</th>
                            <th align="center" colspan="1" rowspan="1">(%)</th>
                        </tr>
                    </thead>
                    <tbody>
                        <tr>
                            <td align="left" colspan="1" rowspan="1">
                                <bold>Total</bold>
                            </td>
                            <td align="right" colspan="1" rowspan="1"> 1,672,980</td>
                            <td align="center" colspan="1" rowspan="1"> 100.0</td>
                            <td align="right" colspan="1" rowspan="1"> 115,814</td>
                            <td align="center" colspan="1" rowspan="1"> 100.0</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1">
                                <bold>Tested in last 14 days</bold>
                            </td>
                            <td align="right" colspan="1" rowspan="1"> 123,830</td>
                            <td align="center" colspan="1" rowspan="1"> 7.4</td>
                            <td align="right" colspan="1" rowspan="1"> 21,071</td>
                            <td align="center" colspan="1" rowspan="1"> 18.2</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1">
                                <bold>Test required by work or school</bold>
                            </td>
                            <td align="right" colspan="1" rowspan="1"> 26,805</td>
                            <td align="center" colspan="1" rowspan="1"> 1.6</td>
                            <td align="right" colspan="1" rowspan="1"> 13,747</td>
                            <td align="center" colspan="1" rowspan="1"> 11.9</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1">
                                <bold>Among those with required test</bold>
                            </td>
                            <td align="right" colspan="1" rowspan="1"/>
                            <td align="right" colspan="1" rowspan="1"/>
                            <td align="right" colspan="1" rowspan="1"/>
                            <td align="right" colspan="1" rowspan="1"/>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1">
                                <bold>Male gender</bold>
                            </td>
                            <td align="right" colspan="1" rowspan="1"> 8,662</td>
                            <td align="center" colspan="1" rowspan="1"> 32.3</td>
                            <td align="right" colspan="1" rowspan="1"> 1,972</td>
                            <td align="center" colspan="1" rowspan="1"> 14.3</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1">
                                <bold>Age in years</bold>
                            </td>
                            <td align="right" colspan="1" rowspan="1"/>
                            <td align="right" colspan="1" rowspan="1"/>
                            <td align="right" colspan="1" rowspan="1"/>
                            <td align="right" colspan="1" rowspan="1"/>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1">
                                <bold>18 to 24</bold>
                            </td>
                            <td align="right" colspan="1" rowspan="1"> 3,356</td>
                            <td align="center" colspan="1" rowspan="1"> 12.5</td>
                            <td align="right" colspan="1" rowspan="1"> 761</td>
                            <td align="center" colspan="1" rowspan="1"> 5.5</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1">
                                <bold>25 to 34</bold>
                            </td>
                            <td align="right" colspan="1" rowspan="1"> 4,648</td>
                            <td align="center" colspan="1" rowspan="1"> 17.3</td>
                            <td align="right" colspan="1" rowspan="1"> 2,374</td>
                            <td align="center" colspan="1" rowspan="1"> 17.3</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1">
                                <bold>35 to 44</bold>
                            </td>
                            <td align="right" colspan="1" rowspan="1"> 4,784</td>
                            <td align="center" colspan="1" rowspan="1"> 17.8</td>
                            <td align="right" colspan="1" rowspan="1"> 3,058</td>
                            <td align="center" colspan="1" rowspan="1"> 22.2</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1">
                                <bold>45 to 54</bold>
                            </td>
                            <td align="right" colspan="1" rowspan="1"> 4,797</td>
                            <td align="center" colspan="1" rowspan="1"> 17.9</td>
                            <td align="right" colspan="1" rowspan="1"> 3,377</td>
                            <td align="center" colspan="1" rowspan="1"> 24.6</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1">
                                <bold>55 to 64</bold>
                            </td>
                            <td align="right" colspan="1" rowspan="1"> 3,983</td>
                            <td align="center" colspan="1" rowspan="1"> 14.9</td>
                            <td align="right" colspan="1" rowspan="1"> 3,141</td>
                            <td align="center" colspan="1" rowspan="1"> 22.8</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1">
                                <bold>65 to 74</bold>
                            </td>
                            <td align="right" colspan="1" rowspan="1"> 1,204</td>
                            <td align="center" colspan="1" rowspan="1"> 4.5</td>
                            <td align="right" colspan="1" rowspan="1"> 920</td>
                            <td align="center" colspan="1" rowspan="1"> 6.7</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1">
                                <bold>75 and older</bold>
                            </td>
                            <td align="right" colspan="1" rowspan="1"> 476</td>
                            <td align="center" colspan="1" rowspan="1"> 1.8</td>
                            <td align="right" colspan="1" rowspan="1"> 105</td>
                            <td align="center" colspan="1" rowspan="1"> 0.8</td>
                        </tr>
                    </tbody>
                </table>
            </table-wrap>
            <p>Among HCWs with a required test, 527 of 13,747 (3.8%) reported a positive test in the last 14 days, while among non-HCWs with a required test, 1,466 of 26,805 (5.5%) reported a positive test, for a relative COVID-19 prevalence ratio of 0.7 (95% UI 0.6 to 0.8) (
                <xref ref-type="table" rid="T2">Table 2</xref>).</p>
            <table-wrap id="T2" orientation="portrait" position="anchor">
                <label>Table 2. </label>
                <caption>
                    <title>Relative COVID-19 incidence rate (RR) and counts of healthcare workers and non-healthcare workers and their crude prevalence counts and rates.</title>
                </caption>
                <table content-type="article-table" frame="hsides">
                    <thead>
                        <tr>
                            <th align="center" colspan="3" rowspan="1">Healthcare workers</th>
                            <th align="center" colspan="3" rowspan="1">Non-healthcare workers</th>
                            <th align="center" colspan="2" rowspan="1"/>
                        </tr>
                        <tr>
                            <th align="center" colspan="1" rowspan="1">Tested</th>
                            <th align="center" colspan="1" rowspan="1">Positive</th>
                            <th align="center" colspan="1" rowspan="1">%</th>
                            <th align="center" colspan="1" rowspan="1">Tested</th>
                            <th align="center" colspan="1" rowspan="1">Positive</th>
                            <th align="center" colspan="1" rowspan="1">%</th>
                            <th align="right" colspan="1" rowspan="1">RR</th>
                            <th align="right" colspan="1" rowspan="1">95% UI</th>
                        </tr>
                    </thead>
                    <tbody>
                        <tr>
                            <td align="center" colspan="1" rowspan="1">13,747</td>
                            <td align="center" colspan="1" rowspan="1">527</td>
                            <td align="center" colspan="1" rowspan="1">3.8</td>
                            <td align="center" colspan="1" rowspan="1">26,805</td>
                            <td align="center" colspan="1" rowspan="1">1,466</td>
                            <td align="center" colspan="1" rowspan="1">5.5</td>
                            <td align="center" colspan="1" rowspan="1">0.7</td>
                            <td align="center" colspan="1" rowspan="1">0.6 to 0.8</td>
                        </tr>
                    </tbody>
                </table>
            </table-wrap>
            <p>Our power calculation simulation results showed that 7,000 simulants provide 80% power to reject a null hypothesis that HCWs and non-HCWs have the same RR if, in truth, the RR is 0.7. Since the survey currently collects a weekly volume of around 7,000 individuals who report taking a required COVID-19 test, the simulation results imply that six weeks of data will provide more than sufficient power.</p>
            <sec>
                <title>Sensitivity analyses</title>
                <p>When we repeated our calculation using the unweighted survey responses to calculate the COVID-19 incidence ratio, we found an even smaller relative incidence ratio of 0.4 (95% UI 0.3 to 0.5).</p>
                <p>When we repeated our analysis restricted to only specific types of HCWs, as afforded by the questionnaire, we found a range of risks, usually less than 1.0, with substantially less certainty due to small sample sizes (
                    <xref ref-type="table" rid="T3">Table 3</xref>).</p>
                <table-wrap id="T3" orientation="portrait" position="anchor">
                    <label>Table 3. </label>
                    <caption>
                        <title>Relative COVID-19 incidence rate (RR) and counts of healthcare workers (HCWs) and non-healthcare workers stratified by worker subtype.</title>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th align="center" colspan="1" rowspan="1"/>
                                <th align="center" colspan="1" rowspan="1">Number of
                                    <break/>non-HCWs</th>
                                <th align="center" colspan="1" rowspan="1">Number of
                                    <break/>HCWs</th>
                                <th align="center" colspan="1" rowspan="1">Relative
                                    <break/>risk</th>
                                <th align="center" colspan="1" rowspan="1">Lower
                                    <break/>bound</th>
                                <th align="center" colspan="1" rowspan="1">Upper
                                    <break/>bound</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">All HCWs</td>
                                <td align="right" colspan="1" rowspan="1">26,805</td>
                                <td align="right" colspan="1" rowspan="1">13,747</td>
                                <td align="right" colspan="1" rowspan="1">0.7</td>
                                <td align="right" colspan="1" rowspan="1">0.6</td>
                                <td align="right" colspan="1" rowspan="1">0.8</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">Physician or surgeon</td>
                                <td align="right" colspan="1" rowspan="1">40,277</td>
                                <td align="right" colspan="1" rowspan="1">275</td>
                                <td align="right" colspan="1" rowspan="1">2.6</td>
                                <td align="right" colspan="1" rowspan="1">1.8</td>
                                <td align="right" colspan="1" rowspan="1">3.5</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">Registered nurse (including nurse
                                    <break/>practitioner)</td>
                                <td align="right" colspan="1" rowspan="1">37,573</td>
                                <td align="right" colspan="1" rowspan="1">2,979</td>
                                <td align="right" colspan="1" rowspan="1">0.6</td>
                                <td align="right" colspan="1" rowspan="1">0.6</td>
                                <td align="right" colspan="1" rowspan="1">0.8</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">Licensed practical or licensed vocational
                                    <break/>nurse</td>
                                <td align="right" colspan="1" rowspan="1">38,560</td>
                                <td align="right" colspan="1" rowspan="1">1,992</td>
                                <td align="right" colspan="1" rowspan="1">0.6</td>
                                <td align="right" colspan="1" rowspan="1">0.5</td>
                                <td align="right" colspan="1" rowspan="1">0.8</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">Physician assistant</td>
                                <td align="right" colspan="1" rowspan="1">40,405</td>
                                <td align="right" colspan="1" rowspan="1">147</td>
                                <td align="right" colspan="1" rowspan="1">0.7</td>
                                <td align="right" colspan="1" rowspan="1">0.4</td>
                                <td align="right" colspan="1" rowspan="1">1.3</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">Dentist</td>
                                <td align="right" colspan="1" rowspan="1">40,518</td>
                                <td align="right" colspan="1" rowspan="1">34</td>
                                <td align="right" colspan="1" rowspan="1">0.4</td>
                                <td align="right" colspan="1" rowspan="1">0.0</td>
                                <td align="right" colspan="1" rowspan="1">0.8</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">Any other treating practitioner</td>
                                <td align="right" colspan="1" rowspan="1">40,189</td>
                                <td align="right" colspan="1" rowspan="1">363</td>
                                <td align="right" colspan="1" rowspan="1">0.5</td>
                                <td align="right" colspan="1" rowspan="1">0.3</td>
                                <td align="right" colspan="1" rowspan="1">0.9</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">Pharmacist</td>
                                <td align="right" colspan="1" rowspan="1">40,473</td>
                                <td align="right" colspan="1" rowspan="1">79</td>
                                <td align="right" colspan="1" rowspan="1">0.3</td>
                                <td align="right" colspan="1" rowspan="1">0.1</td>
                                <td align="right" colspan="1" rowspan="1">0.8</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">Any therapist</td>
                                <td align="right" colspan="1" rowspan="1">39,371</td>
                                <td align="right" colspan="1" rowspan="1">1,181</td>
                                <td align="right" colspan="1" rowspan="1">0.5</td>
                                <td align="right" colspan="1" rowspan="1">0.4</td>
                                <td align="right" colspan="1" rowspan="1">0.7</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">Any health technologist or technician</td>
                                <td align="right" colspan="1" rowspan="1">39,062</td>
                                <td align="right" colspan="1" rowspan="1">1,490</td>
                                <td align="right" colspan="1" rowspan="1">1.0</td>
                                <td align="right" colspan="1" rowspan="1">0.7</td>
                                <td align="right" colspan="1" rowspan="1">1.2</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">Veterinarian</td>
                                <td align="right" colspan="1" rowspan="1">40,519</td>
                                <td align="right" colspan="1" rowspan="1">33</td>
                                <td align="right" colspan="1" rowspan="1">0.3</td>
                                <td align="right" colspan="1" rowspan="1">0.0</td>
                                <td align="right" colspan="1" rowspan="1">1.1</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">Nursing assistant or psychiatric aide</td>
                                <td align="right" colspan="1" rowspan="1">39,045</td>
                                <td align="right" colspan="1" rowspan="1">1,507</td>
                                <td align="right" colspan="1" rowspan="1">1.0</td>
                                <td align="right" colspan="1" rowspan="1">0.8</td>
                                <td align="right" colspan="1" rowspan="1">1.3</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">Home health or personal care aide</td>
                                <td align="right" colspan="1" rowspan="1">39,999</td>
                                <td align="right" colspan="1" rowspan="1">553</td>
                                <td align="right" colspan="1" rowspan="1">0.8</td>
                                <td align="right" colspan="1" rowspan="1">0.5</td>
                                <td align="right" colspan="1" rowspan="1">1.0</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">Occupational or physical therapy
                                    <break/>assistant or aide</td>
                                <td align="right" colspan="1" rowspan="1">40,477</td>
                                <td align="right" colspan="1" rowspan="1">75</td>
                                <td align="right" colspan="1" rowspan="1">1.3</td>
                                <td align="right" colspan="1" rowspan="1">0.5</td>
                                <td align="right" colspan="1" rowspan="1">1.9</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">Massage therapist</td>
                                <td align="right" colspan="1" rowspan="1">40,549</td>
                                <td align="right" colspan="1" rowspan="1">3</td>
                                <td align="right" colspan="1" rowspan="1">4.6</td>
                                <td align="right" colspan="1" rowspan="1">0.0</td>
                                <td align="right" colspan="1" rowspan="1">8.1</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">Dental assistant</td>
                                <td align="right" colspan="1" rowspan="1">40,534</td>
                                <td align="right" colspan="1" rowspan="1">18</td>
                                <td align="right" colspan="1" rowspan="1">0.0</td>
                                <td align="right" colspan="1" rowspan="1">0.0</td>
                                <td align="right" colspan="1" rowspan="1">0.0</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">Medical assistant</td>
                                <td align="right" colspan="1" rowspan="1">40,415</td>
                                <td align="right" colspan="1" rowspan="1">137</td>
                                <td align="right" colspan="1" rowspan="1">1.1</td>
                                <td align="right" colspan="1" rowspan="1">0.5</td>
                                <td align="right" colspan="1" rowspan="1">1.7</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">Medical transcriptionist</td>
                                <td align="right" colspan="1" rowspan="1">40,526</td>
                                <td align="right" colspan="1" rowspan="1">26</td>
                                <td align="right" colspan="1" rowspan="1">0.6</td>
                                <td align="right" colspan="1" rowspan="1">0.0</td>
                                <td align="right" colspan="1" rowspan="1">1.5</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">Pharmacy aide</td>
                                <td align="right" colspan="1" rowspan="1">40,536</td>
                                <td align="right" colspan="1" rowspan="1">16</td>
                                <td align="right" colspan="1" rowspan="1">0.0</td>
                                <td align="right" colspan="1" rowspan="1">0.0</td>
                                <td align="right" colspan="1" rowspan="1">0.0</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">Phlebotomist</td>
                                <td align="right" colspan="1" rowspan="1">40,524</td>
                                <td align="right" colspan="1" rowspan="1">28</td>
                                <td align="right" colspan="1" rowspan="1">3.4</td>
                                <td align="right" colspan="1" rowspan="1">0.7</td>
                                <td align="right" colspan="1" rowspan="1">4.8</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">Veterinary assistant</td>
                                <td align="right" colspan="1" rowspan="1">40,547</td>
                                <td align="right" colspan="1" rowspan="1">5</td>
                                <td align="right" colspan="1" rowspan="1">3.4</td>
                                <td align="right" colspan="1" rowspan="1">0.0</td>
                                <td align="right" colspan="1" rowspan="1">12.0</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">Any other healthcare support worker</td>
                                <td align="right" colspan="1" rowspan="1">38,379</td>
                                <td align="right" colspan="1" rowspan="1">2,173</td>
                                <td align="right" colspan="1" rowspan="1">0.5</td>
                                <td align="right" colspan="1" rowspan="1">0.4</td>
                                <td align="right" colspan="1" rowspan="1">0.6</td>
                            </tr>
                        </tbody>
                    </table>
                </table-wrap>
                <p>When we used only those individuals who were tested because of a workplace requirement 
                    <italic toggle="yes">and</italic> did not feel sick, we obtained a relative risk closer to 1.0.  Using only those tested because of a workplace requirement who also 
                    <italic toggle="yes">did</italic> feel sick we still obtained a relative risk substantially smaller than 1.0 (
                    <xref ref-type="table" rid="T4">Table 4</xref>). Although this finding could suggest that differences in testing patterns between healthcare and other work settings are partially responsible for the different positivity rates among HCWs and non-HCWs, it could also be driven by greater access to COVID-19 testing for confirmation of illness among HCWs experiencing symptoms. The recall period of 14 days provides ample time for an individual to receive a workplace test without symptoms, then develop symptoms, and then receive another test to determine if the symptoms are due to COVID-19, and HCWs might have more opportunity to access such a follow-up test, since they are visiting a healthcare setting for work already.</p>
                <table-wrap id="T4" orientation="portrait" position="anchor">
                    <label>Table 4. </label>
                    <caption>
                        <title>Relative COVID-19 incidence rate (RR) and counts of healthcare workers and non-healthcare workers stratified by those who reported they felt/did not feel sick as an additional reason for getting tested.</title>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th align="right" colspan="1" rowspan="1"/>
                                <th align="right" colspan="1" rowspan="1">Number of non-HCWs</th>
                                <th align="right" colspan="1" rowspan="1">Number of HCWs</th>
                                <th align="right" colspan="1" rowspan="1">Relative risk</th>
                                <th align="right" colspan="1" rowspan="1">Lower bound</th>
                                <th align="right" colspan="1" rowspan="1">Upper bound</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td align="right" colspan="1" rowspan="1">Test required, did not feel sick</td>
                                <td align="right" colspan="1" rowspan="1"> 23,523</td>
                                <td align="right" colspan="1" rowspan="1">  12,789</td>
                                <td align="right" colspan="1" rowspan="1">1.1</td>
                                <td align="right" colspan="1" rowspan="1">1.0</td>
                                <td align="right" colspan="1" rowspan="1">1.2</td>
                            </tr>
                            <tr>
                                <td align="right" colspan="1" rowspan="1">Test required, felt sick</td>
                                <td align="right" colspan="1" rowspan="1">   3,282</td>
                                <td align="right" colspan="1" rowspan="1">       958</td>
                                <td align="right" colspan="1" rowspan="1">0.8</td>
                                <td align="right" colspan="1" rowspan="1">0.7</td>
                                <td align="right" colspan="1" rowspan="1">0.9</td>
                            </tr>
                        </tbody>
                    </table>
                </table-wrap>
            </sec>
        </sec>
        <sec sec-type="discussion">
            <title>Discussion</title>
            <p>This study utilized a population-based approach to examine the relative risk of COVID-19 infection among HCW compared with non-HCW. Finding a relative COVID-19 incidence ratio substantially and significantly less than 1.0 is an unequivocally positive finding, indicating that infection control measures being taken by HCWs in total are effective.</p>
            <p>Our findings are consistent with the limited other evidence available on the risk of COVID-19 in healthcare facility settings
                <sup>
                    <xref ref-type="bibr" rid="ref-12">12</xref>&#x2013;
                    <xref ref-type="bibr" rid="ref-15">15</xref>
                </sup>, and, taken together, this growing body of evidence suggests that providing and seeking healthcare at this point in the epidemic is quite safe. HCWs need not fear contracting or transmitting infections more than other workers do, and patients should not defer needed care at present over concern that they will be exposed to COVID-19 during their interactions with HCWs.</p>
            <p>This outbreak and our understanding of it have both changed rapidly in the past, and may do so again, so we will continue to update this information.</p>
            <sec>
                <title>Limitations</title>
                <p>This work has at least three limitations. First, our results are based on self-reported data and therefore subject to both recall bias and social desirability bias, although the questions we relied on did not seem particularly at risk for either of these biases; the question &#x201c;have you been tested for COVID-19 in the last 14 days?&#x201d; likely included positive responses from individuals who received seroprevalence testing as well as PCR testing as well, which could also introduce a small amount of bias. Second, our approach required a large sample size to obtain a sufficiently precise estimate of RR, but this seems safer than including respondents who did not report receiving a required test, as that could introduce confounding. Third, it is possible that there was still uncontrolled confounding due to differential access to tests between HCWs and non-HCWs. Our sensitivity analysis found substantively similar results when restricted only to individuals who had workplace testing when they did not feel sick, but since we have only considered respondents with tests required by their employer or school, this might focus on non-HCW setting with better-than-average infection control policies (for example, they 
                    <italic toggle="yes">are</italic> doing asymptomatic testing) and therefore the relative risk for HCWs might be even lower than our method estimated.</p>
            </sec>
        </sec>
        <sec sec-type="conclusions">
            <title>Conclusion</title>
            <p>As of October, 2020, in the United States the relative infection ratio of HCWs to non-HCWs is reassuringly low. Infection control remains essential and HCWs must continue to be protected as the COVID-19 pandemic continues, to ensure safety to themselves, their co-workers, and their patients.</p>
        </sec>
        <sec>
            <title>Data availability</title>
            <sec>
                <title>Underlying data</title>
                <p>The underlying data used in this study are available to academic researchers for research purposes from Facebook at: 
                    <ext-link ext-link-type="uri" xlink:href="https://www.facebook.com/research-operations/rfp/?title=covid19-symptom-survey-data-access">https://www.facebook.com/research-operations/rfp/?title=covid19-symptom-survey-data-access</ext-link>. Conditions of access and instructions for applications can be found at 
                    <ext-link ext-link-type="uri" xlink:href="https://dataforgood.fb.com/docs/covid-19-symptom-survey-request-for-data-access/">https://dataforgood.fb.com/docs/covid-19-symptom-survey-request-for-data-access/</ext-link>.</p>
            </sec>
        </sec>
        <sec>
            <title>Code availability</title>
            <p>Reproducibility code available from: 
                <ext-link ext-link-type="uri" xlink:href="https://github.com/aflaxman/covid_hcw_rr">https://github.com/aflaxman/covid_hcw_rr</ext-link>
            </p>
            <p>Archived code at time of publication: 
                <ext-link ext-link-type="uri" xlink:href="http://doi.org/10.5281/zenodo.4270368">http://doi.org/10.5281/zenodo.4270368</ext-link>
                <sup>
                    <xref ref-type="bibr" rid="ref-16">16</xref>
                </sup>.</p>
            <p>License: 
                <ext-link ext-link-type="uri" xlink:href="https://opensource.org/licenses/GPL-3.0">GNU General Public License v3.0</ext-link> </p>
        </sec>
    </body>
    <back>
        <ref-list>
            <ref id="ref-1">
                <label>1</label>
                <mixed-citation publication-type="journal">
                    <article-title>How have healthcare utilization and spending changed so far during the coronavirus pandemic?</article-title>Peterson-KFF Health System Tracker. [cited 2020 Oct 21].
                    <ext-link ext-link-type="uri" xlink:href="https://www.healthsystemtracker.org/chart-collection/how-have-healthcare-utilization-and-spending-changed-so-far-during-the-coronavirus-pandemic/">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-2">
                <label>2</label>
                <mixed-citation publication-type="journal">
                    <article-title>COVID-19 Effects On Care Volumes: What They Might Mean And How We Might Respond</article-title>. [cited 2020 Oct 21].
                    <pub-id pub-id-type="doi">10.1377/hblog20200702.788062/full</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-3">
                <label>3</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Alexander</surname>
                            <given-names>GC</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Tajanlangit</surname>
                            <given-names>M</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Heyward</surname>
                            <given-names>J</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Use and Content of Primary Care Office-Based vs Telemedicine Care Visits During the COVID-19 Pandemic in the US.</article-title>
                    <source>
						
                        <italic toggle="yes">JAMA Netw Open.</italic>
					</source>
                    <year>2020</year>;<volume>3</volume>(<issue>10</issue>):<fpage>e2021476</fpage>.
                    <pub-id pub-id-type="pmid">33006622</pub-id>
                    <pub-id pub-id-type="doi">10.1001/jamanetworkopen.2020.21476</pub-id>
                    <pub-id pub-id-type="pmcid">7532385</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-4">
                <label>4</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Evans</surname>
                            <given-names>M</given-names>
                        </name>
					</person-group>:
                    <article-title>WSJ News Exclusive  Hospitals Failed to Fully Contain Covid-19 Inside Their Walls.</article-title>
                    <source>
						
                        <italic toggle="yes">Wall Street Journal.</italic>
					</source>
                    <year>2020</year>[cited 2020 Oct 21].
                    <ext-link ext-link-type="uri" xlink:href="https://www.wsj.com/articles/hospitals-failed-to-fully-contain-covid-19-inside-their-walls-11600176536">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-5">
                <label>5</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Jewett</surname>
                            <given-names>RL</given-names>
                        </name>
					</person-group>:
                    <article-title>Battle rages inside US hospitals over how Covid-19 strikes and kills.</article-title>
                    <source>
						
                        <italic toggle="yes">Guardian.</italic>
					</source>
                    <year>2020</year>[cited 2020 Oct 21].
                    <ext-link ext-link-type="uri" xlink:href="http://www.theguardian.com/world/2020/sep/23/us-hospitals-coronavirus-battle-cdc">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-6">
                <label>6</label>
                <mixed-citation publication-type="journal">
                    <article-title>COVID-19 Symptom Surveys through Facebook  The Delphi Blog</article-title>. [cited 2020 Oct 21].
                    <ext-link ext-link-type="uri" xlink:href="https://delphi.cmu.edu/blog/2020/08/26/covid-19-symptom-surveys-through-facebook/">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-7">
                <label>7</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Barkay</surname>
                            <given-names>N</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Cobb</surname>
                            <given-names>C</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Eilat</surname>
                            <given-names>R</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Weights and Methodology Brief for the COVID-19 Symptom Survey by University of Maryland and Carnegie Mellon University, in Partnership with Facebook.</article-title>
                    <source>
						
                        <italic toggle="yes">ArXiv200914675 Cs.</italic>
					</source>
                    <year>2020</year>[cited 2020 Oct 21].
                    <ext-link ext-link-type="uri" xlink:href="http://arxiv.org/abs/2009.14675">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-8">
                <label>8</label>
                <mixed-citation publication-type="journal">
                    <article-title>Data for Good: New Tools to Help Health Researchers Track and Combat COVID-19</article-title>. About Facebook. 2020 [cited 2020 Oct 21].
                    <ext-link ext-link-type="uri" xlink:href="https://about.fb.com/news/2020/04/data-for-good/">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-9">
                <label>9</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Wang</surname>
                            <given-names>PW</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Lu</surname>
                            <given-names>WH</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Ko</surname>
                            <given-names>NY</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>COVID-19-Related Information Sources and the Relationship With Confidence in People Coping with COVID-19: Facebook Survey Study in Taiwan.</article-title>
                    <source>
						
                        <italic toggle="yes">J Med Internet Res.</italic>
					</source>
                    <year>2020</year>;<volume>22</volume>(<issue>6</issue>):<fpage>e20021</fpage>.
                    <pub-id pub-id-type="pmid">32490839</pub-id>
                    <pub-id pub-id-type="doi"> 10.2196/20021</pub-id>
                    <pub-id pub-id-type="pmcid">7279044</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-10">
                <label>10</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Srivastav</surname>
                            <given-names>AK</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Sharma</surname>
                            <given-names>N</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Samuel</surname>
                            <given-names>AJ</given-names>
                        </name>
					</person-group>:
                    <article-title>Impact of Coronavirus disease-19 (COVID-19) lockdown on physical activity and energy expenditure among physiotherapy professionals and students using web-based open E-survey sent through WhatsApp, Facebook and Instagram messengers.</article-title>
                    <source>
						
                        <italic toggle="yes">Clin Epidemiol Glob Health.</italic>
					</source>
                    <year>2020</year>.
                    <pub-id pub-id-type="pmid">32838062</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.cegh.2020.07.003</pub-id>
                    <pub-id pub-id-type="pmcid">7358172</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-11">
                <label>11</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Efron</surname>
                            <given-names>B</given-names>
                        </name>
					</person-group>:
                    <article-title>Bootstrap Methods: Another Look at the Jackknife.</article-title>
                    <source>
						
                        <italic toggle="yes">Ann Stat.</italic>
					</source>
                    <year>1979</year>;<volume>7</volume>(<issue>1</issue>):<fpage>1</fpage>&#x2013;<lpage>26</lpage>.
                    <ext-link ext-link-type="uri" xlink:href="https://www.jstor.org/stable/2958830?seq=1">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-12">
                <label>12</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Nalleballe</surname>
                            <given-names>K</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Siddamreddy</surname>
                            <given-names>S</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Kovvuru</surname>
                            <given-names>S</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Risk of COVID-19 from hospital admission during the pandemic.</article-title>
                    <source>
						
                        <italic toggle="yes">Infect Control Hosp Epidemiol.</italic>
					</source>undefined/ed; 1&#x2013;7.
                    <pub-id pub-id-type="pmid">33028457</pub-id>
                    <pub-id pub-id-type="doi">10.1017/ice.2020.1249</pub-id>
                    <pub-id pub-id-type="pmcid">7578623</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-13">
                <label>13</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Ridgway</surname>
                            <given-names>JP</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Robicsek</surname>
                            <given-names>AA</given-names>
                        </name>
					</person-group>:
                    <article-title>Risk of coronavirus disease 2019 (COVID-19) acquisition among emergency department patients: A retrospective case control study.</article-title>
                    <source>
						
                        <italic toggle="yes">Infect Control Hosp Epidemiol.</italic>
					</source>
                    <year>2020</year>; 1&#x2013;3.
                    <pub-id pub-id-type="pmid">32962781</pub-id>
                    <pub-id pub-id-type="doi">10.1017/ice.2020.1224</pub-id>
                    <pub-id pub-id-type="pmcid">7542312</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-14">
                <label>14</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Reale</surname>
                            <given-names>SC</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Fields</surname>
                            <given-names>KG</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Lumbreras-Marquez</surname>
                            <given-names>MI</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Association Between Number of In-Person Health Care Visits and SARS-CoV-2 Infection in Obstetrical Patients.</article-title>
                    <source>
						
                        <italic toggle="yes">JAMA.</italic>
					</source>
                    <year>2020</year>;<volume>324</volume>(<issue>12</issue>):<fpage>1210</fpage>&#x2013;<lpage>1212</lpage>.
                    <pub-id pub-id-type="pmid">32797148</pub-id>
                    <pub-id pub-id-type="doi">10.1001/jama.2020.15242</pub-id>
                    <pub-id pub-id-type="pmcid">7428807</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-15">
                <label>15</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Self</surname>
                            <given-names>WH</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Tenforde</surname>
                            <given-names>MW</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Stubblefield</surname>
                            <given-names>WB</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Seroprevalence of SARS-CoV-2 Among Frontline Health Care Personnel in a Multistate Hospital Network - 13 Academic Medical Centers, April-June 2020.</article-title>
                    <source>
						
                        <italic toggle="yes">MMWR Morb Mortal Wkly Rep.</italic>
					</source>
                    <year>2020</year>;<volume>69</volume>(<issue>35</issue>):<fpage>1221</fpage>&#x2013;<lpage>6</lpage>.
                    <pub-id pub-id-type="pmid">32881855</pub-id>
                    <pub-id pub-id-type="doi">10.15585/mmwr.mm6935e2</pub-id>
                    <pub-id pub-id-type="pmcid">7470460</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-16">
                <label>16</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Flaxman</surname>
                            <given-names>A</given-names>
                        </name>
					</person-group>:
                    <article-title>aflaxman/covid_hcw_rr: As submitted to Gates Open Research (Version v1.0.0).</article-title>
                    <source>
						
                        <italic toggle="yes">Zenodo.</italic>
					</source>
                    <year>2020</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://www.doi.org/10.5281/zenodo.4270368">http://www.doi.org/10.5281/zenodo.4270368</ext-link>
                </mixed-citation>
            </ref>
        </ref-list>
    </back>
    <sub-article article-type="reviewer-report" id="report30426">
        <front-stub>
            <article-id pub-id-type="doi">10.21956/gatesopenres.14411.r30426</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Driscoll</surname>
                        <given-names>Tim</given-names>
                    </name>
                    <xref ref-type="aff" rid="r30426a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0003-0057-2490</uri>
                </contrib>
                <aff id="r30426a1">
                    <label>1</label>School of Public Health, Faculty of Medicine and Health, The University of Sydney, Sydney, NSW, Australia</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>Dr Flaxman works at the Institute for Health Metrics and Evaluation, which runs the Global Burden of Disease (GBD) study.&#x00a0; I am head of the Occupational Risk Factors Expert Working Group working on the GBD study.&#x00a0;I have co-authored papers with Dr Flaxman that have arisen from this study but have not worked closely with him on any aspects of the study and the papers that we have co-authored have had a large number of co-authors. I don't have a personal relationship with Dr Flaxman. I believe I can provide an objective review of this paper.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>1</day>
                <month>4</month>
                <year>2021</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2021 Driscoll T</copyright-statement>
                <copyright-year>2021</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport30426" related-article-type="peer-reviewed-article" xlink:href="10.12688/gatesopenres.13202.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>This paper presents an analysis of data collected from United States&#x2019; respondents to a Facebook survey and focuses on a comparison of the rate of COVID-19 in health care workers compared to workers in other sectors. The main finding was that infection is less common in health care workers compared to non-health care workers, with the authors concluding that the results suggest it is &#x201c;safe&#x201d; (in terms of risk of COVID-19 infection) to be a health care worker. The methodology seems appropriate. The structure of the paper is good and the meaning is generally clear.</p>
            <p> </p>
            <p> In terms of the Methods, there are inconsistencies in the terminology and I can&#x2019;t see any reason for this. Most particularly, there is mention of an &#x201c;
                <italic>endorsement rate</italic>&#x201d;, which is the basis of the &#x201c;
                <italic>relative COVID-19 incidence ratio</italic>&#x201d;, but this endorsement rate is not mentioned again in the manuscript. In the Results section, there is mention of a &#x201c;
                <italic>relative COVID-19 prevalence ratio</italic>&#x201d; and a &#x201c;
                <italic>Relative COVID-19 incidence rate</italic>&#x201d;. In the Discussion, &#x201c;
                <italic>relative COVID-19 incidence ratio</italic>&#x201d; is mentioned again. I presume all three of these terms represent the same quantity. If so, it seems just one term should be used. If not, there needs to be further explanation about what has been calculated and why. It appears that the information presented is prevalence rather than incidence, because although the testing was in the previous 14 days the positive result could reflect past disease, depending on the type of test. If it is assumed the testing was done via PCR and further assumed this PCR test would only be positive for recent (in the previous two weeks or so) infection, then incidence would be an appropriate term to use, but then the implications of this assumption should be considered in the Discussion. Either way, the uncertainty arising from lack of information about the testing seems to be a limitation that could usefully be included at the end of the Discussion.</p>
            <p> </p>
            <p> The conclusion that &#x201c;
                <italic>HCWs need not fear contracting or transmitting infections more than other workers do&#x2026;</italic>&#x201d; seems too strong given the limitations of the data used for this study and the &#x201c;
                <italic>&#x2026;limited other evidence available&#x2026;</italic>&#x201d;, as acknowledged by the authors. Similarly, the preceding statement that the result is &#x201c;
                <italic>&#x2026;an unequivocally positive finding&#x2026;</italic>&#x201d; is at odds with the limitations considered later in the paper. I agree that if the results are accepted on face value they imply that health care workers are at lower risk than non-health care workers, but the other aspects just mentioned mean that conclusions based on these results should be guarded. Also, health care workers are analysed as a group, or in smaller but still broad groups in Table 3. This group will contain a mixture of people working directly with the public (front-line health workers) in a clinical setting and people working in health care but with minimal contact with patients. It might well be that the front-line health workers do indeed have a higher risk of infection than the general public, but that this is not reflected in the study results because the other health care workers have a much lower risk of infection. The fact that the &#x201c;
                <italic>Physician or surgeon</italic>&#x201d; group appears to have a higher risk (RR=2.6) supports this concern. Having mentioned Table 3, the interpretation of this is not clear. Why are there different numbers of non-health care workers in each row, and why do they appear in any row if each row represents a different type of health care worker? It would be helpful to explain this.</p>
            <p> </p>
            <p> There is quite a bit of space in the paper considering the power of the study. The reason for this is not clear. The power calculations are based on an assumed difference of at least 30% in the &#x201c;prevalence&#x201d; of COVID-19 between health care workers and non-health care workers. This would be important if the difference found was less than 30%. However, since the difference found was 30%, the power calculations don&#x2019;t seem relevant.&#x00a0; Also, the program to undertake this power calculation was included in the paper. I am not sure this adds much; I don&#x2019;t mind it being there but it is not further considered and in fact isn&#x2019;t directly referred to &#x2013; it just appears in the text at the end of, or actually part of, the last sentence in the section describing the power calculation. That seems odd.</p>
            <p> </p>
            <p> The authors rightly identify some limitations in their work. These primarily result from the data used in the analysis rather than from the analysis used. The authors note the potential for some forms of reporting bias and for uncontrolled confounding, both of which I agree may be of concern.&#x00a0; They also mention the need for a large sample size, which doesn&#x2019;t seem to be a limitation in terms of interpreting the results of the study; the large sample size is not a source of bias, just something that requires greater statistical resources.</p>
            <p>Is the work clearly and accurately presented and does it cite the current literature?</p>
            <p>Yes</p>
            <p>If applicable, is the statistical analysis and its interpretation appropriate?</p>
            <p>Yes</p>
            <p>Are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Yes</p>
            <p>Is the study design appropriate and is the work technically sound?</p>
            <p>Yes</p>
            <p>Are the conclusions drawn adequately supported by the results?</p>
            <p>Partly</p>
            <p>Are sufficient details of methods and analysis provided to allow replication by others?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>Epidemiology, occupational medicine</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.</p>
        </body>
        <sub-article article-type="response" id="comment3439-30426">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Flaxman</surname>
                            <given-names>Abraham</given-names>
                        </name>
                        <aff>University of Washington, Seattle, USA</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>As stated in manuscript.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>17</day>
                    <month>5</month>
                    <year>2021</year>
                </pub-date>
            </front-stub>
            <body>
                <p>
                    <italic>In terms of the Methods, there are inconsistencies in the terminology and I can&#x2019;t see any reason for this. Most particularly, there is mention of an &#x201c;endorsement rate&#x201d;, which is the basis of the &#x201c;relative COVID-19 incidence ratio&#x201d;, but this endorsement rate is not mentioned again in the manuscript. In the Results section, there is mention of a &#x201c;relative COVID-19 prevalence ratio&#x201d; and a &#x201c;Relative COVID-19 incidence rate&#x201d;. In the Discussion, &#x201c;relative COVID-19 incidence ratio&#x201d; is mentioned again. I presume all three of these terms represent the same quantity. If so, it seems just one term should be used. If not, there needs to be further explanation about what has been calculated and why. It appears that the information presented is prevalence rather than incidence, because although the testing was in the previous 14 days the positive result could reflect past disease, depending on the type of test. If it is assumed the testing was done via PCR and further assumed this PCR test would only be positive for recent (in the previous two weeks or so) infection, then incidence would be an appropriate term to use, but then the implications of this assumption should be considered in the Discussion. Either way, the uncertainty arising from lack of information about the testing seems to be a limitation that could usefully be included at the end of the Discussion.</italic>
                </p>
                <p> </p>
                <p> 
                    <bold>Response: </bold>We have standardized our terminology on incidence, which we think is the most precise and accurate of the terms we used originally; thank you for calling attention to this inconsistency.&#x00a0; We have also added to the limitations section to highlight the way 14-day recall is not exactly &#x201c;incidence&#x201d;.</p>
                <p> </p>
                <p> 
                    <italic>The conclusion that &#x201c;HCWs need not fear contracting or transmitting infections more than other workers do&#x2026;&#x201d; seems too strong given the limitations of the data used for this study and the &#x201c;&#x2026;limited other evidence available&#x2026;&#x201d;, as acknowledged by the authors. Similarly, the preceding statement that the result is &#x201c;&#x2026;an unequivocally positive finding&#x2026;&#x201d; is at odds with the limitations considered later in the paper. I agree that if the results are accepted on face value they imply that health care workers are at lower risk than non-health care workers, but the other aspects just mentioned mean that conclusions based on these results should be guarded. Also, health care workers are analysed as a group, or in smaller but still broad groups in Table 3. This group will contain a mixture of people working directly with the public (front-line health workers) in a clinical setting and people working in health care but with minimal contact with patients. It might well be that the front-line health workers do indeed have a higher risk of infection than the general public, but that this is not reflected in the study results because the other health care workers have a much lower risk of infection. The fact that the &#x201c;Physician or surgeon&#x201d; group appears to have a higher risk (RR=2.6) supports this concern.</italic>
                </p>
                <p> </p>
                <p> 
                    <bold>Response: </bold>We have moderated the discussion in light of this comment, as well as the similar concerns from Reviewer 2.</p>
                <p> </p>
                <p> 
                    <italic>Having mentioned Table 3, the interpretation of this is not clear. Why are there different numbers of non-health care workers in each row, and why do they appear in any row if each row represents a different type of health care worker? It would be helpful to explain this.</italic>
                </p>
                <p> </p>
                <p> 
                    <bold>Response: </bold>Each row besides the first row compares a subtype of HCWs to everyone who is not of that subtype.&#x00a0; We have edited the column headings to make this clearer.</p>
                <p> </p>
                <p> 
                    <italic>There is quite a bit of space in the paper considering the power of the study. The reason for this is not clear. The power calculations are based on an assumed difference of at least 30% in the &#x201c;prevalence&#x201d; of COVID-19 between health care workers and non-health care workers. This would be important if the difference found was less than 30%. However, since the difference found was 30%, the power calculations don&#x2019;t seem relevant.&#x00a0; Also, the program to undertake this power calculation was included in the paper. I am not sure this adds much; I don&#x2019;t mind it being there but it is not further considered and in fact isn&#x2019;t directly referred to &#x2013; it just appears in the text at the end of, or actually part of, the last sentence in the section describing the power calculation. That seems odd.</italic>
                </p>
                <p> </p>
                <p> 
                    <bold>Response: </bold>We did this power calculation in so much detail because we wanted to get our results out as soon as possible, but not so soon that we were fooled by chance variation in the data.&#x00a0; We have taken it out to focus the reader on the most important parts, especially now that there is so much more data available.</p>
                <p> </p>
                <p> 
                    <italic>The authors rightly identify some limitations in their work. These primarily result from the data used in the analysis rather than from the analysis used. The authors note the potential for some forms of reporting bias and for uncontrolled confounding, both of which I agree may be of concern.&#x00a0; They also mention the need for a large sample size, which doesn&#x2019;t seem to be a limitation in terms of interpreting the results of the study; the large sample size is not a source of bias, just something that requires greater statistical resources.</italic>
                </p>
                <p> </p>
                <p> 
                    <bold>Response: </bold>We thank the reviewer for this perspective, and have attempted to edit the limitations section to make it clearer.</p>
            </body>
        </sub-article>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report30475">
        <front-stub>
            <article-id pub-id-type="doi">10.21956/gatesopenres.14411.r30475</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Hawkins</surname>
                        <given-names>Devan</given-names>
                    </name>
                    <xref ref-type="aff" rid="r30475a1">1</xref>
                    <role>Referee</role>
                </contrib>
                <contrib contrib-type="author">
                    <name>
                        <surname>Goldstein-Gelb</surname>
                        <given-names>Marcy</given-names>
                    </name>
                    <xref ref-type="aff" rid="r30475a2">2</xref>
                    <role>Co-referee</role>
                </contrib>
                <aff id="r30475a1">
                    <label>1</label>Department of Public Health Program, Schools of Arts and Sciences, MCPHS University, Boston, MA, USA</aff>
                <aff id="r30475a2">
                    <label>2</label>National Council for Occupational Safety and Health, Somerville, MA, USA</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>29</day>
                <month>3</month>
                <year>2021</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2021 Hawkins D and Goldstein-Gelb M</copyright-statement>
                <copyright-year>2021</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport30475" related-article-type="peer-reviewed-article" xlink:href="10.12688/gatesopenres.13202.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>reject</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>Thank you for the invitation to review this paper. The paper addresses an important topic (the risk of acquiring COVID-19 among healthcare workers). The authors apply unique methods to study the problem. However, we have some concerns about how the analysis was performed and how the results were interpreted. Below, we provide details about these concerns.&#x00a0;</p>
            <p> </p>
            <p> Introduction: 
                <list list-type="bullet">
                    <list-item>
                        <p>The authors should provide some information about previous studies that have examined the risk for COVID-19 among healthcare workers and also justify why they hypothesized that healthcare workers would have a lower risk. Some studies have suggested that they have an elevated risk. Below are some studies that have examined the risk/potential risk for COVID-19 among healthcare workers: 
                            <list list-type="bullet">
                                <list-item>
                                    <p>Baker 
                                        <italic>et al.</italic> (2020
                                        <sup>
                                            <xref ref-type="bibr" rid="rep-ref-30475-1">1</xref>
                                        </sup>).</p>
                                </list-item>
                                <list-item>
                                    <p>Burrer 
                                        <italic>et al.</italic> (2020
                                        <sup>
                                            <xref ref-type="bibr" rid="rep-ref-30475-2">2</xref>
                                        </sup>).</p>
                                </list-item>
                                <list-item>
                                    <p>Hawkins 
                                        <italic>et al.</italic> (2020
                                        <sup>
                                            <xref ref-type="bibr" rid="rep-ref-30475-3">3</xref>
                                        </sup>).</p>
                                </list-item>
                                <list-item>
                                    <p>Ran
                                        <italic> et al. </italic>(2020
                                        <sup>
                                            <xref ref-type="bibr" rid="rep-ref-30475-4">4</xref>
                                        </sup>).</p>
                                </list-item>
                            </list> </p>
                    </list-item>
                </list> </p>
            <p> Methods: 
                <list list-type="bullet">
                    <list-item>
                        <p>The authors should explain the justification for weighting to the overall Facebook population more. If the goal is to ensure that the healthcare workers survey from Facebook are representative of healthcare workers, this type of weighting may not help.&#x00a0;</p>
                    </list-item>
                    <list-item>
                        <p>Was industry information available? There is good reason to suspect that risk will be different across different industry. In some cases, HCWs will even be working from home with telehealth. It may be useful to: 
                            <list list-type="bullet">
                                <list-item>
                                    <p>1) Compare healthcare workers employed in the healthcare industry to other health care workers</p>
                                </list-item>
                                <list-item>
                                    <p>2) Examine the risk among different industries&#x00a0;</p>
                                </list-item>
                            </list> </p>
                    </list-item>
                    <list-item>
                        <p>We strongly recommend including all positive tests as a sensitivity analysis not just those required by work. I agree that differential testing may introduce a bias, but it would be better to show all the data so that we can consider the potential magnitude of that bias. There may actually be an even greater differential between HCW and other workers. &#x00a0;In fact, probably most non-health care workers don't get tested through employer requirements, and only know that they have COVID after becoming sick.</p>
                    </list-item>
                    <list-item>
                        <p>Additionally, we strongly recommend having a different reference population than all non-healthcare workers. Other high risk workers are included in the current reference group, which may have the impact of making the risk among healthcare workers appear lower. Potentially consider including major census or SOC occupations for comparison.&#x00a0;</p>
                    </list-item>
                    <list-item>
                        <p>For non-health care workers, did they ask whether they worked outside the home, or was there just an assumption that they did. &#x00a0;Naturally if they were tested but work from home, that would be an overrepresentation of work-relatedness, though I would assume it would not be an employer requirement if they work from home.</p>
                    </list-item>
                    <list-item>
                        <p>Was the survey only conducted in English?&#x00a0;</p>
                    </list-item>
                </list> </p>
            <p> Results: 
                <list list-type="bullet">
                    <list-item>
                        <p>The demographics for healthcare workers should be compared to national data about healthcare workers demographics. This data can be obtained from the CPS or census. CPS is linked here: https://www.bls.gov/cps/tables.htm</p>
                    </list-item>
                    <list-item>
                        <p>Consider separating occupations into major categories for more fair comparisons. You may consider weighting to this data rather than the Facebook demographics.&#x00a0;</p>
                    </list-item>
                    <list-item>
                        <p>Is race/ethnicity data available? If workers of color are under-represented this could introduce bias to the study, because these workers may be more likely to be employed in higher risk healthcare occupations.&#x00a0;</p>
                    </list-item>
                    <list-item>
                        <p>Table 3: How do the distributions of detailed occupations compare to national data about employment in these occupations? The CPS data linked above can be used to assess this. Bias may be introduced if certain occupations are underrepresented.&#x00a0;</p>
                    </list-item>
                    <list-item>
                        <p>Table 3: The authors should discuss the variability in rates according to specific healthcare occupations. They may consider including the groups according to major healthcare occupations (practioners, support, etc.). Some occupations have elevated rates. &#x00a0;</p>
                    </list-item>
                </list> Discussion: 
                <list list-type="bullet">
                    <list-item>
                        <p>We strongly recommend removing this finding: &#x201c;an unequivocally positive findings, indicating that infection control measures being taken by HCWs in total are effective.&#x201d; Based on the limitations of this study, we do not believe that the findings support this conclusion. The findings may be suggestive of effective measures being taken if some of the limitations in the methods/results are addressed.&#x00a0;</p>
                    </list-item>
                    <list-item>
                        <p>Consider other findings linked above which are not consistent with this study&#x2019;s findings of a lower risk among HCWs.</p>
                    </list-item>
                    <list-item>
                        <p>We strong discourage concluding that HCWs should not fear contracting or transmitting infections more than other workers. HCWs don't base their fear on how their likelihood of exposure compares to other worker fears - they're afraid, according to other factors, including often not having adequate protection methods.&#x00a0;</p>
                    </list-item>
                </list>
            </p>
            <p>Is the work clearly and accurately presented and does it cite the current literature?</p>
            <p>Partly</p>
            <p>If applicable, is the statistical analysis and its interpretation appropriate?</p>
            <p>Partly</p>
            <p>Are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Yes</p>
            <p>Is the study design appropriate and is the work technically sound?</p>
            <p>Partly</p>
            <p>Are the conclusions drawn adequately supported by the results?</p>
            <p>Partly</p>
            <p>Are sufficient details of methods and analysis provided to allow replication by others?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>Devan Hawkins: Occupational health epidemiologist</p>
            <p>We confirm that we have read this submission and believe that we have an appropriate level of expertise to state that we do not consider it to be of an acceptable scientific standard, for reasons outlined above.</p>
        </body>
        <back>
            <ref-list>
                <title>References</title>
                <ref id="rep-ref-30475-1">
                    <label>1</label>
                    <mixed-citation publication-type="journal">
                        <person-group person-group-type="author"/>:
                        <article-title>Estimating the burden of United States workers exposed to infection or disease: A key factor in containing risk of COVID-19 infection.</article-title>
                        <source>
                            <italic>PLoS One</italic>
                        </source>.<year>2020</year>;<volume>15</volume>(<issue>4</issue>) :
                        <elocation-id>10.1371/journal.pone.0232452</elocation-id>
                        <fpage>e0232452</fpage>
                        <pub-id pub-id-type="pmid">32343747</pub-id>
                        <pub-id pub-id-type="doi">10.1371/journal.pone.0232452</pub-id>
                    </mixed-citation>
                </ref>
                <ref id="rep-ref-30475-2">
                    <label>2</label>
                    <mixed-citation publication-type="journal">
                        <person-group person-group-type="author"/>:
                        <article-title>Characteristics of Health Care Personnel with COVID-19 &#x2014; United States, February 12&#x2013;April 9, 2020</article-title>.
                        <source>
                            <italic>MMWR. Morbidity and Mortality Weekly Report</italic>
                        </source>.<year>2020</year>;<volume>69</volume>(<issue>15</issue>) :
                        <elocation-id>10.15585/mmwr.mm6915e6</elocation-id>
                        <fpage>477</fpage>-<lpage>481</lpage>
                        <pub-id pub-id-type="doi">10.15585/mmwr.mm6915e6</pub-id>
                    </mixed-citation>
                </ref>
                <ref id="rep-ref-30475-3">
                    <label>3</label>
                    <mixed-citation publication-type="journal">
                        <person-group person-group-type="author"/>:
                        <article-title>COVID-19 deaths by occupation, Massachusetts, March 1-July 31, 2020.</article-title>
                        <source>
                            <italic>Am J Ind Med</italic>
                        </source>.<year>2021</year>;<volume>64</volume>(<issue>4</issue>) :
                        <elocation-id>10.1002/ajim.23227</elocation-id>
                        <fpage>238</fpage>-<lpage>244</lpage>
                        <pub-id pub-id-type="pmid">33522627</pub-id>
                        <pub-id pub-id-type="doi">10.1002/ajim.23227</pub-id>
                    </mixed-citation>
                </ref>
                <ref id="rep-ref-30475-4">
                    <label>4</label>
                    <mixed-citation publication-type="journal">
                        <person-group person-group-type="author"/>:
                        <article-title>Risk Factors of Healthcare Workers With Coronavirus Disease 2019: A Retrospective Cohort Study in a Designated Hospital of Wuhan in China</article-title>.
                        <source>
                            <italic>Clinical Infectious Diseases</italic>
                        </source>.<year>2020</year>;<volume>71</volume>(<issue>16</issue>) :
                        <elocation-id>10.1093/cid/ciaa287</elocation-id>
                        <fpage>2218</fpage>-<lpage>2221</lpage>
                        <pub-id pub-id-type="doi">10.1093/cid/ciaa287</pub-id>
                    </mixed-citation>
                </ref>
            </ref-list>
        </back>
        <sub-article article-type="response" id="comment3438-30475">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Flaxman</surname>
                            <given-names>Abraham</given-names>
                        </name>
                        <aff>University of Washington, Seattle, USA</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>As stated in manuscript.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>17</day>
                    <month>5</month>
                    <year>2021</year>
                </pub-date>
            </front-stub>
            <body>
                <p>
                    <italic>Introduction:</italic> 
                    <list list-type="bullet">
                        <list-item>
                            <p>
                                <italic>The authors should</italic> 
                                <italic>provide some information about previous studies that have examined the risk for COVID-19 among healthcare workers and also justify why they hypothesized that healthcare workers would have a lower risk. Some studies have suggested that they have an elevated risk. Below are some studies that have examined the risk/potential risk for COVID-19 among healthcare workers:</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>
                                <italic>1. Baker MG, Peckham TK, Seixas NS: Estimating the burden of United States workers exposed to infection or disease: A key factor in containing risk of COVID-19 infection.PLoS One. 2020;&#x00a0;
                                    <bold>15</bold>&#x00a0;(4): e0232452&#x00a0;</italic>
                                <ext-link ext-link-type="uri" xlink:href="http://www.ncbi.nlm.nih.gov/pubmed/32343747">
                                    <italic>PubMed Abstract</italic>
                                </ext-link>
                                <italic>&#x00a0;|&#x00a0;</italic>
                                <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1371/journal.pone.0232452">
                                    <italic>Publisher Full Text</italic>
                                </ext-link>
                            </p>
                        </list-item>
                        <list-item>
                            <p>
                                <italic>2. CDC COVID-19 Response Team, CDC COVID-19 Response Team, Burrer S, de Perio M, et al.: Characteristics of Health Care Personnel with COVID-19 &#x2014; United States, February 12&#x2013;April 9, 2020.&#x00a0;MMWR. Morbidity and Mortality Weekly Report. 2020;&#x00a0;
                                    <bold>69</bold>&#x00a0;(15): 477-481&#x00a0;</italic>
                                <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.15585/mmwr.mm6915e6">
                                    <italic>Publisher Full Text</italic>
                                </ext-link>
                            </p>
                        </list-item>
                        <list-item>
                            <p>
                                <italic>3. Hawkins D, Davis L, Kriebel D: COVID-19 deaths by occupation, Massachusetts, March 1-July 31, 2020.Am J Ind Med. 2021;&#x00a0;
                                    <bold>64</bold>&#x00a0;(4): 238-244&#x00a0;</italic>
                                <ext-link ext-link-type="uri" xlink:href="http://www.ncbi.nlm.nih.gov/pubmed/33522627">
                                    <italic>PubMed Abstract</italic>
                                </ext-link>
                                <italic>&#x00a0;|&#x00a0;</italic>
                                <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1002/ajim.23227">
                                    <italic>Publisher Full Text</italic>
                                </ext-link>
                            </p>
                        </list-item>
                        <list-item>
                            <p>
                                <italic>4. Ran L, Chen X, Wang Y, Wu W, et al.: Risk Factors of Healthcare Workers With Coronavirus Disease 2019: A Retrospective Cohort Study in a Designated Hospital of Wuhan in China.&#x00a0;Clinical Infectious Diseases. 2020;&#x00a0;
                                    <bold>71</bold>&#x00a0;(16): 2218-2221&#x00a0;</italic>
                                <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1093/cid/ciaa287">
                                    <italic>Publisher Full Text</italic>
                                </ext-link>
                            </p>
                        </list-item>
                    </list> 
                    <bold>Response: </bold>Thank you for calling our attention to this growing body of work. We have added to this introduction to include this prior work and clarify our hypothesis.</p>
                <p> </p>
                <p> 
                    <italic>Methods:</italic> 
                    <list list-type="bullet">
                        <list-item>
                            <p>
                                <italic>The authors should explain the justification for weighting to the overall Facebook population more. If the goal is to ensure that the healthcare workers survey from Facebook are representative of healthcare workers, this type of weighting may not help.&#x00a0;</italic>
                            </p>
                        </list-item>
                    </list> 
                    <bold>Response: </bold>Thank you for identifying this risk to the validity of our findings. We have added more detail about the weights in the Study Design section, as well as additional caveats about using the weights for the HCW population in sensitivity analyses in the Statistical Methods section. We have also added to the limitations section to provide more caveats about the risk of non-response bias.</p>
                <p> </p>
                <p> 
                    <italic>Was industry information available? There is good reason to suspect that risk will be different across different industry. In some cases, HCWs will even be working from home with telehealth. It may be useful to:</italic> 
                    <list list-type="bullet">
                        <list-item>
                            <p>
                                <italic>1) Compare healthcare workers employed in the healthcare industry to other health care workers</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>
                                <italic>2) Examine the risk among different industries&#x00a0;</italic>
                            </p>
                        </list-item>
                    </list> 
                    <bold>Response: </bold>Unfortunately, the survey instrument does not distinguish between occupation and industry, and therefore we can only examine risk between different occupations, as identified by responses to the question &#x201c;[p]lease select the occupational group that best fits the main kind of work you were doing in the last four weeks&#x201d;.&#x00a0; Respondents selected a single category from a short list, and then a detailed category from a longer list, and all of the detailed categories that of HCW are listed in Table 3.</p>
                <p> </p>
                <p> 
                    <italic>We strongly recommend including all positive tests as a sensitivity analysis not just those required by work. I agree that differential testing may introduce a bias, but it would be better to show all the data so that we can consider the potential magnitude of that bias. There may actually be an even greater differential between HCW and other workers. &#x00a0;In fact, probably most non-health care workers don't get tested through employer requirements, and only know that they have COVID after becoming sick.</italic>
                </p>
                <p> </p>
                <p> 
                    <bold>Response: </bold>The results of this proposed sensitivity analysis might surprise the reviewer: in an analysis of all survey respondents (123,448 HCWs and 1,699,214 non-HCWs) we find that among HCWs (tested and untested), 1,674 of 123,448 (1.4%) reported a positive test in the last 14 days; while among non-HCWs (tested and untested), 11,963 of 1,699,214 (0.70%) reported a positive test.&#x00a0; This yields a ratio of 1.8 (95% UI 1.52 to 2.03), but it is confounded by the fact that HCWs have greater access to testing than non-HCWs and cannot be used as an estimate of the relative incidence ratio of COVID-19.</p>
                <p> </p>
                <p> If we restrict our analysis to only individuals who have been tested in the last 14 days, we find 156,127 respondents who were tested (regardless of workplace requirements) in the time period we focused on, 22,594 HCWs and 133,533 non-HCWs; Among HCWs tested (regardless of whether the test was required), 1,674 of 22,594 (7.4%) reported a positive test in the last 14 days, while among non-HCWs tested (regardless of whether the test was required), 11,963 of 133,533 (8.96%) reported a positive test, for an RR of 0.8 (95% UI 0.78 to 0.83).</p>
                <p> </p>
                <p> 
                    <bold>Response: </bold>We prefer to keep this complexity out of the main paper; in some occupations, required testing happens only after symptoms develop, and in light of this, we prefer our sensitivity analysis using only required tests among asymptomatic workers to investigating this potential risk of confounding.</p>
                <p> </p>
                <p> 
                    <italic>Additionally, we strongly recommend having a different reference population than all non-healthcare workers. Other high risk workers are included in the current reference group, which may have the impact of making the risk among healthcare workers appear lower. Potentially consider including major census or SOC occupations for comparison.&#x00a0;</italic>
                </p>
                <p> </p>
                <p> 
                    <bold>Response: </bold>We prefer to focus our discussion on a comparison of HCWs with all non-HCWs, but the reviewer raises an interesting additional question.&#x00a0; Although we choose to leave a full investigation of these occupational comparisons for future work, we cannot resist examining them briefly in this response. After HCWs, the occupation with the highest rates of required testing are (16) Other occupation, (2) education, training, and library, (11) office and administration services, and (7) food preparation and serving. Our comparison of HCWs to workers in occupation "Other" found a relative COVID-19 incidence ratio of 0.97 (95% UI 0.82 to 1.12).</p>
                <p> </p>
                <p> This also identifies an important divergence between the &#x201c;non-HCW&#x201d; population and the worker population---there are 9,652 respondents without an occupation code included in the non-HCW population.&#x00a0; Repeating our analysis with these respondents excluded finds a ratio of 0.60 (95% UI 0.55 to 0.67).</p>
                <p> </p>
                <p> 
                    <italic>For non-health care workers, did they ask whether they worked outside the home, or was there just an assumption that they did. &#x00a0;Naturally if they were tested but work from home, that would be an overrepresentation of work-relatedness, though I would assume it would not be an employer requirement if they work from home.</italic>
                </p>
                <p> </p>
                <p> 
                    <bold>Response: </bold>The survey does include the question &#x201c;Was any of your work for pay in the last four weeks outside your home?&#x201d;, and as an additional sensitivity analysis which we excluded from our report we considered the same analysis stratified on work-from-home status. We were surprised to find quantitatively similar results among those who work from home and those who do not.</p>
                <p> </p>
                <p> 
                    <italic>&#x00a0;</italic>
                    <italic>Was the survey only conducted in English?&#x00a0;</italic>
                </p>
                <p> </p>
                <p> The survey was translated into multiple languages (Spanish, French, Portuguese, Chinese, Vietnamese).&#x00a0; We have added a reference to the 
                    <ext-link ext-link-type="uri" xlink:href="https://cmu-delphi.github.io/delphi-epidata/symptom-survey/">https://cmu-delphi.github.io/delphi-epidata/symptom-survey/</ext-link> website with full details on the survey instrument.</p>
                <p> </p>
                <p> 
                    <italic>Results:</italic>
                </p>
                <p> &#x00a0; 
                    <list list-type="bullet">
                        <list-item>
                            <p>
                                <italic>The demographics for healthcare workers should be compared to national data about healthcare workers demographics. This data can be obtained from the CPS or census. CPS is linked here: https://www.bls.gov/cps/tables.htm</italic>
                            </p>
                        </list-item>
                    </list> 
                    <bold>Response: </bold>We appreciate this suggestion, but prefer to keep the main paper simpler and instead include the comparison in this response only.&#x00a0; Among survey respondents, HCWs were 85.7% female, while among employed persons in 2020, &#x201c;Healthcare practitioners and technical occupations&#x201d; were 74.4% female.&#x00a0; The age distribution was also similar, but not identical.</p>
                <p> </p>
                <p> 
                    <italic>Consider separating occupations into major categories for more fair comparisons. You may consider weighting to this data rather than the Facebook demographics.&#x00a0;</italic>
                </p>
                <p> </p>
                <p> 
                    <bold>Response: </bold>We agree that this would be a valuable extension of the approach we have applied in this paper, but we would like to limit the scope of this work to focus solely on the comparison of HCWs to non-HCWs, and leave further investigation and comparison of other occupations and categories for future work.&#x00a0; We agree that additional sensitivity analyses would be warranted in this future work to determine if alternative weighting of the data yields substantively divergent results.&#x00a0; We believe, however, that our sensitivity analyses for the HCW versus non-HCW comparison establish that the substantive finding of an RR substantially below 1.0 for HCWs is robust.</p>
                <p> </p>
                <p> 
                    <italic>Is race/ethnicity data available? If workers of color are under-represented this could introduce bias to the study, because these workers may be more likely to be employed in higher risk healthcare occupations.&#x00a0;</italic>
                </p>
                <p> </p>
                <p> 
                    <bold>Response: </bold>The survey instrument did include race and ethnicity information, but we do not currently have access to these columns of the data. Subsequent work investigating racial and ethnic differences in both response rates and test results would be very interesting.</p>
                <p> </p>
                <p> 
                    <italic>Table 3: How do the distributions of detailed occupations compare to national data about employment in these occupations? The CPS data linked above can be used to assess this. Bias may be introduced if certain occupations are underrepresented.&#x00a0;</italic>
                </p>
                <p> </p>
                <p> 
                    <bold>Response: </bold>Some of the age distributions are quite similar, for example for nurses, while others have small sample sizes and are probably biased by differential response patterns, for example physicians.&#x00a0; Though we included all subcategories for completeness, we felt it was important to include the sample size as well, to make sure readers were not overly influenced by the calculations based on only a small number of respondents.</p>
                <p> </p>
                <p> We agree that this would be a valuable extension of the approach we have applied in this paper, but we would like to limit the scope of this work to focus solely on the comparison of HCWs to non-HCWs, and leave further investigation and comparison of other occupations and categories for future work.</p>
                <p> </p>
                <p> 
                    <italic>Discussion:</italic> 
                    <list list-type="bullet">
                        <list-item>
                            <p>
                                <italic>We strongly recommend removing this finding: &#x201c;an unequivocally positive findings, indicating that infection control measures being taken by HCWs in total are effective.&#x201d; Based on the limitations of this study, we do not believe that the findings support this conclusion. The findings may be suggestive of effective measures being taken if some of the limitations in the methods/results are addressed.&#x00a0;</italic>
                            </p>
                        </list-item>
                    </list> 
                    <bold>Response: </bold>We appreciate the reviewers recommendation and we have substantially moderated the discussion to ensure we keep readers aware of the limitations of our approach and do not over-state the implications our findings.</p>
                <p> </p>
                <p> 
                    <italic>Consider other findings linked above which are not consistent with this study&#x2019;s findings of a lower risk among HCWs.</italic>
                </p>
                <p> </p>
                <p> 
                    <bold>Response: </bold>We have referred to this contrasting evidence base in the discussion now, as well as in the introduction. 
                    <list list-type="bullet">
                        <list-item>
                            <p>We strong discourage concluding that HCWs should not fear contracting or transmitting infections more than other workers. HCWs don't base their fear on how their likelihood of exposure compares to other worker fears - they're afraid, according to other factors, including often not having adequate protection methods.&#x00a0;</p>
                        </list-item>
                    </list> 
                    <bold>Response: </bold>We have moderated the language in our conclusion, and thank the reviewer again for helping us avoid over-stating the implications of our findings.</p>
            </body>
        </sub-article>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report30079">
        <front-stub>
            <article-id pub-id-type="doi">10.21956/gatesopenres.14411.r30079</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Reinhart</surname>
                        <given-names>Alex</given-names>
                    </name>
                    <xref ref-type="aff" rid="r30079a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-6658-514X</uri>
                </contrib>
                <aff id="r30079a1">
                    <label>1</label>Department of Statistics &amp; Data Science, Carnegie Mellon University, Pittsburgh, PA, USA</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>I am a member of the Delphi group at Carnegie Mellon University. Delphi, in collaboration with Facebook and researchers at the University of Maryland, conducts the survey whose data is analyzed in this article, and I manage much of the process on behalf of Delphi (with assistance from Delphi team members). Delphi makes this data available to many researchers, including the authors of this article. I was not involved in the analysis conducted by the authors of this article, and have not corresponded with them about this research, so my review of the scientific merit of the work has been conducted independently. I confirm that this has not affected my ability to write an objective and unbiased review of this article.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>4</day>
                <month>12</month>
                <year>2020</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2020 Reinhart A</copyright-statement>
                <copyright-year>2020</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport30079" related-article-type="peer-reviewed-article" xlink:href="10.12688/gatesopenres.13202.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>This presents a timely and useful analysis of large-scale survey data. For an analysis like this, it's very important to clearly present the meaning of the data and the caveats in the survey design; the authors do a good job here, and my comments here focus on making the paper even clearer.</p>
            <p> </p>
            <p> The analysis seems reasonable overall, and, subject to the limitations of the survey design, a useful contribution to the area.</p>
            <p> </p>
            <p> I've separated my comments into "Main comments", which I think should be addressed to make the article more sound, and "Minor comments" that just make minor improvements to the paper.</p>
            <p> </p>
            <p> 
                <bold>Main comments:</bold> 
                <list list-type="bullet">
                    <list-item>
                        <p>The "Sensitivity analyses" section (page 5) explains that "When we repeated our calculation using the unweighted survey responses to calculate the COVID-19 incidence ratio, we found an even smaller relative incidence ratio of 0.4 (95% UI 0.3 to 0.5)." This seems surprising. Do you have any hypotheses that could explain why this is? It suggests that either the age and gender distributions for HCWs and non-HCWs are quite different (since the survey weights correct for age and gender) or that the estimated non-response for the groups are quite different.</p>
                    </list-item>
                    <list-item>
                        <p>The last paragraph of the Discussion suggests the possibility that "since we have only considered respondents with tests required by their employer or school, this might focus on non-HCW setting with better-than-average infection control policies". This may be a good subject for an additional table of results: A comparison of the distributions of occupation among non-HCW people who were required to be tested and those who were not. Such a table would tell the reader whether those who are required to be tested are from an unusual group of occupations, to help tell whether those occupations might be higher or lower risk than average.</p>
                    </list-item>
                    <list-item>
                        <p>Table 3 contains a "Number of non-HCWs" column, but I don't know how to interpret this. What does it mean to say that there were 26,805 non-HCWs in the "All HCWs" row?</p>
                    </list-item>
                    <list-item>
                        <p>In the Limitations (page 6), the authors mention recall bias and social desirability bias as possible problems. But another key bias would be response bias: while Facebook's weights try to adjust for non-response, if they do not completely adjust for every possible factor related to non-response, there can still be bias. For example, if people who are much more concerned about COVID and take more precautions are also more likely to participate in the survey, and if Facebook does not have covariates that can predict this accurately, the survey sample can be biased relative to the population. It would be good to address this and indicate how it could affect the results.</p>
                    </list-item>
                </list> </p>
            <p> 
                <bold>Minor comments:</bold> 
                <list list-type="bullet">
                    <list-item>
                        <p>The "Study design" subsection mentions that "Facebook also provided survey weights to adjust for the demographics of the active Facebook user population." It would be good to be explicit about what corrections are included in the weights: 
                            <list list-type="bullet">
                                <list-item>
                                    <p>The weights adjust for non-response, using Facebook's estimate of the probability of each sampled individual participating in the survey.</p>
                                </list-item>
                                <list-item>
                                    <p>The weights are then post-stratified by age and gender only.</p>
                                </list-item>
                            </list> </p>
                    </list-item>
                    <list-item>
                        <p>In the "Study design" subsection, the second paragraph states "We analyzed the most recently available six weeks of data from September 6, 2020 to October 18, 2020", but Wave 4 of the survey (containing the occupation and testing questions) was only deployed on September 8, 2020. If data from September 6 and 7 was included, I assume it was left out of the study, because the respondents would not have answered the relevant questions.</p>
                    </list-item>
                    <list-item>
                        <p>It may help readers to be explicit about the survey text and its location. The 
                            <ext-link ext-link-type="uri" xlink:href="https://cmu-delphi.github.io/delphi-epidata/symptom-survey/">survey documentation site</ext-link> contains the full text of each survey wave, and referring to this could help readers who want to read the survey text and flow.</p>
                    </list-item>
                </list>
            </p>
            <p>Is the work clearly and accurately presented and does it cite the current literature?</p>
            <p>Yes</p>
            <p>If applicable, is the statistical analysis and its interpretation appropriate?</p>
            <p>Yes</p>
            <p>Are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Yes</p>
            <p>Is the study design appropriate and is the work technically sound?</p>
            <p>Yes</p>
            <p>Are the conclusions drawn adequately supported by the results?</p>
            <p>Yes</p>
            <p>Are sufficient details of methods and analysis provided to allow replication by others?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>I am a professional statistician and assistant teaching professor of Statistics &amp; Data Science at Carnegie Mellon University. I am also a member of the Delphi group, and manage the collection of the survey data described in this article; see my Competing Interests for further details.</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.</p>
        </body>
        <sub-article article-type="response" id="comment3437-30079">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Flaxman</surname>
                            <given-names>Abraham</given-names>
                        </name>
                        <aff>University of Washington, Seattle, USA</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>As stated in manuscript.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>17</day>
                    <month>5</month>
                    <year>2021</year>
                </pub-date>
            </front-stub>
            <body>
                <p>
                    <italic>This presents a timely and useful analysis of large-scale survey data. For an analysis like this, it's very important to clearly present the meaning of the data and the caveats in the survey design; the authors do a good job here, and my comments here focus on making the paper even clearer.</italic>
                </p>
                <p> 
                    <bold>Response: </bold>We thank the reviewer for this assessment.</p>
                <p> </p>
                <p> 
                    <italic>The analysis seems reasonable overall, and, subject to the limitations of the survey design, a useful contribution to the area.</italic>
                </p>
                <p> 
                    <italic>I've separated my comments into "Main comments", which I think should be addressed to make the article more sound, and "Minor comments" that just make minor improvements to the paper.</italic>
                </p>
                <p> </p>
                <p> 
                    <italic>
                        <bold>Main comments:</bold>
                    </italic> 
                    <list list-type="bullet">
                        <list-item>
                            <p>
                                <italic>The "Sensitivity analyses" section (page 5) explains that "When we repeated our calculation using the unweighted survey responses to calculate the COVID-19 incidence ratio, we found an even smaller relative incidence ratio of 0.4 (95% UI 0.3 to 0.5)." This seems surprising. Do you have any hypotheses that could explain why this is? It suggests that either the age and gender distributions for HCWs and non-HCWs are quite different (since the survey weights correct for age and gender) or that the estimated non-response for the groups are quite different.</italic>
                            </p>
                        </list-item>
                    </list> 
                    <bold>Response: </bold>This appears to be an error in our number-plugging!&#x00a0; In the archived code corresponding to this submission, we have a relative incidence ratio of 0.70 (95% UI 0.65 to 0.74). We apologize for this and thank the reviewer for their careful reading that helped find and fix this defect!</p>
                <p> </p>
                <p> 
                    <italic>The last paragraph of the Discussion suggests the possibility that "since we have only considered respondents with tests required by their employer or school, this might focus on non-HCW setting with better-than-average infection control policies". This may be a good subject for an additional table of results: A comparison of the distributions of occupation among non-HCW people who were required to be tested and those who were not. Such a table would tell the reader whether those who are required to be tested are from an unusual group of occupations, to help tell whether those occupations might be higher or lower risk than average.</italic>
                </p>
                <p> </p>
                <p> 
                    <bold>Response: </bold>We appreciate the reviewer&#x2019;s suggestion, but prefer to restrict the scope of this paper to focus only on HCWs, and leave investigation of other occupations for future research.</p>
                <p> </p>
                <p> 
                    <italic>Table 3 contains a "Number of non-HCWs" column, but I don't know how to interpret this. What does it mean to say that there were 26,805 non-HCWs in the "All HCWs" row?</italic>
                </p>
                <p> </p>
                <p> 
                    <bold>Response: </bold>Thank you for flagging this confusing terminology.&#x00a0; By &#x201c;non-HCWs&#x201d; we meant the number of respondents who are 
                    <italic>not </italic>in the HCW subgroup for which the row reports the relative risk.&#x00a0; We have renamed the column headers to make this clearer. 
                    <list list-type="bullet">
                        <list-item>
                            <p>
                                <italic>In the Limitations (page 6), the authors mention recall bias and social desirability bias as possible problems. But another key bias would be response bias: while Facebook's weights try to adjust for non-response, if they do not completely adjust for every possible factor related to non-response, there can still be bias. For example, if people who are much more concerned about COVID and take more precautions are also more likely to participate in the survey, and if Facebook does not have covariates that can predict this accurately, the survey sample can be biased relative to the population. It would be good to address this and indicate how it could affect the results.</italic>
                            </p>
                        </list-item>
                    </list> 
                    <bold>Response: </bold>Thank you for calling attention to this important limitation.&#x00a0; We have added a sentence to the limitations section about it.</p>
                <p> </p>
                <p> 
                    <bold>
                        <italic>Minor comments:</italic>
                    </bold> 
                    <list list-type="bullet">
                        <list-item>
                            <p>
                                <italic>The "Study design" subsection mentions that "Facebook also provided survey weights to adjust for the demographics of the active Facebook user population." It would be good to be explicit about what corrections are included in the weights:</italic> 
                                <list list-type="bullet">
                                    <list-item>
                                        <p> 
                                            <list list-type="bullet">
                                                <list-item>
                                                    <p>
                                                        <italic>The weights adjust for non-response, using Facebook's estimate of the probability of each sampled individual participating in the survey.</italic>
                                                    </p>
                                                </list-item>
                                                <list-item>
                                                    <p>
                                                        <italic>The weights are then post-stratified by age and gender only.</italic>
                                                    </p>
                                                </list-item>
                                            </list> </p>
                                    </list-item>
                                </list> </p>
                        </list-item>
                    </list> 
                    <bold>Response: </bold>We have edited to include this detail explicitly.</p>
                <p> </p>
                <p> 
                    <italic>In the "Study design" subsection, the second paragraph states "We analyzed the most recently available six weeks of data from September 6, 2020 to October 18, 2020", but Wave 4 of the survey (containing the occupation and testing questions) was only deployed on September 8, 2020. If data from September 6 and 7 was included, I assume it was left out of the study, because the respondents would not have answered the relevant questions.</italic>
                </p>
                <p> </p>
                <p> 
                    <bold>Response: </bold>Good point, we have updated to text to reflect the days use only Wave 4 data, and shifted the data end date to still include precisely 6 weeks of data. This resulted in minor changes to many of our results, but no changes to our substantive findings.</p>
                <p> &#x00a0; 
                    <list list-type="bullet">
                        <list-item>
                            <p>
                                <italic>It may help readers to be explicit about the survey text and its location. The&#x00a0;</italic>
                                <ext-link ext-link-type="uri" xlink:href="https://cmu-delphi.github.io/delphi-epidata/symptom-survey/">
                                    <italic>survey documentation site</italic>
                                </ext-link>
                                <italic>&#x00a0;contains the full text of each survey wave, and referring to this could help readers who want to read the survey text and flow.</italic>
                            </p>
                        </list-item>
                    </list> 
                    <bold>Response: </bold>Thank you for suggesting this, we have added a reference to this documentation.</p>
            </body>
        </sub-article>
    </sub-article>
</article>
