Young infant clinical signs study, Pakistan: a data note

Neonatal sepsis is the leading cause of child death globally with most of these deaths occurring in the first week of life. It is of utmost public health importance that clinical signs predictive of severe illness and need for referral are identified early in the course of illness. From 2002-2005, a multi country trial called the Young Infant Clinical Signs Study (YICSS) was conducted in seven sites across three South-Asian (Bangladesh, India, and Pakistan), two African (Ghana, and South Africa), and one South American (Bolivia) country. The study aimed to develop a simplified algorithm to be used by primary healthcare workers for the identification of sick young infants needing prompt referral and treatment. The main study enrolled 8,889 young infants between the ages of 0-59 days old. This dataset contains observations on 2950 young infants aged 0-59 days from the Pakistan site. The data was collected between 2003-2004 with information on the most prevalent signs and symptoms. The data from this study was used to update the Integrated Management of Childhood Illness guidelines. The World Health Organisation (WHO) seven-sign algorithm has been used in other major community-based trials to study possible serious bacterial infection and its treatment regimens.


Introduction
In 2015, around 45% of all under-five mortality occurred in the first month of life particularly in low-and middle-income countries 1 .Majority of neonatal deaths occur due to infections; thus, it is important to identify sick infants needing urgent referral and hospitalization.The Young Infant Clinical Signs Study (YICSS) is a multi-country study conducted across six low-and middle-income countries (Bangladesh, Bolivia, Ghana, India, South Africa, and Pakistan) at seven sites.The study determined predictive values of various clinical signs and symptoms which can be used by primary healthcare workers to identify infants with severe illness requiring hospitalization as compared to an expert pediatrician diagnosis 2 .The study enrolled infants in two age groups: 3,177 infants in the 0-6 days age group and 5,712 infants in the 7-59 days age group.Out of the 31 signs and symptoms recorded by the community health workers, 12 identified severe illness in the first week of life and the algorithm was further reduced to seven key predictors of disease severity.The findings of this study formed the basis for updating the Integrated Management of Childhood Illness (IMCI) guidelines.Since its conception, the World Health Organisation (WHO) IMCI algorithm has been used in other major community-based studies.A multicenter observational cohort study called Aetiology of Neonatal Infection in South Asia (ANISA) and another randomised, open label, equivalence trial Simplified Antibiotic Therapy Trial (SATT) were both based on the IMCI seven-sign algorithm and proved to be important steps in understanding the infectious etiology behind neonatal possible serious bacterial infections and antibiotic regimens that can be given in case referral is not possible 3,4 .
The datasets presented in this paper are from the site-specific study conducted by the Aga Khan University in Karachi, Pakistan.The study aimed to validate clinical signs and symptoms in Pakistani infants in order to identify severe illness and predict hospital admissions.It was the largest site-specific cohort in the YICSS.The data is more representative of the study population since it is collected from community-based referral to primary healthcare centers.The data collection for the primary study occurred from 2003-2004.We believe these site-specific results remain relevant for similar low-and-middle-income settings.The dataset can be used by researchers to replicate the analysis or update systematic reviews and meta-analysis.

Data description
The dataset includes 2950 observations from 1,633 infants aged 0-6 days, 817 infants aged 7-27 days and 500 infants aged 28-59 days.Infants were enrolled from September 2003 to November 2004.There are five files which are uploaded: "YICSS Dataset Version 2.xslx", "YIS codebook Version 2.xls", "Form A.pdf", "Form B.pdf" and "Form C.pdf" available for download 5 .There is a total of 253 fields in the "YIS Dataset Version 2.xlsx"."Form A.pdf", "Form B.pdf" and "Form C.pdf" are the study tools that were used to collect this data.A codebook "YIS codebook Version 2. xslx" gives information on individual variables.

Data collection
The study was conducted in two peri-urban sites, Rehri Goth and Ibrahim Hyderi and an urban squatter settlement, Bilal Colony in Karachi, Pakistan.Both the peri-urban sites had Primary Healthcare Centers (PHC) run by the Aga Khan University Hospital (AKUH).All infants aged less than 60 days who were either self-referred to our center or referred by Community Health Workers (CHW) during community surveillance were first screened by a trained LHV (Lady Health Visitor) (Study Person A) after determining eligibility and taking informed consent from the parent/guardian.Form A included a questionnaire on the socio-demographic details and clinical signs which were recorded by the LHV.The infants were then referred to an experienced pediatrician (Study person B) who was blinded to the assessment of the LHV.The pediatrician determined the need for immediate referral of the infants to a tertiary care hospital, the National Institute of Child Health (NICH), based on their clinical presentation.Infant pulse and oximetry were also performed.Form B collected information of the expert paediatrician assessment and Form C described the final clinical diagnoses at the end of hospitalisation.A detailed description of the methodology has been published previously 2,6 .

Data entry and management
All the forms were first checked for completion and correctness.Data was then double entered into an Epi-Data database (V.2.1, Epidata Association, Odense Denmark).Data cleaning and consistency checks were performed at the data coordination centre in Melbourne, Australia.The quality of data received from study sites was also monitored.

Statistical methods
Following the methodology of the primary paper, we replicated the analysis for 0-6-day age group in Pakistani infants 2 .We determined a simple association between each of the clinical signs and symptoms and the study outcome (i.e., severe illness requiring urgent referral, as confirmed by paediatrician diagnosis) using sensitivity, specificity, and odds ratio (OR) with 95% confidence intervals (CIs).We developed multiple logistic regression models to determine predictors of urgent referral.We used backward selection to identify predictors with an OR of at least 2 and 95% confidence interval excluding 1. Signs were omitted when the corresponding p-value in the multivariable model was greater than 0•05 or the adjusted OR fell below 2 and thus was considered unlikely to have major prognostic value.A further reduction of the list of signs was then made based on their prevalence and clinical judgement grounds, omitting signs that were rarely reported.All analysis was performed using Stata version 9.2 software.

Ethics
This study was approved by the Ethics Review Committee of Aga Khan University and the Johns Hopkins University Institutional Review Board.

Pascal M Lavoie
Department of Pediatrics, The University of British Columbia, Vancouver, British Columbia, Canada This study reports data from 2950 infants 0 to 59 days of age, enrolled from 3 sites in Pakistan in 2003-2004 as part of the Young Infant Clinical Signs Study.The purpose was to develop compounded clinical signs at presentation, predictive of an urgent need for referral, as defined as "infants with severe illness requiring hospitalization as compared to an expert pediatrician diagnosis".The findings of this study formed the basis for the Integrated Management of Childhood Illness (IMCI) guidelines.The data is clearly presented.I don't have major comments.However, it would be helpful to clarify: 1) The statistical method description indicated that the analysis was replicated for the 0-6 age group in Pakistan infants.However, ref#2 refers to children under age 2 months.The data presented includes all 2950 infants, presumably with infants >6 days.I did not see a specific analysis about the 0-6 days in the current manuscript.Please clarify what this is referring to.
2) The data is clearly labeled, but for categorical variables I was not able to access information about how the categories were defined.For example, tage refers to age groups 1, 2, etc. but the data dictionary (YIS codebook Version 2.xlsx) did not seem to specify what these categories were.It may be in the original publication, but would be useful to indicate in this specific article.For other things like "who attended delivery" I was not able to find information on how this was coded.Please clarify...

Julia Johnson
Division of Neonatology, Johns Hopkins University, Baltimore, Maryland, USA The authors previously published their analysis of predictors for severe illness in the Pakistani infants enrolled in the YICCS study (reference 6).This data note makes available the site-specific data, linked appropriately, to promote ready replication of the previously performed analyses and/or inclusion of the data in future systematic reviews and meta-analyses.The data note is written clearly, with adequate description of data collection, entry, and management, and a brief overview of methods used for the analysis.

Are the datasets clearly presented in a useable and accessible format? Yes
Competing Interests: No competing interests were disclosed.This is an interesting data note from one of original center of Young infant study.It has included the important signs and symptoms to diagnose underlying serious bacterial infection which will be helpful for peripheral health worker to suspect sepsis in newborn.Authors have mentioned in form A about diarrhea in the history .However, many infants are admitted with history of abdominal distension rather than diarrhea with sepsis and ileus.Form B : Secondly, they have mentioned about gestational age.Correct assessment of gestational age is critical.Unfortunately it is not assessed correctly many a times in low and middle income countries.This needs to be highlighted.Rupture of membrane : Authors have mentioned 12 hours .Traditionally it it is 18 or 24 hours.Please explain.

Are the datasets clearly presented in a useable and accessible format? Yes
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Neonatal ventilation, Infection control and Neonatal sepsis.

Salahuddin Ahmed
Projahnmo Research Foundation, Dhaka, Bangladesh The rationale is described clearly.The methods are described clearly and are replicable by others.In the introduction, the authors need to add a reference for the majority of neonatal deaths that occur due to infection.In the data collection, "Infant pulse and oximetry were also performed" is written but the most commonly used term is pulse oximetry.Which pulse oximeter was used -this needs to be added.How many days of training did LHV receive to assess children less than 60 days?
Is the rationale for creating the dataset(s) clearly described?Yes

Are the datasets clearly presented in a useable and accessible format? Yes
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Infectious disease epidemiology, technology used to diagnose disease, pneumonia I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
Are the datasets clearly presented in a useable and accessible format?PartlyCompeting Interests: No competing interests were disclosed.Reviewer Expertise: Neonatology, global health research, neonatal sepsis, immunology, clinical care, epidemiology.I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.Reviewer Report 19 July 2024 https://doi.org/10.21956/gatesopenres.14560.r36061© 2024 Johnson J.This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reviewer Report 19
March 2024 https://doi.org/10.21956/gatesopenres.14560.r35885© 2024 Saha B. This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.Bijan Saha Department of Neonatology, Institute of Post Graduate Medical Education & Research and SSKM Hospital, Kolkata, West Bengal, India

Reviewer Report 29
November 2021 https://doi.org/10.21956/gatesopenres.14560.r31413© 2021 Ahmed S. This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.