Administrative Data Improves Quality of Cervical Pre-cancer Administrative Data Improves Quality of Cervical Pre-cancer Surveillance in Davidson County, Tennessee, United States Surveillance in Davidson County, Tennessee, United States

Background : Accurate data are critical for public health surveillance yet can be challenging to ensure. The Tennessee (TN) HPV Vaccine IMPACT Project aims to assess the effectiveness of the human papillomavirus (HPV) vaccine in prevention of cervical cancer and high-grade dysplasia through laboratory reporting of pathology results among Davidson County women. This project assessed feasibility and value of use of administrative sources for improved data quality and completeness of high-grade cervical events in TN HPV-IMPACT between 2013 and 2017. Method : We queried three administrative data systems (Hospital Discharge Data System, Ambulatory Surgical Treatment Center, and Tennessee Medicaid [TennCare]) for eligible women with cervical pre-cancer diagnostic and procedural codes from 2013 to 2017. We assessed data completeness from standard surveillance practices and from the addition of cases identi ﬁ ed and veri ﬁ ed through linkage with administrative data. Additionally, eligible women were linked to TennCare to inform missing demographic, insurance, and vaccination data elements. Results : Overall, use of administrative data systems increased the number of women identi ﬁ ed with cervical pre-cancer by 5% during the study years. Linkage to TennCare improved data completeness on race/ethnicity, insurance, and vaccination status by 10% e 20%. Conclusion : Linkage with administrative databases is a feasible and effective method to improve public health data quality.


Introduction
H uman papillomavirus (HPV) causes approxi- mately 31,500 preventable cancers in women in the United States (U.S.) annually [1].Since 2006, three HPV vaccines have been licensed in the United States [2].The full impact of the vaccination program on HPV-associated cancers may not be observed for decades, given the natural history of HPV infection and carcinogenesis [3].Therefore, monitoring efforts target intermediate outcomes, including high grade cervical intraepithelial neoplasia grades 2 and 3 and adenocarcinoma in situ (together referred to as CIN2þ) that can be detected earlier through cervical cancer screening [4].Since 2008, the Tennessee HPV Vaccine IMPACT (TN HPV-IMPACT) Project conducts active surveillance of diagnostic pathology laboratories to identify all pathology reports of CIN2þ in women ages 18 years of age (the age at which cervical cancer screening was recommended to start in 2008) living in Davidson County, TN, to examine the effectiveness of HPV vaccination in the population.Complete data collection of all CIN2þ is critical for accurate and unbiased public health reporting.However, data completeness assessed by TN HPV-IMPACT through individual and detailed auditing of reporting laboratories is time-consuming and not always feasible.
This analysis reports use of administrative data to improve public health data quality.Administrative data are generated at every health care encounter, whether through a visit to a physician's office, a diagnostic procedure, an admission to a hospital, or receipt of a prescription at a community pharmacy [5].Administrative data have been used for a variety of public health purposes.For example, Healthcare Effectiveness and Data Information Set (HEDIS) captures administrative and electronic health data and has been used to assess adherence to guidelines of care, such as cancer screening [6].We wanted to show the usefulness of administrative data in public health data quality improvement.We performed audits using administrative data to identify Davidson County women aged 18 years with CIN2þ lesions who may have been missed by laboratorybased, standard surveillance practices in HPV-IMPACT.

Study population
TN HPV-IMPACT is part of a multi-site, on-going, population-based surveillance program funded by the CDC to monitor the incidence of CIN2þ (reportable conditions in TN) among women aged 18 years and examine the effectiveness of HPV vaccination.Since 2008, the project monitors the incidence of CIN2þ among women aged 18 years because at the time U.S. cervical cancer screening guidelines recommended routine cervical dysplasia screening only for women 18 years of age.This surveillance is part of the Emerging Infections Program funded by the Centers for Disease Control and Prevention (CDC).The catchment area, Davidson County, TN is one of the five sites (California, Connecticut, New York, Oregon, and Tennessee) conducting this surveillance project [7].The public health activities of TN HPV-IMPACT was considered non-human research by the institutional review boards of CDC, Vanderbilt University Medical Center and Tennessee Department of Health.

Data sources
As part of standard surveillance, we identify Davidson County women with CIN2þ diagnoses through reporting by pathology laboratories.To audit reporting completeness from the years 2013 through 2017, we utilized three different TN administrative data systems, one in which was a system that contained Medicaid data.Medicaid provides health coverage to millions of people in the U.S., including eligible low-income adults, children, pregnant women, elderly adults, and people with disabilities.Medicaid is administered by states, according to federal requirements.The program is funded jointly by states and the federal government [8].Tennessee Medicaid (TennCare) data system contains comprehensive medical claims data on low-income women enrolled in the state program [9].The second system, TN Hospital Discharge Data System (HDDS), contains medical claims data from hospital-based inpatient and outpatient surgical procedures.The third system, TN Ambulatory Surgery Treatment Center data system (ASTC), contains medical claims data on outpatient procedures performed in ambulatory surgery centers [10].We receive this data from the Tennessee Department of Health (TDH).These datasets are maintained by TDH and are routinely checked for data quality by them before releasing for public health surveillance purposes.To further strengthen the data completeness, we use three different datasets (instead of just one).

Inclusion and exclusion criteria
For all three data systems, we used Current Procedural Terminology (CPT) codes for colposcopy (57421, 57454, 57455, 57456, 57460, 57461, 57500, 57520, and 57522), International Classification of Diseases (ICD) 9 and 10 codes (ICD 9: 233.1, 622.1, 622.11, 622.12, and 795.04;ICD 10: N87.1, N87.2, N87.9, D06.0, D06.1, D06.7, D06.9, and R87.613) for cervical dysplasia (ICD-9 for 2013 through September 2015 and ICD-10 from October 2015 through 2017), and Davidson County zip codes to identify women in the catchment area.Using a window of seven days before or after the procedure or diagnosis date, we searched for information including women's address, hospital or facility name, physician name, and insurance to validate new or known events.We excluded women already identified through standard TN HPV-IMPACT surveillance and those outside the surveillance area.Women with potential newly identified events were then investigated by review of relevant pathology reports.If a pathology report could not be retrieved, the diagnosis was considered not confirmed and therefore not included as a new event.If a woman had a qualifying CIN2þ diagnosis and was living in Davidson County at the time of her diagnosis, and had not previously been identified, she was classified as identified through audit.
In addition, we linked women identified through standard surveillance to the TennCare data system to inform data elements missing from routine surveillance in HPV-IMPACT with regards to HPV vaccination, insurance status, and race/ethnicity.Race/ethnicity information was obtained for women who were ever enrolled in TennCare.Insurance status was confirmed for women enrolled in Tenn-Care at the time of their procedure.We obtained HPV vaccine type, number of doses and date information for women insured by TennCare using CPT codes (90649, 90650, and 90651).Women were categorized as unvaccinated if there were no codes for HPV vaccination, and they were continuously enrolled in TennCare since June 1, 2006 (HPV vaccine release date) through the date of CIN2þ diagnosis.Race/ethnicity, insurance and vaccine information on each woman was classified as new information if previously missing and found through the TennCare data system.

Ethical issue
Dataset is permitted to use by institutional.

Results
From years 2013 through 2017, we identified 2580 Davidson County women with procedure and diagnosis codes indicating possible CIN2þ using the three administrative data systems (Fig. 1).Of these, 1469 (57%) were excluded due to being previously identified through standard surveillance in HPV-IMPACT (including 838 women from HDDS; 60 women from ASTC; and 571 women from TennCare).These 1469 women who were previously identified also represented 70% of the 2094 women captured using standard surveillance.Of the remaining 1111 potential new women captured by the administrative audits, 23 (2%) were excluded due to inability to review the pathology report.Of the remaining 1088 women whose pathology reports were available, 106 (4%) met criteria by catchment area and CIN2þ diagnosis and had not been identified through standard surveillance (Fig. 1).These 106 women identified through administrative audit contributed 5% of total cases of CIN2þ (n ¼ 2200).Each administrative data system yielded a range of 3%e 6% of additional women with CIN2þ from years 2013 through 2017 (Table 1).
To augment other missing data elements, linkage to TennCare yielded 1108 women in TN HPV-IMPACT who were enrolled at some point, which represented 50% of all women.For these women, we obtained new race/ethnicity information for a total of 97 (9%).Of the total women enrolled in TennCare at any time, 566 (51%) were enrolled at the time of their CIN2þ diagnosis, yielding new insurance information for 201 women (18%).In addition, 202 (18%) women had HPV new vaccination history confirmed through TennCare data.For each of these three metrics (race/ethnicity, insurance at time of diagnosis, and vaccination history), 10%e20% of the data were new information that had been previously missing (Table 2).

Discussion
Augmenting standard surveillance with administrative data resulted in a 5% increase in women with CIN2þ.Using administrative data not only identified these additional women but also validated high completeness of laboratory reporting of women with CIN2þ.
We did not observe a common cause or source of the 106 newly identified cases.The 13 pathology laboratories that serve our catchment area use different search methods (natural language processing, manual keyword search, SNOMED and ICD codes) to report CIN2þ conditions.Some of these missed cases we captured through audits could be due to differences in search methods (ICD codes vs manual keyword search by regular lab reporting).We plan to share the results of these audits with our laboratory facilities to improve their case identification and reporting processes.Laboratory reporting is a very useful public health surveillance tool, but due to personnel changes, laboratory policies, company restructuring, and even most recently a pandemic, changes or lapses in reporting can occur.Use of administrative audits can assess for such lapses in reporting.In addition, linkage with an insurance data system (TennCare) helped to capture missing vaccination, insurance, and race/ethnicity information.
Our study is novel in its reporting of how administrative data can be used to augment public health disease surveillance information for auditing and data quality improvement purposes.Administrative data have been used to augment clinical registries or check the quality of registries in clinical and healthcare studies [11].Administrative data are also used for validation purposes or quality assessment [12,13].Public health programs have used administrative data for capture-recapture analyses to assess disease burden within a population [14,15].In this study, we used administrative data for data quality assessment of our standard surveillance methods.Similar use of administrative data have been reported by other public health entities.The New York State Department of Health, for example, have used administrative data to validate the accuracy of surgical site infection data reported to the National Healthcare Safety Network by hospitals between 2009 and 2010.They reviewed a sample of surgical site infections identified by administrative data as an efficient means to identify errors in reporting [16].The results of our study further supports the use of administrative data sources for public health disease surveillance programs.
There are some limitations associated with administrative audits to consider.Administrative data systems may not include all relevant search criteria for a given outcome.HDDS and ASTC data systems only contain cervical treatment procedures, such as loop electrosurgical excision and conization procedures, and do not contain all diagnostic procedures such as cervical biopsies and endocervical curettage.Therefore, women missed through standard surveillance who only had diagnostic procedures with no follow up of a treatment procedure would not have been captured using those databases.Additionally, pathology reports were not available for 2% of women identified through the three administrative data systems.Thus, we were unable to determine if they met project criteria, and thus they were not counted.We also recognize that administrative data may not be readily available to all public health surveillance programs.Lastly, women could be missed due to coding errors in the data systems or due to incorrect addresses.Coding errors in administrative data due to human error are difficult to avoid and are generally a rare occurrence.Utilizing more than one administrative data system further reduces the risk of coding errors or missingness in the administrative data sources.Overall, we found use of administrative data to be an efficient and feasible method to audit laboratories for incomplete reporting to assure data completeness.Other alternative approaches for data audits when there are large numbers of events or limited resources include sampling techniques.

Conclusions
This report highlights the usefulness of administrative data to improve public health data quality and accuracy.Overall, we found that the use of administrative data systems successfully improved data completeness for TN HPV-IMPACT.The use of these types of data collected by many State Departments of Health may have quality improvement applications for other public health surveillance programs.

Fig. 1 .
Fig. 1.Total women identified through administrative audit data systems (HDDS, ASTC, and TennCare).a As part of standard surveillance, the TN HPV-IMPACT Project identifies pathology reports of CIN2þ diagnoses in Davidson County women through laboratory based reporting.b TN HPV-IMPACT Project criteria includes living in catchment area Davidson County at the time of procedure and having a CIN2þ diagnosis.

Table 1 .
Total women identified through standard surveillance and administrative audit data systems (HDDS, ASTC, and TennCare) by year.

Table 2 .
New race/ethnicity, insurance, and vaccine information identified through TennCare data system a .