A multi-stage approach to maximizing geocoding success in a large population-based cohort study through automated and interactive processes

Sonderman, Jennifer S. and Mumma, Michael T. and Cohen, Sarah S. and Cope, Elizabeth L. and Blot, William J. and Signorello, Lisa B. (2012) A multi-stage approach to maximizing geocoding success in a large population-based cohort study through automated and interactive processes. Geospatial health , 6 (2). pp. 273-284. ISSN 1970-7096

gh-v6i2-14-sonderman.pdf - Published Version

Download (1MB)
Official URL: http://www.geospatialhealth.unina.it/


To enable spatial analyses within a large, prospective cohort study of nearly 86,000 adults enrolled in a 12-state area in the southeastern United States of America from 2002-2009, a multi-stage geocoding protocol was developed to efficiently maximize the proportion of participants assigned an address level geographic coordinate. Addresses were parsed, cleaned and standardized before applying a combination of automated and interactive geocoding tools. Our full protocol increased the non-Post Office (PO) Box match rate from 74.5% to 97.6%. Overall, we geocoded 99.96% of participant addresses, with only 5.2% at the ZIP code centroid level (2.8% PO Box and 2.3% non-PO Box addresses). One key to reducing the need for interactive geocoding was the use of multiple base maps. Still, addresses in areas with population density <44 persons/km2 were much more likely to require resource-intensive interactive geocoding than those in areas with >920 persons/km2 (odds ratio (OR) = 5.24; 95% confidence interval (CI) = 4.23, 6.49), as were addresses collected from participants during in-person interviews compared with mailed questionnaires (OR = 1.83; 95% CI = 1.59, 2.11). This study demonstrates that population density and address ascertainment method can influence automated geocoding results and that high success in address level geocoding is achievable for large-scale studies covering wide geographical areas

Item Type: Article
Uncontrolled Keywords: Epidemiologic methods, geographical information systems, prospective studies, residence characteristics, United States of America
Subjects: 600 Tecnologia - Scienze applicate > 610 Medicina e salute (Classificare qui la tecnologia dei servizi medici) > 614 Medicina legale; incidenza delle malattie; Medicina preventiva pubblica > 614.4 Incidenza delle malattie e misure pubblica per prevenirle (classificare qui l'Epidemiologia, l'Epidemiologia clinica) > 614.42 Incidenza delle malattie, e misure pubbliche per prevenirle. Incidenza (classificare qui la prevalenza; la Geografia medica; l'Epidemiologia spaziale; i rilevamenti sanitari)
900 Storia, Geografia e discipline ausiliarie > 910 Geografia e viaggi > 910.285 Geographic information systems
Depositing User: Chiara Ceccucci
Date Deposited: 13 Aug 2012 12:20
Last Modified: 13 Aug 2012 12:20
URI: http://eprints.bice.rm.cnr.it/id/eprint/4206

Actions (login required)

View Item View Item