Working toward effective anonymization for surveillance data: innovation at South Africa ’s Agincourt Health and Socio-Demographic Surveillance Site

We report findings from analyses of the error introduced by several masking techniques applied to data from the Agincourt Health and Socio-Demographi c Surveillance System in rural South Africa. Using a vegetation index (Normalized Difference Vegetation Index (NDVI)) at the household scale, comparisons are made between the “true” NDVI values and those calculated after masking. We also examine the tradeoffs between accuracy and protecting resp ondent privacy. The exploration suggests that in this study setting and for NDVI, geomasking approaches that use buffers and account for population density produce the most accurate results. However, the exploration also clearly demonstrates the tradeoff between accuracy and privacy, with more accur acy resulting in a higher level of potential respondent identification. It is important to note that these analyses illustrate a process that should characterize spatially informed research but within which particular decisions must be shaped by the research setting and objectives. In the long run, we aim to provide insight into masking’s potential and perils to facilitate population-environment-health research.
Source: Population and Environment - Category: Environmental Health Source Type: research