The use of non-name identifiers for HIV reporting has been suggested by some as an alternative to name-based reporting. However, it has also been criticized because a non-name system may result in a high duplicate count and may be costly to develop and maintain. Due to such concerns, the HIV Epidemiology Program, Data Acquisition Unit, is evaluating the feasibility of using non-name identifiers using our AIDS surveillance system as a model.
- Evaluate the accuracy of various non-name identifier algorithms.
- Evaluate the completeness of reporting the required data elements necessary to construct the non-name identifiers.
- For each algorithm, evaluate and compare the performance of three data matching techniques (exact match, fuzzy match, and wildcard match) to identify true matches and false matches.
From September, 1997 through December,1997, the personal identifying information from all AIDS case reports submitted by our reporting sources was entered into an ACCESS database. Nine non-name identifiers were generated based on specific algorithms using various combinations of initials, date of birth, gender, race/ethnicity, and the last four digits of each social security number. To date sensitivity, specificity, positive predictive value and negative predictive values have been calculated to evaluate each algorithm. Currently, coded data have been transferred to the Centers for Disease Control and Prevention (CDC) where computer programming and statistical experts are designing the data matching programs to determine the most accurate method for determining true and false matches.
Gordon Bunch, M.A.