HIDE: Short IDs for Robust and Anonymous Linking of Users Across Multiple Sessions in Small HCI Experiments
MetadataShow full item record
Record linking is needed to analyze observations across multiple sessions. However, recent privacy legislature such as the General Data Protection Regulations (GDPR) restricts the storage of information that identify individuals. Obtaining permissions to store information about individuals can be bureaucratic and time-consuming. Anonymous schemes such as self-generated ids and machine generated ids have been proposed. However, self-generated linking ids demand effort from the participants, while machine assisted schemes typically generate long and incomprehensible ids. Consequently, there is a risk that students and researchers will limit their research to single session experiments to avoid privacy issues. To simply the administration of small multi-session experiments, the HIDE procedure is proposed for generating short human readable ids for linking participants across multiple sessions while maintaining anonymity and being robust to input errors. The approach is different from previous approaches in that the goal is to minimize the length of the linking ids. First, the procedure converts the participant’s name into a phonetic representation. Next, this phonetic representation is hashed, and a truncated snippet of the hash is used as the linking id. HIDE is initialized by searching for a salt that minimizes the id lengths. Experiments show that the procedure is capable of coding small experiments with 20 participants using two digits, and experiments of around 200 participants with four digits. An implementation of the procedure has been made available through a simple web interface. It is hoped that the procedure can help students and HCI researchers collect more comprehensive data by following participants over time, while protecting their privacy.