“Anonymised” data lies at the core of everything from modern medical research to personalised recommendations and modern AI techniques. Unfortunately, according to a paper, successfully anonymising data is practically impossible for any complex dataset. An anonymised dataset is supposed to have had all personally identifiable information removed from it, while retaining a core of useful information for researchers to operate on without fear of invading privacy. For instance, a hospital may remove patients’ names, addresses and dates of birth from a set of health records in the hope researchers may be able to use the large sets of records to uncover hidden links between conditions. But in practice, data can be deanonymised in a number of ways. In 2008, an anonymised Netflix data set of film ratings was deanonymised by comparing the ratings with public scores on the IMDb film website in 2014; the home addresses of New York […]