Last week I wrote about a very early ‘finding’ under Canada’s Personal Information Protection and Electronic Documents Act which raises some issues about how the law might apply in the rapidly developing big data environment. This week I look at a more recent ‘finding’ – this time 5 years old – that should raise red flags regarding the extent to which Canada’s laws will protect individual privacy in the big data age.
In 2009, the Assistant Privacy Commissioner Elizabeth Denham (who is now the B.C. Privacy Commissioner) issued her findings as a result of an investigation into a complaint by the Canadian Internet Policy and Public Interest Clinic into the practices of a Canadian direct marketing company. The company combined information from different sources to create profiles of individuals linked to their home addresses. Customized mailing lists based on these profiles were then sold to clients looking for individuals falling within particular demographics for their products or services.
Consumer profiling is a big part of big data analytics, and today consumer profiles will draw upon vast stores of personal information collected from a broad range of online and offline sources. The data sources at issue in this case were much simpler, but the lessons that can be learned remain important.
The respondent organization used aggregate geodemographic data, which it obtained from Statistics Canada, and which was sorted according to census dissemination areas. This data was not specific to particular identifiable individuals – the aggregated data was not meant to reveal personal information, but it did give a sense of, for example, distribution of income by geographic area (in this case, by postal code). The company then took name and address information from telephone directories so as to match the demographic data with the name and location information derived from the directories. Based on the geo-demographic data, assumptions were made about income, marital status, likely home-ownership, and so on. The company also added its own assumptions about religion, ethnicity and gender based upon the telephone directory information – essentially drawing inferences based upon the subscribers’ names. These assumptions were made according to ‘proprietary models’. Other proprietary models were used to infer whether the individuals lived in single or multi-family dwellings. The result was a set of profiles of named individuals with inferences drawn about their income, ethnicity and gender. CIPPIC’s complaint was that the respondent company was collecting, using and disclosing the personal information of Canadians without their consent.
The findings of the Assistant Privacy Commissioner (APC) are troubling for a number of reasons. She began by characterizing the telephone directory information as “publicly available personal information”. Under PIPEDA, information that falls into this category, as defined by the regulations, can be collected, used and disclosed without consent, so long as the collection, use and disclosure are for the purposes for which it was made public. Telephone directories fall within the Regulations Specifying Publicly Available Information. However, the respondent organization did more than simply resell directory information.
Personal information is defined in PIPEDA as “information about an identifiable individual”. The APC characterized the aggregate geodemographic data as information about certain neighborhoods, and not information about identifiable individuals. She stated that “the fact that a person lives in a neighborhood with certain characteristics” was not personal information about that individual.
The final piece of information associated with the individuals in this case was the set of assumptions about, among other things, religion, ethnicity and gender. The APC characterized these as “assumptions”, rather than personal information – after all, the assumptions might not be correct.
Because the respondent’s clients provided the company with the demographic characteristics of the group it sought to reach, and because the respondent company merely furnished names and addresses in response to these requests, the APC concluded that the only personal information that was collected, used or disclosed was publicly available personal information for which consent was not required. (And, in case you are wondering, allowing people to contact individuals was one of the purposes for which telephone directory information is published – so the “use” by companies of sending out marketing information fell within the scope of the exception).
And thus, by considering each of the pieces of information used in the profile separately, the respondent’s creation of consumer profiles from diffuse information sources fell right through the cracks in Canada’s data protection legislation. This does not bode well for consumer privacy in an age of big data analytics.
The most troubling part of the approach taken by the APC is that which dismisses “assumptions” made about individuals as being merely assumptions and not personal information. Consumer profiling is about attributing characteristics to individuals based on an analysis of their personal information from a variety of sources. It is also about acting on those assumptions once the profile is created. The assumptions may be wrong, the data may be flawed, but the consumer will nonetheless have to bear the effects of that profile. These effects may be as minor as being sent advertising that may or may not match their activities or interests; but they could be as significant as decisions made about entitlements to certain products or services, about what price they should be offered for products or services, or about their desirability as a customer, tenant or employee. If the assumptions are not “actual” personal information, they certainly have the same effect, and should be treated as personal information. Indeed, the law accepts that personal information in the hands of an organization may be incorrect (hence the right to correct personal information), and it accepts that opinions about an individual constitute their personal information, even though the opinions may be unfair.
The treatment of the aggregate geodemographic information is also problematic. On its own, it is safe to say that aggregate geodemographic information is information about neighborhoods and not about individuals. But when someone looks up the names and addresses of the individuals living in an area and matches that information to the average age, income and other data associated with their postal codes, then they have converted that information into personal information. As with the ethnicity and gender assumptions, the age, income, and other assumptions may be close or they may be way off base. Either way, they become part of a profile of an individual that will be used to make decisions about that person. Leslie O’Keefe may not be Irish, he may not be a woman, and he may not make $100,000 a year – but if he is profiled in this way for marketing or other purposes, it is not clear why he should have no recourse under data protection laws.
Of course, the challenged faced by the APC in this case was how to manage the ‘balance’ set out in s. 3 of PIPEDA between the privacy interests of individuals and the commercial need to collect, use and disclose personal information. In this case, to find that consent – that cornerstone of data protection laws – was required for the use and disclosure of manufactured personal information, would be to hamstring an industry built on the sale of manufactured personal information. As the use – and the sophistication – of big data and big data analytics advances, organizations will continue to insist that they cannot function or compete without the use of massive stores of personal information. If this case is any indication, decision makers will be asked to continue to blur and shrink the edges of key concepts in the legislation, such as “consent” and “personal information”.
The PIPEDA complaint in this case dealt with relatively unsophisticated data used for relatively mundane purposes, and its importance may be too easily overlooked as a result. But how we define personal information and how we interpret data protection legislation will have enormous importance as to role of big data analytics in our lives continues to grow. Both this decision and the one discussed last week offer some insights into how Canada’s data protection laws might be interpreted or applied – and they raise red flags about the extent to which these laws are adequately suited to protecting privacy in the big data era.