Displaying items by tag: data scraping

Tuesday, 27 May 2025 05:18

New Clearview AI Decision Has Implications for OpenAI Investigation

The Alberta Court of Queen’s Bench has issued a decision in Clearview AI’s application for judicial of an Order made by the province’s privacy commissioner. The Commissioner had ordered Clearview AI to take certain steps following a finding that the company had breached Alberta’s Personal Information Protection Act (PIPA) when it scraped billions of images – including those of Albertans – from the internet to create a massive facial recognition database marketed to police services around the world. The court’s decision is a partial victory for the commissioner. It is interesting and important for several reasons – including for its relevance to generative AI systems and the ongoing joint privacy investigation into OpenAI. These issues are outlined below.

Brief Background

Clearview AI became notorious in 2020 following a New York Times article which broke the story on the company’s activities. Data protection commissioners in Europe and elsewhere launched investigations, which overwhelmingly concluded that the company violated applicable data protection laws. In Canada, the federal privacy commissioner joined forces with the Quebec, Alberta and British Columbia (BC) commissioners, each of which have private sector jurisdiction. Their joint investigation report concluded that their respective laws applied to Clearview AI’s activities as there was a real and substantial connection to their jurisdictions. They found that Clearview collected, used and disclosed personal information without consent, and that no exceptions to consent applied. The key exception advanced by Clearview AI was the exception for “publicly available information”. The Commissioners found that the scope of this exception, which was similarly worded in the federal, Alberta and BC laws, required a narrow interpretation and that the definition in the regulations enacted under each of these laws did not include information published on the internet. The commissioners also found that, contrary to shared legislative requirements, the collection and use of the personal information by Clearview AI was not for a purpose that a reasonable person would consider appropriate in the circumstances. The report of findings made a number of recommendations that Clearview ultimately did not accept. The Quebec, BC and Alberta commissioners all have order making powers (which the federal commissioner does not). Each of these commissioners ordered Clearview to correct its practices, and Clearview sought judicial review of each of these orders. The decision of the BC Supreme Court (which upheld the Commissioner’s order) is discussed in an earlier post. The decision from Quebec has yet to be issued.

In Alberta, Clearview AI challenged the commissioner’s jurisdiction on the basis that Alberta’s PIPA did not apply to its activities. It also argued that that the Commissioner’s interpretation of “publicly available information” was unreasonable. In the alternative, Clearview AI argued that ‘publicly available information’, as interpreted by the Commissioner, was an unconstitutional violation of its freedom of expression. It also contested the Commissioner’s finding that Clearview did not have a reasonable purpose for collecting, using and disclosing the personal information.

The Jurisdictional Question

Courts have established that Canadian data protection laws will apply where there is a real and substantial connection to the relevant jurisdiction. Clearview AI argued that it was a US-based company that scraped most of its data from social media websites mainly hosted outside of Canada, and that therefore its activities took place outside of Canada and its provinces. Yet, as Justice Feasby noted, “[s]trict adherence to the traditional territorial conception of jurisdiction would make protecting privacy interests impossible when information may be located everywhere and nowhere at once” (at para 50). He noted that there was no evidence regarding the actual location of the servers of social media platforms, and that Clearview AI’s scraping activities went beyond social media platforms. Justice Feasby ruled that he was entitled to infer from available evidence that images of Albertans were collected from servers located in Canada and in Alberta. He observed that in any event, Clearview marketed its services to police in Alberta, and its voluntary decision to cease offering those services did not alter the fact that it had been doing business in Alberta and could do so again. Further, the information at issue in the order was personal information of Albertans. All of this gave rise to a real and substantial connection with Alberta.

Publicly Available Information

The federal Personal Information Protection and Electronic Documents Act (PIPEDA) contains an exception to the consent requirement for “publicly available information”. The meaning of this term is defined in the Regulations Specifying Publicly Available Information. The relevant category is found in s. 1(e) which specifies “personal information that appears in a publication, including a magazine, book or newspaper, in printed or electronic form, that is available to the public, where the individual has provided the information.” Alberta’s PIPA contains a similar exception (as does BC’s law), although the wording is slightly different. Section 7(e) of the Alberta regulations creates an exception to consent where:

(e) the personal information is contained in a publication, including, but not limited to, a magazine, book or newspaper, whether in printed or electronic form, but only if

(i) the publication is available to the public, and

(ii) it is reasonable to assume that the individual that the information is about provided that information; [My emphasis]

In their joint report of findings, the Commissioners found that their respective “publicly available information” exceptions did not include social media platforms.

Clearview AI made much of the wording of Alberta’s exception, arguing that even if it could be said that the PIPEDA language excluded social media platforms, the use of the words “including but not limited to” in the Alberta regulation made it clear that the list was not closed, nor was it limited to the types of publications referenced.

In interpreting the exceptions for publicly available information, the Commissioners emphasized the quasi-constitutional nature of privacy legislation. They found that the privacy rights should receive a broad and expansive interpretation and the exceptions to those rights should be interpreted narrowly. The commissioners also found significant differences between social media platforms and the more conventional types of publications referenced in their respective regulations, making it inappropriate to broaden the exception. Justice Feasby, applying reasonableness as the appropriate standard of review, found that the Alberta Commissioner’s interpretation of the exception was reasonable.

Freedom of Expression

Had the court’s decision ended there, the outcome would have been much the same as the result in the BC Supreme Court. However, in this case, Clearview AI also challenged the constitutionality of the regulations. It sought a declaration that if the exception were interpreted as limited to books, magazines and comparable publications, then this violated its freedom of expression under s. 2(b) of the Canadian Charter of Rights and Freedoms.

Clearview AI argued that its commercial purposes of scraping the internet to provide information services to its clients was expressive and was therefore protected speech. Justice Feasby noted that Clearview’s collection of internet-based information was bot-driven and not engaged in by humans. Nevertheless, he found that “scraping the internet with a bot to gather images and information may be protected by s. 2(b) when it is part of a process that leads to the conveyance of meaning” (at para 104).

Interestingly, Justice Feasby noted that since Clearview no longer offered its services in Canada, any expressive activities took place outside of Canada, and thus were arguably not protected by the Charter. However, he acknowledged that the services had at one point been offered in Canada and could be again. He observed that “until Clearview removes itself permanently from Alberta, I must find that its expression in Alberta is restricted by PIPA and the PIPA Regulation” (at para 106).

Having found a prima facie breach of s. 2(b), Justice Feasby considered whether this was a reasonable limit demonstrably justified in a free and democratic society, under s. 1 of the Charter. The Commissioner argued that the expression at issue in this case was commercial in nature and thus of lesser value. Justice Feasby was not persuaded by category-based assumptions of value; rather, he preferred an approach in which the regulation of commercial expression is consistent with and proportionate to its character.

Justice Feasby found that the Commissioner’s reasonable interpretation of the exception in s. 7 of the regulations meant that it would exclude social media platforms or “other kinds of internet websites where images and personal information may be found” (at para 118). He noted that this is a source-based exception – in other words that some publicly available information may be used without knowledge or consent, but not other similar information. The exclusion depends on the source and not the purpose of use for the personal information. Justice Feasby expressed concern that the same exception that would exclude the scraping of images from the internet for the creation of a facial recognition database would also apply to the activities of search engines widely used by individuals to gain access to information on the internet. He thus found that the publicly available information exception was overbroad, stating: “Without a reasonable exception to the consent requirement for personal information made publicly available on the internet without use of privacy settings, internet search service providers are subject to a mandatory consent requirement when they collect, use and disclose such personal information by indexing and delivering search results” (at para 138). He stated: “I take judicial notice of the fact that search engines like Google are an important (and perhaps the most important) way individuals access information on the internet” (at para 144).

Justice Feasby also noted that while it was important to give individuals some level of control over their personal information, “it must also be recognized that some individuals make conscious choices to make their images and information discoverable by search engines and that they have the tools in the form of privacy settings to prevent the collection, use, and disclosure of their personal information” (at para 143). His constitutional remedy – to strike the words “including, but not limited to magazines, books, and newspapers” from the regulation was designed to allow “the word ‘publication’ to take its ordinary meaning which I characterize as ‘something that has been intentionally made public’” (at para 149).

The Belt and Suspenders Approach

Although excising part of the publicly available information definition seems like a major victory for Clearview AI, in practical terms it is not. This is because of what the court refers to as the law’s “belt and suspenders approach”. This metaphor suggests that there are two routes to keep up privacy’s pants – and loosening the belt does not remove the suspenders. In this case, the suspenders are located in the clause found in PIPA, as well as in its federal and BC counterparts, that limits the collection, use and disclosure of personal information to only that which “a reasonable person would consider appropriate in the circumstances”. The court ruled that the Commissioner’s conclusion that the scraping of personal information was not for purposes that a reasonable person would consider appropriate in the circumstances was reasonable and should not be overturned. This approach, set out in the joint report of findings, emphasized that the company’s mass data scraping involved over 3 billion images of individuals, including children. It was used to create biometric face prints that would remain in Clearview’s databases even if the source images were removed from the internet, and it was carried out for commercial purposes. The commissioners also found that the purposes were not related to the reasons why individuals might have shared their photographs online, could be used to the detriment of those individuals, and created the potential for a risk of significant harm. Continuing with his analogy to search engines, Justice Feasby noted that Clearview AI’s use of publicly available images was very different from the use of the same images by search engines. The different purposes are essential to the reasonableness determination. Justice Feasby states: “The “purposes that are reasonable” analysis is individualized such that a finding that Clearview’s use of personal information is not for reasonable purposes does not apply to other organizations and does not threaten the operations of the internet” (at para 159). He noted that the commercial dimensions of the use are not determinative of reasonableness. However, he observed that “where images and information are posted to social media for the purpose of sharing with family and friends (or prospective friends), the commercialization of such images and information by another party may be a relevant consideration in determining whether the use is reasonable” (at para 160).

The result is that Clearview AI’s scraping of images from the public internet violates Alberta’s PIPA. The court further ruled that the Commissioner’s order was clear and specific, and capable of being implemented. Justice Feasby required Clearview AI to report within 50 days on its good faith progress in taking steps to cease the collection, use and disclosure of images and biometric data collected from individuals in Alberta, and to delete images and biometric data in its database that are from individuals in Alberta.

Harmonized Approaches to Data Protection Law in Canada

This decision highlights some of the challenges to the growing collaboration and cooperation of privacy commissioners in Canada when it comes to interpreting key terms and concepts in substantially similar legislation. Increasingly, the commissioners engage in joint investigations where complaints involve organizations operating in multiple jurisdictions in Canada. While this occurs primarily in the private sector context, it is not exclusively the case, as a recent joint investigation between the BC and Ontario commissioners into a health data breach demonstrates. Joint investigations conserve regulator resources and save private sector organizations from having to respond to multiple similar and concurrent investigations. In addition, joint investigations can lead to harmonized approaches and interpretations of shared concepts in similar legislation. This is a good thing for creating certainty and consistency for those who do business across Canadian jurisdictions.

However, harmonized approaches are vulnerable to multiple judicial review applications, as was the case following the Clearview AI investigation. Although the BC Supreme Court found that the BC Commissioner’s order was reasonable, what the Alberta King’s Bench decision demonstrates is that a common front can be fractured. Justice Feasby found that a slight difference in wording between Alberta’s regulations and those in BC and at the federal level was sufficient to justify finding the scope of Alberta’s publicly available information exception to be unconstitutional.

Harmonized approaches may also be vulnerable to unilateral legislative change. In this respect, it is worth noting that an Alberta report on the impending reform of PIPA recommends “that the Government take all necessary steps, including through proposing amendments to the Personal Information Protection Act, to improve alignment of all provincial privacy legislation, including in the private, public and health sectors” (at p. 13).

The Elephant in the Room: Generative AI and Data Protection Law in Canada

In his reasons, Justice Feasby made Google’s search functions a running comparison for Clearview AI’s data scraping practices. Perhaps a better example would have been the data scraping that takes place in order to train generative AI models. However, the court may have avoided that example because there is an ongoing investigation by the Alberta, Quebec, BC and federal commissioners into OpenAI’s practices. The findings in that investigation are overdue – perhaps the delay has, at least in part, been caused by anticipation of what might happen with the Alberta Clearview AI judicial review. The Alberta decision will likely present a conundrum for the commissioners.

Reading between the lines of Justice Feasby’s decision, it is entirely possible that he would find that the scraping of the public internet to gather training data for generative AI systems would both fall within the exception for publicly available information and be for a purpose that a reasonable person would consider appropriate in the circumstances. Generative AI tools are now widely used – more widely even than search engines since these tools are now also embedded in search engines themselves. To find that the collection and use of personal information that may be indiscriminately found on the internet cannot be used in this way because consent is required is fundamentally impractical. In the EU, the legitimate interest exception in the GDPR provides latitude for use in this way without consent, and recent guidance from the European Data Protection Supervisor suggestions that legitimate interests combined, where appropriate with Data Protection Impact Assessments may address key data protection issues.

In this sense, the approach taken by Justice Feasby seems to carve a path for data protection in a GenAI era in Canada by allowing data scraping of publicly available sources on the Internet in principle, subject to the limit that any such collection or any ensuing use or disclosure of the personal information must be for purposes that a reasonable person would consider appropriate in the circumstances. However, this is not a perfect solution. In the first place, unlike the EU approach, which ensures that other privacy protective measures (such as privacy impact assessments) govern this kind of mass collection, Canadian law remains outdated and inadequate. Further, the publicly available information exceptions – including Alberta’s even after its constitutional nip and tuck – also emphasize that, to use the language of Alberta’s PIPA, it must be “reasonable to assume that the individual that the information is about provided the information”. In fact, there will be many circumstances in which individuals have not provided the information posted online about them. This is the case with photos from parties, family events and other social interactions. Further, social media – and the internet as a whole – is full of non-consensual images, gossip, anecdotes and accusations.

The solution crafted by the Alberta Court of King’s Bench is therefore only a partial solution. A legitimate interest exception would likely serve much better in these circumstances, particularly if it is combined with broader governance obligations to ensure that privacy is adequately considered and assessed. Of course, before this happens, the federal government’s privacy reform measures in Bill C-27 must be resuscitated in some form or another.

Published in Privacy

Monday, 17 March 2025 06:16

Clearview AI and Compliance with Canadian Privacy Laws

The Clearview AI saga has a new Canadian instalment. In December 2024, the British Columbia Supreme Court rendered a decision on Clearview AI’s application for judicial review of an order issued by the BC Privacy Commissioner. This post explores that decision and some of its implications. The first part sets the context, the next discusses the judicial review decision, and part three looks at the ramifications for Canadian privacy law of the larger (and ongoing) legal battle.

Context

Late in 2021, the Privacy Commissioners of BC, Alberta, Quebec and Canada issued a joint report on their investigation into Clearview AI (My post on this order is here). Clearview AI, a US-based company, had created a massive facial recognition (FRT) database from images scraped from the internet that it marketed to law enforcement agencies around the world. The investigation was launched after a story broke in the New York Times about Clearview’s activities. Although Canadian police services initially denied using Clearview AI, the RCMP later admitted that it had purchased two licences. Other Canadian police services made use of promotional free accounts.

The joint investigation found that Clearview AI had breached the private sector data protection laws of the four investigating jurisdictions by collecting and using sensitive personal information without consent and by doing so for purposes that a reasonable person would not consider appropriate in the circumstances. The practices also violated Quebec’s Act to establish a legal framework for information technology. Clearview AI disagreed with these conclusions. It indicated that it would temporarily cease its operations in Canada but maintained that it was entitled to scrape content from the public web. After failing to respond to the recommendations in the joint report, the Commissioners of Quebec, BC and Alberta issued orders against the company. These orders required Clearview AI to cease offering its services in their jurisdictions, to make best efforts to stop collecting the personal information of those within their respective provincial boundaries, and to delete personal information in its databases that had been improperly collected from those within their boundaries. No order issued from the federal Commissioner, who does not have order making powers under the Personal Information Protection and Electronic Documents Act (PIPEDA). He could have applied to the Federal Court for an order but chose not to do so (more on that in Part 3 of this post).

Clearview AI declined to comply with the provincial orders, other than to note that it had already temporarily ceased operations in Canada. It then applied for judicial review of the orders in each of the three provinces.

To date, only the challenge to the BC Order has been heard and decided. In the BC application, Clearview argued that the Commissioner’s decision was unreasonable. Specifically, it argued that BC’s Personal Information Protection Act (PIPA) did not apply to Clearview AI, that the information it scraped was exempt from consent requirements because it was “publicly available information”, and that the Commissioner’s interpretation of purposes that a reasonable person would consider appropriate in the circumstances was unreasonable and failed to consider Charter values. In his December 2024 decision, Justice Shergill of the BC Supreme Court disagreed, upholding the Commissioner’s order.

The BC Supreme Court Decision on Judicial Review

Justice Shergill confirmed that BC’s PIPA applies to Clearview AI’s activities, notwithstanding the fact that Clearview AI is a US-based company. He noted that applying the ‘real and substantial connection’ test – which considers the nature and extent of connections between a party’s activities and the jurisdiction in which proceedings are initiated – leads to that conclusion. There was evidence that Clearview AI’s database had been marketed to and used by police services in BC, as well as by the RCMP which polices many parts of the province. Further, Justice Shergill noted that Clearview’s data scraping practices were carried out worldwide and captured data about BC individuals including, in all likelihood, data from websites hosted in BC. Interestingly, he also found that Clearview’s scraping of images from social media sites such as Facebook, YouTube and Instagram also created sufficient connection, as these sites “undoubtedly have hundreds of thousands if not millions of users in British Columbia” (at para 91). In reaching his conclusion, Justice Shergill emphasized “the important role that privacy plays in the preservation of our societal values, the ‘quasi-constitutional’ status afforded to privacy legislation, and the increasing significance of privacy laws as technology advances” (at para 95). He also found that there was nothing unfair about applying BC’s PIPA to Clearview AI, as the company “chose to enter British Columbia and market its product to local law enforcement agencies. It also chooses to scrape data from the Internet which involves personal information of people in British Columbia” (at para 107).

Sections 12(1)(e), 15(1)(e) and 18(1)(e) of PIPA provide exceptions to the requirement of knowledge and consent for the collection, use and disclosure of personal information where “the personal information is available to the public” as set out in regulations. The PIPA Regulations include “printed or electronic publications, including a magazine, book, or newspaper in printed or electronic form.” Similar exceptions are found in the federal PIPEDA and in Alberta’s Personal Information Protection Act. Clearview AI had argued that public internet websites, including social media sites, fell within the category of electronic publications and their scraping was thus exempt from consent requirements. The commissioners disagreed, and Clearview AI challenged this interpretation as unreasonable.

Justice Shergill found that the Commissioners’ conclusion that social media websites fell outside the exception for publicly available information was reasonable. The BC Commissioner was entitled to read the list in the PIPA Regulations as a “narrow set of sources” (at para 160). Justice Shergill reviewed the reasoning in the joint report for why social media sites should be treated differently from other types of publications mentioned in the exception. These include the fact that social media sites are dynamic and not static and that individuals exercise a different level of control over their personal information on social media platforms than on news or other such sites. Although the legislation may require a balancing of privacy rights with private sector interests, Justice Shergill found that it was reasonable for the Commissioner to conclude that privacy rights should be given precedence over commercial interests in the overall context of the legislation. Referencing the Supreme Court of Canada’s decision in Lavigne, Justice Shergill noted that “it is the protection of individual privacy that supports the quasi-constitutional status of privacy legislation, not the right of the organization to collect and use personal information” (at para 174). An individual’s ability to control what happens to their personal information is fundamental to the autonomy and dignity protected by privacy rights and “it is thus reasonable to conclude that any exception to these important rights should be interpreted narrowly” (at para 175).

Clearview AI argued that posting photos to social media sites reflected an individual’s autonomous choice to surrender the information to the public domain. Justice Shergill preferred the Commissioner’s interpretation, which considered the sensitivity of the biometric information, and the impact its collection and use could have on individuals. He referenced the Supreme Court of Canada’s decision in R. v. Bykovets (my post on this case is here), which emphasized that “individuals ‘may choose to divulge certain information for a limited purpose, or to a limited class of persons, and nonetheless retain a reasonable expectation of privacy” (at para 162, citing para 46 of Bykovets).

Clearview AI also argued that the Commissioner was unreasonable in not taking into account Charter values in his interpretation of PIPA. In particular, the company was of the view that the freedom of expression, which guarantees the right both to communicate and to receive information, extended to the ability to access and use publicly available information without restriction. Although Justice Shergill found that the Commissioner could have been more direct in his consideration of Charter values, his decision was still not unreasonable on this point. The Commissioner did not engage with the Charter values issues at length because he did not consider the law to be ambiguous – Charter values-based interpretation comes into play in helping to resolve ambiguities in the law. As Justice Shergill noted, “It is difficult to understand how Clearview’s s. 2(b) Charter rights are infringed through an interpretation of ‘publicly available’ which excludes it from collecting personal information from social media websites without consent” (at para 197).

Like its counterpart legislation in Alberta and at the federal level, BC’s PIPA contains a section that articulates the overarching principle, that any collection, use or disclosure of personal information must be for purposes that a reasonable person would consider appropriate in the circumstances. This means, among other things, that even if the exception to consent had applied in this case, the collection and use of the scraped personal information would still have had to have been for a reasonable purpose.

The Commissioners had found that overall, Clearview’s scraping of vast quantities of sensitive personal information from the internet to build a massive facial recognition database was not one that a reasonable person would find appropriate in the circumstances. Clearview AI preferred to characterize its purpose as providing a service to the benefit of law enforcement and national security. In their joint report, the Commissioners had rejected this characterization noting that it did not justify the massive, widespread scraping of personal information by a private sector company. Further, the Commissioners had noted that such an activity could have negative consequences for individuals, including cybersecurity risks and risks that errors could lead to reputational harm. They also observed that the activity contributed to “broad-based harm inflicted on all members of society, who find themselves under continual mass surveillance by Clearview based on its indiscriminate scraping and processing of their facial images” (at para 253). Justice Shergill found that the record supported these conclusions, and that the Commissioners’ interpretation of reasonable purposes was reasonable.

Clearview AI also argued that the Commissioner’s Order was “unnecessary, unenforceable or overbroad”, and should thus be quashed (at para 258). Justice Shergill accepted the Commissioner’s argument that the order was necessary because Clearview had only temporarily suspended its services in Canada, leaving open the possibility that it would offer its services to Canadian law enforcement agencies in the future. He also accepted the Commissioner’s argument that compliance with the order was possible, noting that Clearview had accepted certain steps for ceasing collection and removing images in its settlement of an Illinois class action lawsuit. The order required the company to use “best efforts”, in an implicit acknowledgement that a perfect solution was likely impossible. Clearview argued that a “best efforts” standard was too vague to be enforceable; Justice Shergill disagreed, noting that courts often used “best efforts language”. Further, and quite interestingly, Justice Shergill noted that “if it is indeed impossible for Clearview to sufficiently identify personal information sourced from people in British Columbia, then this is a situation of Clearview’s own making” (at para 279). He noted that “[i]t is not an answer for Clearview to say that because the data was indiscriminately collected, any order requiring it to cease collecting data of persons present in a particular jurisdiction is unenforceable” (at para 279).

Implications

This is a significant decision as it upholds interpretations of important provisions of BC PIPA. These provisions are similar to ones in Alberta’s PIPA and in the federal PIPEDA. However, it is far from the end of the Clearview AI saga, and there is much to continue to watch.

In the first place, the BC Supreme Court decision is already under appeal to the BC Court of Appeal. If the Court of Appeal upholds this decision, it will be a major victory for the BC Commissioner. Yet, either way, there is likely to be a further application for leave to appeal to the Supreme Court of Canada. It may be years before the issue is finally resolved. In this time, data protection laws in BC, Alberta and at the federal level might well be reformed. It will therefore also be important to examine any new bills to see whether the provisions at issue in this case are addressed in any way or left as is.

In the meantime, Clearview AI has also filed for judicial review of the orders of the Quebec and Alberta commissioners, and these applications are moving forward. All three orders (BC, Alberta and Quebec) are based on the same joint findings. A decision by either or both the Quebec or Alberta superior courts that the orders are unreasonable could strike a significant blow for the united front that Canada’s commissioners are increasingly showing on privacy issues that affect all Canadians. There is therefore a great deal riding on the outcomes of these applications. In any event, regardless of the outcomes, expect applications for leave to appeal to the Supreme Court of Canada. Leave to appeal is less likely to be granted if all three provincial courts of appeal take a similar approach to the issues. It is at this point impossible to predict how this litigation will play out.

It is notable that the Privacy Commissioner of Canada, who has no order making powers under PIPEDA but who can apply to Federal Court for an order, declined to do so. Under PIPEDA, such an application requires a hearing de novo by the Federal Court – this means that unlike the judicial review proceedings in the other provinces, the Federal Court need not show any deference to the federal Commissioner’s findings. Instead, the Court would proceed to a determination of the issues after hearing and considering the parties’ evidence and argument. One might wonder whether the rather bruising decision of the Federal Court in Privacy Commissioner v. Facebook (which was subsequently overturned by the Federal Court of Appeal) might have influenced the Commissioner to not roll the dice to seek an order with so much at stake. That a hearing de novo before the Federal Court could upset the apple cart of the Commissioners’ attempts to co-ordinate efforts, reduce duplication and harmonize interpretation, is sobering. Yet, it also means that if this litigation saga ends with the conclusion that the orders are reasonable and enforceable, BC, Alberta and Quebec residents will have received results in the form of orders requiring Clearview to delete images and to geo-fence any future collection of images to protect those within those provinces (which will still need to be made enforceable in the US) – while Canadians elsewhere in the country will not. Canadians will need long promised but as yet undelivered reform of PIPEDA to address the ability of the federal Commissioner to issue orders – ones that will be subject to judicial review with appropriate deference, rather than second guessed by the Personal Information and Data Protection Tribunal proposed in Bill C-27.

Concluding thoughts

Despite rulings from privacy and data protection commissioners around the world that Clearview AI is in breach of their respective laws, and notwithstanding two class action lawsuits in the US under the Illinois Biometric Information Privacy Act, the company has continued to grow its massive FRT database. At the time of the Canadian investigation, the database was said to hold 3 billion images. Current reports place this number at over 50 billion. Considering the resistance of the company to compliance with Canadian law, this raises the question of what it will take to motivate compliance by resistant organizations. As the proposed amendments to Canada’s federal private sector privacy laws wither on the vine after neglect and mismanagement in their journey through Parliament, this becomes a pressing and important question.

Published in Privacy

Thursday, 23 December 2021 13:05

Provinces Issue Orders Requiring Clearview AI to Comply with Data Protection Laws - But Then What?

On December 7, 2021, the privacy commissioners of Quebec, British Columbia and Alberta issued orders against the US-based company Clearview AI, following its refusal to voluntarily comply with the findings in the joint investigation report they issued along with the federal privacy commissioner on February 3, 2021.

Clearview AI gained worldwide attention in early 2020 when a New York Times article revealed that its services had been offered to law enforcement agencies for use in a largely non-transparent manner in many countries around the world. Clearview AI’s technology also has the potential for many different applications including in the private sector. It built its massive database of over 10 billion images by scraping photographs from publicly accessible websites across the Internet, and deriving biometric identifiers from the images. Users of its services upload a photograph of a person. The service then analyzes that image and compares it with the stored biometric identifiers. Where there is a match, the user is provided with all matching images and their metadata, including links to the sources of each image.

Clearview AI has been the target of investigation by data protection authorities around the world. France’s Commission Nationale de l'Informatique et des Libertés has found that Clearview AI breached the General Data Protection Regulation (GDPR). Australia and the UK conducted a joint investigation which similarly found the company to be in violation of their respective data protection laws. The UK commissioner has since issued a provisional view, stating its intent to levy a substantial fine. Legal proceedings are currently underway in Illinois, a state which has adopted biometric privacy legislation. Canada’s joint investigation report issued by the federal, Quebec, B.C. and Alberta commissioners found that Clearview AI had breached the federal Personal Information Protection and Electronic Documents Act, as well as the private sector data protection laws of each of the named provinces.

The Canadian joint investigation set out a series of recommendations for Clearview AI. Specifically, it recommended that Clearview AI cease offering its facial recognition services in Canada, “cease the collection, use and disclosure of images and biometric facial arrays collected from individuals in Canada”, and delete any such data in its possession. Clearview AI responded by saying that it had temporarily ceased providing its services in Canada, and that it was willing to continue to do so for a further 18 months. It also indicated that if it offered services in Canada again, it would require its clients to adopt a policy regarding facial recognition technology, and it would offer an audit trail of searches.

On the second and third recommendations, Clearview AI responded that it was simply not possible to determine which photos in its database were of individuals in Canada. It also reiterated its view that images found on the Internet are publicly available and free for use in this manner. It concluded that it had “already gone beyond its obligations”, and that while it was “willing to make some accommodations and met some of the requests of the Privacy Commissioners, it cannot commit itself to anything that is impossible and or [sic] required by law.” (Letter reproduced at para 3 of Order P21-08).

In this post I consider three main issues that flow from the orders issued by the provincial commissioners. The first relates to the cross-border reach of Canadian law. The second relates to enforcement (or lack thereof) in the Canadian context, particularly as compared with what is available in other jurisdictions such as the UK and the EU. The third issue relates to the interest shown by the commissioners in a compromise volunteered by Clearview AI in the ongoing Illinois litigation – and what this might mean for Canadians’ privacy.

1. Jurisdiction

Clearview AI maintains that Canadian laws do not apply to it. It argues that it is a US-based company with no physical presence in Canada. Although it initially provided its services to Canadian law enforcement agencies (see this CBC article for details of the use of Clearview by Toronto Police Services), it had since ceased to do so – thus, it no longer had clients in Canada. It scraped its data from platform companies such as Facebook and Instagram, and while many Canadians have accounts with such companies, Clearview’s scraping activities involved access to data hosted on platforms outside of Canada. It therefore argued that it not only did not operate in Canada, it had no ‘real and substantial’ connection to Canada.

The BC Commissioner did not directly address this issue. In his Order, he finds a hook for jurisdiction by referring to the personal data as having been “collected from individuals in British Columbia without their consent”, although it is clear there is no direct collection. He also notes Clearview’s active contemplation of resuming its services in Canada. Alberta’s Commissioner makes a brief reference to jurisdiction, simply stating that “Provincial privacy legislation applies to any private sector organization that collects, uses and discloses information of individuals within that province” (at para 12). The Quebec Commissioner, by contrast, gives a thorough discussion of the jurisdictional issues. In the first place, she notes that some of the images came from public Quebec sources (e.g., newspaper websites). She also observes that nothing indicates that images scraped from Quebec sources have been removed from the database; they therefore continue to be used and disclosed by the company.

Commissioner Poitras cited the Federal Court decision in Lawson for the principle that PIPEDA could apply to a US-based company that collected personal information from Canadian sources – so long as there is a real and substantial connection to Canada. She found a connection to Quebec in the free accounts offered to, and used by, Quebec law enforcement officials. She noted that the RCMP, which operates in Quebec, had also been a paying client of Clearview’s. When Clearview AI was used by clients in Quebec, those clients uploaded photographs to the service in the search for a match. This also constituted a collection of personal information by Clearview AI in Quebec.

Commissioner Poitras found that the location of Clearview’s business and its servers is not a determinative jurisdictional factor for a company that offers its services online around the world, and that collects personal data from the Internet globally. She found that Clearview AI’s database was at the core of its services, and a part of that database was comprised of data from Quebec and about Quebeckers. Clearview had offered its service in Quebec, and its activities had a real impact on the privacy of Quebeckers. Commissioner Poitras noted that millions of images of Quebeckers were appropriated by Clearview without the consent of the individuals in the images; these images were used to build a global biometric facial recognition database. She found that it was particularly important not to create a situation where individuals are denied recourse under quasi-constitutional laws such as data protection laws. These elements in combination, in her view, would suffice to create a real and substantial connections.

Commissioner Poitras did not accept that Clearview’s suspension of Canadian activities changed the situation. She noted that information that had been collected in Quebec remained in the database, which continued to be used by the company. She stated that a company could not appropriate the personal information of a substantial number of Quebeckers, commercialise this information, and then avoid the application of the law by saying they no longer offered services in Quebec.

The jurisdictional questions are both important and thorny. This case is different from cases such as Lawson and Globe24hrs, where the connections with Canada were more straightforward. In Lawson, there was clear evidence that the company offered its services to clients in Canada. It also directly obtained some of its data about Canadians from Canadian sources. In Globe24hrs, there was likewise evidence that Canadians were being charged by the Romanian company to have their personal data removed from the database. In addition, the data came from Canadian court decisions that were scraped from websites located in Canada. In Clearview AI, while some of the scraped data may have been hosted on servers located in Canada, most were scraped from offshore social media platform servers. If Clearview AI stopped offering its services in Canada and stopped scraping data from servers located in Canada, what recourse would Canadians have? The Quebec Commissioner attempts to address this question, but her reasons are based on factual connections that might not be present in the future, or in cases involving other data-scraping respondents. There needs to be a theory of real and substantial connection that specifically addresses the scraping of data from third-party websites, contrary to those websites’ terms of use, and contrary to the legal expectations of the sites’ users that can anchor the jurisdiction of Canadian law, even when the scraper has no other connection to Canada.

Canada is not alone with these jurisdictional issue – Australia’s orders to Clearview AI are currently under appeal, and the jurisdiction of the Australian Commissioner to make such orders will be one of the issues on appeal. A jurisdictional case – one that is convincing not just to privacy commissioners but to the foreign courts that may have to one day determine whether to enforce Canadian decisions – needs to be made.

2. Enforcement

At the time the facts of the Clearview AI investigation arose, all four commissioners had limited enforcement powers. The three provincial commissioners could issue orders requiring an organization to change its practices. The federal commissioner has no order-making powers, but can apply to Federal Court to ask that court to issue orders. The relative impotence of the commissioners is illustrated by Clearview’s hubristic response, cited above, that indicates that it had already “gone beyond its obligations”. Clearly, it considers anything that the commissioners had to say on the matter did not amount to an obligation.

The Canadian situation can be contrasted with that in the EU, where commissioners’ orders requiring organizations to change their non-compliant practices are now reinforced by the power to levy significant administrative monetary penalties (AMPs). The same situation exists in the UK. There, the data commissioner has just issued a preliminary enforcement notice and a proposed fine of £17M against Clearview AI. As noted earlier, the enforcement situation is beginning to change in Canada – Quebec’s newly amended legislation permits the levying of substantial AMPs. When some version of Bill C-11 is reintroduced in Parliament in 2022, it will likely also contain the power to levy AMPs. BC and Alberta may eventually follow suit. When this happens, the challenge will be first, to harmonize enforcement approaches across those jurisdictions; and second, to ensure that these penalties can meaningfully be enforced against offshore companies such as Clearview AI.

On the enforcement issue, it is perhaps also worth noting that the orders issued by the three Commissioners in this case are all slightly different. The Quebec Commissioner orders Clearview AI to cease collecting images of Quebeckers without consent, and to cease using these images to create biometric identifiers. It also orders the destruction, within 90 days of receipt of the order, all of the images collected without the consent of Quebeckers, as well as the destruction of the biometric identifiers. Alberta’s Commissioner orders that Clearview cease offering its services to clients in Alberta, cease the collection and use of images and biometrics collected from individuals in Alberta, and delete the same from its databases. BC’s order prohibits the offering of Clearview AI’s services using data collected from British Columbians without their consent to clients in British Columbia. He also orders that Clearview AI use “best efforts” to cease its collection, use and disclosure of images and biometric identifiers of British Columbians without its consent, as well as to use the same “best efforts” to delete images and biometric identifiers collected without consent.

It is to these “best efforts” that I next turn.

3. The Illinois Compromise

All three Commissioners make reference to a compromise offered by Clearview AI in the course of ongoing litigation in Illinois under Illinois’ Biometric Information Privacy Act. By referring to “best efforts” in his Order, the BC Commissioner seems to be suggesting that something along these lines would be an acceptable compromise in his jurisdiction.

In its response to the Canadian commissioners, Clearview AI raised the issue that it cannot easily know which photographs in its database are of residents of particular provinces, particularly since these are scraped from the Internet as a whole – and often from social media platforms hosted outside Canada.

Yet Clearview AI has indicated that it has changed some of its business practices to avoid infringing Illinois law. This includes “cancelling all accounts belonging to any entity based in Illinois” (para 12, BC Order). It also includes blocking from any searches all images in the Clearview database that are geolocated in Illinois. In the future, it also offers to create a “geofence” around Illinois. This means that it “will not collect facial vectors from any scraped images that contain metadata associating them with Illinois” (para 12 BC Order). It will also “not collect facial vectors from images stored on servers that are displaying Illinois IP addresses or websites with URLs containing keywords such as “Chicago” or “Illinois”.” Clearview apparently offers to create an “opt-out” mechanism whereby people can ask to have their photos excluded from the database. Finally, it will require its clients to not upload photos of Illinois residents. If such a photo is uploaded, and it contains Illinois-related metadata, no search will be performed.

The central problem with accepting the ‘Illinois compromise’ is that it allows a service built on illegally scraped data to continue operating with only a reduced privacy impact. Ironically, it also requires individuals who wish to benefit from this compromise, to provide more personal data in their online postings. Many people actually suppress geolocation information from their photographs to protect their privacy. Ironically, the ‘Illinois compromise’ can only exclude photos that contain geolocation data. Even with geolocation turned on, it would not exclude the vacation pics of any BC residents taken outside of BC (for example). Further, limiting scraping of images from Illinois-based sites will not prevent the photos of Illinois-based individuals from being included within the database a) if they are already in there, and b) if the images are posted on social media platforms hosted elsewhere.

Clearview AI is a business built upon data collection practices that are illegal in a large number of countries outside the US. The BC Commissioner is clearly of the opinion that a compromise solution is the best that can be hoped for, and he may be right in the circumstances. Yet it is a bitter pill to think that such flouting of privacy laws will ultimately be rewarded, as Clearview gets to keep and commercialize its facial recognition database. Accepting such a compromise could limit the harms of the improper exploitation of personal data, but it does not stop the exploitation of that data in all circumstances. And even this unhappy compromise may be out of reach for Canadians given the rather toothless nature of our current laws – and the jurisdictional challenges discussed earlier.

If anything, this situation cries out for global and harmonized solutions. Notably it requires the US to do much more to bring its wild-west approach to personal data exploitation in line with the approaches of its allies and trading partners. It also will require better cooperation on enforcement across borders. It may also call for social media giants to take more responsibility when it comes to companies that flout their terms and conditions to scrape their sites for personal data. The Clearview AI situation highlights these issues – as well as the dramatic impacts data misuse may have on privacy as personal data continues to be exploited for use in powerful AI technologies.

Published in Privacy

Thursday, 04 February 2021 10:06

How Might Bill C-11 Affect the Outcome of a Clearview AI-type Complaint?

A joint ruling from the federal Privacy Commissioner and his provincial counterparts in Quebec, B.C., and Alberta has found that U.S.-based company Clearview AI breached Canadian data protection laws when it scraped photographs from social media websites to create the database it used to support its facial recognition technology. According to the report, the database contained the biometric data of “a vast number of individuals in Canada, including children.” Investigations of complaints under public sector data protection laws about police use of Clearview AI’s services are still ongoing.

The Commissioners’ findings are unequivocal. The information collected by Clearview AI is sensitive biometric data. Express consent was required for its collection and use, and Clearview AI did not obtain consent. The company’s argument that consent was not required because the information was publicly available was firmly rejected. The Commissioners described Clearview AI’s actions as constituting “the mass identification and surveillance of individuals by a private entity in the course of commercial activity.” (at para 72) In defending itself, Clearview AI put forward arguments that were clearly at odds with Canadian law. They also resisted the jurisdiction of the Canadian Commissioners, notwithstanding the fact that they collected the personal data of Canadians and offered their commercial services to Canadian law enforcement agencies. Clearview AI did not accept the Commissioners’ findings, and “has not committed to following” the recommendations.

At the time of this report, Bill C-11, a bill to reform Canada’s current data protection law, is before Parliament. The goal of this post is to consider what difference Bill C-11 might make to the outcome of complaints like this one should it be passed into law. I consider both the substantive provisions of the bill and its new enforcement regime.

Consent

Like the current Personal Information Protection and Electronic Documents Act (PIPEDA), consent is a core requirement of Bill C-11. To collect, use or disclose personal information, an organization must either obtain valid consent, or its activities must fall into one of the exceptions to consent. In the Clearview AI case, there was no consent, and the disputed PIPEDA exception to the consent requirement was the one for ‘publicly available personal information’. While this exception seems broad on its face, to qualify, the information must fall within the parameters set out in the Regulations Specifying Publicly Available Personal Information. These regulations focus on certain categories of publicly available information – such as registry information (land titles, for example), court registries and decisions, published telephone directory information, and public business information listings. In most cases, the regulations provide that the use of the information must also relate directly to the purposes for which it was made public. The regulations also contain an exception for “personal information that appears in a publication, including a magazine, book or newspaper, in printed or electronic form, that is available to the public, where the individual has provided the information.” The interpretation of this provision was central to Clearview AI’s defense of its practices. It argued that social media postings were “personal information that appears in a publication.” The Commissioners adopted a narrow interpretation consistent with this being an exception in quasi-constitutional legislation. They distinguished between the types of publications mentioned in the exception and uncurated, dynamic social-media sites. The Commissioners noted that unlike newspapers or magazines, individuals retain a degree of control over the content of their social media sites. They also observed that to find that all information on the internet falls within the publicly available information exception “would create an extremely broad exemption that undermines the control users may otherwise maintain over their information at the source.” (at para 65) Finally, the Commissioners observed that the exception applied to information provided by the data subject, but that photographs were scraped by Clearview AI regardless of whether they were posted by the data subject or by someone else.

Would the result be any different under Bill C-11? In section 51, Bill C-11 replicates the “publicly available information exception” for collection, use or disclosure of personal information. Like PIPEDA, it also leaves the definition of this term to regulations. However, Canadians should be aware that there has been considerable pressure to expand the regulations so that personal information shared on social media sites is exempted from the consent requirement. For example, in past hearings into PIPEDA reform, the House of Commons ETHI Committee at one point appeared swayed by industry arguments that PIPEDA should be amended to include websites and social media within this exception. Bill C-11 does not resolve this issue; but if passed, it might well be on the table in the drafting of regulations. If nothing else, the Clearview AI case provides a stark illustration of just how important this issue is to the privacy of Canadians.

However, data scrapers may be able to look elsewhere in Bill C-11 for an exception to consent. Bill C-11 contains new exceptions to consent for “business operations” which I have criticized here. One of these exceptions would almost certainly be relied upon by a company in Clearview AI’s position if the bill were passed. The exceptions allow for the collection and use of personal information without an individual’s knowledge or consent if, among other things, it is for “an activity in the course of which obtaining the individual’s consent would be impracticable because the organization does not have a direct relationship with the individual.” (18(2)(e)). A company that scrapes data from social media sites to create a facial recognition database would find it impracticable to get consent because it has no direct relationship with any of the affected individuals. The exception seems to fit.

That said, s. 18(1) does set some general guardrails. The one that seems relevant in this case is that the exceptions to consent are only available where “a reasonable person would expect such a collection or use for that activity”. Hopefully, collection of images from social media websites to fuel facial recognition technology would not be something that a reasonable person would expect; certainly, the Commissioners would not find it to be so. In addition, section 12 of Bill C-11 requires that information be collected or used “only for purposes that a reasonable person would consider appropriate in the circumstances” (a requirement carried over from PIPEDA, s. 5(3)). In their findings, the Commissioners ruled that the collection and use of images by Clearview AI was for a purpose that a reasonable person would find inappropriate. The same conclusion could be reached under Bill C-11.

There is reason to be cautiously optimistic, then, that Bill C-11 would lead to the same result on a similar set of facts: the conclusion that the wholesale scraping of personal data from social media sites to build a facial recognition database without consent is not permitted. However, the scope of the exception in s. 18(2)(e) is still a matter of concern. The more exceptions that an organization pushing the boundaries feels it can wriggle into, the more likely it will be to engage in a privacy-compromising activities. In addition, there may be a range of different uses for scraped data and “what a reasonable person would expect” is a rather squishy buffer between privacy and wholesale data exploitation.

Enforcement

Bill C-11 is meant to substantially increase enforcement options when it comes to privacy. Strong enforcement is particularly important in cases where organizations are not interested in accepting the guidance of regulators. This is certainly the case with Clearview AI, which expressly rejected the Commissioners’ findings. Would Bill C-11 strengthen the regulator’s hand?

The Report of Findings in this case reflects the growing trend of having the federal and provincial commissioners that oversee private sector data protection laws jointly investigate complaints involving issues that affect individuals across Canada. This cooperation is important as it ensures consistent interpretation of what is meant to be substantially similar legislation across jurisdictions. Nothing in Bill C-11 would prevent the federal Commissioner from continuing to engage in this cross-jurisdictional collaboration – in fact, subsection 116(2) expressly encourages it.

Some will point to the Commissioner’s new order-making powers as another way to strengthen his enforcement hand. The Commissioner can now direct an organization to take measures to comply with the legislation or to cease activities that are in contravention of the legislation (s. 92(2)). This is a good thing. However, these orders are subject to appeal to the new Personal Information Protection and Data Tribunal (the Tribunal). By contrast, orders of the Commissioners of BC and Alberta are final, subject only to judicial review.

In addition, it is not just the orders of the Commissioner that are appealable under C-11, but also his findings. This raises questions about how the new structure under Bill C-11 might affect cooperative inquiries like the one in this case. Conclusions shared with other Commissioners can be appealed by respondents to the Tribunal, which owes no deference to the Commissioner on questions of law. As I and others have already noted, the composition of the Tribunal is somewhat concerning; Bill C-11 would require only a minimum of one member of the tribunal to have expertise in privacy law. While it is true that proceedings before the Federal Court were de novo, and thus the Commissioner was afforded no formal deference in that context either, access to Federal Court was more limited than the wide-open appeals route to the Tribunal. The Bill C-11 structure really seems to shift the authority to interpret and apply the law away from the Commissioner and to the mysterious and not necessarily expert Tribunal.

Bill C-11 also has a much-touted new power to issue substantial fines for breach of the legislation. Interestingly, however, this does not seem to be the kind of case in which a fine would be available. Fines, provided for under s. 93(1) of Bill C-11 are available only with respect to the breach of certain obligations under the statute (these are listed in s. 93(1)). Playing fast and loose with the requirement to obtain consent is not one of them. This is interesting, given the supposedly central place consent plays within the Bill. Further thought might need to be given to the list of ‘fine-able contraventions’.

Overall, then, although C-11 could lead to a very similar result on similar facts, the path to that result may be less certain. It is also not clear that there is anything in the enforcement provisions of the legislation that will add heft to the Commissioner’s findings. In practical terms, the decisions that matter will be those of the Tribunal, and it remains to be seen how well this Tribunal will serve Canadians.

Published in Privacy

Tuesday, 24 September 2019 07:19

Tussle over scraped corporate registry data exposes tensions between transparency and privacy

An interesting case from Quebec demonstrates the tension between privacy and transparency when it comes to public registers that include personal information. It also raises issues around ownership and control of data, including the measures used to prevent data scraping. The way the litigation was framed means that not all of these questions are answered in the decision, leaving some lingering public policy questions.

Quebec’s Enterprise Registrar oversees a registry, in the form of a database, of all businesses in Quebec, including corporations, sole corporations and partnerships. The Registrar is empowered to do so under the Act respecting the legal publicity of enterprises (ALPE), which also establishes the database. The Registrar is obliged to make this register publicly accessible, including remotely by technological means, and basic use of the database is free of charge.

The applicant in this case is OpenCorporates, a U.K.-based organization dedicated to ensuring total corporate transparency. According to its website, OpenCorporates has created and maintains “the largest open database of companies in the world”. It currently has data on companies located in over 130 jurisdictions. Most of this data is drawn from reliable public registries. In addition to providing a free, searchable public resource, OpenCorporates also sells structured data to financial institutions, government agencies, journalists and other businesses. The money raised from these sales finances its operations.

OpenCorporates gathers its data using a variety of means. In 2012, it began to scrape data from Quebec’s Enterprise Register. Data scraping involves the use of ‘bots’ to visit and automatically harvest data from targeted web pages. It is a common data-harvesting practice, widely used by journalists, civil society actors and researchers, as well as companies large and small. As common as it may be, it is not always welcome, and there has been litigation in Canada and around the world about the legality of data scraping practices, chiefly in contexts where the defendant is attempting to commercialize data scraped from a business rival.

In 2016 the Registrar changed the terms of service for the Enterprise Register. These changes essentially prohibited web scraping activities, as well as the commercialization of data extracted from the site. The new terms also prohibit certain types of information analyses; for example, they bar searches for data according to the name and address of a particular person. All visitors to the site must agree to the Terms of Service. The Registrar also introduced technological measures to make it more difficult for bots to scrape its data.

Opencorporates Ltd. C. Registraire des entreprises du Québec is not a challenge to the Register’s new, restrictive terms and conditions. Instead, because the Registrar also sent OpenCorporates a cease and desist letter demanding that it stop using the data it had collected prior to the change in Terms of Service, OpenCorporates sought a declaration from the Quebec Superior Court that it was entitled to continue to use this earlier data.

The Registrar acknowledged that nothing in the ALPE authorizes it to control uses made of any data obtained from its site. Further, until it posted the new terms and conditions for the site, nothing limited what users could do with the data. The Registrar argued that it had the right to control the pre-2016 data because of the purpose of the Register. It argued that the ALPE established the Register as the sole source of public data on Quebec businesses, and that the database was designed to protect the personal information that it contained (i.e. the names and addresses of directors of corporations). For example, it does not permit extensive searches by name or address. OpenCorporates, by contrast, permits the searching of all of its data, including by name and address.

The court characterized the purpose of the Register as being to protect individuals and corporations that interact with other corporations by assuring them easy access to identity information, including the names of those persons associated with a corporation. An electronic database gives users the ability to make quick searches and from a distance. Quebec’s Act to Establish a Legal Framework for Information Technology provides that where a document contains personal information and is made public for particular purposes, any extensive searches of the document must be limited to those purposes. This law places the onus on the person responsible for providing access to the document to put in place appropriate technological protection measures. Under the ALPE, the Registrar can carry out more comprehensive searches of the database on behalf of users who must make their request to the Registrar. Even then, the ALPE prohibits the Registrar from using the name or address of an individual as a basis for a search. According to the Registrar, a member of the public has right to know, once one they have the name of a company, with whom they are dealing; they do not have the right to determine the number of companies to which a physical person is linked. By contrast, this latter type of search is one that could be carried out using the OpenCorporates database.

The court noted that it was not its role to consider the legality of OpenCorporates’ database, nor to consider the use made by others of that database. It also observed that individuals concerned about potential privacy breaches facilitated by OpenCorporates might have recourse under Quebec privacy law. Justice Rogers’ focus was on the specific question of whether the Registrar could prevent OpenCorporates from using the data it gathered prior to the change of terms of service in 2016. On this point, the judge ruled in favour of OpenCorporates. In her view, OpenCorporates’ gathering of this data was not in breach of any law that the Registrar could rely upon (leaving aside any potential privacy claims by individuals whose data was scraped). Further, she found that nothing in the ALPE gave the Registrar a monopoly on the creation and maintenance of a database of corporate data. She observed that the use made by OpenCorporates of the data was not contrary to the purpose of the ALPE, which was to create greater corporate transparency and to protect those who interacted with corporations. She ruled that nothing in the ALPE obligated the Registrar to eliminate all privacy risks. The names and addresses of those involved with corporations are public information; the goal of the legislation is to facilitate digital access to the data while at the same time placing limits on bulk searches. Nothing in the ALPE prevented another organization from creating its own database of Quebec businesses. Since OpenCorporates did not breach any laws or terms of service in collecting the information between 2012 and 2016, nothing prevented it from continuing to use that information in its own databases. Justice Rogers issued a declaration to the effect that the Registrar was not permitted to prevent OpenCorporates from publishing and distributing the data it collected from the Register prior to 2016.

While this was a victory for OpenCorporates, it did not do much more than ensure its right to continue to use data that will become increasingly dated. There is perhaps some value in the Court’s finding that the existence of a public database does not, on its own, preclude the creation of derivative databases. However, the decision leaves some important questions unanswered. In the first place, it alludes to but offers no opinion on the ability to challenge the inclusion of the data in the OpenCorporates database on privacy grounds. While a breach of privacy argument might be difficult to maintain in the case of public data regarding corporate ownership, it is still unpredictable how it might play out in court. This is far less sensitive data that that involved in the scraping of court decisions litigated before the Federal Court in A.T. v. Globe24hr.com; there is a public interest in making the specific personal information available in the Registry; and the use made by OpenCorporates is far less exploitative than in Globe24hr. Nevertheless, the privacy issues remain a latent difficulty. Overall, the decision tells us little about how to strike an appropriate balance between the values of transparency and privacy. The legislation and the Registrar’s approach are designed to make it difficult to track corporate ownership or involvement across multiple corporations. There is rigorous protection of information with low privacy value and with a strong public dimension; with transparency being weakened as a result. It is worth noting that another lawsuit against the Register may be in the works. It is reported that the CBC is challenging the decision of the Registrar to prohibit searches by names of directors and managers of companies as a breach of the right to freedom of expression.

Because the terms of service were not directly at issue in the case, there is also little to go on with respect to the impact of such terms. To what extent can terms of service limit what can be done with publicly accessible data made available over the Internet? The recent U.S. case of hiQ Labs Inc. v. LinkedIn Corp. raises interesting questions about freedom of expression and the right to harvest publicly accessible data. This and other important issues remain unaddressed in what is ultimately an interesting but unsatisfying court decision.

Published in Privacy

Thursday, 30 November 2017 13:14

Data deficits and the regulation of the sharing economy

Last year I attended a terrific workshop at UBC’s Allard School of Law. The workshop was titled ‘Property in the City’, and panelists presented work on a broad range of issues relating to law in the urban environment. A special issue of the UBC Law Review has just been published featuring some of the output of this workshop. The issue contains my own paper (discussed below and available here) that explores skirmishes over access to and use of Airbnb platform data.

Airbnb is a ‘sharing economy’ platform that facilitates the booking of short-term accommodation. The company is premised on the idea that many urban dwellers have excess space – rooms in homes or apartments – or have space they do not use at certain periods of the year (entire homes or apartments while on vacation, for example) – and that a digital marketplace can maximize efficient use of this space by matching those seeking temporary accommodation with those having excess space. The Airbnb web site claims that it “connects people to unique travel experiences at any price point” and at the same time “is the easiest way for people to monetize their extra space and showcase it to an audience of millions.”

This characterization of Airbnb is open to challenge. Several studies, including ones by the Canadian Centre for Policy Alternatives, the City of Vancouver, and the NY State Attorney General suggest that a significant number of units for rent on Airbnb are offered as part of commercial enterprises. The description also belies Airbnb’s disruptive impact. The re-characterization and commodification of ‘surplus’ private spaces neatly evades the regulatory frameworks designed for the marketing of short-term accommodation and leaves licensed short-term accommodation providers complaining that their highly regulated businesses are being undermined by competition from those not bearing the same regulatory burdens. At the same time, many housing advocates and city officials are concerned about the impact of platforms such as Airbnb on the availability and affordability of long-term housing.

These challenges are made more difficult to address by the fact that the data needed to understand the impact of platform companies, along with data about short-term rentals that would otherwise be captured through regulatory processes, are effectively privatized in the hands of Airbnb. Data deficits of this kind pose a challenge to governments, civil society and researchers..

My paper explores the impact of a company such as Airbnb on cities from the perspective of data. I argue that platform-based, short-term rental activities have a fundamental impact on what data are available to municipal governments who struggle to regulate in the public interest, as well as to civil society groups and researchers that attempt to understand urban housing issues. The impacts of platform companies are therefore not just disruptive of incumbent industries; they disrupt planning and regulatory processes by masking activities and creating data deficits. My paper considers some of the currently available solutions to the data deficits, which range from self-help type recourses such as data scraping to entering into data-sharing agreements with the platform companies. Each of these solutions has its limits and drawbacks. I argue that further action may be required by governments to ensure their data needs are adequately met.

Although this paper focuses on Airbnb, it is worth noting that the data deficits discussed in the paper are merely a part of a larger context in which evolving technologies shift control over some kinds of data from public to private hands. Ensuring the ability of governments and civil society to collect, retain, and share data of a sufficient quality to both enable and to enhance governance, transparency, and accountability should be priorities for municipal governments, and should also be supported by law and policy at provincial and federal levels.

Published in E-Commerce & Internet Law

Monday, 21 August 2017 06:12

Interlocutory decision addresses scraping of publicly accessible data

Skirmishes over right to freely access and use “publicly available” data hosted by internet platform companies have led to an interesting decision from the U.S. District Court from the Northern District of California. The decision is on a motion for an interlocutory injunction, so it does not decide the merits of the competing claims. Nevertheless, it provides insight into a set of issues that are likely only to increase in importance as these rich troves of data are mined by competitors, opportunistic businesses, big data giants, researchers and civil society actors.

The parties in hiQ Labs Inc. v LinkedIn Corp. are companies whose business models are based upon career-related personal information provided by professionals. LinkedIn offers a professional networking platform to over 500 million users, and it is easily the leading company in its space. hiQ, for its part, is a data analytics company with two main products aimed at enterprises. The first is “Keeper”, a product which informs corporations about which of their employees are at greatest risk of being poached by other companies. The second is “Skill Mapper” which provides businesses with summaries of the skills of their employees. For both of its products hiQ relies on data that it scrapes from LinkedIn’s publicly accessible web pages.

Data featured on LinkedIn’s site are provided by users who create accounts and populate their profiles with a broad range of information about their background and skills. LinkedIn members have some control over the extent to which their information will be shared by others. They can choose to limit access to their profile information to only their close contacts or to an expanded list of contacts. Alternatively, they can provide access to all other members of LinkedIn. They also have the option to make their profiles entirely public. These public profiles are searchable by search engines such as Google. It is the data in the fully public profiles that is scraped and used by hiQ.

hiQ is not the only company that scrapes data from LinkedIn as part of an independent business model. In fact, LinkedIn has only recently attempted to take legal action against a large number of users of its data. hiQ was just one of many companies that received a cease and desist letter from LinkedIn. Because being cut off from the LinkedIn data would effectively decimate its business, hiQ responded by seeking a declaration from the California court that its activities were legal. The recent decision from the court is in relation to hiQ’s request for an interlocutory injunction that will allow it to continue to access the LinkedIn data pending resolution of the substantive legal issues raised by both sides.

hiQ argued that in moving against its data scraping activities, LinkedIn engaged in unfair business practices, and violated its free speech rights under the California constitution. LinkedIn, for its part, argued that hiQ’s data scraping activities violated the Computer Fraud and Abuse Act (CFAA), as well as the digital locks provisions Digital Millennium Copyright Act (DMCA) (although these latter claims do not feature in the decision on the interlocutory injunction).

Like other platform companies, access to and use of LinkedIn’s site is governed by website Terms of Service (TOS). These TOS prohibit data scraping. When LinkedIn demanded that hiQ cease scraping data from its site, it also implemented technological protection measures to prevent access by hiQ to its data. LinkedIn’s claims under the CFAA and the DMCA are based largely on the circumvention of these technological barriers by hiQ.

The court ultimately granted the injunction barring LinkedIn from limiting hiQ’s access to its publicly available data pending the resolution of the issues in the case. In doing so, it expressed its doubts that the CFAA applied to hiQ’s activity, noting that if it did, it would “profoundly impact open access to the Internet.” It also found that attempts by LinkedIn to block hiQ’s access might be in breach of state law as anti-competitive behavior. In reaching its decision, the court had some interesting things to say about the importance of access to publicly accessible data, and the privacy rights of those who provided the data. These issues are highlighted in the discussion below.

In deciding whether to grant an interlocutory injunction, a court must assess both the possibility of irreparable harm and the balance of convenience as between the parties. In this case, the court found that denying hiQ access to LinkedIn data would essentially put it out of business – causing it irreparable harm. LinkedIn argued that it was imperative that it be allowed to protect its data because of its users’ privacy interests. While hiQ only scraped data from public profiles, LinkedIn argued that even those users with public profiles had privacy interests. I noted that 50 million of its users with public profiles had selected its “Do Not Broadcast” feature which prevents profile updates from being broadcast to a user’s connections. LinkedIn described this as a privacy feature that would essentially be circumvented by routine data scraping. The court was not convinced. In the first place, it found that there might be many reasons besides privacy concerns that motivated users to choose “do not broadcast”. It gave as an example the concern by users that their connections not be spammed by endless notifications. The Court also noted that LinkedIn had its own service for professional recruiters that kept them apprised of updates even from users who had implemented “Do Not Broadcast”. The court dismissed arguments by LinkedIn that this was different because users had consented to such sharing in their privacy policy. The court stated: “It is unlikely, however, that most users’ actual privacy expectations are shaped by the fine print of a privacy policy buried in the User Agreement that likely few, if any, users have actually read.” [Emphasis in original] This is interesting, because the court discounts the relevance of a privacy policy in informing users’ expectations of privacy. Essentially, the court finds that users who make their profiles public have no real expectation of privacy in the information. LinkedIn could therefore not rely on its users’ privacy interests to justify its actions.

In assessing whether the parties raised serious questions going to the merits of the case, the court considered LinkedIn’s arguments about the CFAA. The CFAA essentially criminalizes intentional access to a computer without authorization, or in a way that exceeds the authorization provided, with the result that information is obtained. The question, therefore, was whether hiQ’s continued access to the LinkedIn site after LinkedIn expressly revoked any permission and tried to bar its access, was a violation of the CFAA. The court dismissed the cases cited by LinkedIn in support of its position, noting that these cases involved unauthorized access to password protected sites as opposed to accessing publicly available information.

The court observed that the CFAA was enacted largely to deal with the problem of computer hacking. It noted that if the application of the law was extended to publicly accessible websites it would greatly expand the scope of the legislation with serious consequences. The court noted that this would mean that “merely viewing a website in contravention of a unilateral directive from a private company would be a crime.” [Emphasis in original] It went on to note that “The potential for such exercise of power over access to publicly viewable information by a private entity weaponized by the potential of criminal sanctions is deeply concerning.” The court placed great emphasis on the importance of an open internet. It noted that “LinkedIn, here, essentially seeks to prohibit hiQ from viewing a sign publicly visible to all”. It clearly preferred an interpretation of the CFAA that would be limited to unauthorized access to a computer system through some form of “authentication gateway”.

The court also found that hiQ raised serious questions that LinkedIn’s behavior might fall afoul of competition laws in California. It noted that LinkedIn is in a dominant position in the field of professional networking, and that it might be leveraging its position to get a “competitively unjustified advantage in a different market.” It also accepted that it was possible that LinkedIn was denying its competitors access to an essential facility that it controls.

The court was not convinced by hiQ’s arguments that the technological barriers erected by LinkedIn violated the free speech guarantees in the California Constitution. Nevertheless, it found that on balance the public interest favoured the granting of the injunction to hiQ pending the outcome of litigation on the merits.

This dispute is extremely interesting and worth following. There are a growing number of platforms that host vast stores of publicly accessible data, and these data are often relied upon by upstart businesses (as well as established big data companies, researchers, and civil society) for a broad range of purposes. The extent to which a platform company can control its publicly accessible data is an important one, and one which, as the California court points out, will have important public policy ramifications. The related privacy issues – where the data is also personal information – are also important and interesting. These latter issues may be treated differently in different jurisdictions depending upon the applicable data protection laws.

Published in Privacy