Teresa Scassa - Blog

On November 23, 2018, Waterfront Toronto hosted a Civic Labs workshop in Toronto. The theme of the workshop was Smart City Data Governance. I was asked to give a 10 minute presentation on the topic. What follows is a transcript of my remarks.

Smart city governance relates to how smart cities govern themselves and their processes; how they engage citizens and how they are transparent and accountable to them. Too often the term “smart city” is reduced to an emphasis on technology and on technological solutionism – in other words “smart cities” are presented as a way in which to use technology to solve urban problems. In its report on Open Smart Cities, Open North observes that “even when driven in Canada by good intentions and best practices in terms of digital strategies, . . . [the smart city] remains a form of innovation and efficient driven technological solutionism that is not necessarily integrated with urban plans, with little or no public engagement and little to no relation to contemporary open data, open source, open science or open government practices”.

Smart cities governance puts the emphasis on the “city” rather than the “smart” component, focusing attention on how decisions are made and how the public is engaged. Open North’s definition of the Open Smart City is in fact a normative statement about digital urban governance:

An Open Smart City is where residents, civil society, academics, and the private sector collaborate with public officials to mobilize data and technologies when warranted in an ethical, accountable and transparent way to govern the city as a fair, viable and liveable commons and balance economic development, social progress and environmental responsibility.

This definition identifies the city government as playing a central role, with engagement from a range of different actors, and with particular economic, social and environmental goals in mind. This definition of a smart city involves governance in a very basic and central way – stakeholders are broadly defined and they are engaged not just in setting limits on smart cities technology, but in deciding what technologies to adopt and deploy and for what purposes.

There are abundant interesting international models of smart city governance – many of them arise in the context of specific projects often of a relatively modest scale. Many involve attempts to find ways to include city residents in both identifying and solving problems, and the use of technology is relevant both to this engagement and to finding solutions.

The Sidewalk Toronto project is somewhat different since this is not a City of Toronto smart city initiative. Rather, it is the tri-governmental entity Waterfront Toronto that has been given the lead governance role. This has proved challenging since while Waterfront Toronto has a public-oriented mandate, it is not a democratically elected body, and its core mission is to oversee the transformation of specific brownfield lands into viable communities. This is important to keep in mind in thinking about governance issues. Waterfront Toronto has had to build public engagement into its governance framework in ways that are different from a municipal government. The participation of federal and provincial privacy commissioners, and representatives from federal and provincial governments feed into governance as does the DSAP and there has been public outreach. There will also be review of and consultation of the Master Innovation Development Plan (MIDP) once it is publicly released. But it is a different model from city government and this may set it apart in important ways from other smart cities initiatives in Canada and around the world.

Setting aside for a moment the smart cities governance issue, let’s discuss data governance. The two are related – especially with respect to the issue of what data is collected in the smart city and for what purposes.

Broadly speaking, data governance goes to the question of how data will be stewarded (and by whom) and for what purposes. Data governance is about managing data. As such, it is not a new concept. Data governance is a practice that is current in both private and public sector contexts. Most commonly it takes place within a single organization which develops practices and protocols to manage its existing and future data. Governance issues include considering who is responsible for the data, who is entitled to set the rules for access to and reuse of it, how those rules will be set, and who will profit/benefit from the data and on what terms. It also includes addressing issues such as data security, standards, interoperability, and localization. Where the data include personal information, compliance with privacy laws is an aspect of data governance. But governance is not limited to compliance – for example, an organization may adopt higher standards than those required by privacy law, or may develop novel approaches to managing and protecting personal information.

There are many different data governance models. Some (particularly in the public sector) are shaped by legislation, regulations and government policies. Others may be structured by internal policies, standards, industry practice, and private law instruments such as contracts or trusts. As the term is commonly used, data governance does not necessarily implicate citizen involvement or participation in the same way as “smart city governance” does – it is the “city” part of “smart city governance” that brings in to focus democratic principles of transparency, accountability, engagement and so on. However, where there is a public sector dimension to the collection or control of data, then public sector laws, including those relating to transparency and accountability, may apply.

With the rise of the data economy, data sharing is becoming an important activity for both public and private sector actors. As a result, new models of data governance are needed to facilitate data sharing. There are many different benefits that flow from data sharing. It may be carried out for financial gain, or it may be done to foster innovation, enable new insights, stimulate the economy, increase transparency, solve thorny problems, and so on. There are also different possible beneficiaries. Data may be shared amongst a group of entities each of which will find advantages in the mutual pooling of their data resources. Or it may be shared broadly in the hope of generating new data-based solutions to existing problems. In some cases, data sharing has a profit motive. The diversity of actors, beneficiaries, and motivations, makes it necessary to find multiple, diverse and flexible frameworks and principles to guide data sharing arrangements.

Open government data regimes are an important example of a data governance model for data sharing. Many governments have decided that opening government data is a significant public policy goal, and have done tremendous amount of work to create the infrastructure not just for sharing data, but for doing it in a useful, accessible and appropriate manner. This means the development of standards for data and metadata, and the development of portals and search functions. It has meant paying attention to issues of interoperability. It has also required governments to consider how best to protect privacy and confidential information, or information that might impact on security issues. Once open, the sharing frameworks are relatively straightforward -- open data portals typically offer data to anyone, with no registration requirement, under a simple open licence.

Governments are not the only ones developing open data portals – research institutions are increasingly searching for ways in which to publicly share research outputs including publications and data. Some research data infrastructures support sharing, but not necessarily on fully open terms – this requires another level of consideration as to the policy reasons for limiting access, how to limit access effectively, and how to set and ensure respect for appropriate limits on reuse.

The concept of a data trust has also received considerable attention as a means of data sharing. The term data trust is now so widely and freely used that it does not have a precise meaning. In its publication “What is a Data Trust”, the ODI identifies at least 5 different concepts of a data trust, and they provide examples of each:

· A data trust as a repeatable framework of terms and mechanisms.

· A data trust as a mutual organisation.

· A data trust as a legal structure.

· A data trust as a store of data.

· A data trust as public oversight of data access.

The diversity of “data trusts” means that there are a growing number of models to study and consider. However, it also makes it a little dangerous to talk about “data trust” as if it has a precise meaning. With data trusts, the devil is very much in the details. If Sidewalk Labs is to propose a ‘data trust’ for the management of data gathered in the Sidewalk Toronto development, then it will be important to probe into exactly what the term means in this context.

What Sidewalk Labs is proposing is a particular vision of a data trust as a data governance model for data sharing in a smart cities development. It is admittedly a work in progress. It has some fairly particular characteristics. For example, not only is it a framework to set the parameters for sharing the subset “urban data” (defined by Sidewalk Labs) collected through the project, it also contemplates providing governance for any proposals by third parties who might want to engage in the collection of new kinds, categories or volumes of data.

In thinking about the proposed ‘trust’, some questions I would suggest considering are:

1) What is the relationship between the proposed trust and the vision for smart city governance? In other words, to what extent is the public and/or are public sector decision-makers engaged in determining what data will be governed by the trust, on what terms, for whose benefit, and on what terms will sharing take place?

2) A data governance model does not make up for a robust smart city governance up front (in identifying the problems to be solved, the data to be collected to solve them, etc.). If this piece is missing, then discussion of the trust may involve discussing the governance of data where there is no group consensus or input as to its collection. How should this be done (if at all)?

3) A data governance model can be created for the data of a single entity (e.g. an open government portal, or a data governance framework for a corporation); but it can also be developed to facilitate data sharing between entities, or even between a group of entities and a broader public. So an important question in the ST context is what model is this? Is this Sidewalk Labs data that is being shared? Or is it Waterfront’s? Or the City’s? Who has custody/control or ownership of the data that will be governed by the ‘trust’?

4) Data governance is crucial with respect to all data held by an entity. Not all data collected through the Sidewalk Toronto project will fall within Sidewalk’s definition of “urban data” (for which the ‘trust’ is proposed). If the data governance model under consideration only deals with a subset of data, then there must be some form of data governance for the larger set. What is it? And who determines its parameters?

Published in Privacy

Late in the afternoon of Monday, October 15, 2018, Sidewalk Labs released a densely-packed slide-deck which outlined its new and emerging data governance plan for the Sidewalk Toronto smart city development. The plan was discussed by Waterfront Toronto’s Digital Strategy Advisory Panel at their meeting on Thursday, October 18. I am a member of that panel, and this post elaborates upon the comments I made at that meeting.

Sidewalk Labs’ new data governance proposal builds upon the Responsible Data Use Policy Framework (RDUPF) document which had been released by Sidewalk Labs in May 2018. It is, however, far more than an evolution of that document – it is a different approach reflecting a different smart city concept. It is so different that Ann Cavoukian, advisor to Sidewalk Labs on privacy issues, resigned on October 19. The RDUPF had made privacy by design its core focus and promised the anonymization of all sensor data. Cavoukian cited the fact that the new data governance framework contemplated that not all personal information would be deidentified as a reason for her resignation.

Neither privacy by design nor data anonymization are privacy panaceas, and the RDUPF document had a number of flaws. One of them was that by championing deidentification of personal information as the key to responsible data use, it very clearly only addressed privacy concerns relating to a subset of the data that would inevitably be collected in the proposed smart city. In addition, by focusing on privacy by design, it did little to address the many other data governance issues the project faced.

The new proposal embraces a broader concept of data governance. It is cognizant of privacy issues but also considers issues of data control, access, reuse, and localization. In approaching data governance, Sidewalk is also proposing using a ‘civic data trust’ as a governance model. Sidewalk has made it clear that this is a work in progress and that it is open to feedback and comment. It received some at the DSAP meeting on Thursday, and more is sure to come.

My comments at the DSAP focused on two broad issues. The first was data and the second was governance. I prefaced my discussion of these by warning that in my view it is a mistake to talk about data governance using either of the Sidewalk Labs documents as a departure point. This is because these documents embed assumptions that need to be examined rather than simply accepted. They propose a different starting point for the data governance conversation than I think is appropriate, and as a result they unduly shape and frame that discussion.

Data

Both the RDUPF and the current data governance proposal discuss how the data collected by the Sidewalk Toronto development will be governed. However, neither document actually presents a clear picture of what those data are. Instead, both documents discuss a subset of data. The RDUPF discussed only depersonalized data collected by sensors. The second discussed only what it defines as “urban data”:

Urban Data is data collected in a physical space in the city, which includes:

● Public spaces, such as streets, squares, plazas, parks, and open spaces

● Private spaces accessible to the public, such as building lobbies, courtyards, ground-floor markets, and retail stores

● Private spaces not controlled by those who occupy them (e.g. apartment tenants)

This is very clearly only a subset of smart cities data. (It is also a subset that raises a host of questions – but those will have to wait for another blog post.)

In my view, any discussion of data governance in the Sidewalk Toronto development should start with a mapping out of the different types of data that will be collected, by whom, for what purposes, and in what form. It is understood that this data landscape may change over time, but at least a mapping exercise may reveal the different categories of data, the issues they raise, and the different governance mechanisms that may be appropriate depending on the category. By focusing on deidentified sensor data, for example, the RDUPF did not address personal information collected in relation to the consumption of many services that will require identification – e.g., for billing or metering purposes. In the proposed development, what types of services will require individuals to identify themselves? Who will control such data? How will it be secured? What will policies be with respect to disclosure to law enforcement without a warrant? What transparency measures will be in place? Will service consumption data also be deidentified and made available for research? In what circumstances? I offer this as an example of a different category of data that still requires governance, and that still needs to be discussed in the context of a smart cities development. This type of data would also fall outside the category of “urban data” in the second governance plan, making that plan only a piece of the overall data governance required, as there are many other categories of data that are not captured by “urban data”. The first step in a data governance must be for all involved to understand what data is being collected, how, why, and by whom.

The importance of this is also made evident by the fact that between the RDUPF and the new governance plan, the very concept of the Sidewalk Toronto smart city seems to have changed. The RDUPF envisioned a city in which sensors were installed by Sidewalk and Sidewalk was committing to the anonymization of any collected personal information. In the new version, the model seems to be of the smart city as a technology platform on which any number of developers will be invited to build. As a result, the data governance model proposes an oversight body to provide approval for new data collection in public spaces, and to play some role in the sharing of the collected data if appropriate. This is partly behind the resignation of Ann Cavoukian. She objected to the fact that this model accepts that some new applications might require the collection of personal information and so deidentification could not be an upfront promise for all data collected.

The technology-platform model seems responsive to concerns that the smart city would effectively be subsumed by a single corporation. It allows other developers to build on the platform – and by extension to collect and process data. Yet from a governance perspective this is much messier. A single corporation can make bold commitments with respect to its own practices; it may be difficult or inappropriate to impose these on others. It also makes it much more difficult to predict what data will be collected and for what purposes. This does not mean that the data mapping exercise is not worthwhile – many kinds and categories of data are already foreseeable and mapping data can help to understand different governance needs. In fact, it is likely that a project this complex will require multiple data governance models.

Governance

The second point I tried to make in my 5 minutes at the Thursday meeting was about data governance. The new data governance plan raises more questions than it answers. One glaring issue seems to be the place for our already existing data governance frameworks. These include municipal and provincial Freedom of Information and Protection of Privacy Acts and PIPEDA. They may also include the City of Toronto’s open data policies and platforms. There are very real questions to be answered about which smart city data will be private sector data and which will be considered to be under the custody or control of a provincial or municipal government. Government has existing legal obligations about the management of data that are under its custody or control, and these obligations include the protection of privacy as well as transparency. A government that decides to implement a new data collection program (traffic cameras, GPS trackers on municipal vehicles, etc.) would be the custodian of this data, and it would be subject to relevant provincial laws. The role of Sidewalk Labs in this development challenges, at a very fundamental level, the understanding of who is ultimately responsible for the collection and governance of data about cities, their services and infrastructure. Open government data programs invite the private sector to innovate using public data. But what is being envisaged in this proposal seems to be a privatization of the collection of urban data – with some sort of ‘trust’ model put in place to soften the reality of that privatization.

The ‘civic data trust’ proposed by Sidewalk Labs is meant to be an innovation in data governance, and I am certainly not opposed to the development of innovative data governance solutions. However, the use of the word “trust” in this context feels wrong, since the model proposed is not a data trust in any real sense of the word. This view seems to be shared by civic data trust advocate Sean MacDonald in an article written in response to the proposal. It is also made clear in this post by the Open Data Institute which attempts to define the concept of a civic data trust. In fact, it is hard to imagine such an entity being created and structured without significant government involvement. This perhaps is at the core of the problem with the proposal – and at the root of some of the pushback the Sidewalk Toronto project has been experiencing. Sidewalk Labs is a corporation – an American one at that – and it is trying to develop a framework to govern vast amounts of data collected about every aspect of city life in a proposed development. But smart cities are still cities, and cities are public institutions created and structured by provincial legislation and with democratically elected councils. If data is to be collected about the city and its residents, it is important to ask why government is not, in fact, much more deeply implicated in any development of both the framework for deciding who gets to use city infrastructure and spaces for data collection, and what data governance model is appropriate for smart cities data.

Published in Privacy

Canadian Trademark Law

Published in 2015 by Lexis Nexis

Canadian Trademark Law 2d Edition

Buy on LexisNexis

Electronic Commerce and Internet Law in Canada, 2nd Edition

Published in 2012 by CCH Canadian Ltd.

Electronic Commerce and Internet Law in Canada

Buy on CCH Canadian

Intellectual Property for the 21st Century

Intellectual Property Law for the 21st Century:

Interdisciplinary Approaches

Purchase from Irwin Law