access to information Ambush Marketing big data citizen science confidential information copyright data protection digital cartography ecommerce and internet law Electronic Commerce Extraterritoriality fair dealing freedom of expression Geospatial geospatial data intellectual property Internet internet law IP open courts open data open government personal information pipeda Privacy takings trademark law trademarks traditional knowledge transparency
Wednesday, 11 May 2016 08:04 Written by Teresa Scassa
A Canadian court has just handed down a decision in a case that interweaves interesting issues about copyright in data with issues around how the government can limit the scope of these rights in its view of the public interest. The case is complex – it involves a large number of defendants and is tied to a range of other law suits relating to the regulatory regime for oil and gas exploration in Canada. The complexity of the case is such that I will divide my analysis over two blog posts. This – the first – will address the issues around whether there is copyright in the data submitted to the regulator; the second blog post will deal with the issues relating to the curtailment of the copyright within the context of the regulatory regime.
The plaintiff in this case and in the mass of related litigation is Geophysical Service Inc. (GSI). GSI is a Canadian company that is in the business of carrying out marine seismic surveys and licensing the data that it collected and a compiled as a result of its activities. It claims that its flood of litigation around the copyright and regulatory regime issues resulted from the fact that the government’s approach is driving it out of business. As copyright is often touted as providing incentives to create and innovate, GSI’s precarious status as an innovator in this area sets an interesting context for the issues raised in the litigation.
In a nutshell, GSI – like other companies in this field – had to obtain a licence from the national regulator to conduct its expensive, time and labour intensive work. A condition of the licence was that the data it generated and processed into information products would be submitted to the appropriate regulatory bodies that oversee offshore oil and gas exploration. It is this data and the related information products that GSI claims is protected by copyright law. Under the statutes governing the regulatory process, data submitted to the regulator can be made public after a 5 year period. GSI was in the business of selling its data and information products to companies engaged in oil and gas exploration. GSI argued that the fact that the same data and analysis could be released to the public after 5 years, and was, as a matter of policy released between 5 and 15 years after its submission made its business ultimately unsustainable. They argued, therefore, that they had copyright in the data they collected and in the analytics they carried out on the data. They then argued that the regulator, by releasing this data to the public before the expiration of the copyright term, infringed its copyrights. They also maintained that the other private sector companies which made use of their data obtained from the public sources, violated their copyrights.
The first issue, therefore, was whether the seismic data and related information products produced by GSI amounted to original works that could be protected by copyright law. It is a basic principle of copyright law that there can be no copyright in facts – facts are in the public domain. At the same time, however, it is possible to have copyright in a compilation of facts – so long as that compilation meets the requirements of originality. According to the Supreme Court of Canada in CCH Canadian v. Law Society of Upper Canada, originality requires that a work: a) is not copied; b) reflects an exercise in skill and judgment and 3) can be attributed to a human author. In this case, the defendants argued that the GSI data was ‘copied’ from the environment (i.e. it was factual material not protected by copyright law); that its collection and compilation did not involve sufficient skill and judgment because it was in part automated, and in part collected and compiled according to industry standards; and that the technology-assisted and highly human- and other resource-intensive process involved in its collection and compilation meant that it did not originate from an identifiable human author.
Justice Eidsvik of the Alberta Court of Queen’s Bench found resoundingly for the plaintiffs on the copyright issues. She carefully considered the manner in which the seismic data was both collected and processed. She found that both the raw data and the processed data constituted “works” within the meaning of the Copyright Act. She analogized the raw seismic data to a literary work or a literary compilation. She also found that some of the seismic sections – data represented as squiggly lines – would fall within the definition of an artistic work. Both “works” in this case met the necessary threshold for originality. She noted that the creation and compilation of the seismic data required significant levels of skill, noting that “The data is created, not merely collected, through the intervention of human skill” (at para 79). The collection of this seismic data requires a complex series of choices. She accepted the analogy that it was like taking a photograph. Justice Eidsvik observed:
In this case, the photograph is not just a quick snapshot; rather, it is one that requires careful selection of the location, angle of technological instruments (e.g. the size and depth of the airguns, the length and depth of the streamers, and the number and placement of hydrophones), and finally the filtering and refining of the product. (at para 80)
She also found apt an analogy from one of the expert witnesses between the creation of the data and the conducting of a symphony, where the conductor “ensures that some instruments are played louder, or softer, or faster or slower, to make a beautiful creation. The same types of decisions are made on board the seismic acquisition ship to obtain “beautiful” raw seismic data.” (at para 81)
Having found copyright in the compilation of raw data, it is not surprising that the judge also found copyright in the processed data as well. She found that substantial skill and judgment went into the processing of the data, stating that “The raw data is not simply pumped into a computer and a useful product comes out.” (at para 83) She found that the quality of the processed data is very much dependent upon the participation of a skilled processor, and that different companies would produce different processed data from the raw data depending upon the skill of the processor involved.
Justice Eidsvik also found that the requisite human author was present. In doing so, she addressed the Telstra decision from the Australian High Court which had found no copyright in a telephone director in part because it was created following a largely automated process in which there was relatively little human input. In this case, she found the human input to be a significant factor in determining the quality of the output at both the stage of acquisition of the data and the processing stage. She reviewed the few Canadian cases involving compilations of data, noting that in cases where human input is more significant in terms of the choices made in arranging the facts, the courts accept that the compilation is original.
Justice Eidsvik rejected the argument that it is necessary to identify a specific human author in order to find copyright in a complex factual work. She accepted that a team of “authors” could create a factual compilation. Nevertheless, she was also prepared to identify in this case the head of the seismic crew on the ship as the author of the raw data and the person in charge of the computing as the author of the processed data. She noted as well that in this case the actual owner of the copyright would be the employer of both of these individuals – GSI.
In finding copyright in both the raw and the processed data, Justice Eidsvik was careful to note that she was not deviating from the principle that there could be no copyright in facts or ideas. She found that the “seismic data is an expression of GSI’s views of what the image of the subsurface of the surveyed areas represents.” (at para 97). The raw facts – the features of the subsurface – are there for anyone to see and are in the public domain – but the data collected about those facts is authored. Critical data theorists will recognize in here the seeds of the essential subjectivity of collected data, where choices are made as to how to collect the data, and according to what parameters.
Justice Eidsvik also rejected the idea that the works at issue lacked originality because their collection and compilation were dictated by “practical considerations, utility or externally imposed requirements.” (at para 105) Notwithstanding the presence of industry standards that would influence some of the decision-making involved in the collection and processing of the data, she found that “the original skill and judgment that comes to bear on the final product of the seismic work far outweighs the portion of “hard wired” industry standards in play.” (at para 105)
Based on the facts of this case it is not surprising that Justice Eidsvik would conclude that there was copyright in both the compilation of seismic data and in the processed data. Her extensive review of the process by which the data is first collected and then processed reveals a substantial amount of skill and judgment. In a “datified” society, the decision may give some comfort to those who collect and process all manner of data: their products – whether compilations of raw data or processed data (analytics) – are works that can be protected under copyright law. Such protection will be dependent upon an ability to show that the collection and/or processing involve choices motivated by skill and judgment, rather than mechanical decision-making or compliance with industry norms or standards.
While for GSI it was a victory to have copyright confirmed in its data products, the victory was largely pyrrhic. The second part of the decision – and the part that I will consider in a subsequent blog post – deals with the regulatory regime which the court ultimately finds to have effectively expropriated this copyright interest. Stay tuned!
Monday, 25 April 2016 07:06
Sourcing cycling data from the private sector: Some questions about data analytics and city planningWritten by Teresa Scassa
A recent news story from the Ottawa area raises interesting questions about big data, smart cities, and citizen engagement. The CBC reported that Ottawa and Gatineau have contracted with Strava, a private sector company to purchase data on cycling activity in their municipal boundaries. Strava makes a fitness app that can be downloaded for free onto a smart phone or other GPS-enabled device. The app uses the device’s GPS capabilities to gather data about the users’ routes travelled. Users then upload their data to Strava to view the data about their activities. Interested municipalities can contract with Strava Metro for aggregate de-identified data regarding users’ cycling patterns over a period of time (Ottawa and Gatineau have apparently contracted for 2 years’ worth of data). According to the news story, their goal is to use this data in planning for more bike-friendly cities.
On the face of it, this sounds like an interesting idea with a good objective in mind. And arguably, while the cities might create their own cycling apps to gather similar data, it might be cheaper in the end for them to contract for the Strava data rather than to design and then promote the use of theirs own apps. But before cities jump on board with such projects, there are a number of issues that need to be taken into account.
One of the most important issues, of course, is the quality of the data that will be provided to the city, and its suitability for planning purposes. The data sold to the city will only be gathered from those cyclists who carry GPS-enabled devices, and who use the Strava app. This raises the question of whether some cyclists – those, for example, who use bikes to get around to work, school or to run errands and who aren’t interested in fitness apps – will not be included in planning exercises aimed at determining where to add bike paths or bike lanes. Is the data most likely to come from spandex-wearing, affluent, hard core recreational cyclists than from other members of the cycling community? The cycling advocacy group Citizens for Safe Cycling in Ottawa is encouraging the public to use the app to help the data-gathering exercise. Interestingly, this group acknowledges that the typical Strava user is not necessarily representative of the average Ottawa cyclist. This is in part why they are encouraging a broader public use of the app. They express the view that some data is better than no data. Nevertheless, it is fair to ask whether this is an appropriate data set to use in urban planning. What other data will be needed to correct for its incompleteness, and are there plans in place to gather this data? What will the city really know about who is using the app and who is not? The purchased data will be deidentified and aggregated. Will the city have any idea of the demographic it represents? Still on the issue of data quality, it should be noted that some Strava users make use of the apps’ features to ride routes that create amusing map pictures (just Google “strava funny routes” to see some examples). How much of the city’s data will reflect this playful spirit rather than actual data about real riding routes is a question also worth asking.
Another important issue – and this is a big one in the emerging smart cities context – relates to data ownership. Because the data is collected by Strava and then sold to the cities for use in their planning activities, it is not the cities’ own data. The CBC report makes it clear that the contract between Strava and its urban clients leaves ownership of the data in Strava’s hands. As a result, this data on cycling patterns in Ottawa cannot be made available as open data, nor can it be otherwise published or shared. It will also not be possible to obtain the data through an access to information request. This will surely reduce the transparency of planning decisions made in relation to cycling.
Smart cities and big data analytics are very hot right now, and we can expect to see all manner of public-private collaborations in the gathering and analysis of data about urban life. Much of this data may come from citizen-sensors as is the case with the Strava data. As citizens opt or are co-opted into providing the data that fuels analytics, there are many important legal, ethical and public policy questions which need to be asked.
Published in Geospatial Data/Digital Cartography
Canadian Trademark Law
Published in 2015 by Lexis Nexis
Electronic Commerce and Internet Law in Canada, 2nd Edition
Published in 2012 by CCH Canadian Ltd.
Intellectual Property for the 21st Century
Intellectual Property Law for the 21st Century: