A decision of the U.S. Court of Appeals for the 10th Circuit highlights the power dynamics around rights to collect and share data. It marks an important victory for environmental activists, and should also be of interest to all those who engage in citizen science, as well as community-based environmental monitoring.

The case arose after the Wyoming legislature passed a law titled Trespassing to Unlawfully Collect Resource Data that imposed civil and criminal liability on any person who crossed over private land in order to “access adjacent or proximate land where he collects resource data.” The statutory definitions of resource data included all kinds of data gathering activity from taking notes to photographing wildlife or taking samples of soil or water.

The backstory to the legislation involved efforts by environmental activists with the Western Watersheds Project to document the impact of cattle grazing on water quality, and to push for limits on grazing on public lands. These efforts were opposed by cattle ranchers, who apparently carry enough clout to push the legislature to enact such a law. A predecessor statute in 2015, titled Trespassing to Collect Data, created civil and criminal liability for collecting data on “open lands”. After the constitutionality of the 2015 law was challenged, it was amended to prohibit crossing private land without permission in order to collect data on “adjacent or proximate land” (which might be public land). It was this amended version that was considered by the appellate court.

The issue before the Court was not whether there was a broad right to collect resource data on either public or private land. Rather, it was whether the state, by creating new civil and criminal trespass penalties for those who crossed private land without permission in order to collect data on public land, violated the free speech rights of the data collectors. The plaintiffs’ argument was essentially that although there were already penalties for trespass on private land, the statute created additional penalties for those who trespassed on private land for the purpose of collecting data on public land. Thus, the court framed the issue as “not whether trespassing is protected conduct, but whether the act of collecting resource data on public lands qualifies as protected speech.” The court noted that the prohibited acts under the law involved “collecting water samples, taking handwritten notes about habitat conditions, making an audio recording of one’s observation of vegetation, or photographing animals”, so long as location data was also included.

The Court noted that a number of federal and state environmental statutes and regulations provided for public submission of environmental data as part of assessment and decision-making processes. The plaintiffs argued that a law restricting their ability to gather environmental data inhibited their ability to participate in such processes, thus limiting their freedom of speech. The Court agreed, noting that the First Amendment extends to the “creation” of speech. The Court observed that “An individual who photographs animals or takes notes about habitat conditions is creating speech in the same manner as an individual who records a police encounter”. The Court also found that the taking of samples, though “somewhat further afield of pure speech”, was protected. In this case, the samples were characterized by the Court as “information plaintiffs need to engage in environmental advocacy”. The Court also observed that the plaintiffs used the data they collected in advocacy activities, and that this type of political engagement was at the core of the First Amendment protection.

The Court does caution that there is no general “unrestrained right to gather information”. As a result, laws that, by banning activities incidentally prevent the ability to gather information about those activities would not run afoul of the First Amendment. In this way, a general prohibition on trespass does not offend the First Amendment, even if it means that someone would be equally barred from trespassing to gather information. What was problematic here was that the laws created new penalties that specifically applied to trespass for data gathering activities.

Although the legislation in this case might seem to be an outlier product of an aggressive stakeholder lobby of government, the issues it raises have a broader significance. Control over data, access to data and even the ability to create data are all crucially important in our data-driven society. My ongoing research explores issues of ownership, control and access to data – expect to see more posts on these topics over the course of the year.

A 2016 European Commission report titled Survey report: data management in Citizen Science projects provides interesting insights into how such projects manage the data they collect. Proper management is, of course, essential to ensure that the collected data can be used and reused by project leaders as well as by other downstream users. It is relevant as well to the protection of the privacy of citizen participants. The authors of this report surveyed a large number of citizen science projects. From the 121 responses received they distilled findings that explore the diversity of the citizen science projects, and that reveal a troubling lack of thorough data management practices. A significant shortcoming for many projects was the lack of appropriate data licences to govern reuse of either raw or aggregate data collected.

There has been growing pressure on those carrying out research using public resources to make the fruits of the research – including the research data – publicly available for consultation, verification or reuse. But doing so is not as simple as a binary open/closed choice. There are a number of different questions that researchers must address: Should the raw data be made open or only the aggregate data? Should it be immediately available or available only after an embargo period? Is all data suitable for release or should some be protected for public policy reasons (such as protecting privacy)? And what, if any, terms and conditions should be imposed on reuse?

The authors of the EC report, Sven Schade and Chrysi Tsinaraki, found that overall there was a relatively high level of data sharing from citizen science projects. Significantly, 38% of the respondents to their survey provided access to their raw data; 37% provided access to aggregate data and 30% provided access to both. One interesting observation in this respect was that 68% of those respondents who provided access to their raw data also included within this dataset personal identifiers of citizen contributors to the project. Such data might be advertently collected, as where individuals provide personal information with their data uploads. In some cases, the scope of personal information might be significant. Contributions to a project might include geolocation information and geodemographic details. Schade and Tsinaraki asked respondents about their practices when it came to obtaining informed consent to data collection from project participants; they found that 25% of respondents did not obtain such consent whereas 53% relied upon a generic terms of use document to obtain consent. It was not entirely clear whether the consent being sought related to privacy issues or to obtaining any necessary rights to use or disseminate the data being collected (which might, for example, include copyright protected photographs). In any event, the results of the survey suggest that there is a significant lack of attention to both privacy and IP rights issues in citizen science projects.

On the issue of data licensing, Schade and Tsinaraki found that the conditions imposed on reuse by different projects varied. A majority of those who made data available believed that the data was in the public domain, while others imposed conditions such as non-commercial or share-alike restrictions. When asked which license they used to achieve these goals, 32 out of 56 respondents indicated that they used one of the commonly available template licences such as Creative Commons or Open Data Commons. A surprising number of respondents indicated that no particular licence was used. While data released in this way might be presumed to be “open”, the usefulness of the data might well be hampered by a lack of clarity regarding the scope of permitted reuse.

In addition to providing access to data, the authors of the Report asked whether citizen science researchers allowed open access to research results (presumably in the form of published papers and other output). While the overwhelming majority of projects indicated that they used open access options (ranging from public domain dedication to open access with conditions), Schade and Tsinaraki also found that 14 of the projects they considered used licences with terms that were not consistent with the reuse conditions that the researchers had identified. Clearly there is a need for greater support for projects in developing or choosing appropriate licences.

Although many of the projects indicated that they provided access to their data, the duration of that access was less certain. The authors found that 42% of projects intended to guarantee access to their data only within the lifespan of the project. The authors also found that 40% of projects that provide data access do not provide comprehensive metadata along with the data. This would certainly limit the value of the data for reuse. Both these issues are important in the context of citizen science projects, which are often granted-funded and temporally-limited. The ability to archive and preserve research data and to make it available for meaningful access and reuse should be part of researchers’ data management plans, and is something which should be supported by research institutions and funding agencies.

Overall, the Report provides data that suggests that the burgeoning field of citizen science needs more support when it comes to all aspects of data management. Proper data management practices will help citizen science researchers to meet their own objectives, to share their data effectively and appropriately, and to protect the rights and interests of participants.

Note: In 2015 I drafted a report, with Haewon Chung, for the Wilson Center Commons Lab titled Managing Intellectual Property Rights in Citizen Science. This report addresses many licensing issues related to the collection, sharing and reuse of citizen science data and outputs. It is available under a Creative Commons Licence.


