Sensoring the Oceans: The Argo Floats Array in the Governance of Science Data Infrastructures

What role do governance arrangements, background legal rules, and the core infrastructures play in enabling data collection, determining what “ocean data” is produced, and when and how it is made available? We explore this question by focusing on data about oceanic features produced by Argo – an international program, operationalized by state agencies and research institutions, that comprises arrays of autonomous floats for ocean observation. Through examination of annual meeting notes, interviews, and observation of the Argo Steering Committee’s annual meeting, we analyze the techniques and practices involved in planning, testing, calibrating, validating, and error-correcting that ultimately lead to the production, transmission, and dissemination of Argo data. We then position Argo within the institutional governance of oceans, weather, climate and, most recently, earth systems to illustrate both the evolution of Argo’s role and its evolving and uneasy position within different governance approaches. In the conclusion, we challenge the utility of “ocean data” as an analytical category and highlight the risks of over-coordination and institutionalization of data infrastructures. We suggest that allowing data infrastructures like Argo to develop organically might lead to productive (if unexpected) connections, fusions, or splits, which might in turn reorient the focus of observation towards unexplored interactions between and within earth systems. We hope that our analysis helps bring to the fore some core data-infrastructural features of planetary governance as it now exists and will (have to) rapidly further evolve.

The final version of this working paper will be published in “Governance by Data: Infrastructures of Algorithmic Rule” (Cambridge University Press), edited by Fleur Johns, Gavin Sullivan, and Dimitri van Den Meerssche.

Datafication, Power, and Publics in India's National Digital Health Ecosystem

Photo of EEG reading

While evident for a long time, the COVID-19 pandemic starkly illustrated the need to strengthen India’s public healthcare system. But since 2017, the solution to India’s public health woes takes the shape of the National Digital Health Ecosystem (NDHE) – a digital system for the generation, use, and ‘frictionless’ circulation of health data across healthcare actors through the use of artefacts such as health IDs, electronic health records, data standards, and federated computing architectures. These artefacts are not neutral technological systems. Rather, together with social practices, they constitute a “data infrastructure”. Seeing the NDHE as a data infrastructure allows us to visibilise the regulatory effects of the NDHE, i.e., the ways in which the NDHE creates “communities of the affected” whose access to public health is now mediated by affordances granted by the NDHE. This, in turn, shapes law and regulation of the NDHE, where legal frameworks for (health) data protection are not weakened by accident, but weakened by design. At the same time, the regulatory effects of the NDHE can and should be regulated by law, by channeling law’s commitment to the creation of healthy public spheres to ensure the vitality of a democracy. Accordingly, this paper makes three contributions – one, it provides a brief overview of the political economy and the regulatory effects of the NDHE; two, it analyses the ways in which the regulatory effects of the NDHE shape legal frameworks for health data to disempower individuals and communities who are the generators of this data; and three, it outlines research and policy suggestions for how the law can intervene in limiting the exclusionary data-politics of the NDHE.

This paper originated in the seminar Global Data Law II: Ordering and Power. It will be published by the National Law School of India University’s Socio-Legal Review, Vol 20, Issue 1 (2024).

Zoning Data Flows

This article explores how China is developing a unique location-based data outbound deregulation regime to mitigate the negative effects of its initial security-driven regulations. A major move is repurposing free trade zones with data outbound negative lists. Using an infrastructural-thinking framework, this article examines the evolution of data outbound regulation in China, recent initiatives in the country's free trade zones, and the dynamics between local and central governments. China's data outbound practices are enabled and constrained by its global information and telecommunication (ICT) infrastructural connectivity and domestic distribution. Free trade zones become appealing deregulation testing grounds due to their overlap with critical ICT hub locations and their role as sites for policy experimentation. The ongoing pilot projects, through the interplay of law and infrastructure, present promising potential to channel China's data outbound activities into specific areas, thereby increasing their visibility, making them more amenable to regulation, and fostering both local and national economies.

Published in Tsinghua China Law Review Vol. 16 No. 2 (2024), pp. 191-223. This paper draws insights from Guarini Global Law & Tech’s Global Data Law Project and Institute for International Law and Justice’s Infrastructure as Regulation Project.

Empowering Law in Earth System Models

This blog explores the power relation between law and science in global environmental governance, by resorting to Global Data Law and Infrastructure as Regulation (InfraReg) project at NYU Law. The identification and understanding of global environmental crises has predominantly depended on science, and more recently, data-driven approaches.

Historically, international environmental law has primarily focused on institutional support for environmental science rather than engaging in the substantive processes of its norm creation. However, a paradigm shift is needed. Environmental physical models often form the condition to and/or couple with social system models, directing the creation of climate change scenarios, especially those by the IPCC. These scenarios are widely embraced by governments and corporations with gigantic climate governance impact, while evading scrutiny from international law.

Emerging proposals advocate for examining these processes through the right to science, as enshrined in the ICESCR, and for integrating broader concepts of climate and energy justice. This blog argues that, in addition, an overlooked perspective lies in the inequities of data generation and infrastructure distribution. Given the complexities and chaotic nature of Earth systems, these disparities create profound injustices that cannot be sufficiently addressed through participation and due process reforms. Instead, mobilization of various regimes of international law and institutions is a must.

This piece is part of the American Branch’s first blogging symposium, examining the ILW 2024 theme of ‘Powerless law or law for the powerless?’ from an International Environmental and Energy Law perspective. The blog post builds on insights developed in GGLT’s Planetary Futures project.

From Headlines to Al: Narrowing the Bargaining Gaps between News and AI Companies

This paper explores the interplay between litigation, legislation, and infrastructures as regulation in relation to scraping news texts by artificial intelligence (AI). It delves into the pivotal role of scraping public texts from news websites in training AI, which raises conflicts over data generation and revenue distribution between news and AI companies. The current bargaining imbalances between the parties limit news companies from receiving adequate compensation, potentially undermining incentives for public news creation. The paper analyzes the limitations of litigation in addressing text-scraping disputes due to its lengthy, costly nature and fragmented US case law. It proposes targeted legislative interventions inspired by Australian and Canadian models regarding presenting news on digital advertising platforms. The three proposed regulatory measures are collaborative negotiations by the new companies, integrating AI technologies into future collaborations, and regulating Robot.txt and AI.txt infrastructures while embracing the fair use doctrine. These tools can improve the bargaining in the shadow of the law between news and AI companies by tipping the scale in favor of news companies. Despite their challenges, these regulatory measures suggest new avenues for value distribution between the news and AI companies in the ever-evolving technological landscape.

This paper was initially written for the Global Data Law course. It won second prize in the Berkeley Technology Law Journal writing competition 2024 and is forthcoming in that journal.

China's Interim Measures for the Management of Generative AI Services

On August 15, 2023, the Interim Measures for the Management of Generative AI Services (Measures) – China’s first binding regulation on generative AI – came into force. The Interim Measures were jointly issued by the Cyberspace Administration of China (CAC), along with six other agencies, on July 10, 2023, following a public consultation on an earlier draft of the Measures that concluded in May 2023. 

This blog post is a follow-up to an earlier guest blog post, “Unveiling China’s Generative AI Regulation” published by the Future of Privacy Forum (FPF) on June 23, 2023, that analyzed the earlier draft of the Measures. This post compares the final version of the regulation with the earlier draft version and highlights key provisions.

Notable changes in the final version of the Measures include:

  • A shift in institutional dynamics, with the CAC playing a less prominent role;

  • Clarification of the Measures’ applicability and scope;

  • Introduction of responsibilities for users;

  • Introduction of additional responsibilities for providers, such as taking effective measures to improve the quality of training data, signing service agreements with registered users, and promptly addressing illegal content;

  • Assignment of responsibilities to government agencies to strengthen the management of generative AI services; and

  • Introduction of a transparency requirement for generative AI services, in addition to the existing responsibilities for providers to increase the accuracy and reliability of generated content.

Published by the Future of Privacy Forum blog. The blog post builds on insights developed in the context of Guarini Global Law & Tech’s conference on “how (not) to regulate generative AI”.

Attributive Justice in International Law: The Global Law and Infrastructure of Pathogen Genomic Sequence Data-Sharing and Benefit-Sharing

A succession of epidemic diseases among humans in the first decades of the 21st century renewed long-standing controversies about power imbalances and justice in the global production, use, and distribution of scientific data and its benefits. A new area of contention concerns digital genomic sequence data (GSD). The largely-forgotten idea of ‘attributive justice’, articulated by Hugo Grotius (1625), helps make sense of otherwise-disparate demands for GSD justice.

At least two kinds of attributive justice claims are made in relation to GSD. One is for attribution of credit to scientists and others involved in medical services or other procurement of samples—a scientist-focused attributive justice. These claims are mobilized especially in efforts to rectify existing power and resource imbalances in science production, both within national societies and by scientists from developing country. These claims have considerable traction, but not in formal international law.

A second claim relates to demands by developing countries either to control GSD, or at least to receive benefits from commercial use of it when the underlying biological sample originates specifically in their territory. These claims have been pursued in efforts to extend the 2010 Nagoya Protocol. Other existing or pending international treaty regimes embedded in entirely separate institutions also address benefit sharing in relation to oceanic, plant, or human digital sequence sharing, complicating the formation of a coherent or unified set of rules. Contentions about widely used sets of data- governance principles such as Findable, Accessible, Interoperable, and Reus- able (FAIR) data also arise in each treaty regime.

Infrastructural regimes for the sharing of sequences have become major sites for both the scientist-relative and state-relative attributive justice claims. The most widely used platform for access to GSD of all kinds is the International Nucleotide Sequence Database Collaboration (INSDC) (including GenBank). A leading alternative is GISAID, which is similar in being free to use but conditions GSD access and sets requirements for attribution of

scientific credit. While GISAID goes further than INSDC in supporting scientist-relative attributive justice claims, the two infrastructures are broadly similar with regard to state-relative claims for attribution of GSD and benefit-sharing. The infrastructures have recently begun trying to ensure that metadata accompanying each sequence attributes it to samples taken from a particular country, but not that the GSD is systematically linked to its com- mercial outcomes. These infrastructures embed norms and ideologies of their original builders, such as a normative commitment to ‘Open Science’, and the economic and epidemic-security interests of richer OECD countries.

Attributive justice entitlements of particular scientists and states will not leverage universal principles of distributive virus- and vaccine-justice but are reinforcing significant shifts toward orders of respect and recognition in global health research and (slowly) in sequencing infrastructures. The contributions of attributive justice have been underestimated.

Professor Benedict Kingsbury published his article 'Attributive Justice in International Law: The Global Law and Infrastructure of Pathogen Genomic Sequence Data-Sharing and Benefit-Sharing' in the Summer 2023 issue of NYU Journal of International Law and Politics.

Unveiling China’s Generative AI Regulation

The Cyberspace Administration of China (CAC) released Draft Measures for the Management of Generative AI Services (the “Draft Measures”) on April 11, 2023. The comment period closed on May 10, 2023. Public statements by industry participants and legal experts provided insight into the likely content of their comments. It is now the turn of the CAC as China’s “cyber super-regulator” to consider these comments and likely produce a revised text.

This blog post analyzes the provisions and implications of the Draft Measures. It covers the Draft Measures’ scope of application, how they apply to the development and deployment lifecycle of generative AI systems, and how they deal with the ability of generative AI systems to “hallucinate” (that is, produce inaccurate or baseless output). It also highlights potential developments and contextual points about the Draft Measures that industry and observers should pay attention to.

Published by the Future of Privacy Forum blog. The blog post builds on insights developed in the context of Guarini Global Law & Tech’s conference on “how (not) to regulate generative AI”.

Three Years of GDPR: Enforcement (or Lack Thereof) and Its Impact on Cross-Border Contracts

The General Data Protection Regulation (GDPR) is widely touted as the greatest shift in data privacy regulation of the century—with protections of users’ rights in commercial use, as well as cross-border transfers, the GDPR establishes fundamental freedoms within digital spaces and codifies the rights of users across the European Union (EU). When the GDPR was introduced, the EU had high expectations of changing practices in relation to data collection, processing and transfer. Despite examples of penalties and fines being imposed on businesses, three years after the GDPR entered into force, the question remains: Has GDPR enforcement (or lack thereof) changed the way cross-border contracting is carried out? This article describes the EU’s initial plans for enforcement under the GDPR, discusses actual instances of enforcement over its three years of existence, and queries whether anything about the GDPR has changed cross-border contracting practices.

Since inception, EU supervisory authorities have levied approximately 1,034 fines and 1.6 billion Euro in penalties for violations under the GDPR. Nonetheless, since 2021, authorities appear to have ramped up enforcement. Between January and November 2021, DPAs filed 395 fines against companies, totaling over 1 billion Euro (eighty-one percent of all fines issued from the inception of the GDPR to November 2021) .

However, very few enforcement actions have addressed cross-border contracting. Companies engaging in cross-border contracting could interpret this lack of interest from regulatory bodies as a sign that things may carry on as they did before the GDPR came into effect. While corporations may have altered attitudes towards customer engagement and data processing based on the GDPR, there is little evidence of changes to nuanced practices associated with cross-border contracts. Businesses seem far more focused on compliance with requirements related to contracts with user than requirements for contractual relationships internationally. Similar enforcement trends in the narrowed context of Chapter V protections against improper and unjustifiable cross-border data tranfsers remain to be seen.

Published in The Year in Review: An Annual Publication of the ABA International Law Section (vol 56 ABA/ILS YIR 67-72 (2022)). The article was written by Ali Strongwater (JD ‘23) and Izak Rosenfeld (Associate General Counsel, Access Now) during Ali’s externship at Access Now in Fall 2021.

Confronting Data Inequality

Control over data conveys significant social, economic, and political power. Unequal control over data—a pervasive form of digital inequality—is a problem for economic development, human agency, and collective self-determination that needs to be addressed. This Article takes steps in this direction by analyzing the extent to which law facilitates unequal control over data and by suggesting ways in which legal interventions could lead to more equal control over data. We use the term "data inequality" to capture unequal control over data-not only in terms of having or not having data, but also in terms of having or not having the ''power to datafy" (i.e., deciding what becomes or does not become data). We argue that data inequality is a function of unequal control over the infrastructures that generate, shape, process, store, transfer, and use data. Existing law often regulates data as an object to be transferred, protected, shared, and exploited and is not always attuned to the salience of infrastructural control over data. While there are no easy solutions to the variegated causes and consequences of data inequality, we suggest that retaining flexibility to experiment with different approaches; reclaiming infrastructural control; systematically demanding enhanced transparency; pooling data and bargaining power; and developing differentiated and conditional access to data mechanisms may help in confronting data inequality more effectively going forward.

Published in Columbia Journal of Transnational Law, Volume 60, Issue 3 (2022), pp. 829-956. The paper was initially written as a background paper for the World Development Report 2021: Data for Better Lives. It draws on ideas developed in Guarini Global Law & Tech’s Global Data Law project.