Multilateral Development Banks and the Legal Fabric of Global Data and AI Governance

As countries around the world engage in a “race to AI regulation” to increase trust and uptake of AI systems, little attention has been paid to the role of multilateral development banks (MDBs) in shaping AI governance in the Global South. This article aims to address this gap in the literature by exploring how MDBs construct an intricate, dense “legal fabric” comprised of a range of binding and nonbinding instruments that act at international, regional, and project-specic sites. MDBs provide funding, technical assistance, and drafting advice for the creation of national laws, policies, and regulatory frameworks–but they also exert leverage through a range of other legal, bureaucratic, and organizational instruments, including loan agreements, project appraisal documents, operational manuals, impact and risk assessment frameworks, terms of reference, guiding principles for infrastructure design, and playbooks for policymakers. These documents shape, reinforce, embed, and stabilize norms with direct impacts on AI systems, and are deeply entangled with law and legal concepts from a variety of doctrines, particularly data protection, algorithmic governance, and cybercrime. “Untangling” this legal fabric reveals several difcult normative, political, and distributive questions regarding both the procedural means through which MDBs shape and enact these instruments, and the substantive norms that are “baked” into this legal fabric and the digital infrastructures they encase. This article concludes by arguing that MDBs must ensure that their work on AI regulation reduces digital inequality, reects the needs and values of the communities they operate in (rather than a top-down, Global North-driven approach), and incorporates active participation from various “publics” that may be directly and indirectly impacted by these systems.

Do as I Say, Not as I Code: GitHub's Copilot Prompts IP Litigation with International Implications

The rapid proliferation of Large Language Models (LLMs) in con-temporary technological ecosystems has sparked significant legal debates, particularly regarding intellectual property (IP) rights. One particularly notable yet under-reported case is Doe v. GitHub, currently stayed in the Northern District of California after the court certified an order for interlocutory appeal on September 27, 2024. This lawsuit involves OpenAI, GitHub, and its parent company Microsoft, focusing on the use of open-source software (OSS) code to train LLMs, specifically GitHub’s Copilot—a programming assistance tool currently powered by OpenAI’s GPT-4 model and previously by Codex, a modified, fine-tuned version of GPT-3 additionally trained on gigabytes of publicly available source code. Although praised for its potential to enhance programming productivity, open-source developers and communities have raised concerns about Copilot due to its tendency to reproduce material from public repositories without properly attributing authorship or adhering to terms and conditions of the original open-source licenses. The upcoming decision by the Ninth Circuit on whether claims under DMCA § 1202(b)(1) or (b)(3) must meet an “identicality” requirement carries significant implications for the AI industry, particularly in shaping standards for copyright compliance in models that use open-source data.

Digitalization as Development: Rethinking the IFC’s Risk Assessment and Remedy Frameworks in the Context of Digital Technologies

“Digital transformation” has become an increasingly central pillar of the international development landscape. “Reaping the benefits of digitalization” is seen as a developmental imperative, and new technologies are widely hailed to provide transformative opportunities. For multilateral development banks (MDBs) in particular, digitalization has become a strategic priority, and these institutions are financing a rapidly growing number of projects with digital components. Although digital technologies can be transformative, whether and under what conditions such transformations enhance economic and social well-being in the ways that MDBs proclaim requires close examination.

Focusing on the International Finance Corporation (IFC), a private-sector lending institution of the World Bank Group, this report unpacks the concept of “digital transformation,” posing the question: what exactly is being transformed by digital technologies, for whom, and with what implications? The report then analyzes the IFC’s current framework for assessing the risks and impacts of its investments and for remedying harms arising from its projects and identifies key challenges that digitalization poses to existing risk and impact assessment frameworks and remedy mechanisms. It then proposes forward-looking suggestions for how existing frameworks might be rethought and reformed.

Curb Your Enthusiam: Why Europe's Digital Reforms May Not Become a Global Standard

The European Union is widely perceived and presents itself as the global vanguard in the struggle to regulate digital corporations. The Union’s regulatory schemes, especially the Digital Services Act, are widely hailed as the continent’s – and from the EU’s perspective, the world’s – best shot at taming digital capitalism. The EU designed many of those measures to become a ‘global standard’. Yet, drawing from organization theory and a legal realist analysis of several of the key provisions of the DSA and their potential implementation, I claim that crucial parts of Europe’s reforms will not become a global normative standard – or, if they do, in ways fundamentally different to what many would expect. That is for two reasons. First, while the DSA does establish a few concise and objective substantive standards, it also grants extensive discretion to private organizations. Second, if private actors will, as we must assume, exercise this discretion in an autonomous (some might say self-serving manner), many publicly acclaimed provisions of Europe’s digital governance reforms may yield globalized private ordering carrying the legitimizing label of EU supervision. Consequently, some current European reforms may stabilize rather than constrain private power and diffuse, if at all, only European ceremonies and labels but not necessarily the full substance of EU law.

From In(-)formation to Infrastructural Turns: The Digital Futures of Human Rights Law and Practice

In his book, The Informational Logic of Human Rights: Network Imaginaries in the Cybernetic Age, Joshua Bowsher critiques the human rights movement’s preoccupation with informational practices. Tracing the evolution of informational preoccupation of human rights organizations to the rise of cybernetics and its enmeshment with the neoliberal project, Bowsher argues that the resulting practice of creating violations as events through capturing, cutting and noise elimination has defanged and depoliticized rights. The quest for objective, stable and predictable knowledge has permeated even the turn to algorithms and machine learning. While denouncing the human rights movement’s resistance to critical and political forms of knowledge making that interrogate subjects, norms, values and power relations, Bowsher nonetheless sees potential for salvaging the promise of human rights. What is needed, Bowsher argues, is a reconfguration of human rights information from an assemblage of ‘brute facts’ into positional, situated knowledge-making practice that would aim to forge connections between structures of oppression and domination and sufferings of ‘particularly situated human beings’. Picking up on Bowsher’s call for human rights in(-)formation, this review essay examines the infrastructural turn needed to effectuate knowledge-generating practices attuned to the polyvalency of the situated perspectives. Focusing particularly on the growing role of digital data and infrastructures, the essay seeks to illuminate promising paths forward for human rights advocates and practitioners.

Sensoring the Oceans: The Argo Floats Array in the Governance of Science Data Infrastructures

What role do governance arrangements, background legal rules, and the core infrastructures play in enabling data collection, determining what “ocean data” is produced, and when and how it is made available? We explore this question by focusing on data about oceanic features produced by Argo – an international program, operationalized by state agencies and research institutions, that comprises arrays of autonomous floats for ocean observation. Through examination of annual meeting notes, interviews, and observation of the Argo Steering Committee’s annual meeting, we analyze the techniques and practices involved in planning, testing, calibrating, validating, and error-correcting that ultimately lead to the production, transmission, and dissemination of Argo data. We then position Argo within the institutional governance of oceans, weather, climate and, most recently, earth systems to illustrate both the evolution of Argo’s role and its evolving and uneasy position within different governance approaches. In the conclusion, we challenge the utility of “ocean data” as an analytical category and highlight the risks of over-coordination and institutionalization of data infrastructures. We suggest that allowing data infrastructures like Argo to develop organically might lead to productive (if unexpected) connections, fusions, or splits, which might in turn reorient the focus of observation towards unexplored interactions between and within earth systems. We hope that our analysis helps bring to the fore some core data-infrastructural features of planetary governance as it now exists and will (have to) rapidly further evolve.

The final version of this working paper will be published in “Governance by Data: Infrastructures of Algorithmic Rule” (Cambridge University Press), edited by Fleur Johns, Gavin Sullivan, and Dimitri van Den Meerssche.

Datafication, Power, and Publics in India's National Digital Health Ecosystem

Photo of EEG reading

While evident for a long time, the COVID-19 pandemic starkly illustrated the need to strengthen India’s public healthcare system. But since 2017, the solution to India’s public health woes takes the shape of the National Digital Health Ecosystem (NDHE) – a digital system for the generation, use, and ‘frictionless’ circulation of health data across healthcare actors through the use of artefacts such as health IDs, electronic health records, data standards, and federated computing architectures. These artefacts are not neutral technological systems. Rather, together with social practices, they constitute a “data infrastructure”. Seeing the NDHE as a data infrastructure allows us to visibilise the regulatory effects of the NDHE, i.e., the ways in which the NDHE creates “communities of the affected” whose access to public health is now mediated by affordances granted by the NDHE. This, in turn, shapes law and regulation of the NDHE, where legal frameworks for (health) data protection are not weakened by accident, but weakened by design. At the same time, the regulatory effects of the NDHE can and should be regulated by law, by channeling law’s commitment to the creation of healthy public spheres to ensure the vitality of a democracy. Accordingly, this paper makes three contributions – one, it provides a brief overview of the political economy and the regulatory effects of the NDHE; two, it analyses the ways in which the regulatory effects of the NDHE shape legal frameworks for health data to disempower individuals and communities who are the generators of this data; and three, it outlines research and policy suggestions for how the law can intervene in limiting the exclusionary data-politics of the NDHE.

This paper originated in the seminar Global Data Law II: Ordering and Power. It will be published by the National Law School of India University’s Socio-Legal Review, Vol 20, Issue 1 (2024).

Zoning Data Flows

This article explores how China is developing a unique location-based data outbound deregulation regime to mitigate the negative effects of its initial security-driven regulations. A major move is repurposing free trade zones with data outbound negative lists. Using an infrastructural-thinking framework, this article examines the evolution of data outbound regulation in China, recent initiatives in the country's free trade zones, and the dynamics between local and central governments. China's data outbound practices are enabled and constrained by its global information and telecommunication (ICT) infrastructural connectivity and domestic distribution. Free trade zones become appealing deregulation testing grounds due to their overlap with critical ICT hub locations and their role as sites for policy experimentation. The ongoing pilot projects, through the interplay of law and infrastructure, present promising potential to channel China's data outbound activities into specific areas, thereby increasing their visibility, making them more amenable to regulation, and fostering both local and national economies.

Published in Tsinghua China Law Review Vol. 16 No. 2 (2024), pp. 191-223. This paper draws insights from Guarini Global Law & Tech’s Global Data Law Project and Institute for International Law and Justice’s Infrastructure as Regulation Project.

Empowering Law in Earth System Models

This blog explores the power relation between law and science in global environmental governance, by resorting to Global Data Law and Infrastructure as Regulation (InfraReg) project at NYU Law. The identification and understanding of global environmental crises has predominantly depended on science, and more recently, data-driven approaches.

Historically, international environmental law has primarily focused on institutional support for environmental science rather than engaging in the substantive processes of its norm creation. However, a paradigm shift is needed. Environmental physical models often form the condition to and/or couple with social system models, directing the creation of climate change scenarios, especially those by the IPCC. These scenarios are widely embraced by governments and corporations with gigantic climate governance impact, while evading scrutiny from international law.

Emerging proposals advocate for examining these processes through the right to science, as enshrined in the ICESCR, and for integrating broader concepts of climate and energy justice. This blog argues that, in addition, an overlooked perspective lies in the inequities of data generation and infrastructure distribution. Given the complexities and chaotic nature of Earth systems, these disparities create profound injustices that cannot be sufficiently addressed through participation and due process reforms. Instead, mobilization of various regimes of international law and institutions is a must.

This piece is part of the American Branch’s first blogging symposium, examining the ILW 2024 theme of ‘Powerless law or law for the powerless?’ from an International Environmental and Energy Law perspective. The blog post builds on insights developed in GGLT’s Planetary Futures project.

From Headlines to Al: Narrowing the Bargaining Gaps between News and AI Companies

This paper explores the interplay between litigation, legislation, and infrastructures as regulation in relation to scraping news texts by artificial intelligence (AI). It delves into the pivotal role of scraping public texts from news websites in training AI, which raises conflicts over data generation and revenue distribution between news and AI companies. The current bargaining imbalances between the parties limit news companies from receiving adequate compensation, potentially undermining incentives for public news creation. The paper analyzes the limitations of litigation in addressing text-scraping disputes due to its lengthy, costly nature and fragmented US case law. It proposes targeted legislative interventions inspired by Australian and Canadian models regarding presenting news on digital advertising platforms. The three proposed regulatory measures are collaborative negotiations by the new companies, integrating AI technologies into future collaborations, and regulating Robot.txt and AI.txt infrastructures while embracing the fair use doctrine. These tools can improve the bargaining in the shadow of the law between news and AI companies by tipping the scale in favor of news companies. Despite their challenges, these regulatory measures suggest new avenues for value distribution between the news and AI companies in the ever-evolving technological landscape.

This paper was initially written for the Global Data Law course. It won second prize in the Berkeley Technology Law Journal writing competition 2024 and is forthcoming in that journal.