Top 20 AI Governance Papers of 2024 (part 1)
A breakdown of the year's best research | Edition #4
Hey 👋
I’m Oliver Patel, author and creator of Enterprise AI Governance.
This free newsletter delivers practical, actionable and timely insights for AI governance professionals.
My goal is simple: to empower you to understand, implement and master AI governance.
For more frequent updates, follow me on LinkedIn.
This week’s newsletter covers:
✅ Top 20 AI governance papers of 2024 (part 1)
✅ Key takeaways and summary of each paper
✅ Why it matters: the ‘so what’ for AI governance professionals
2024 was a landmark year for AI governance. Across the globe, alongside important AI policy developments, a huge amount of impressive and consequential research was published.
Whilst it is not always easy to carve out time to engage with foundational research, it is the only way to truly understand these complex issues.
This two part series presents an overview of the top 20 AI governance and safety papers of 2024.
Part 1 (below) features papers on AI and cyber security, red teaming, explainability and open source AI.
Part 2 will be covered in next week’s edition of Enterprise AI Governance. It features papers on AI agents and assistants, deliberative reasoning, liability for AI harms and foundation model use policies.
Disclaimer: I am not sponsored by any of the below organisations and I receive nothing in return for promoting these papers 😊
Top 20 AI governance papers of 2024 (part 1)
1. Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations
National Institute of Standards and Technology (NIST), January 2024
Key takeaways: There are many different attacks which malicious actors can deploy to compromise and manipulate AI systems. This includes attacks which aim to exfiltrate sensitive information from models, undermine model performance and even steal model weights. This comprehensive NIST report presents a taxonomy and framework for understanding and classifying these various attacks. It also explains the corresponding AI system vulnerabilities and risk mitigation options for each attack, focusing on generative AI and predictive AI respectively.
Why it matters: If you read one paper on AI and cyber security, it should be this. It is not possible to understand the cyber security risks of AI without understanding the various attacks and threat vectors. And this knowledge is futile without also understanding what can be done in practice, at different stages of the AI lifecycle, to mitigate risk and vulnerability. By providing clarity on key terms, concepts and definitions, NIST standardises how we think about and protect AI from a cyber security perspective.
Image from NIST highlighting the different types of attacks on generative AI systems
2. ASEAN Guide on AI Governance and Ethics
Association of Southeast Asian Nations (ASEAN), February 2024
Key takeaways: ASEAN has 10 member states, with a combined population of over 600 million people. This includes key economic players like Thailand, Malaysia, Singapore and Vietnam. ASEAN’s practical guidance on AI governance is intended to be used by organisations in the region which develop and deploy AI systems in commercial settings. The guidance covers 4 components: i) internal governance structures and measures, ii) human involvement and oversight, iii) operations management (e.g., AI lifecycle and iv) stakeholder interaction and communication.
Why it matters: Do not focus excessively on the EU AI Act and neglect everything else. AI governance went truly global in 2024, and this trend is set to ramp up this year. AI governance professionals, especially those working in multinational companies, need to track important policy and regulatory developments worldwide. This paper is likely a precursor to more official regulatory proposals in some of the ASEAN member states. Therefore, it serves as a useful guide to policy thinking in the region. Although many countries will likely pass AI laws, there is going to be plenty of divergence and fragmentation. In this respect, the Brussels effect could be limited, and cross-jurisdictional compliance will become increasingly complex to navigate.
3. AI Accountability Policy Report
National Telecommunications and Information Administration (NTIA), March 2024
Key takeaways: This report proposes several policy recommendations, for U.S. government agencies, to enhance and strengthen AI governance. This includes:
Creating guidelines for AI audits, information disclosures and liability rules
Investing in tools, standards and research on AI testing and evaluation, including red teaming
Amending regulations to mandate AI audits, testing and evaluation
Why it matters: Accountability is a critical component of AI governance. Exactly who is responsible and accountable, within an organisation, for compliance with AI governance requirements, and whether these people actually fulfil their obligations, is the most important determinant of success. When rolling out your AI governance framework, do not overlook the importance of pre-defined roles, responsibilities and senior leadership accountabilities for AI governance. This should cover specific processes, assessments, reviews, decisions, escalations and risk acceptance procedures. Without a scalable accountability framework and operating model, AI governance cannot succeed.
4. AI Index Report 2024
Stanford University, April 2024
Key takeaways: Perhaps too many to summarise (the report is 500+ pages), but some of the most important takeaways for AI governance are:
Privacy and data governance is ranked as the top AI risk for companies
People across the world are becoming more nervous and concerned about AI
There is a lack of standardisation for LLM evaluation and safety, with each AI company adopting its own approach
The number of AI regulations across the U.S. increases sharply
Why it matters: The annual AI Index is a must read for everyone working in AI governance. AI is becoming an increasingly large, complex and disparate field. And the development, use and risks of AI impacts all industries, countries and parts of society. It’s becoming virtually impossible to keep up with everything, which is why this report is a vital resource to recap on the key trends across the year.
Image from Stanford’s AI Index highlighting the most relevant AI risks for companies, with privacy and data governance considered the top risk worldwide
5. Mapping the Mind of a Large Language Model
Anthropic, May 2024
Key takeaways: It is widely accepted that LLMs are a black box. Anthropic’s research aims to change this. The team evaluated the inner workings of the Claude 3.0 Sonnet model (one of Anthropic’s flagship LLMs) during the process of inference (i.e., whilst the model was performing the calculations required to generate a new prediction). This parallel, real-time evaluation highlighted that the model has ‘features’, which are distinct patterns of neurons in the network, which ‘activate’ and come to the fore in response to specific inputs and queries. By understanding how and when these features activate, and the role they play in the model’s inference process, we can better explain, influence and steer AI model behaviour.
Why it matters: Interpretability and explainability is a perennial challenge for AI governance. Without being able to meaningfully and reliably understand, interpret and explain both how and why a model generates a particular output, certain use cases will be off limits, or at best highly risky from an ethical, legal and compliance standpoint. Although some traditional machine learning models are explainable, such as random forest models based on decision trees, LLMs in production today are not. However, this could begin to change, with frontier research like this.
Image from Anthropic showing which aspects of an input query the model’s ‘gender bias awareness’ feature picks up on
6. AI Governance in Practice Report 2024
IAPP, June 2024
Key takeaways: This report provides a thorough overview of the emerging field of AI governance. It serves as a useful introduction for any organisation or professional beginning their journey. It outlines the main risks of AI, focusing on data, privacy, transparency, bias, copyright and security. It then summarises the most important AI laws, regulations and frameworks, including the EU AI Act, OECD AI Principles and NIST AI Risk Management Framework, all of which are intended to mitigate these risks.
Why it matters: The IAPP is playing an instrumental role in the professionalisation and evolution of AI governance. This report offers sound advice for practitioners. For example, enterprise AI governance programmes should be tailored to the specific needs and context of the organisation. Furthermore, existing governance processes, like privacy, should be leveraged and augmented, where possible.
IAPP chart showing significant increase in private sector investment in AI since 2013
7. Dual-Use Foundation Models with Widely Available Model Weights
National Telecommunications and Information Administration (NTIA), July 2024
Key takeaways: This report, by the U.S. government, provides a detailed overview of open source AI. It assesses the benefits, opportunities, risks and dangers of open source foundation models (i.e., making model weights openly available for download to the public). Some of the main risks cited include the design or production of chemical, biological, radiological or nuclear weapons (CBRN), as well as acceleration of AI development in “countries of concern”.
Why it matters: Every enterprise is currently grappling with its open source AI policy. Data scientists and engineers expect access to the latest technology, especially when it is freely and openly available online. Although AI governance leaders do not want to impede innovation, they are rightfully wary of shadow IT, the build-up of technical debt and various legal and compliance risks from ungoverned open source AI use, such as breaching copyright. Forward thinking organisations must strike a fine balance, to ensure responsible and legally compliant AI innovation.
8. The AI Risk Repository: A Comprehensive Meta-Review, Database, and Taxonomy of Risks From AI
MIT FutureTech, August 2024
Key takeaways: MIT’s AI Risk Repository is a live and evolving database of 700+ risks, drawn from 40+ AI frameworks. These AI risks are categorised into 7 core domains: i) discrimination & toxicity, ii) privacy & security, iii) misinformation, iv) malicious actors & misuse, v) human-computer interaction, vi) socioeconomic and environmental and vii) AI system safety, failures & limitations. This is a comprehensive and unified classification system for AI risk.
Why it matters: The purpose of AI governance is to ensure AI risks are effectively mitigated and managed. Therefore, you cannot implement AI governance without understanding AI risk. There is an overwhelming amount of material out there on AI risks, harms and incidents, which is why the AI Risk Repository is an essential resource. Next time you need to give an important presentation or training about AI risk, with over 700 examples in the repository today, you should no excuses for not thinking of interesting case studies!
9. Historical Analogues That Can Inform AI Governance
RAND, August 2024
Key takeaways: Michael Vermeer analyses four historical case studies of technology governance and highlights the key lessons learned for AI. These are i) nuclear technology, ii) the internet, iii) encryption products and iv) genetic engineering. His key message is that there is a pressing need for international consensus on norms and standards. Also, partnerships between government and the private sector are crucial for effective long-term governance and risk mitigation.
Why it matters: It’s important for AI governance professionals to remember that the ’governance’ is just as crucial as the ‘AI’. Don’t spend all your time and energy thinking and learning about AI, if it comes at the expense of learning lessons from other domains of governance. Ultimately, although AI governance is novel and exciting, technology governance and risk management is nothing new. Enterprise AI governance professionals need not reinvent the wheel. Speak to your colleagues in other risk functions, such as privacy, cybersecurity and financial reporting, as they will undoubtedly have useful insights for your work.
10. Guide to Red Teaming Methodology on AI Safety
Japan AI Safety Institute, September 2024
Key takeaways: Japan’s AI Safety Institute provides a detailed methodology and practical guidance which organisations can use to ‘red team’ AI systems. The intended audience is AI developers and providers. Red teaming is defined as an evaluation method to determine and strengthen the effectiveness of an AI system at withstanding malicious actions. The focus is on red teaming to mimic, and ultimately prevent, malicious attacks carried out by actors seeking to abuse, disrupt, manipulate and destroy AI systems.
Why it matters: Red teaming is one of the most important techniques in AI governance and safety. Taking inspiration from cyber security, AI red teaming consists of simulating attacks, like malicious or harmful inputs, with the goal of identifying vulnerabilities and weaknesses, which can be addressed before an AI model or system is released. Red teaming exercises are now routinely carried by AI developers, especially prior to the release of new foundation models. All AI governance professionals therefore need to be aware of what best practice looks like.
Japan AI Safety Institute diagram showing how to red team an AI model with individual prompts
Thanks for posting this, Oliver. Looking forward to part 2.