How to Govern Externally Provided AI
From paperwork to dynamic oversight | #34
Hey đ
Iâm Oliver Patel, author and creator of Enterprise AI Governance.
Your enterprise increasingly relies on AI products and foundation models developed by other companies. This article tackles two critical challenges for AI governance leaders: i) how to move from âpaper-basedâ vendor due diligence to dynamic and continuous oversight, and ii) how to manage the novel risks of building AI systems with externally provided foundation models.
If you enjoy my work and want to read a comprehensive, step-by-step guide to enterprise AI governance, sign up to secure a 25% discount for my forthcoming book, Fundamentals of AI Governance (2026).
Over the past few weeks, this newsletter has analysed some of the most pertinent challenges facing AI governance leaders in 2025.
Part 1 explored challenges 1-3 (see below). It argued that the democratisation and widespread accessibility of AI is driving intense volume and velocity of AI use cases. This is contributing to an AI risk âvibe shiftâ, overwhelming AI governance functions, and necessitates updating the risk-based approach to AI governance.
We then took a mini detour for a deep dive on challenge 4âprotecting confidential business dataâin which I outlined the PROTECT Framework for Managing Data Risks in the AI Era. The purpose of the PROTECT Framework is to enable organisations to understand, map, and mitigate the most pertinent data risks that are fuelled by widespread adoption of generative AI.
Todayâs article continues the series by analysing challenges 5 and 6, both of which are focused on the way in which enterprises leverage, and increasingly rely on, AI models, products, and services developed and provided by external organisations. It provides practical advice and guidance on how to conduct effective vendor due diligence and oversight, what AI procurement processes should entail, and notable challenges and risks to mitigate when using pre-trained foundation models to develop and deploy customised AI applications.
As a reminder, here are the top 10 challenges this series covers.
Top 10 Challenges for AI Governance Leaders in 2025
The âdemocratisation dilemmaâ. How to maintain robust oversight and promote compliance when the ability to develop, deploy, and use AI is democratised and widely accessible?
Volume and velocity. How to keep up with the sheer volume and rapid pace of enterprise AI initiatives, whilst cutting through the noise and deploying finite resources and expertise on the highest value work?
Refining the risk-based approach. How to respond to the AI risk âvibe shiftâ and effectively target governance on the relatively small proportion of AI systems and use cases that could pose significant risks?
Protecting confidential business data. How to protect confidential business data when there is immense hunger to experiment with and use the latest AI applications that are released on the market?
Ongoing vendor due diligence and oversight. How to move beyond âpaper-basedâ vendor due diligence and apply continuous oversight on the performance, trustworthiness, and safety of externally provided AI applications?
AI engineering: building with foundation models. How to determine your rights, responsibilities, and liabilitiesâas well as the novel risks and tangible mitigationsâwhen building AI systems with foundation models provided by external organisations?
Open-source AI model oversight. How to effectively govern the widespread access and use of open-source AI models, to safeguard your organisation from legal, compliance, and cyber security risks, whilst promoting innovation?
Embedding compliance by design. How to build AI systems that promote compliance by design and default, to make it seamless for your workforce to do the right thing?
Agentic AI governance: taking the human out of the loop. How to promote responsible, meaningful, and empowered human oversight of AI, when the fundamental goal of agentic AI is to take the human out of the loop?
Digital governance silos and inefficiencies. How to effectively streamline and integrate your disparate digital governance and risk management processes and capabilities, to improve the user experience and accelerate AI innovation, whilst also strengthening your compliance posture?
5. Ongoing Vendor Due Diligence and Oversight
How to move beyond âpaper-basedâ vendor due diligence and apply continuous oversight on the performance, trustworthiness, and safety of externally provided AI applications?
Modern enterprises are becoming heavily reliant on AI products, applications, and services developed and provided by external organisations. Core enterprise workflows and domains, including recruitment, learning and development, IT support, and financial analytics are powered by bespoke vendor AI solutions. Furthermore, virtually all legacy enterprise software is being enhanced and updated with novel AI features and capabilities.
For most organisations, it makes more sense to buy rather than build. However, this amplifies various AI risks, such as unauthorised use and sharing of confidential data, liability exposure for AI performance issues and incidents, and copyright breaches.
For these reasons, AI vendor due diligence and oversight is an integral part of enterprise AI governanceâbut it is one of the hardest pillars to get right. This is because, fundamentally, organisations struggle to effectively monitor and control how well these externally provided AI applications work.
There are three core elements to the âpaper-basedâ governance referred to above:
1. Vendor due diligence assessment process. This typically consists of a set of assessments and questions that must be completed by the vendor before they are onboarded, as well as documentation and assurances they must provide.
2. Contracts, clauses and templates. This refers to standardised contractual clauses that can be included in agreements with AI vendors, covering topics like service delivery, data use, regulatory compliance, indemnity protection, and allocation of responsibilities and liabilities.
3. AI policies, covering procurement and partnerships. Finally, enterprise AI policy foundations should outline the principles, requirements, and processes for AI procurement and partnerships.
The challenge is that these core elements are necessary, but not sufficient, for effective AI vendor oversight and risk management. In isolation, they can represent relatively blunt instruments that risk creating the impression of robust governance and risk management without significantly impacting outcomes or mitigating the most serious risks.
For example, pre-vendor onboarding due diligence covers a snapshot in time and does not enable dynamic response to future issues or concerns that may arise. This is especially problematic given the pace at which vendors are updating and evolving their applications, and the importance of monitoring performance, reliability, impacts, and incidents across a large organisation.
Practical solutions
Pre-deployment testing and PoCs: before onboarding new solutions and signing hefty contracts, enterprises should perform testing and evaluation of vendor AI solutions. This testing can focus on both performance (i.e., how well it works for various use cases) as well as safety (i.e., to what extent are guardrails effective). This can include undertaking proof-of-concept (PoC) engagements and leveraging sandbox-type environments. This provides an additional layer of confidence over solely paper-based assessment exercises. Simply put, the best way to determine how well an AI solution will work for your organisation is to rigorously test and evaluate it.
Ongoing monitoring and evaluation: once externally provided AI solutions are deployed in production, they should be monitored and evaluated on an ongoing and dynamic basis. Enterprises should not merely rely on vendorsâ internal monitoring processes and the written assurances they provide pre-onboarding. This should be augmented with independent testing and monitoringâperhaps even involving third partiesâoptimised for their use cases. This approach is especially important for mission critical applications. This can be both quantitative (e.g., automated tracking of output accuracy) and qualitative (e.g., user feedback surveys and incident reports). Ideally, quantifiable performance metrics and thresholds (or baselines) should be contractually guaranteed, with robust response procedures in case of deviations or deficiencies.
New AI features and capabilities: the pre-vendor onboarding due diligence, and the resulting contract that is signed, often fails to account for the myriad ways in which the product could be updated with new AI features or capabilities. Oftentimes, such new AI features or capabilities integrated into an existing product can materially change the risk level, compliance scope, or potential impact of the overall product. Therefore, mechanisms must be in place for formal assessment and evaluation of new AI features and capabilities that trigger certain risk-based criteriaâeven when the vendor and the product they provide is already approved and deployed in production.
Shaping product implementation and roadmaps: the best way to be aware of product enhancements, including the integration of new AI features and capabilities that can amplify risk, is to maintain active and focused engagement with the vendor. Part of this should include working with the vendor to implement the product in a responsible way, taking advantage of all available guardrails, monitoring, and safety featuresâas well as the expertise they should have from countless prior deployments. More broadly, this is an underappreciated facet of enterprise AI governance. The more influence you have on the vendor and their products, the more effective you will be at managing and mitigating risks.
6. AI Engineering: Building with Foundation Models
How to determine your rights, responsibilities, and liabilitiesâas well as the novel risks and tangible mitigationsâwhen building AI systems with foundation models provided by external organisations?
The generative AI boom of the past few years has fundamentally reshaped the nature of enterprise AI. Almost all enterprises are now building AI applications with pre-trained foundation models. And it is becoming less common for organisations to independently train and develop AI models for deployment in production.
Chip Huyen characterises âAI engineeringâ as a new discipline, distinct from traditional machine learning engineering, that centres around building production applications (i.e., AI systems) using readily available foundation models developed and provided by external organisations, rather than training AI models from scratch.
Although AI and data science teams have always used external models, components, packages, toolkits, and libraries, AI engineering has quickly emerged as the dominant modality for enterprise AI development and deployment today.
This is true for both major enterprises and the vendors that serve them. For the latter, many of the vendor-provided AI products discussed above are ultimately software wrappers around advanced foundation models developed by AI labs. For the former, many of their internal AI use cases and AI systems leverage those same models, through services enabling foundation model consumption and use via APIs.
This challenges traditional AI governance frameworks and regulations, many of which were developed primarily during the traditional machine learning era. For example, the AI engineering modality disrupts the binary and outdated distinction between internal AI development of AI systems and procurement and use of externally provided AI solutions. Today, AI development is hybrid, as internal engineering teams build with external models. AI engineering also challenges the notion of AI systems designed and intended for a specific use case (or set of use cases), given the advanced, general-purpose capabilities of these frontier models.
Given that enterprises cannot rely solely on mainstream AI governance frameworks, standards, and regulations to surface and mitigate the myriad risks that building with foundation models entails, AI governance leaders need to do the heavy lifting themselves. Although there exists a broad spectrum of approaches and techniques for AI engineeringâincluding prompt engineering, context management, retrieval augmented generation (RAG), and model fine-tuningâthere are some shared risks and challenges that apply across the board.
Although this is not an exhaustive list, here are five of the most notable AI engineering challenges enterprises are currently grappling with:
EU AI Act obligations for general-purpose AI (GPAI) model providers: it is important for enterprises to verify that the GPAI model providers they work with are complying with their applicable obligations under the AI Act and adhering to the GPAI Code of Practice (if they are a signatory). The most important aspect of this is obtaining (and adhering to) the technical documentation (e.g., instructions for use, usage policy, training data summary etc.) provided by the GPAI model provider, prior to building an AI system which integrates that GPAI model. Furthermore, enterprises must also track scenarios where they are using extremely large amounts of compute for GPAI model modification (i.e., fine-tuning), as this can trigger GPAI model provider status if certain thresholds are breached. See this previous newsletter article for a deep dive on the topic.
Jurisdictional access restrictions: there are various jurisdictions that restrict access to, and use of, certain types of AI models and services. This includes use case type restrictions that apply in specific scenarios, as well as broad restrictions that always apply. For example, Chinaâs Generative AI Services law restricts the types of AI models that can be integrated into âpublic-facingâ AI applications, in order to prevent unauthorised information from being shared to the public. Also, various sanctions and export control regimes, such as the EUâs sanctions against Russia, restrict which AI models, services, and technologies can be provided or made available from one jurisdiction to another.
Fragmentation of vendor terms and acceptable use policies: enterprises typically pay for access to various foundation model platforms, to enable on-demand consumption and use of a buffet of pre-trained foundation models. A key challenge is that different AI model providers have different terms of service and acceptable use policies, at both the platform level and the model level. For example, the overarching terms of service governing an organisationâs use of foundation model platforms provided by organisations like Google, Microsoft, and IBM will always differ, perhaps in important ways. And specific model providers, which enable access to their models for consumption via these platforms, have differing acceptable use policies and âpass throughâ terms that apply to the use of their foundation models. This becomes extremely complex to keep track of, especially for the average engineering team that simply wants to get on with their job.
Black box opacity: it is widely accepted that explainability is difficult to achieve in the context of generative AI, in part due to the complexity and vastness of model architectures and the difficulties even the technical experts who developed those models have in understanding, interpreting, and explaining their outputs. This problem is amplified when using pre-trained, proprietary generative AI models, as you usually do not have access to the model weights and most likely have limited information about its training data sources. Therefore, it is virtually impossible to operationalise explainability in a meaningful way, which is a legal requirement or ethical imperative in certain scenarios.
Compliance and reputational contagion: there is intense media and political scrutiny on every move made by the major AI companies. The more your organisation builds products and applications with the models they provide, the more exposed you may be in case something dramatic goes wrong. The reputational risks of working with certain vendors, as well as their AI and data compliance posture, are key considerations when selecting which foundation models to use.
Practical solutions
It is difficult to outline comprehensive solutions to these novel challenges without proposing a full enterprise AI governance framework, which is clearly beyond the scope of this article. However, that is what you can expect from my forthcoming book.
However, what can be said is that the common thread across these challenges is dependency and complexity. The imperative for AI governance leaders is to:
manage the dependency proactively rather than reactively; and
simplify the complexity of this domain into practical and actionable guidance for the wider organisation.
Foundation model providersâlike AI vendors providing domain-specific AI products and servicesârequire continuous and dynamic oversight and monitoring, as opposed to one-time procurement due diligence and contracts that are left to gather dust. A key part of this is maintaining centralised visibility regarding which models are approved and available for use, which jurisdictions these models can be used in, what the applicable terms of use are, which use cases and AI systems use which foundation models, and how much compute these engineering teams are using. Doing this requires modern tooling, such as an enterprise AI catalogue, which is linked to dynamic AI governance assessment and monitoring processes, as well as role-based training for AI engineering and data science teams, so that they are aware of their responsibilities.
Finally, do not rely solely on vendor assurances about compliance, safety, or performance. Independently test and evaluate foundation model performance for your specific use cases and with your data, as this matters much more than external benchmarks. And implement and document compensating controls if inherent limitations like explainability cannot be overcome.
Good luck!




Strong breakdown especially the distinction between paper-based due diligence and continuous oversight. What we consistently see is that dependency risk becomes acute when organisations cannot reconstruct what changed, when, and under which model or policy state. Foundation model complexity isnât just legal or reputational itâs evidentiary. Ongoing assurance requires tamper-evident decision and model lineage, not just stronger contracts.
Well written Oliver. The attack vector has increased many folds and we need new tools and frameworks to manage risk. Your newsletter and frameworks are in the right direction. We at Secure AI have been implementing AI guardrails and governance platform keeping some of this frameworks in mind. Happy to share more details if needed. You can learn more here. https://secureaillc.com/product.html