This blog sets out the risks associated with AI use in the aid sector and how to mitigate those risks with due diligence on providers and robust corporate policies and work force education.
At MetricsLed we write from a ring side seat. We’ve seen a surge of requests through the summer from government and private sector clients asking us to build AI capabilities into ‘public facing’ products: for aid recipients, to evaluate grantees and applicants, and to determine, assess and anticipate the spatial demand for critical humanitarian assistance.
There are obvious benefits to using AI for these types of function: the legal, financial and reputation risks are less well understood, particularly in the upper echelons of companies and boards still typically run by analogue minded Boomer/GenXers.
Boards need a clear strategy for AI and its use in delivery and administrative systems. And Boards need to take care in engaging providers and developing tools: it’s the wild west out there and the market is flooded with lots of new products, developed in unknown jurisdictions, and trained on data of contested provenance and managed by agencies who do not always comply with regulations or apply necessary safeguards. Engaging with this sort of AI and using it say to generate recommendations for sensitive Technical Assistance programs, to draft policy or legislation, to manage public finances, to diagnose illness, etc carries risk.
The blog sets out a range of steps and principles that leaders and Board should adhere to as they consider the use of AI in corporate or delivery settings. Advising at a minimum.
- Full alignment with UK AI Playbook, ICO guidance, and EU AI Act principles.
- Data Protection Impact Assessment (DPIAs) undertaken for every high-risk AI project.
- Bias testing and explainability as standard practice.
- Human oversight at all stages of delivery.
The article concludes with a rogue’s gallery of salutary case studies on scandals, fines and reputational damage that should resonate with those leadership cohorts… and demonstrate how easy it is for leaders to miss issues: to take their eye of the ball, if they don’t have well understood and consistently applied approaches to due diligence when buying AI services or developing AI tools and human testing and oversight in their application and use.
Introduction
In international aid and development, the pressure to adopt AI is accelerating -whether for complex tasking such as the algorithmic profiling of aid recipients or for more mundane back-office tasks such as bid generation, grant writing and CV formatting.
This Summer LinkedIn has been lit up by a sugar rush of new providers promising all sorts of digital snake oil to the lazy, the credulous and the uninformed.
But this rush is taking many actors into uncharted and risky waters and way out in front of a complex, fast-changing and poorly understood range of regulatory guardrails.
But where personal data and human lives are at stake, AI tools must be deployed responsibly. This is not just a moral imperative, it is now a legal one, with global and UK-specific frameworks setting clear expectations for safe, transparent, and fair AI.
But many companies, and individuals within companies, are ignorant of the risks involved in the use of AI and the sanctions that can be imposed on those who are found to have broken the complex and fast changing set of regulations that underpin it.
In the sections below we set out some of these risks with reference to real world issues. We explain the contours of the regulatory regime and the sanctions that can be imposed on those who breach it. And we set out guidance for Boards on how to undertake due diligence on AI providers and services and point to the need for corporate policies on AI to mitigate these risks and build understanding and compliance within the companies they lead.
Some Real-World AI Missteps in Humanitarian Settings
- IRC’s Chatbot Risks — Misinformation & Data Security: The International Rescue Committee piloted AI-backed chatbot trials in El Salvador. Kenya, Greece and Italy to support displaced persons. While promising in reach, these systems risked delivering incorrect information in volatile situations and posed potential security threats to vulnerable populations.
- ‘AI for Good’ Backfired: An anti-poaching project using AI to detect wildlife threats mistakenly labelled humans as poachers. Misclassification in such contexts can have real human and reputational consequences.
- Mapping Payoffs for the Wrong Communities: Satellite imagery projects in India aiming to map poverty overestimated deprivation in some communities and underestimated it in others— demonstrating systematic bias and undermining fairness in aid planning.
- Nonprofit AI Bias in Grant Screening: AI tools used to review grant applications inadvertently favoured repeat applicants, perpetuating inequality rather than reducing it.
The Regulatory Map Is Already Here
While the aid sector often operates in challenging jurisdictions, major regulatory frameworks already set the standard for responsible AI:
- UK — A Pro-Innovation Approach to AI Regulation (White Paper, DSIT, 2023) enshrines safety, transparency, fairness, accountability, and contestability as cross-cutting principles.
- UK AI Playbook (Feb 2025) A detailed operational framework for AI in government services, covering governance, bias testing, and explainability.
- ICO Guidance on AI and Data Protection (2023) Mandates Data Protection Impact Assessments (DPIAs) for high-risk AI and sets rules on bias and inference risks.
- EU Artificial Intelligence Act (2024) A risk-based approach that bans certain AI uses and applies extraterritorially to providers outside the EU.
- US AI Bill of Rights (2022) Ethical guidance for AI developers focusing on transparency, fairness, and accountability.
Due Diligence Before You Buy AI Services
Companies face many risks as they seek to navigate the fast-changing world of Ai and what it means in humanitarian settings. At one end of the spectrum sit the risk of missing the boat – both the Luddites who spurn the advance of tech and the Purists horrified that AI is being used to generate technical writing or proposals.
But the reality is that all companies need to embrace AI for delivery and for back-office tasks if they are to remain relevant and competitive. Risk at this end of the spectrum is the over enthusiastic or ad hoc adoption of AI with limited oversight or guiding strategy.
Safe adoption of AI must be based on informed oversight and thorough due diligence of both of external AI providers and of internal AI builds.
Key areas to evaluate include:
- Regulatory Compliance Check: Evidence of alignment with the UK AI Playbook, ICO guidance, and relevant foreign frameworks. Completed DPIAs for any personal or sensitive data processing.
- Data Security & Privacy Review: Clear policies on encryption, storage location, and access control. Assurances on training data confidentiality.
- Bias & Fairness Testing: Demonstrable testing across relevant demographics and geographies. Transparent processes for explaining AI decisions.
- Human Oversight: Human-in-the-loop review for all high-stakes outputs. Documented escalation and correction processes.
- Vendor Transparency: Disclosure of algorithms, training data sources, and update cycles. Service-level agreements covering error correction and audit rights.
The Risks of Ignoring Due Diligence
- Legal & Financial Sanctions: UK GDPR fines up to £17.5 million or 4% of global turnover. Contract disqualification for public sector and donor-funded projects.
- Funding Loss: Donors and philanthropies are adding AI compliance clauses – a breach terminate contracts instantly.
- Reputational Harm: Public exposure of bias, data breaches, or harmful recommendations can erode public trust and donor confidence.
- Operational Damage: Misguided outputs can divert aid, exclude vulnerable groups, or cause harm.
MetricsLed’s Commitment: Responsible AI Built for Aid
At MetricsLed, we don’t just innovate, we regulate ourselves to the highest standards and we practice what we preach with a commitment to ensuring all our products in use, in testing and in development are underpinned at a minimum with:
- Full alignment with UK AI Playbook, ICO guidance, and EU AI Act principles.
- Mandatory DPIAs for every high-risk AI project.
- Bias testing and explainability as standard practice.
- Human oversight at all stages of delivery.
When you are buying our products and platforms you are buying AI tools that are custom built for the aid sector have been developed, trained and tested in line with these principles.
Do some clients roll their eyes on this stuff? Yes they do. Do we bang on about it pedantically…Yes. And yes it’s complex, technical, legalistic and it can be dull. But it’s important and getting more important every day and the risks of kicking it into the long grass are increasingly existential.
Conclusion
The next big wave in international development will be the use of AI to solve some of the world’s most complex aid, development and humanitarian problems.
Set against an otherwise gloomy global picture the potential of AI is dizzying: From the use of medical diagnostics without doctors, algorithms to identify corruption and fraud in public procurement, to the use of AI tutors in low-bandwidth environments. The pressures on individuals, companies & organizations and government to adopt AI are building and many are adopting AI in a piecemeal and ad hoc fashion.
Leaders must get their agencies match fit to understand the upside potential of AI, to set out a clear strategy for its use in administrative and delivery terms and consciously act to mitigate the risks inherent in its use.
Case Studies: When AI Goes Wrong — and What It Costs
- Dutch Child-Benefits Scandal (‘Toeslagenaffaire’)
What happened: The Dutch tax authority used algorithmic risk scoring to flag “fraud” in childcare benefits. Thousands of families—disproportionately with dual nationality—were wrongly accused and financially devastated.
Consequence: A national scandal, ministerial resignations, and long‑term reparations.
Lesson for aid: Risk scoring on incomplete or biased data can systemically target protected groups and destroy public trust.
- Netherlands’ SyRI Welfare-Fraud System Struck Down in Court
What happened: The SyRI model pooled multiple government datasets to predict welfare fraud.
Consequence: In 2020, a Dutch court ruled SyRI unlawful under human-rights and privacy law (opaque, overly intrusive, unclear purpose).
Lesson for aid: Black‑box models that lack necessity, proportionality and explainability face litigation risk.
- UK A‑Level Grading Algorithm (2020)
What happened: A standardisation algorithm downgraded ~39% of teacher‑predicted grades, disproportionately harming students from state schools/smaller cohorts.
Consequence: Mass protests, rapid government U‑turn, leadership fallout at Ofqual—significant reputational damage.
Lesson for aid: Even “well‑intended” models can embed structural bias; transparency and stakeholder testing are non‑negotiable.
- South Wales Police — Live Facial Recognition Unlawful (Bridges v. SWP)
What happened: Live facial recognition was deployed in public spaces.
Consequence: UK Court of Appeal ruled the use unlawful (inadequate safeguards, watchlist criteria, and equality assessment).
Lesson for aid: High‑risk biometrics without robust governance invites court defeat and brand damage.
- Clearview AI — EU Sanctions
What happened: Face-scraping without consent; processing biometric data of EU residents.
Consequence: French CNIL imposed a €20M fine and ordered data deletion; similar enforcement pressure across the EU.
Lesson for aid: Scraping/biometrics without a lawful basis is a fast track to EU enforcement and funding ineligibility.
- DeepMind–Royal Free NHS (UK)
What happened: 1.6M patient records shared for an app pilot (AKI detection) without adequate patient information/legal basis.
Consequence: ICO ruled the data sharing unlawful; mandatory remedial actions and intense media scrutiny.
Lesson for aid: “Move fast” with sensitive data—without DPIAs, transparency and clear legal basis—ends in regulatory action.
- Amazon Recruiting AI Biased Against Women
What happened: An experimental CV‑screening model down‑ranked women for technical roles, learning bias from historical hiring data.
Consequence: System scrapped; reputational hit illustrating the perils of proxy discrimination.
Lesson for aid: Training on skewed histories replicates exclusion unless bias testing is rigorous.
- Apple Card Algorithm — Gender Bias Allegations & Regulatory Probe
What happened: Viral claims of unequal credit limits between spouses triggered an NYDFS investigation into algorithmic lending fairness.
Consequence: No disparate impact was ultimately found, but the incident created a high‑profile reputational storm and forced detailed explainability.
Lesson for aid: Even when models pass legal tests, lack of explainability can cause public backlash and regulatory scrutiny.
- COMPAS Risk Scores — Criminal Justice Bias Debate
What happened: ProPublica reported racial bias in recidivism risk scoring; subsequent academic debate on fairness trade-offs ensued.
Consequence: Enduring reputational controversy around opaque, consequential AI; policy pressure for transparency and auditing.
Lesson for aid: In high‑stakes settings, contested “fairness” metrics can still sink trust—publish testing methods and error trade‑offs.
- Generative AI & GDPR — Italian Garante Action on ChatGPT
What happened: Italy’s regulator found GDPR issues (lawful basis, transparency, age checks), temporarily restricting ChatGPT and proposing fines unless remedies adopted.
Consequence: Platform changes + ongoing enforcement exposure (~€15M fine proposal).
Lesson for aid: Using large models without a GDPR‑compliant wrapper (DPIAs, age gates, notices) risks sudden service disruption and penalties.
Endnotes
Politico Europe, “Dutch scandal serves as a warning for Europe…” (child‑benefits scandal).
Amnesty International, “Dutch childcare benefit scandal… ban racist algorithms.”
IAPP, “Digital welfare fraud detection and the Dutch SyRI judgment.”
WIRED, “Everything that went wrong with the botched A‑Levels algorithm.”
LSE Impact Blog, “What the world can learn from the UK’s A‑level grading fiasco.”
Wikipedia overview of the 2020 grading controversy.
Guardian, “South Wales police lose landmark facial recognition case.”
Hunton Andrews Kurth, “UK Court of Appeal Finds Automated Facial Recognition Unlawful (Bridges).”
EDPB/CNIL, “French SA fines Clearview AI EUR 20 million.”
Hunton Privacy Blog, “CNIL fines Clearview AI 20M.”
WIRED, “NHS Trust broke data law sharing with DeepMind” (ICO ruling coverage).
Quartz, “DeepMind was unlawfully given 1.6M NHS records.”
Reuters, “Amazon scraps AI recruiting tool biased against women.”
NYDFS, “Report on Apple Card Investigation.”
ProPublica, “Machine Bias” and methods note.
AP News & Lewis Silkin, “Italy tells OpenAI ChatGPT violates GDPR” / “Garante strikes again.”