I’ve been closely following the evolution of artificial intelligence, and the rise of Agentic AI—systems capable of autonomous decision-making, environmental interaction, and goal-oriented behaviour—marks a thrilling yet daunting leap forward. Unlike traditional AI, which operates within narrowly defined parameters, Agentic AI can independently reason, plan, and act, often with minimal human oversight. While this autonomy promises transformative benefits across industries, it also introduces a host of new risks that demand urgent attention. In this blog, I’ll explore these risks in detail, using real-world examples to illustrate their implications, and discuss current methods for auditing and mitigating them, as well as my perspective on what needs to be done, including the intriguing concept of using AI to monitor AI.

What is Agentic AI?
Before diving into the risks, let’s clarify what Agentic AI entails. These are not your run-of-the-mill chatbots or predictive models. Agentic AI systems, such as OpenAI’s rumoured ‘Operator’ or Cognition Software’s ‘Devin’ (an autonomous software engineer), can perceive their environment, make decisions, and execute complex tasks—think coding, booking travel, or managing customer service inquiries—without constant human input. Their ability to adapt, learn, and optimise strategies in real time makes them powerful collaborators but also potential sources of chaos if not properly governed. As Gartner hails Agentic AI as a top technology trend for 2025, the stakes couldn’t be higher.
New Risks Introduced by Agentic AI
The autonomy and interconnectedness of Agentic AI systems give rise to risks that are distinct from those of traditional AI. Below, I outline the key categories, supported by examples to highlight their real-world implications.
- Unintended or Harmful Decisions (Autonomy Risks)
Agentic AI’s ability to act independently can lead to unintended consequences, especially in high-stakes environments. Because these systems rely on probabilistic models and dynamic learning, they may misinterpret data or prioritise efficiency over safety.
• Example: Klarna’s AI Misstep
Swedish fintech company Klarna replaced 700 employees with AI agents to boost efficiency. However, the lack of human oversight led to quality issues, degrading service performance and customer satisfaction. This illustrates how over-reliance on autonomous systems can backfire, particularly when nuanced human judgement is required.
• Implication: In critical sectors like healthcare or finance, such errors could be catastrophic—imagine an AI health assistant prescribing incorrect treatments due to flawed data interpretation or an AI trading system triggering market instability through erratic trades. - Security Vulnerabilities and Adversarial Attacks
Agentic AI’s integration with external systems and vast datasets expands its attack surface, making it a prime target for cyberattacks, data breaches, and adversarial manipulation.
• Example: SolarWinds Supply Chain Attack
In 2020, the SolarWinds cyberattack compromised the Orion software platform by injecting malware into its update process, affecting thousands of clients, including government agencies. This exposed vulnerabilities in AI supply chains, which Agentic AI systems, with their complex dependencies, are particularly susceptible to. Malicious actors could manipulate training data (data poisoning) or exploit coding flaws to trick AI agents into harmful actions.
• Implication: A hacked AI agent managing critical infrastructure could leak sensitive data or execute unauthorised actions, such as shutting down power grids or misdirecting autonomous vehicles. - Ethical Concerns and Bias Amplification
Agentic AI systems, trained on historical data, can perpetuate biases or make ethically questionable decisions, especially when operating autonomously.
• Example: Healthcare AI Bias
If an AI-powered health assistant is trained on skewed datasets (e.g., underrepresenting certain demographics), it could develop biased treatment plans, putting patients at risk. This is particularly concerning for Agentic AI, which may act on these biases without human intervention.
• Implication: Ethical lapses could erode public trust and lead to legal repercussions, especially under regulations like the EU’s AI Act, which mandates fairness and accountability. - Lack of Explainability (The Black Box Problem)
The opaque decision-making processes of Agentic AI—often powered by complex neural networks—make it difficult to understand how decisions are reached, complicating accountability and trust.
• Example: Auditing Challenges
Foundation models powering AI agents are notoriously difficult to audit due to their lack of transparency. If an AI agent denies a loan application, regulators or customers may demand an explanation, but the system’s ‘black box’ nature could render this impossible, violating rights under laws like the GDPR.
• Implication: Without explainability, organisations risk non-compliance and public backlash, particularly in sectors requiring transparency, such as finance or public services. - Failure Cascades in Interconnected Systems
Agentic AI’s integration into larger systems means a single error can trigger a domino effect, destabilising entire workflows or infrastructures.
• Example: Hypothetical Military Scenario
A simple-reflex AI agent used in an autonomous missile defence system could misinterpret a signal as a threat, triggering a counterstrike that escalates into conflict. This risk is heightened if both adversaries rely on similar autonomous systems, creating a feedback loop of errors.
• Implication: Failure cascades could disrupt supply chains, financial markets, or critical infrastructure, with far-reaching societal consequences. - Over-Reliance and Automation Bias
As organisations delegate more tasks to Agentic AI, there’s a risk of over-reliance, eroding human skills and critical thinking.
• Example: Job Displacement and Skill Degradation
The commoditisation of tasks like coding or sales lead generation by AI agents could displace white-collar jobs, exacerbating socioeconomic inequalities. Moreover, employees may become overly dependent on AI recommendations, losing the ability to make independent decisions.
• Implication: Over-reliance could lead to operational vulnerabilities if AI systems fail or are unavailable, and societal pushback against job losses could fuel resistance to AI adoption. - Privacy Violations
Agentic AI’s need for vast datasets increases the risk of privacy breaches, especially when handling sensitive personal or corporate information.
• Example: Data Leakage in NLP Models
Natural language processing models, like those used in AI agents, can inadvertently leak sensitive training data (e.g., personal details or trade secrets) when generating outputs, posing significant privacy risks.
• Implication: Breaches could lead to regulatory fines under laws like the GDPR and damage organisational reputations.
Current Standards and Practices: A Technical Overview
To manage these risks, several standards and practices have been developed. Below, I provide a detailed look at their technical underpinnings, with examples of their application.
- Standards for AI Risk Management
• NIST AI Risk Management Framework (AI RMF)
◦ Technical Details: The AI RMF, released in 2023, outlines a four-phase process: GOVERN (policy creation), MAP (risk identification), MEASURE (quantitative assessment using metrics like accuracy, fairness, and robustness), and MANAGE (risk mitigation through controls like human-in-the-loop). It uses statistical methods like fairness metrics (e.g., demographic parity) to evaluate bias and adversarial robustness tests to assess security.
◦ Example: A healthcare AI provider might use the AI RMF to measure bias in a diagnostic model by calculating the false positive rate across demographic groups, ensuring compliance with ethical standards.
◦ Limitation: The framework is high-level and lacks specific guidance for Agentic AI’s dynamic planning capabilities, such as auditing MCTS-based decision trees.
• ISO/IEC 42001 (AI Management System Standard)
◦ Technical Details: Released in 2023, ISO/IEC 42001 provides a framework for managing AI systems, focusing on risk assessment, transparency, and accountability. It mandates the use of logging mechanisms to track AI decisions (e.g., storing input-output pairs) and regular audits using techniques like SHAP (SHapley Additive exPlanations) for explainability.
◦ Example: A financial institution might use ISO/IEC 42001 to log decisions made by an AI trading agent, enabling post hoc analysis to ensure compliance with regulations like MiFID II.
◦ Limitation: Logging alone doesn’t address the real-time reasoning of Agentic AI, where decisions evolve dynamically.
• EU AI Act (2025 Implementation)
◦ Technical Details: The EU AI Act classifies AI systems by risk level (e.g., high-risk for healthcare or finance) and mandates requirements like data quality checks, bias mitigation (e.g., using fairness-aware algorithms), and transparency (e.g., providing decision logs). For high-risk systems, it requires conformity assessments using tools like adversarial testing and model interpretability methods (e.g., LIME—Local Interpretable Model-agnostic Explanations).
◦ Example: An autonomous vehicle manufacturer might conduct adversarial testing on its AI agent, simulating attacks like sensor spoofing to ensure compliance with the EU AI Act.
◦ Limitation: The Act focuses on static models, not the iterative learning and planning of Agentic AI, which can change behaviour post-deployment.
Current Standards and Practices: A Technical Overview
To manage these risks, several standards and practices have been developed. Below, I provide a detailed look at their technical underpinnings, with examples of their application.
- Standards for AI Risk Management
• NIST AI Risk Management Framework (AI RMF)
◦ Technical Details: The AI RMF, released in 2023, outlines a four-phase process: GOVERN (policy creation), MAP (risk identification), MEASURE (quantitative assessment using metrics like accuracy, fairness, and robustness), and MANAGE (risk mitigation through controls like human-in-the-loop). It uses statistical methods like fairness metrics (e.g., demographic parity) to evaluate bias and adversarial robustness tests to assess security.
◦ Example: A healthcare AI provider might use the AI RMF to measure bias in a diagnostic model by calculating the false positive rate across demographic groups, ensuring compliance with ethical standards.
◦ Limitation: The framework is high-level and lacks specific guidance for Agentic AI’s dynamic planning capabilities, such as auditing MCTS-based decision trees.
• ISO/IEC 42001 (AI Management System Standard)
◦ Technical Details: Released in 2023, ISO/IEC 42001 provides a framework for managing AI systems, focusing on risk assessment, transparency, and accountability. It mandates the use of logging mechanisms to track AI decisions (e.g., storing input-output pairs) and regular audits using techniques like SHAP (SHapley Additive exPlanations) for explainability.
◦ Example: A financial institution might use ISO/IEC 42001 to log decisions made by an AI trading agent, enabling post hoc analysis to ensure compliance with regulations like MiFID II.
◦ Limitation: Logging alone doesn’t address the real-time reasoning of Agentic AI, where decisions evolve dynamically.
• EU AI Act (2025 Implementation)
◦ Technical Details: The EU AI Act classifies AI systems by risk level (e.g., high-risk for healthcare or finance) and mandates requirements like data quality checks, bias mitigation (e.g., using fairness-aware algorithms), and transparency (e.g., providing decision logs). For high-risk systems, it requires conformity assessments using tools like adversarial testing and model interpretability methods (e.g., LIME—Local Interpretable Model-agnostic Explanations).
◦ Example: An autonomous vehicle manufacturer might conduct adversarial testing on its AI agent, simulating attacks like sensor spoofing to ensure compliance with the EU AI Act.
◦ Limitation: The Act focuses on static models, not the iterative learning and planning of Agentic AI, which can change behaviour post-deployment.
Current Methods and Measures for Auditing and Mitigating Risks
To address these risks, organisations and regulators are developing frameworks, tools, and practices to ensure Agentic AI is deployed responsibly. Below, I outline key approaches, supported by examples.
- Auditing Frameworks
• NIST AI Risk Management Framework (AI RMF)
Released in 2023, the NIST AI RMF provides a structured approach to risk management through four phases: GOVERN (establishing policies), MAP (identifying interdependencies), MEASURE (assessing impacts), and MANAGE (responding to risks). It’s widely used to assess AI trustworthiness and compliance.
• ISO/IEC Standards
Standards like ISO/IEC 27001 and ISO 31000 guide organisations in auditing AI systems for security and risk management, incorporating stress testing and bias audits.
• Example: Citibank’s AI Auditing
Citibank uses AI to improve trade compliance by auditing unstructured data for patterns, ensuring regulatory adherence. This demonstrates how AI can enhance audit accuracy but requires skilled auditors to interpret results. - Mitigation Strategies
• Human-in-the-Loop Oversight
Maintaining human review of AI decisions, especially in high-stakes scenarios, mitigates autonomy risks. For instance, DeepMind’s AlphaGo was tested in millions of simulated games to fine-tune its behaviour, with human oversight ensuring safety.
• Explainable AI (XAI)
Researchers are developing XAI methods to make AI decision-making transparent. For example, post hoc explanation techniques help auditors understand AI outputs, aiding compliance with regulations like the Equal Credit Opportunity Act.
• Adversarial Testing and Red-Teaming
Red-teaming exercises, like those conducted by ActiveFence, simulate cyberattacks to expose vulnerabilities in AI agents, enabling proactive mitigation.
• Robust Governance Frameworks
Organisations are establishing clear policies for AI deployment, including access controls, continuous monitoring, and ethical guidelines. The EU AI Act, set to take effect in 2025, mandates such frameworks for high-risk AI systems. - Regulatory Compliance
• EU AI Act
The EU AI Act requires providers of high-risk AI systems to implement risk management measures, such as fine-tuning models to reduce biases or using safety filters to detect harmful outputs.
• Third-Party Audits
Independent audits, as recommended by the US Federal Trade Commission, evaluate AI fairness and performance, ensuring accountability.
Using AI to Monitor and Guard AI
An innovative approach to managing Agentic AI risks is leveraging AI itself to monitor and safeguard other AI systems. This concept, akin to a digital watchdog, is gaining traction as a scalable solution.
• AI-Driven Fraud Detection
In financial systems, AI agents can monitor other AI agents for unusual behaviour, such as erratic trades or data anomalies, flagging potential fraud or errors in real time.
• SentinelOne’s Autonomous Threat Detection
SentinelOne uses behavioural AI to detect abnormal activities in AI systems, enabling automated responses to mitigate threats like data exfiltration or model backdoors.
• Challenges: AI monitoring AI introduces complexity, as the monitoring system itself must be secure and unbiased. Regular audits and human oversight are essential to prevent recursive errors or blind spots.
Why Current Standards and Practices Are Not Enough
While these standards and practices provide a foundation, they are woefully inadequate for the unique challenges posed by Agentic AI’s increased agency, planning, and reasoning. Here’s why, from both a technical and strategic perspective:
1. Static Frameworks vs. Dynamic Systems
Standards like the EU AI Act and NIST AI RMF are designed for static AI models, where behaviour is predictable and auditable at deployment. Agentic AI, however, evolves through continuous learning and dynamic planning. For example, an AI agent using MCTS might simulate 10,000 scenarios to make a single decision, and its reasoning path could change with each new data input. Current standards lack mechanisms to audit such dynamic processes in real time, leaving regulators and organisations blind to emerging risks.
2. Scalability Challenges
Practices like HITL and manual auditing don’t scale for Agentic AI, which can make millions of decisions daily. For instance, an AI managing a smart city’s traffic system might adjust signals every second, making human oversight infeasible. Similarly, XAI tools like SHAP are computationally expensive and impractical for real-time analysis of multi-step reasoning, where decisions span thousands of nodes in a decision tree.
3. Inadequate Handling of Emergent Behaviours
Current practices focus on known risks, such as bias or adversarial attacks, but Agentic AI’s emergent behaviours are inherently unpredictable. For example, an AI agent tasked with optimising energy usage might inadvertently cause blackouts by over-optimising for efficiency, a failure mode that standard stress tests might not catch. Existing frameworks lack the foresight to anticipate such emergent risks, which arise from the interplay of agency, planning, and environmental interaction.
4. Lack of Real-Time Explainability
As Agentic AI’s reasoning becomes more complex (e.g., combining LLMs with RL and MAS), the “black box” problem worsens. Current XAI tools can explain simple models but fail to handle the multi-step, probabilistic reasoning of Agentic AI. For instance, explaining why an AI agent chose a specific action after evaluating 1,000 future scenarios requires tracing through a probabilistic decision tree—a task beyond the capabilities of tools like LIME or SHAP.
5. Regulatory Lag
Regulations like the EU AI Act are reactive, addressing risks identified in traditional AI (e.g., bias, transparency). They don’t account for Agentic AI’s ability to autonomously set goals, plan long-term strategies, and interact with other agents, which introduces new failure modes like systemic cascades or ethical drift (where the AI’s goals diverge from human values over time).
6.
What Needs to Be Done: My Point of View
While current methods are a strong start, the rapid adoption of Agentic AI demands bolder, more proactive measures. Here’s my take on what’s needed:
1. Global Standards for Explainability
The ‘black box’ problem must be tackled head-on. Governments and industry bodies should collaborate to mandate XAI standards, ensuring all AI agents provide clear, auditable decision trails. This would enhance trust and compliance, particularly in regulated sectors.
2. Mandatory Stress Testing
Before deployment, Agentic AI systems should undergo rigorous stress testing in simulated environments to identify failure points. This should be a regulatory requirement, akin to crash tests for cars.
3. Public Education and AI Literacy
To combat over-reliance, organisations and governments must invest in AI literacy programmes, empowering employees and citizens to critically evaluate AI outputs. This would also mitigate automation bias and support ethical AI use.
4. Hybrid AI-Human Governance Models
Rather than pitting AI against humans, we need hybrid models where AI monitors AI under human supervision. For example, an AI watchdog could flag anomalies, but a human auditor would make the final call, balancing efficiency with accountability.
5. Ethical AI by Design
Developers must embed ethical guidelines into AI agents from the outset, aligning their objectives with human values. This could involve ‘ethical circuit breakers’—mechanisms that halt AI actions if they deviate from predefined moral boundaries.
6. Cross-Sector Collaboration
The risks of Agentic AI transcend industries. Governments, tech firms, academia, and civil society must work together to develop adaptive governance frameworks, sharing best practices and case studies to address emerging threats.
Conclusion
Agentic AI is a double-edged sword, offering unprecedented autonomy and efficiency but introducing risks that could destabilise systems, economies, and societies if left unchecked. From unintended decisions and security vulnerabilities to ethical lapses and failure cascades, the challenges are as diverse as they are urgent. Current auditing frameworks like NIST’s AI RMF, mitigation strategies like human oversight, and innovative approaches like AI monitoring AI provide a solid foundation, but they’re not enough on their own.
As we stand at this technological crossroads, we must act decisively—bolstering transparency, enforcing rigorous testing, and fostering collaboration to ensure Agentic AI serves humanity rather than subverts it. By blending human ingenuity with AI’s capabilities, we can harness this transformative technology while keeping its perils at bay. What do you think—how should we balance the promise and peril of Agentic AI? Share your thoughts below, and let’s keep the conversation going.


Leave a comment