The Coming Storm: Why Banks’ Model Risk Management Is Struggling with GenAI
How AI, Regulation, and Complexity Are Outpacing Traditional SR 11-7 Programs
Banks’ SR 11‑7 programs are running into structural limits with opaque, fast‑changing, third‑party AI—especially GenAI and agentic systems. These pain points will only intensify as AI scales across the industry.
Big Trend Lines
· Rapid expansion of AI/ML and GenAI use cases (credit, fraud, operations, customer service, code, policy drafting) is turning “a few hundred models” into “thousands of models and AI services,” stressing inventories, validation capacity, and governance.
· Regulators are reinterpreting SR 11‑7 and layering on AI‑specific expectations (explainability, fairness, continuous monitoring, third‑party assurance, AI governance frameworks) rather than replacing it.
· Firms are moving from periodic, static validation to “continuous model assurance” with near real‑time monitoring, drift detection, and automated testing—often using AI to monitor AI.
Hard Problems That Are Getting Worse
1. Opaque and Third‑Party Foundation Models
· Many critical AI capabilities now rely on external LLMs and agentic platforms (e.g., GPT‑style models), where training data, architecture, and versioning are not transparent.
· Vendors frequently update models unilaterally, breaking reproducibility and undermining SR 11‑7’s assumptions about fixed specifications and controlled change management.
· Banks must attest to model risk controls over systems they neither fully understand nor control, including data handling and security inside third‑party AI platforms.
Why it will worsen: As more workflows embed external GenAI (co‑pilots for bankers, chatbots, automated coding, decision support), banks’ critical paths will hinge on black‑box models whose behavior can shift overnight.
2. Explainability, Fairness, and Regulatory Scrutiny
· Deep ML and GenAI models are inherently hard to explain to business owners, boards, auditors, and regulators. Standard SR 11‑7 “conceptual soundness” and outcome testing do not fully answer “why did this particular decision happen?”
· Regulators expect robust bias and disparate‑impact analysis across sensitive attributes—technically challenging with complex features and non‑deterministic LLM outputs.
· Explainability tools (like SHAP, LIME) help but are expensive, approximate, and difficult to scale, especially for generative models.
Why it will worsen: AI is increasingly used in high‑stakes decisions (pricing, collections, underwriting, surveillance), raising expectations for individualized explanations and fairness proof—not just aggregate statistics.
3. Continuous Drift, Instability, and Behavior Under Attack
· AI models drift faster as data, markets, and user behavior change, and as vendors silently retrain foundation models.
· SR 11‑7’s periodic validation cadence is out of sync with systems whose risk profile can change weekly. Firms are trying to deploy real‑time monitoring, but coverage is uneven.
· Generative models are vulnerable to adversarial prompts, jailbreaks, and prompt‑injection attacks that can bypass business rules or generate non‑compliant content—risks traditional validation never anticipated.
Why it will worsen: As agentic AI chains tools and actions, single‑prompt exploits can cascade across systems; drift and adversarial behavior will be continuous, not episodic.
4. Defining “What Is a Model” and Managing Proliferation
· Banks are struggling to decide what falls under SR 11‑7: small decision engines, RPA scripts, GenAI assistants, scoring APIs, in‑app recommendation engines, and “shadow AI” built by business units.
· Model inventories and tiering schemes break down when hundreds of low‑code/no‑code apps, spreadsheets, and AI micro‑services all arguably qualify as models.
· Controlling EUCs and “citizen‑built” AI (e.g., staff wiring Excel to public LLMs) is increasingly difficult, creating blind spots in model risk and data‑loss risk.
Why it will worsen: GenAI tools make it trivial for non‑technical staff to build quasi‑models. Governance frameworks will be chasing an ever-expanding perimeter.
5. Data Governance, Privacy, and Security Across the Model Estate
· AI models consume and sometimes embed highly sensitive data; with many models and pipelines, the aggregate “model data estate” becomes a major attack surface.
· Public and vendor‑hosted LLMs raise questions about where prompts, logs, and training data reside, how long they are retained, and whether they might leak proprietary or customer information.
· Aligning model risk, operational risk, cybersecurity, privacy, and data‑residency obligations into one coherent control set is proving difficult, especially across jurisdictions.
Why it will worsen: More models, more jurisdictions, more data types, and more cross‑border cloud/AI services mean the data governance and DLP problem grows super‑linearly.
6. Capacity, Skills, and Automation in MRM
· Traditional MRM teams are now expected to cover ML, GenAI, cybersecurity‑adjacent risks, and ethics/AI governance—the skills gap is real.
· Manual validation and documentation can’t keep up with the volume and velocity of AI models, driving adoption of “MRM 2.0” platforms and AI‑assisted validation.
· Regulators will scrutinize any “AI that validates AI,” so firms must prove that automated validation is itself governed, tested, and explainable.
Why it will worsen: Model counts and regulatory expectations are rising faster than headcount; without aggressive automation and better processes, backlog and control gaps will grow.
Where This Is Heading
· Governance is shifting from model‑by‑model compliance to ecosystem‑level assurance: continuous monitoring across all AI systems, immutable audit trails for autonomous decisions, and integrated AI governance frameworks spanning risk, compliance, and technology.
· Expect more explicit AI/ML guidance (OCC “responsible AI,” EBA AI guidelines, EU AI Act, Fed/OCC clarifications) that will layer on top of SR 11‑7 rather than replace it, focusing on transparency, fairness, and cross‑border consistency.

