AI in federal healthcare VA CMS promise and peril at scale

Executive summary

Federal policy has moved decisively from permissive experimentation to mandatory governance for clinical AI. The Executive Order on AI, OMB M-24-10, NIST’s AI Risk Management Framework, FDA device guidance, and ONC’s HTI-1 rule together establish a regime requiring transparency, safety, postmarket monitoring, and nondiscrimination for AI that touches patient care and coverage decisions [1][2][3][4][5][6][7]. VA is already operating scaled predictive programs such as REACH VET, while CMS is modernizing adjudication and prior authorization pipelines—each now subject to safety-impacting AI controls and HIPAA constraints on data use and tracking technologies [12][8][2][10][9]. Mission owners should treat any AI that recommends or influences diagnosis, treatment, triage, benefits, or payment as safety-impacting, implement pre-deployment testing and independent evaluation, and establish continuous monitoring and incident response per federal policy [2][3][4][7].

Policy baseline for clinical AI at scale

Executive Order 14110 mandates HHS to advance patient safety, quality improvement, and algorithmic fairness in healthcare AI, and directs agencies to align AI governance with national standards [1].
OMB M-24-10 requires all agencies to inventory AI systems, designate Chief AI Officers, and apply risk management practices. It defines “safety-impacting AI” and mandates rigorous testing, independent evaluation, ongoing monitoring, incident reporting, and protections against algorithmic discrimination—requirements that squarely apply to clinical decision support and benefit adjudication contexts in VA and CMS [2].
NIST AI RMF 1.0 sets out governance functions (Map, Measure, Manage, Govern) and risk controls for validity, robustness, bias, and monitoring that agencies are directed to adopt for AI systems [3][2].
FDA governs AI/ML-enabled medical devices and clinical decision support. Its CDS guidance delineates non-device CDS (outside medical device regulation) from device CDS requiring clearance/approval, which often covers ML models that process medical images or signals and generate diagnostic/treatment recommendations [4]. FDA’s draft guidance on Predetermined Change Control Plans (PCCPs) provides a path for adaptive ML devices while requiring controls and postmarket oversight [5][6].
ONC’s HTI-1 final rule revises EHR certification criteria to require transparency for Decision Support Interventions (DSI), including information on source, logic, training data, performance, limitations, and fairness assessments, to support safe use of algorithms within certified health IT [7].
HIPAA applies to AI data pipelines. HHS OCR clarifies de-identification standards for using data outside covered contexts [9] and prohibits impermissible disclosures via online tracking technologies (e.g., pixels) on patient portals or appointment pages that transmit PHI to third parties without authorization [10].

Clinical safety regime: what “safety-impacting AI” means in practice

Under M-24-10, any AI that could materially affect patient safety, diagnosis, treatment, triage, or benefits decisions should be classified as safety-impacting and subjected to enhanced controls [2]. Implementation implications:

Pre-deployment testing and independent evaluation: Agencies must test AI for clinically meaningful performance, robustness, and bias, and obtain independent evaluation proportional to risk before use in operations [2][3]. FDA clearance/approval satisfies this for regulated devices; non-device CDS still requires agency-level evaluation and documentation [4][2].
Continuous monitoring and incident response: Establish telemetry, drift detection, clinical outcome monitoring, and a process to pause/rollback models if adverse impacts occur. ONC’s HTI-1 transparency artifacts should be integrated into monitoring playbooks. Report incidents per M-24-10 [2][7].
Algorithmic discrimination safeguards: Assess disparate performance across protected classes and mitigate per NIST SP 1270 and agency civil rights obligations. Document assessments and mitigations [2][18].
Human-in/on-the-loop controls: Maintain clinician oversight and documented fallback procedures for AI recommendations; HTI-1 requires clear articulation of intended use and limitations [7].
Change management for adaptive models: For FDA-regulated ML devices, use PCCPs to pre-specify retraining conditions and validation methods; for non-device models, agencies should adopt analogous change-control gates and audits [5][6][2].

VA landscape: scaled use and governance needs

VA established the National Artificial Intelligence Institute (NAII) to accelerate trustworthy AI research and deployment in veteran care, focusing on safety, ethics, and impact [11].
VA’s REACH VET program uses predictive modeling to identify Veterans at statistically elevated risk and proactively engages them with care—an example of AI-enabled risk stratification operating at scale within clinical workflows [12].
Governance implications: REACH VET-like models that influence triage or outreach should be treated as safety-impacting AI and subjected to M-24-10 controls (testing, independent evaluation, monitoring, incident reporting) and bias assessments under NIST RMF/SP 1270 [2][3][18]. Where models intersect with diagnostic recommendations or device functionality, FDA CDS/medical device determinations apply [4][6].
Data governance: As a HIPAA-covered provider, VHA must ensure any AI training or inference involving PHI complies with HIPAA use/disclosure rules, de-identification standards, and OCR’s restrictions on tracking technologies in patient-facing web apps [9][10].
Integration: HTI-1 DSI transparency artifacts should be incorporated into EHR decision support for VA’s certified health IT environments to support clinician understanding, safe use, and auditability [7].

UNVERIFIED: A specific VHA directive enumerating AI governance or “trustworthy AI” requirements could further operationalize M-24-10; we could not locate a primary source directive and recommend reviewer verification.

CMS landscape: adjudication, program integrity, and clinical interfaces

CMS’s Fraud Prevention System (FPS) has long used predictive analytics to identify aberrant billing and fraud risk, demonstrating scaled algorithmic use in program integrity [13]. Safety-impacting AI controls from M-24-10 extend to any models that could affect beneficiary access or provider payments [2].
CMS’s Interoperability and Prior Authorization final policies require standardized APIs to streamline prior authorization across Medicare, Medicaid, and Exchanges, reshaping data flows and enabling algorithmic decision support in payer operations, although they do not mandate AI use [8].
Clinical interfaces: ONC’s HTI-1 certification criteria for DSI apply to EHR developers and indirectly to CMS clinical quality programs reliant on certified health IT; transparency artifacts should accompany algorithmic decision support used in Promoting Interoperability contexts [7].
HIPAA and civil rights: CMS operations must ensure algorithmic tools do not produce discriminatory outcomes and comply with PHI safeguards, per M-24-10 and HIPAA [2][9][10].

Data governance and privacy constraints

PHI use for training/inference: Covered entities must either operate within HIPAA treatment, payment, and healthcare operations permissions, use Business Associate Agreements, or ensure formal de-identification (expert determination or safe harbor) before secondary uses [9].
Tracking technologies: OCR’s bulletin prohibits pixel and similar tracking that transmits PHI from patient-facing pages to third parties without authorization. This constrains telemetry and third-party SDKs in model-in-the-loop portals and apps [10].
Information blocking: The Cures Act rule requires availability of electronic health information. AI deployments must not be configured to unjustifiably restrict access or export of EHI (e.g., via proprietary black-box outputs that impede required disclosures) [19].

Perils at scale: failure modes to actively manage

Performance drift and silent failure: Changes in populations, workflows, devices, or coding schemas degrade model performance; requires continuous monitoring and retraining controls, and, for regulated devices, PCCPs with postmarket surveillance [5][6][2].
Bias and inequitable impact: Differential accuracy across subgroups can produce disparate outcomes; agencies must measure and mitigate per NIST SP 1270 and M-24-10 [18][2].
Overreliance on opaque CDS: Non-device CDS still demands transparency and clinician interpretability per HTI-1; lack of clarity on logic, data provenance, and limitations increases safety risk [7].
Data misuse and leakage: Improper sharing via tracking tech or insufficient de-identification breaches HIPAA; vendor and SDK governance must be tightened in patient-facing apps [10][9].
Regulatory misclassification: Misunderstanding FDA CDS vs device boundaries can lead to unapproved clinical functionality in production; early regulatory assessment is essential [4][6].

Acquisition, cloud, and tooling alignment

Hosting and controls: Azure Government provides cloud services with FedRAMP High authorization and DoD CC SRG authorizations at IL2/IL4/IL5 for applicable workloads; Microsoft also offers environments for IL6 workloads (Azure Government Secret) for classified scenarios [14][15]. Agencies must select environments consistent with data sensitivity and mission needs and apply agency ATOs.
Responsible AI implementation: Azure Machine Learning offers documentation and tooling for model interpretability, fairness assessment, error analysis, and monitoring that can help operationalize NIST AI RMF practices and HTI-1 transparency artifacts within MLOps pipelines; agencies must validate and document controls per policy [16][3][7][2].
Policy enforcement: Azure Policy enables codified guardrails (e.g., encryption, network isolation, logging) that support FISMA, FedRAMP, and agency governance requirements; map policies to AI system inventories and lifecycle controls required by M-24-10 [17][2].
Note: Vendor capabilities do not substitute for regulatory compliance or FDA clearance. Agencies must verify device status, applicable guidance, and certification criteria; vendor claims should not be treated as policy facts [4][5][7].

Implementation checklist for VA and CMS missions

Classify AI systems that influence diagnosis, treatment, triage, outreach, benefits, or payment as safety-impacting; register in the AI inventory and assign accountable owners [2].
Conduct pre-deployment clinical validation, robustness testing, and bias assessment; obtain independent evaluation proportional to risk; document results [2][3][18].
Determine FDA regulatory status for CDS vs device; if device, ensure clearance/approval and establish PCCP for adaptive models; if non-device, apply HTI-1 DSI transparency and agency governance rigor [4][5][7].
Establish continuous monitoring: performance, drift, adverse events; define incident response and rollback procedures; report incidents per policy [2][3].
Enforce HIPAA-compliant data pipelines; prohibit impermissible tracking technologies; manage BAAs and de-identification where applicable [10][9].
Integrate transparency artifacts into clinician-facing EHR workflows per HTI-1; train users on intended use, limitations, and escalation paths [7].
Address civil rights: perform disparate impact analysis and mitigation; include stakeholder review and documentation [2][18].
Align cloud controls with FedRAMP/FISMA and agency ATO; implement technical guardrails via policy-as-code; maintain audit trails [15][17].

Gaps and source conflicts to surface

CDS vs device boundary complexity: FDA’s CDS guidance clarifies criteria, but real-world ML models embedded in imaging or signal analysis frequently cross into device territory. Agencies must reconcile ONC HTI-1 transparency requirements with FDA device controls; these regimes are complementary but distinct [4][7].
Adaptive models oversight: FDA’s PCCP approach is still draft guidance for ML-enabled devices; agencies deploying adaptive models outside device regulation should emulate PCCP discipline, but there is no formal federal template for non-device adaptive clinical AI changes [5][2].
VA internal AI governance directives: Public materials emphasize trustworthy AI via NAII, but a current, formal VHA directive for operational AI governance was not found in primary sources; reviewer verification is recommended (UNVERIFIED) [11].
CMS adjudication AI transparency: CMS policy modernizes APIs for prior authorization but does not set explicit transparency standards for AI used by payers in coverage decisions; agencies should apply M-24-10 safety-impacting AI controls and civil rights reviews pending further HHS/CMS-specific guidance [8][2].

What good looks like in 12 months

VA: All clinical algorithms (e.g., REACH VET risk stratifiers, imaging triage tools) are inventoried; each has documented validation, bias assessment, HTI-1 transparency artifacts, monitoring dashboards, incident response playbooks, and periodic independent evaluations; HIPAA tracking tech risks remediated across portals [12][2][7][10].
CMS: Program integrity and prior authorization algorithms are inventoried; decision support used in clinical quality programs accompanies HTI-1-compliant transparency; civil rights and bias assessments are logged; beneficiary-facing apps audited for tracking tech compliance [13][8][7][2][10].
Cross-cutting: Model change governance aligned to PCCPs where applicable; cloud controls codified with policy-as-code; NIST AI RMF practices embedded in SDLC and operations [5][17][3][2].

Sources

[1] Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence
[2] OMB Memorandum M-24-10: Advancing Governance, Innovation, and Risk Management for Agency Use of Artificial Intelligence
[3] NIST AI Risk Management Framework 1.0
[4] FDA Guidance: Clinical Decision Support Software
[5] FDA Draft Guidance: Predetermined Change Control Plans for Machine Learning–Enabled Device Software Functions
[6] FDA: Artificial Intelligence and Machine Learning in Software as a Medical Device (SaMD)
[7] ONC HTI-1 Final Rule (Federal Register)
[8] CMS Fact Sheet: Policies to Improve Prior Authorization Processes
[9] HHS OCR Guidance on De-identification of PHI under HIPAA
[10] HHS OCR Bulletin: Use of Online Tracking Technologies by HIPAA Regulated Entities
[11] VA National Artificial Intelligence Institute
[12] VA REACH VET Program
[13] GAO-16-76: Medicare Fraud Prevention System
[14] Microsoft Azure Government compliance offerings
[15] Microsoft Compliance Offerings: FedRAMP
[16] Azure Machine Learning: Responsible AI
[17] Azure Policy overview
[18] NIST SP 1270: Identifying and Managing Bias in AI
[19] ONC Cures Act: Information Blocking