Research Report — March 2026

Why Non-Auditable AI Concentrates Liability on the Firm

Your firm has a human in the loop. When the system fails, that human is you.

8hrs Illustrative: a full workday of AI output review
~23% Illustrative: detection probability after 45 min of passive monitoring
Limited or None Independent access to vendor-side inference telemetry in standard SaaS deployments

Oversight Requires the Capacity to Oversee.

Regulators mandate that a human oversee AI output before it reaches a client. Decades of cognitive science research show why that mandate breaks down in practice: AI systems produce errors that look correct on the surface. At volume, the human reviewing that output cannot sustain the cognitive load required to catch them. The longer the review, the more errors pass through.

The Harvard JOLT analysis published February 2026 identifies the legal consequence. Under the Learned Hand negligence formula, a party is liable when the cost of prevention is less than the expected harm. But when the firm has no independent way to verify what the AI did, the burden of precaution (B) approaches infinity, rendering the formula void. The formula stops working as a human-only oversight standard.

In a cloud deployment, the lawyer has no independent access to inference logs or a firm-controlled audit trail. When something goes wrong, the firm cannot reconstruct what happened without the vendor's cooperation. The duty to verify remains. The ability to verify does not.

Under Harvard's framework, that lawyer becomes a liability sponge: positioned to absorb the blame for a system they had no independent ability to verify.

"When a regulator mandates a task that a human cannot reliably perform, they are effectively mandating a breach of duty and setting the industry up to fail."

— Nanda Min Htin, Harvard JOLT Digest (Feb 2026)

The Human Brain Was Not Built for This Job.

Two phenomena make passive AI oversight structurally unreliable. Both are documented in aviation disasters and confirmed in human factors research.

🚗

Vigilance Decrement — Uber Tempe (2018)

The safety driver was streaming The Voice on Hulu. That sounds like negligence until you understand the dynamic: passive monitoring of a reliable system produces vigilance decrement. The human brain disengages — not because the operator is careless, but because that is what human brains do when asked to watch something that appears to be working. Uber reached a civil settlement. The driver was charged with negligent homicide (later resolved via a guilty plea to endangerment in 2023) and became the liability sponge for a systemic design failure.

NTSB/HAR-19/03 (2019). Tempe Police Dept report & Hulu records (June 2018, Reuters).
✈️

Automation Bias — Sriwijaya Air SJ182 (2021)

When the autothrottle malfunctioned, the pilots failed to monitor the engine instruments for over a minute. By the time the autopilot disengaged, the aircraft entered an upset condition and the crew did not recover. 62 people died. The pilots were not untrained. They were lulled by a system that had been reliable — until it wasn't.

KNKT.21.01.01.04 — Final Report (2022).

A lawyer reviewing AI output for a full workday. Vigilance decrement means error detection materially degrades within the first hour. The remaining seven hours are liability theater — the human is present but cognitively disengaged.

0:30
1:00
1–2hr
Hours 2–8: Vigilance Decrement Zone
Illustrative: High detection (~85–98%)
Illustrative: Declining (~40–60%)
Illustrative: Critically degraded (~10–25%)
Illustrative model based on vigilance decrement literature (Parasuraman & Manzey, 2010; Warm, Parasuraman & Matthews, 2008). Exact detection rates vary by task complexity, operator training, and system reliability. Not derived from a single primary dataset. See annotated bibliography for primary sources.

The Learned Hand Formula Collapses in Cloud AI.

Under the Hand Formula (United States v. Carroll Towing Co., 1947), a party is negligent if the Burden of precaution (B) is less than the Probability of harm (P) multiplied by the gravity of the Loss (L).

When B — the burden of verifying AI output — approaches infinity because the system's decision logic is not human-verifiable, the formula ceases to produce a meaningful standard. The Harvard JOLT analysis concludes this renders the negligence calculus void as a human-only oversight standard. In a third-party cloud deployment, this opacity is compounded by the lawyer's inability to independently access vendor-side inference telemetry or logs.

Interactive: How Architecture Changes the Formula

Select a deployment architecture. Watch what happens to B.

$
B (Burden)
<
High
P (Probability)
×
Max
L (Loss)
Air-Gapped: The firm has physical and logical access to the hardware, inference logs, and model artifacts it controls. The burden (B) of verification is manageable. The Hand Formula holds. The lawyer is a genuine supervisor. Liability is proportionate to actual control.

Calabresi: Who Is the Cheapest Cost Avoider?

Calabresi's rule is simple: if you could have prevented the harm and you didn't, the liability is yours.

In a cloud deployment, privileged client data sits on infrastructure the firm does not control. If a vendor breach exposes that data, the firm must explain why it placed privileged material on a third party's servers when a locally deployed option existed. The cost of prevention was the air gap. Air-gapped deployment removes the third-party cloud inference attack surface, gives the firm sole custody of client data, and provides a complete audit trail the firm controls — without depending on vendor cooperation to reconstruct what happened when something goes wrong.

Choosing cloud when a locally controlled option exists is like booking a client on an airline with a known safety record problem when a safer carrier flies the same route.

Illustrative representation of relative cost and information positions based on Calabresi's cheapest cost avoider framework.

Architecture Determines Whether Oversight Is Real.

The same lawyer reviewing the same AI output has fundamentally different oversight capacity depending on where the inference runs.

High Risk

SaaS Cloud (Multi-Tenant)

ObservabilityLimited
Data ControlVendor
Inference LogsVendor-controlled
RAM ExposureNot auditable
Compulsion RiskElevated
Prevention ControlVendor
Liability Sponge
Medium Risk

Firm-Controlled VPC

ObservabilityPartial
Data ControlShared
Inference LogsRequest/Response Only
RAM ExposureVendor-managed
Compulsion RiskReduced
Prevention ControlShared
Residual Exposure
Controlled Risk

On-Premises / Air-Gapped

ObservabilityFull
Data ControlFirm-controlled
Inference LogsFull Access
RAM ExposureAuditable
Compulsion RiskMinimal
Prevention ControlLaw Firm
Supervisor

Air-gapped deployment is the most direct architecture to restore the full oversight capacity the Hand Formula requires. A forward-deployed engineering model means the firm gets the infrastructure without building it from scratch.

Three Pillars to Replace Theater with Verifiable Duties.

The Harvard JOLT framework proposes replacing passive oversight mandates with concrete, verifiable obligations distributed across the AI supply chain. All three require the human to engage with the system in a verifiable way — which a third-party cloud deployment often limits on the deployer side.

Pillar 01 — Developer

Technical Robustness

Developers must build systems capable of being overseen. In legal AI deployments, this pillar can translate to confidence scoring, citation traceability, cryptographic logs, and interpretability endpoints. The cost to implement these at the model level is lower than the aggregate cost of downstream oversight — making the developer the cheapest cost avoider for inference-level failures.

Pillar 02 — Deployer

Human-Systems Integration

Deployers must implement genuine collaboration between human and AI: friction roles that force active engagement, resilience roles with cognitive handrails, and failure-specific training. Training lawyers on how the system works when it's right is insufficient. They must be trained on what happens when it fails.

Pillar 03 — Shared

Post-Market Monitoring

Continuous monitoring and adverse event reporting — modeled on FDA post-market surveillance — to quantify the Probability of Harm (P) in the Hand Formula over time. As failure modes accumulate in centralized databases, the standard of care rises. Under this framework, ignorance of the database ceases to function as a defense.

The Vendor Arguments. And Why They Don't Restore Verifiability.

Vendor Claim

"We have SOC 2 Type II and ISO 27001."

Third-party audits prove security controls are effective. The law firm doesn't need underlying log access.

Rebuttal

Security ≠ Auditable Inference.

SOC 2 attests that security controls were effective over a review period. It does not prove the AI correctly grounded its output to a specific case citation. It does not provide the evidentiary readiness a lawyer needs when a court examines their methodology.

Vendor Claim

"We use Confidential Computing."

Hardware-level isolation means even vendor admins cannot see inference data. The data is mathematically safe.

Rebuttal

Six Layers of Trust You Cannot Verify.

Confidential computing protects data from the vendor's own employees. It does not protect data from the vendor's hardware supply chain, firmware vulnerabilities, attestation infrastructure, key management operations, or a CLOUD Act order. The firm is trusting six layers it cannot independently verify. If a federal order compels the provider to assist in accessing the data, the encryption is irrelevant. The law supersedes the technology.

Vendor Claim

"We don't train on your data. Zero data retention."

The data is processed and discarded. Nothing is stored. The firm's privileged material is never at risk.

Rebuttal

Retention Policy ≠ Exposure Prevention.

Zero data retention governs what happens after processing. It says nothing about exposure during processing. During inference, privileged content exists as plaintext in vendor-controlled RAM. Retention policy does not eliminate the window of exposure. It just determines how long the vendor keeps a copy afterward.

Vendor Claim

"We have contractual protections (BAA/DPA/MSA)."

Binding agreements prevent the vendor from misusing client data. The firm is legally protected.

Rebuttal

Contracts Don't Control Who Gets the Order.

A subpoena can reach the firm regardless of where data is stored. The difference is who receives it. With air-gapped deployment, the order goes to the firm. The firm asserts privilege and controls the response. Under the CLOUD Act, the government can compel the cloud vendor directly. The vendor may not know what is privileged. The firm may not know the order was issued.

The Architectural Verdict

The question is not whether your firm has a human in the loop.

The question is whether that human has anything real to oversee.

Architecture determines the answer. Not policy. Not contracts. Not vendor assurances.

That is what CloseVector is built to solve.

The Duty Exists.
The Capacity Depends on Architecture.

CloseVector provides air-gapped AI infrastructure designed for deployment inside your firm's walls. Firm-controlled observability. Firm-controlled audit trail. No third-party vendor-side inference dependency. The lawyer becomes a supervisor — not a sponge.

Schedule Assessment →

Sources Cited

[1] Nanda Min Htin, "Redefining the Standard of Human Oversight for AI Negligence," Harvard Journal of Law & Technology Digest (Feb 9, 2026). Foundation for the liability sponge framework, Hand Formula analysis, and three-pillar verifiable duties proposal.
[2] Guido Calabresi, The Costs of Accidents: A Legal and Economic Analysis (Yale University Press, 1970). Cheapest cost avoider framework applied to deployment architecture analysis.
[3] NTSB/HAR-19/03 (2019). Collision Between Vehicle Controlled by Developmental Automated Driving System and Pedestrian. Primary source for the Tempe crash and the NTSB's analysis of operator monitoring and automation complacency.
[4] KNKT.21.01.01.04 (2022). Aircraft Accident Investigation Report — Sriwijaya Air SJ182. Primary source for the SJ182 crash and the KNKT's analysis of pilot monitoring, automation reliance, and contributing factors.
[5] Crootof, Kaminski & Price, "Humans in the Loop," Vanderbilt Law Review (2023). Taxonomy of human roles in AI oversight; framework for analyzing when HITL mandates serve as liability shields rather than safety mechanisms.
[6] Selbst, "Negligence and AI's Human Users," Boston University Law Review (2020). Analysis of how AI unpredictability breaks standard foreseeability in tort law.
[7] Parasuraman & Manzey, "Complacency and Bias in Human Use of Automation," Human Factors (2010). Empirical foundation for automation bias and vigilance decrement phenomena.
[8] Warm, Parasuraman & Matthews, "Vigilance Requires Hard Mental Work and Is Stressful," Human Factors (2008). Empirical basis for vigilance decrement in sustained monitoring tasks.
[9] United States v. Carroll Towing Co., 159 F.2d 169 (2d Cir. 1947). Origin of the Learned Hand Formula for negligence analysis.

Application of the Harvard JOLT framework and Calabresi's cheapest cost avoider principle to cloud legal AI deployment architecture is the author's own analysis extending these sources. This page does not constitute legal advice.