Prompt Engineering for Anti-Hallucination Evidence Generation
A multi-layer prompt and validation architecture that prevents LLM hallucinations in compliance evidence generation through structured inputs, hard constraint gates, and a 4-phase validation pipeline.
Problem
Naive LLM prompting for compliance classification produces hallucinated AWS service names, inflated confidence scores, and prose-driven misclassifications — failures that silently corrupt downstream audit evidence.
Solution
Evolved from flat prose prompting to a constrained pipeline where the LLM handles only semantic interpretation. Five interlocking anti-hallucination mechanisms enforce structured input, service name blocklists, mathematical constraints, deterministic escape hatches, and confidence-gated review. A 4-phase validation pipeline catches what prompt constraints alone cannot prevent.
Impact
- →Eliminated AWS service name leakage into abstract classification outputs via hard-coded regex blocklist
- →Reduced misclassification of process-only controls through deterministic escape hatches that bypass the LLM entirely
- →Established a gold-set validation framework with 10 analyst-authored test cases and 8 codified divergence categories
Architecture
- 01Source fields are packaged into a structured JSON input — the LLM never sees raw prose alone
- 02Prompt contract explicitly forbids AWS service names and provider-specific resolution
- 03Validator enforces split arithmetic, blocklist compliance, and confidence-justification pairing
- 04Escape hatch classifier intercepts obvious process-attestation records before any LLM call
- 05Gold validation set of 10 analyst-authored cases gates prompt changes before production runs
Capabilities
- ·Structured LLM input packaging from normalized source fields
- ·22-service AWS name blocklist with whole-word regex enforcement
- ·Mathematical split constraints (technical + process == 1.0)
- ·Deterministic escape hatches for process-only controls
- ·Confidence-gated routing with mandatory justification
- ·4-phase validation pipeline (schema, gold set, divergence analysis, full run)
Stack
Technical Deep Dive
Architecture internals and annotated code from the production system.
Architecture Overview
The evolution is clear: the system moved from 'ask the LLM to produce compliance evidence' to 'give the LLM a tightly scoped classification job, validate every output field, and never let it touch decisions that can be made deterministically.' The LLM handles semantic interpretation. Everything else — routing, mapping readiness, human review flags, escape hatches — is pipeline logic that the model never controls.
Key Architectural Decisions
Structured Input, Not Raw Prose
The LLM never sees free-text KSI descriptions alone. It receives a normalized JSON package of source fields. Raw KSI prose must not be the sole LLM input — this prevents noun-extraction errors and forces the model to reason from evidence signals, not sentence surface patterns.
AWS Service Name Blocklist
A 22-service blocklist (ec2, s3, iam, vpc, lambda, etc.) enforced via whole-word regex matching. If the LLM leaks a concrete AWS service name into candidate_subjects, the output is rejected programmatically — not just flagged. This forces the model to stay at the abstract resource-class level ('compute instances', not 'EC2').
Mathematical Constraints
technical_split + process_split must equal exactly 1.0. The validator checks round(ts + ps, 10) != 1.0 — the model can't fabricate a classification where both dimensions are inflated. If the LLM hallucinates about what fraction of a control is machine-testable, the math won't add up.
Escape Hatch for Process-Only Controls
Controls with automation_status: No + validation_method: Manual trigger a deterministic override: candidate_subjects = [], technical_split = 0.0, process_split = 1.0, layer2_action = 'do_not_component_map'. The pipeline sets these — the LLM doesn't even get to guess.
Confidence-Based Gating with Mandatory Justification
When enrichment_confidence = 'low', ambiguity_notes must be non-empty (validated programmatically). Low confidence triggers requires_human_review = true, blocking downstream automation. The LLM can't hand-wave past uncertainty.
Code Showcase 1
Before: Naive Prompting Output
A naive prompt given the KSI title 'Automated Inventory' and summary 'Use authoritative sources to automatically maintain real-time inventories' produces three hallucination failures at once — classification driven by prose, leaked AWS service names, and inflated confidence that ignores contradictory metadata.
{
"requirement_type": "technical_configuration",
"candidate_subjects": ["AWS Config", "EC2 instances", "S3 buckets"],
"technical_split": 0.9,
"process_split": 0.1,
"enrichment_confidence": "high"
}| Property | Detail |
|---|---|
| Failure 1 | Classified on prose, not metadata — the word 'Automated' in the title drove the classification |
| Failure 2 | Leaked AWS service names — 'AWS Config', 'EC2', 'S3' in candidate_subjects |
| Failure 3 | Inflated confidence — ignored that automation_status: No and validation_method: Manual contradict the title |
Code Showcase 2
After: Constrained Prompting Output
With the evolved prompt contract, the structured input includes metadata signals alongside the prose. The prompt explicitly says 'DO NOT use AWS service names' and 'DO NOT resolve to specific cloud providers — that is Layer 2's job.' Source metadata wins over prose.
{
"requirement_type": "hybrid",
"candidate_subjects": [
"resource inventory records",
"asset discovery configurations",
"inventory source authorities"
],
"technical_split": 0.4,
"process_split": 0.6,
"enrichment_confidence": "medium",
"ambiguity_notes": "Title/summary describe automated capability but source metadata (automation_status: No, validation_method: Manual) indicates manual validation. Source metadata takes precedence."
}| Property | Detail |
|---|---|
| Metadata Wins | Classified as 'hybrid' despite 'Automated' in the title — source metadata (automation_status: No) takes precedence over prose |
| Abstract Subjects | No AWS service names — uses abstract resource classes that Layer 2 will resolve to concrete CloudFormation types |
| Honest Confidence | Medium confidence with mandatory ambiguity notes explaining the title/metadata contradiction |
| Valid Splits | technical_split (0.4) + process_split (0.6) = 1.0 — math constraint satisfied |
Validation Pipeline
What prompt constraints alone cannot prevent, the validation pipeline catches before any output reaches production.
| Phase | What It Catches |
|---|---|
| ASchema Validation | Structural errors, split arithmetic, escape hatch violations. All blocking. |
| BGold Set Testing | 10 analyst-authored test cases with acceptance thresholds (8/10 type match, 10/10 no AWS names, 2/2 escape hatches). |
| CDivergence Analysis | Categorizes every mismatch into 8 types (type_error, scope_narrowing, aws_name_leak, confidence_inflation, etc.) with specific resolution actions. |
| DFull Run Verification | End-to-end count verification, cross-layer traceability, human review queue sizing. |