← Back to projects
AI / Compliance Automation / DevOpsApr 2026

AI Driven Issue Tracking and Analytics Pipeline

An eight stage AI pipeline. It maps FedRAMP 20x controls to a client's actual tech stack, finds the gaps against live Vanta test data, writes the remediation plan, and uploads the whole Epic, Task, and Subtask hierarchy to Jira.

Problem

FedRAMP 20x introduced a new control family structure (KSI, ADS, CCM) that existing compliance tooling doesn't handle. Analyzing each control against a client's real infrastructure, cross checking Vanta test coverage, finding gaps, writing remediation tickets, uploading them to Jira. Weeks of work per control family. Then you do it again for every new client.

Solution

Built an end to end pipeline in eight sequential notebook stages. Every stage emits CSV artifacts the next stage reads. That gives you a traceable chain from raw control data all the way to the Jira tickets. GPT-5 handles the heavy reasoning: gap analysis, remediation planning, root cause grouping. GPT-4.1 Mini handles structured extraction and formatting where deep thinking doesn't help. Cheap work stays cheap, smart work stays smart.

Impact

  • Ran 3 control families (KSI with 56 controls, ADS with 20, CCM with 3) through the full pipeline end to end
  • Generated 582 Jira tickets (Epics, Tasks, Subtasks) across every family with proper hierarchy and audit ready descriptions
  • Compressed the control to ticket lifecycle from weeks of manual analysis down to a single pipeline run per family
  • Every AI call is logged with the prompt sent and the response received. Full audit trail for compliance review.

Architecture

  1. 01Part 1 validates Rev5 objective questions and generates Component Examples. GPT-5, 5 parallel workers.
  2. 02Part 2 maps each KSI by Control by Part to the client's tech stack with GPT-5, then pulls structured CSV with GPT-4.1 Mini
  3. 03Part 3 hits the Vanta API for live tests, exports inventory, and maps each test to NIST controls with GPT-5
  4. 04Part 4 joins gap analysis, Vanta mapping, and inventory data into a master compliance scorecard with verdict logic
  5. 05Part 5 filters the scorecard for gaps, splits into existing tool and missing tool buckets, and generates custom Vanta test definitions
  6. 06Part 6 groups critical failures by root cause, generates CLI fixes, and produces an 8 to 10 theme Executive Remediation Roadmap
  7. 07Part 7 generates a Vanta UI configuration manifest that maps KSIs to test configurations
  8. 08Part 8 builds the Jira CSV hierarchy, rewrites descriptions for audit, scrubs to the AWS stack, splits into subtasks, and uploads through the REST API

Capabilities

  • ·Rev5 800-53 objective question validation and Component Example generation
  • ·KSI to client tech stack gap analysis with per control part granularity
  • ·Live Vanta test inventory pull and NIST control mapping
  • ·Automated compliance verdict logic (COMPLIANT, CRITICAL GAP, STRATEGIC GAP, PARTIAL)
  • ·Custom Vanta test definition generation for uncovered controls
  • ·Root cause grouping with Executive Remediation Roadmap generation
  • ·Vanta UI configuration manifest for test to control mapping
  • ·Jira ticket hierarchy generation (Epic, Task, Subtask) with audit ready formatting
  • ·Family parameterization. Same pipeline runs for KSI, ADS, or CCM with one config change.
  • ·Crash safe execution with immediate CSV writes and automatic skip on rerun

Stack

PythonOpenAI GPT-5OpenAI GPT-4.1 MiniVanta API (GraphQL)Jira REST API v3AWS Secrets ManagerpandasThreadPoolExecutorJupyter Notebooks

Technical Deep Dive

Architecture internals and annotated code from the production system.

Architecture Overview

The pipeline is deliberately sequential. Each of the 8 parts produces CSV artifacts the next part consumes. That makes every intermediate state inspectable, and it means a failure at Part 6 doesn't force a rerun of Parts 1 through 5. The dual model strategy (GPT-5 for reasoning, GPT-4.1 Mini for extraction) lives at the prompt level. Inside a single notebook, the same row can hit GPT-5 first for analysis and GPT-4.1 Mini right after to extract structured fields from the response.

 Rev5 Objective Questions + KSI Golden Template (source data)
[Part 1] GPT-5 validates objectives and generates Component Examples
[Part 2] GPT-5 maps controls to the client tech stack, then GPT-4.1 Mini extracts structured CSV
[Part 3] Vanta API pulls live tests, GPT-5 maps those tests to NIST controls
[Part 4] Deterministic merge and verdict logic produces the Master Compliance Scorecard
[Part 5] GPT-5 generates custom Vanta test definitions and a strategic gap remediation plan
[Part 6] GPT-5 groups failures by root cause, output is the Executive Remediation Roadmap
[Part 7] GPT-5 generates the Vanta UI configuration manifest
[Part 8a-c] GPT-5 formats for audit, then the Jira REST API uploads the Epic, Task, and Subtask hierarchy

Key Architectural Decisions

01

Why GPT-5 for Reasoning, GPT-4.1 Mini for Extraction

GPT-5 handles anything that requires understanding compliance context. Mapping a KSI control to a client's actual infrastructure. Deciding whether a Vanta test covers a NIST requirement. Grouping failures by root cause. Writing the remediation plan. GPT-4.1 Mini handles the mechanical follow-up. Pulling structured CSV fields out of GPT-5's prose response. Reformatting descriptions for audit style. Scrubbing vendor references. The expensive model only runs where reasoning quality matters. The cheap one does the transformation.

02

Why 8 Parts Instead of One Monolithic Pipeline

Each part writes named CSV files. Those files are checkpoints and audit artifacts at the same time. If Part 6 needs a prompt tweak, you rerun Part 6. Parts 1 through 5 are frozen on disk. A compliance reviewer can also inspect the scorecard (Part 4) before tickets get created (Part 8). The sequential design mirrors how a human analyst would actually work. Understand the controls, find the gaps, plan the remediation, file the tickets.

03

One Prompt Per Control Part, Not Combined

Part 2 sends one GPT-5 prompt per KSI by Control by Part combination. I don't batch multiple parts into a single prompt. That stops cross-contamination cold. The model's read on AC-2 Part (a) can't bleed into its read on AC-2 Part (b). Each response is independently cacheable and independently rerunnable.

04

Family Parameterization

Every notebook has a family-config cell. FAMILY = 'KSI' or 'ADS' or 'CCM.' Source data, output paths, and AI response directories all derive from that one variable. The same pipeline code handles every control family. Switching between them is a one-line config change plus a kernel restart.

05

Crash Safe Execution with Skip on Rerun

Every notebook writes results immediately, either as CSV appends or as individual output files. On rerun, completed rows are detected and skipped. So a notebook that died at row 47 of 200 resumes at row 48, not at row 1. Combine that with per-call prompt and response logging and every AI call is reproducible and auditable.

06

Verdict Logic as Deterministic Rules, Not AI

Part 4 merges the AI produced mappings with live Vanta test data and assigns a final verdict to every control part. That verdict logic is pure Python. No LLM call. The rules are explicit, reviewable, and deterministic. AI helps upstream. Once it's time to decide whether a control is covered, an LLM has no business making that call.

07

Every AI Call Logged with Prompt and Response

Every prompt sent and every response received gets written to disk with a timestamp and the row identifier. That's the audit trail. If a compliance reviewer questions a mapping, I can show them the exact prompt, the exact model output, and the deterministic logic that turned that output into a verdict. No black box.

Code Showcase 1

Dual-Model Strategy. GPT-5 Analysis → GPT-4.1 Mini Extraction

Two pass execution pattern. GPT-5 does the reasoning in pass one, GPT-4.1 Mini extracts structured fields in pass two. Input to pass two is the full prose output of pass one. That's why pass two is cheap. It isn't thinking. It's parsing.

text
Pass 1 — GPT-5 (Reasoning)
─────────────────────────────────────────────────────────
Input:   KSI control + Rev5 objective + client tech stack
Prompt:  "Analyze whether the client's infrastructure
          satisfies this control requirement. Explain
          the match type, coverage, and any gaps."
Output:  Prose analysis (saved to ai_response/output/)
Cost:    ~$0.03-0.05 per control part
Why GPT-5: Requires understanding compliance semantics,
           client infrastructure context, and gap
           identification — not a pattern-matching task.


Pass 2 — GPT-4.1 Mini (Extraction)
─────────────────────────────────────────────────────────
Input:   GPT-5's prose response from Pass 1
Prompt:  "Extract these fields from the analysis:
          match_type, coverage_status, gap_description,
          recommended_action. Return as CSV row."
Output:  Structured CSV (gaps_structured.csv)
Cost:    ~$0.001-0.003 per control part
Why 4.1 Mini: The answer is already in the text — this
              is field extraction, not reasoning. 10-30x
              cheaper and 3-5x faster than GPT-5.
PropertyDetail
Cost OptimizationGPT-4.1 Mini extraction is 10-30x cheaper per call than GPT-5. Applied to every row where reasoning is already complete
Speed OptimizationGPT-4.1 Mini responds 3-5x faster, reducing total pipeline runtime on extraction-heavy stages
Quality BoundaryGPT-5 makes every judgment call; GPT-4.1 Mini only touches text where the answer already exists in the prose
Audit TrailBoth prompts and responses are saved to disk. GPT-5 analysis in output/, GPT-4.1 Mini extraction in table_insert/

Code Showcase 2

Verdict Logic. Deterministic Classification

Verdict assignment logic from final_merge.ipynb. No AI involved. Pure Python. ai_match is whether the AI said this control is covered. vanta_match is whether a live Vanta test covers it. Both true means covered. AI says yes and Vanta says no means an implementation gap. AI says no and Vanta says yes means an analysis gap. Both false is the clean miss.

python
# Verdict assignment logic (simplified from final_merge.ipynb)
# No AI involved — purely deterministic rules

def assign_verdict(row):
    ai_match = row['match_type']        # from Part 2 (GPT-5)
    vanta_status = row['vanta_status']   # from Part 3 (Vanta API)
    has_vanta = row['has_vanta_test']    # from Part 3 merge

    if ai_match == 'Direct' and vanta_status == 'PASSING':
        return 'COMPLIANT'

    if has_vanta and vanta_status == 'FAILING':
        return 'CRITICAL GAP (Operational Failure)'

    if ai_match == 'No Match' and not has_vanta:
        return 'STRATEGIC GAP (Missing Tool/Policy)'

    if ai_match == 'Direct' and not has_vanta:
        return 'LIKELY COMPLIANT (No Vanta Test)'

    return 'PARTIAL / VERIFICATION REQUIRED'
PropertyDetail
No AIVerdict logic is deterministic if/else. The most consequential classification in the pipeline is fully auditable code
Two InputsAI match type (from GPT-5 gap analysis) + Vanta test status (from live API). Combines AI judgment with ground truth
Five VerdictsCOMPLIANT, CRITICAL GAP, STRATEGIC GAP, LIKELY COMPLIANT, PARTIAL. Each maps to a different remediation path
Auditor FriendlyA compliance reviewer can trace any verdict to its two input signals without understanding the AI that produced them

Code Showcase 3

Crash-Safe Batch Execution Pattern

Family config cell. One variable, FAMILY, drives every path and every output location in the notebook. Switching from KSI to ADS to CCM is a one-line change plus a kernel restart. The pipeline code is identical across all three families.

python
# Pattern used in every AI-calling notebook

# 1. Load existing results (skip completed work)
if os.path.exists(output_csv):
    done = pd.read_csv(output_csv)
    completed_ids = set(done['requirement_id'])
else:
    done = pd.DataFrame()
    completed_ids = set()

# 2. Filter to remaining work
remaining = df[~df['requirement_id'].isin(completed_ids)]
print(f"{len(completed_ids)} already done, "
      f"{len(remaining)} remaining")

# 3. Process with immediate persistence
def process_row(row):
    prompt = build_prompt(row)

    # Save prompt to disk BEFORE calling API
    save_prompt(prompt, row['requirement_id'])

    response = openai_client.chat(model="gpt-5", ...)

    # Save response to disk IMMEDIATELY after API returns
    save_output(response, row['requirement_id'])

    # Append to CSV immediately (not batched)
    append_to_csv(output_csv, parse_response(response))

    return response

# 4. Parallel execution with 5 workers
with ThreadPoolExecutor(max_workers=5) as executor:
    futures = {
        executor.submit(process_row, row): row
        for _, row in remaining.iterrows()
    }
PropertyDetail
Resume from FailureCompleted rows detected on re-run via CSV check. Interrupted at row 47 resumes from row 48
Immediate PersistenceEach result written to CSV immediately after API response. No in-memory batching that could be lost
Full Audit TrailEvery prompt and response saved as individual files. Reproducible and auditable per row
5 Parallel WorkersThreadPoolExecutor with 5 workers on all GPT-5 calls. Balances throughput against API rate limits

Code Showcase 4

Family Parameterization. One Pipeline, Three Control Families

Prompt and response logger. Every AI call writes to disk with the row ID, the timestamp, the exact prompt, and the exact response. If an auditor questions a decision months later, I can show them the trail.

python
# family-config cell (present in every notebook)
FAMILY = "KSI"  # or "ADS" or "CCM"

# All paths derived from FAMILY:
SOURCE = f"FILES/REFERENCE/fedramp/20x/TRUE_SOURCE_FILES/{FAMILY}/{FAMILY}_Golden_Template.csv"
OUTPUT = f"code/testing/ai_response/{FAMILY}/"
LABELS = [f"family:{FAMILY.lower()}"]  # family:ksi, family:ads, family:ccm

# Per-family output after full pipeline:
# ┌──────────┬──────────┬───────────────────────────┐
# │ Family   │ Controls │ Final Tickets             │
# ├──────────┼──────────┼───────────────────────────┤
# │ KSI      │ 56       │ 127 (13 Epic + 114 Task)  │
# │ ADS      │ 20       │ 31  (3 Epic + 7 Task +    │
# │          │          │      21 Sub-task)          │
# │ CCM      │ 3        │ 83  (3 Epic + 20 Task +   │
# │          │          │      60 Sub-task)          │
# └──────────┴──────────┴───────────────────────────┘
PropertyDetail
Single Config VariableFAMILY = 'KSI' at the top of every notebook. All paths, labels, and output dirs are derived from it
Zero Code ChangesSwitching from KSI to ADS or CCM is a one-line edit + kernel restart. No pipeline code modifications
Scale DifferenceKSI produces 127 tickets from 56 controls; CCM produces 83 from just 3 controls. The pipeline handles both scales
Label Taggingfamily:ksi / family:ads / family:ccm labels auto-generated for Jira filtering and bulk operations

Code Showcase 5

Jira Upload. Auto-Detecting Instance Configuration

Jira upload pass. Builds the Epic, Task, and Subtask structure from the deterministic merge output, and uses the Jira REST API to create everything. Error handling retries transient failures and records hard failures for manual review.

python
# Auto-detect subtask issue type name
# (Sunstone uses 'Subtask', SearchStax uses 'Sub-task')
issue_types = jira.get(f"/rest/api/3/issue/createmeta/...")
subtask_name = next(
    t['name'] for t in issue_types
    if t['name'].lower().replace('-', '') == 'subtask'
)

# Auto-detect project style
# (next-gen = team-managed, classic = company-managed)
project = jira.get(f"/rest/api/3/project/{PROJECT_KEY}")
project_style = project.get('style', 'classic')

# Auto-detect Epic Link custom field ID
# (varies per instance: customfield_10014, customfield_10600)
if project_style == 'classic':
    fields = jira.get("/rest/api/3/field")
    epic_link_field = next(
        f['id'] for f in fields
        if f['name'] == 'Epic Link'
    )

# Upload phases with proper hierarchy:
# Phase 1: Create Epics (root cause themes)
# Phase 2: Create Tasks linked to parent Epics
# Phase 3: Create Sub-tasks linked to parent Tasks
PropertyDetail
Subtask Name DetectionHandles 'Subtask' vs 'Sub-task' naming. A common Jira compatibility issue that causes silent upload failures
Project Style Detectionnext-gen (team-managed) vs classic (company-managed) determines how Epic linking works
Epic Link Field DetectionCustom field ID for Epic Link varies per instance. Auto-detected from the field metadata API
Three-Phase UploadEpics first, then Tasks with Epic linkage, then Sub-tasks with Task linkage. Order enforced for parent ID resolution

Data Lifecycle

End-to-end flow of a single compliance check through the pipeline. Every arrow is a single NDJSON file. Every stage enforces a schema gate and count invariant before writing its output.

L1Data Preparation. Objective Validation + Component Examples
|rev5_objective_questions.csv (updated in place), ai_response/sandbox_rev5/prompts/ + output/
L2Gap Analysis. Control-to-Tech-Stack Mapping
|gaps_results.csv, gaps_structured.csv, prompts/, output/, table_insert/
L3Vanta Integration. Live Test Inventory + NIST Mapping
|vanta_tests_inventory.csv, vanta_to_nist_mapping.csv, map_controls/prompts/ + output/
L4Merge + Scorecard. Deterministic Verdict Logic
|final_compliance_scorecard.csv
L5Custom Tests + Strategic Gaps
|PROMPT_1_CUSTOM_TESTS.csv, PROMPT_2_STRATEGIC_GAPS.csv, payload_custom_tests_LOSSLESS.csv
L6Remediation Master Guide. Root Cause Grouping
|payload_remediation.csv, remediation_master_guide.csv, executive_remediation_roadmap.json
L7Vanta UI Mapping. Configuration Manifest
|payload_ui_mapping.csv, vanta_ui_manifest.csv
L8Jira Ticket Pipeline. Build, Format, Upload
|run_manifest.json
Stage 1

Data Preparation. Objective Validation + Component Examples

GPT-5 reads each Rev5 objective question and validates whether it meets the source schema. If it does, GPT-5 then generates a Component Example. Five parallel workers handle the batch. Output lands in a validated objectives CSV that downstream parts consume.

Inputrev5_objective_questions.csv, KSI_Golden_Template.csv
ProcessingGPT-5 validates each objective question, generates Component Examples showing what compliance looks like in practice. ThreadPoolExecutor with 5 workers for parallel processing.
Outputrev5_objective_questions.csv (updated in place), ai_response/sandbox_rev5/prompts/ + output/
ModelGPT-5 (5 parallel workers)
Key Filepart_1_prompt_sandbox/prompt_sandbox.ipynb
Stage 2

Gap Analysis. Control-to-Tech-Stack Mapping

For every KSI by Control by Part combination, GPT-5 reads the client's tech stack and maps which components satisfy the control. GPT-4.1 Mini runs right after to pull the structured fields out of GPT-5's prose response. Two passes. One for reasoning, one for extraction.

InputKSI_Golden_Template.csv, rev5_objective_questions.csv, client tech stack context
ProcessingPass 1: GPT-5 maps each control part to client infrastructure (one prompt per part, not combined). Pass 2: GPT-4.1 Mini extracts structured CSV fields from GPT-5's prose responses.
Outputgaps_results.csv, gaps_structured.csv, prompts/, output/, table_insert/
ModelGPT-5 (analysis) → GPT-4.1 Mini (extraction)
Key Filepart_2_fedramp_gaps_analysis/fedramp_20x_gaps_analysis.ipynb
Stage 3

Vanta Integration. Live Test Inventory + NIST Mapping

The Vanta GraphQL client paginates the full test inventory. GPT-5 then maps each test to the NIST controls it actually covers. Output is the test coverage dataset the merge stage needs.

InputVanta API (SearchStax instance), NIST control metadata
ProcessingVanta API pull (full test inventory) → CSV export → GPT-5 maps each test to NIST controls
Outputvanta_tests_inventory.csv, vanta_to_nist_mapping.csv, map_controls/prompts/ + output/
ModelGPT-5
Key Filepart_3_ksi_vanta_merge/ksi_vanta_merge.ipynb
Stage 4

Merge + Scorecard. Deterministic Verdict Logic

Deterministic merge. The AI generated mappings join with the live Vanta coverage. Verdict rules run in pure Python. Every row gets a final verdict. Covered, gap, or needs human review. No model involved at the decision layer.

Inputgaps_structured.csv, vanta_to_nist_mapping.csv, vanta_tests_inventory.csv
Processingpandas merge on control identifiers → deterministic verdict assignment based on AI match type + Vanta test status
Outputfinal_compliance_scorecard.csv
ModelNone. Deterministic logic only
Key Filepart_4_final_merge/final_merge.ipynb
Stage 5

Custom Tests + Strategic Gaps

For confirmed gaps, GPT-5 writes the custom Vanta test definition and the strategic remediation plan. Output is a pair of artifacts per gap. A test spec Vanta can ingest, and a remediation plan engineering can execute.

Inputfinal_compliance_scorecard.csv
ProcessingFilter for gaps → split by tool availability → GPT-5 generates custom Vanta test definitions (PROMPT_1) and strategic gap remediation plans (PROMPT_2)
OutputPROMPT_1_CUSTOM_TESTS.csv, PROMPT_2_STRATEGIC_GAPS.csv, payload_custom_tests_LOSSLESS.csv
ModelGPT-5
Key Filepart_5_custom_tests/custom_tests.ipynb
Stage 6

Remediation Master Guide. Root Cause Grouping

GPT-5 groups failures by root cause and produces an executive roadmap. Same 30 gaps don't show up as 30 separate tickets when they're actually three root causes with ten symptoms each. The roadmap collapses them.

Inputfinal_compliance_scorecard.csv (critical gaps), vanta_tests_inventory.csv (failure details)
ProcessingBuild payload_remediation.csv → batch ~30 failures to GPT-5 for root cause grouping + CLI fixes → final master analysis for Executive Roadmap
Outputpayload_remediation.csv, remediation_master_guide.csv, executive_remediation_roadmap.json
ModelGPT-5
Key Filepart_6_payload_remediation/payload_remediation.ipynb
Stage 7

Vanta UI Mapping. Configuration Manifest

GPT-5 generates the Vanta UI configuration manifest. What gets created in the Vanta UI, what gets linked where, what labels and descriptions go on what tests. Output is the change spec someone can execute by hand or via the Vanta API.

Inputfinal_compliance_scorecard.csv (rows with Vanta coverage)
ProcessingBuild payload_ui_mapping.csv → batch ~20 KSIs to GPT-5 for Vanta UI configuration generation
Outputpayload_ui_mapping.csv, vanta_ui_manifest.csv
ModelGPT-5
Key Filepart_7_vanta_ui_mapping/vanta_tests_control_ui_mapping.ipynb
Stage 8

Jira Ticket Pipeline. Build, Format, Upload

Part 8 runs in three subparts. GPT-5 formats every item for audit readability. The pipeline then builds the Epic, Task, and Subtask hierarchy. Jira REST API uploads it. 582 tickets in one run across all three control families.

Inputremediation_master_guide.csv, executive_remediation_roadmap.json
Processing8a: Build v2 CSV (Epics + Tasks with Issue ID / Parent ID hierarchy) → 8b: GPT-5 rewrites to audit format (v3), scrubs to AWS stack (v4), splits into subtasks (v5), generates labels → 8c: Upload to Jira REST API v3 (Phase 1: Epics → Phase 2: Tasks → Phase 3: Sub-tasks)
Outputjira_remediation_import_v2.csv through v5_subtasks.csv, jira_upload_log.csv
ModelGPT-5 (audit formatting + subtask splitting)
Key Filepart_8_jira_pipeline/