← Back to projects
AI / Compliance Automation / DevOpsApr 2026

AI-Driven Issue Tracking & Analytics Pipeline

An 8-part AI pipeline that maps FedRAMP 20x controls to a client's technology stack, identifies compliance gaps against live Vanta test data, generates remediation plans, and uploads a fully structured Epic → Task → Sub-task hierarchy to Jira.

Problem

FedRAMP 20x introduced a new control family structure (KSI, ADS, CCM) that doesn't map neatly to existing compliance tooling. Manually analyzing each control against a client's actual infrastructure, cross-referencing Vanta test coverage, identifying gaps, writing remediation tickets, and uploading them to Jira is a multi-week effort per control family — and it has to be repeated for every client engagement.

Solution

Built an end-to-end pipeline split into 8 sequential notebook stages. Each stage produces CSV artifacts that the next stage consumes, creating a traceable chain from raw control data to uploaded Jira tickets. GPT-5 handles all reasoning-intensive work (gap analysis, remediation planning, root cause grouping), while GPT-4.1 Mini handles structured extraction and formatting passes — optimizing for cost and speed where deep reasoning isn't needed.

Impact

  • Processed 3 control families (KSI: 56 controls, ADS: 20 controls, CCM: 3 controls) through the full pipeline end-to-end
  • Generated 582 Jira tickets (Epics + Tasks + Sub-tasks) across all families with proper hierarchy and audit-ready descriptions
  • Reduced the control-to-ticket lifecycle from weeks of manual analysis to a single pipeline run per family
  • Every AI call logged with prompt sent and response received — full audit trail for compliance review

Architecture

  1. 01Part 1 validates Rev5 objective questions and generates Component Examples using GPT-5 with 5 parallel workers
  2. 02Part 2 maps each KSI × Control × Part to the client's tech stack via GPT-5, then extracts structured CSV via GPT-4.1 Mini
  3. 03Part 3 pulls live Vanta tests via API, exports inventory, and maps each test to NIST controls using GPT-5
  4. 04Part 4 joins gap analysis + Vanta mapping + inventory data into a master compliance scorecard with verdict logic
  5. 05Part 5 filters the scorecard for gaps, splits into existing-tool and missing-tool categories, generates custom Vanta test definitions
  6. 06Part 6 groups critical gap failures by root cause, generates CLI fixes, and produces an Executive Remediation Roadmap with 8-10 themes
  7. 07Part 7 generates a Vanta UI configuration manifest mapping KSIs to test configurations
  8. 08Part 8 builds the Jira CSV hierarchy, rewrites descriptions to audit format, scrubs to AWS stack, splits into subtasks, and uploads via REST API

Capabilities

  • ·Rev5 800-53 objective question validation and Component Example generation
  • ·KSI-to-client-tech-stack gap analysis with per-control-part granularity
  • ·Live Vanta test inventory pull and NIST control mapping
  • ·Automated compliance verdict logic (COMPLIANT / CRITICAL GAP / STRATEGIC GAP / PARTIAL)
  • ·Custom Vanta test definition generation for uncovered controls
  • ·Root cause grouping with Executive Remediation Roadmap generation
  • ·Vanta UI configuration manifest for test-to-control mapping
  • ·Jira ticket hierarchy generation (Epic → Task → Sub-task) with audit-ready formatting
  • ·Family parameterization — same pipeline runs for KSI, ADS, or CCM with a single config change
  • ·Crash-safe execution with immediate CSV writes and automatic skip-on-rerun

Stack

PythonOpenAI GPT-5OpenAI GPT-4.1 MiniVanta API (GraphQL)Jira REST API v3AWS Secrets ManagerpandasThreadPoolExecutorJupyter Notebooks

Technical Deep Dive

Architecture internals and annotated code from the production system.

Architecture Overview

The pipeline is deliberately sequential — each of the 8 parts produces CSV artifacts that the next part consumes. This makes every intermediate state inspectable and debuggable, and means a failure at Part 6 doesn't require re-running Parts 1-5. The dual-model strategy (GPT-5 for reasoning, GPT-4.1 Mini for extraction) is applied at the prompt level: within a single notebook, the same row may be processed by GPT-5 first for analysis, then by GPT-4.1 Mini to extract structured fields from the response.

 Rev5 Objective Questions + KSI Golden Template (source data)
[Part 1] GPT-5 validates objectives + generates Component Examples
[Part 2] GPT-5 maps controls to client tech stack → GPT-4.1 Mini extracts structured CSV
[Part 3] Vanta API pulls live tests → GPT-5 maps tests to NIST controls
[Part 4] Deterministic merge + verdict logic → Master Compliance Scorecard
[Part 5] GPT-5 generates custom Vanta test definitions + strategic gap remediation
[Part 6] GPT-5 root cause grouping → Executive Remediation Roadmap
[Part 7] GPT-5 generates Vanta UI configuration manifest
[Part 8a-c] GPT-5 audit formatting → Jira REST API upload (Epic → Task → Sub-task)

Key Architectural Decisions

01

Why GPT-5 for Reasoning, GPT-4.1 Mini for Extraction

GPT-5 handles every task that requires understanding compliance context: mapping a KSI control to a client's specific infrastructure, deciding whether a Vanta test covers a NIST requirement, grouping failures by root cause, and writing remediation plans. GPT-4.1 Mini handles the mechanical follow-up: extracting structured CSV fields from GPT-5's prose response, reformatting ticket descriptions to audit style, and scrubbing vendor-specific references. This split means the expensive model only runs where reasoning quality directly affects output correctness, while the cheaper, faster model handles transformation tasks where the answer is already in the text.

02

Why 8 Separate Parts Instead of One Monolithic Pipeline

Each part produces named CSV files that serve as both checkpoints and audit artifacts. If Part 6 (remediation grouping) needs a prompt tweak, you re-run Part 6 only — Parts 1-5 outputs are stable on disk. This also means a compliance reviewer can inspect the intermediate scorecard (Part 4) before the pipeline generates tickets (Part 8). The sequential design mirrors how a human analyst would work: first understand the controls, then find gaps, then plan remediation, then create tickets.

03

One Mini Prompt Per Control Part, Not Combined

Part 2 sends one GPT-5 prompt per KSI × Control × Part combination rather than batching multiple parts into a single prompt. This prevents cross-contamination — the model's analysis of AC-2 Part (a) doesn't bleed into its analysis of AC-2 Part (b). It also means each response is independently cacheable and re-runnable.

04

Family Parameterization

Every notebook has a family-config cell with FAMILY = 'KSI' (or 'ADS' or 'CCM'). Source data, output paths, and AI response directories are all derived from this single variable. The same pipeline code processes all three control families — switching families is a one-line config change plus a kernel restart.

05

Crash-Safe Execution with Skip-on-Rerun

Every notebook writes results immediately (CSV append or individual output files). On re-run, completed rows are detected and skipped automatically. This means a notebook interrupted at row 47 of 200 resumes from row 48, not row 1. Combined with prompt/response logging, every AI call is reproducible and auditable.

06

Verdict Logic as Deterministic Rules, Not AI

Part 4's verdict assignment (COMPLIANT, CRITICAL GAP, STRATEGIC GAP, LIKELY COMPLIANT, PARTIAL) is pure deterministic logic applied to the merged scorecard — not an AI judgment call. If the AI says 'Direct match' and the Vanta test is passing, the verdict is COMPLIANT. If the Vanta test is actively failing, it's CRITICAL GAP. This keeps the most consequential classification in auditable, rule-based code.

07

Auto-Detecting Jira Instance Configuration

Part 8c auto-detects the subtask issue type name (Subtask vs. Sub-task), project style (next-gen vs. classic), and Epic Link custom field ID — all of which vary between Jira instances. This means the same upload code works against both the Sunstone test environment and SearchStax production without manual configuration.

Code Showcase 1

Dual-Model Strategy — GPT-5 Analysis → GPT-4.1 Mini Extraction

Part 2 demonstrates the core dual-model pattern. GPT-5 receives the full control context and produces a detailed prose analysis of whether the client's infrastructure covers the requirement. GPT-4.1 Mini then receives GPT-5's response and extracts structured fields into CSV columns. This pattern repeats across the pipeline wherever an analysis step produces prose that needs to become structured data.

text
Pass 1 — GPT-5 (Reasoning)
─────────────────────────────────────────────────────────
Input:   KSI control + Rev5 objective + client tech stack
Prompt:  "Analyze whether the client's infrastructure
          satisfies this control requirement. Explain
          the match type, coverage, and any gaps."
Output:  Prose analysis (saved to ai_response/output/)
Cost:    ~$0.03-0.05 per control part
Why GPT-5: Requires understanding compliance semantics,
           client infrastructure context, and gap
           identification — not a pattern-matching task.


Pass 2 — GPT-4.1 Mini (Extraction)
─────────────────────────────────────────────────────────
Input:   GPT-5's prose response from Pass 1
Prompt:  "Extract these fields from the analysis:
          match_type, coverage_status, gap_description,
          recommended_action. Return as CSV row."
Output:  Structured CSV (gaps_structured.csv)
Cost:    ~$0.001-0.003 per control part
Why 4.1 Mini: The answer is already in the text — this
              is field extraction, not reasoning. 10-30x
              cheaper and 3-5x faster than GPT-5.
PropertyDetail
Cost OptimizationGPT-4.1 Mini extraction is 10-30x cheaper per call than GPT-5 — applied to every row where reasoning is already complete
Speed OptimizationGPT-4.1 Mini responds 3-5x faster, reducing total pipeline runtime on extraction-heavy stages
Quality BoundaryGPT-5 makes every judgment call; GPT-4.1 Mini only touches text where the answer already exists in the prose
Audit TrailBoth prompts and responses are saved to disk — GPT-5 analysis in output/, GPT-4.1 Mini extraction in table_insert/

Code Showcase 2

Verdict Logic — Deterministic Classification

Part 4's verdict assignment is pure rule-based logic, not AI inference. The scorecard merge produces a row for every KSI × Control × Part with an AI match type (from Part 2) and a Vanta test status (from Part 3). The verdict is assigned by deterministic if/else logic that an auditor can trace without understanding AI.

python
# Verdict assignment logic (simplified from final_merge.ipynb)
# No AI involved — purely deterministic rules

def assign_verdict(row):
    ai_match = row['match_type']        # from Part 2 (GPT-5)
    vanta_status = row['vanta_status']   # from Part 3 (Vanta API)
    has_vanta = row['has_vanta_test']    # from Part 3 merge

    if ai_match == 'Direct' and vanta_status == 'PASSING':
        return 'COMPLIANT'

    if has_vanta and vanta_status == 'FAILING':
        return 'CRITICAL GAP (Operational Failure)'

    if ai_match == 'No Match' and not has_vanta:
        return 'STRATEGIC GAP (Missing Tool/Policy)'

    if ai_match == 'Direct' and not has_vanta:
        return 'LIKELY COMPLIANT (No Vanta Test)'

    return 'PARTIAL / VERIFICATION REQUIRED'
PropertyDetail
No AIVerdict logic is deterministic if/else — the most consequential classification in the pipeline is fully auditable code
Two InputsAI match type (from GPT-5 gap analysis) + Vanta test status (from live API) — combines AI judgment with ground truth
Five VerdictsCOMPLIANT, CRITICAL GAP, STRATEGIC GAP, LIKELY COMPLIANT, PARTIAL — each maps to a different remediation path
Auditor FriendlyA compliance reviewer can trace any verdict to its two input signals without understanding the AI that produced them

Code Showcase 3

Crash-Safe Batch Execution Pattern

Every AI-calling notebook follows the same crash-safe pattern: load existing results, skip completed rows, process remaining rows with immediate CSV persistence, and log every prompt/response pair. This pattern is consistent across all 8 parts.

python
# Pattern used in every AI-calling notebook

# 1. Load existing results (skip completed work)
if os.path.exists(output_csv):
    done = pd.read_csv(output_csv)
    completed_ids = set(done['requirement_id'])
else:
    done = pd.DataFrame()
    completed_ids = set()

# 2. Filter to remaining work
remaining = df[~df['requirement_id'].isin(completed_ids)]
print(f"{len(completed_ids)} already done, "
      f"{len(remaining)} remaining")

# 3. Process with immediate persistence
def process_row(row):
    prompt = build_prompt(row)

    # Save prompt to disk BEFORE calling API
    save_prompt(prompt, row['requirement_id'])

    response = openai_client.chat(model="gpt-5", ...)

    # Save response to disk IMMEDIATELY after API returns
    save_output(response, row['requirement_id'])

    # Append to CSV immediately (not batched)
    append_to_csv(output_csv, parse_response(response))

    return response

# 4. Parallel execution with 5 workers
with ThreadPoolExecutor(max_workers=5) as executor:
    futures = {
        executor.submit(process_row, row): row
        for _, row in remaining.iterrows()
    }
PropertyDetail
Resume from FailureCompleted rows detected on re-run via CSV check — interrupted at row 47 resumes from row 48
Immediate PersistenceEach result written to CSV immediately after API response — no in-memory batching that could be lost
Full Audit TrailEvery prompt and response saved as individual files — reproducible and auditable per row
5 Parallel WorkersThreadPoolExecutor with 5 workers on all GPT-5 calls — balances throughput against API rate limits

Code Showcase 4

Family Parameterization — One Pipeline, Three Control Families

A single FAMILY variable at the top of every notebook controls source data paths, output directories, and ticket labels. The same pipeline code processes KSI (56 controls → 127 tickets), ADS (20 controls → 31 tickets), and CCM (3 controls → 83 tickets) without code changes.

python
# family-config cell (present in every notebook)
FAMILY = "KSI"  # or "ADS" or "CCM"

# All paths derived from FAMILY:
SOURCE = f"FILES/REFERENCE/fedramp/20x/TRUE_SOURCE_FILES/{FAMILY}/{FAMILY}_Golden_Template.csv"
OUTPUT = f"code/testing/ai_response/{FAMILY}/"
LABELS = [f"family:{FAMILY.lower()}"]  # family:ksi, family:ads, family:ccm

# Per-family output after full pipeline:
# ┌──────────┬──────────┬───────────────────────────┐
# │ Family   │ Controls │ Final Tickets             │
# ├──────────┼──────────┼───────────────────────────┤
# │ KSI      │ 56       │ 127 (13 Epic + 114 Task)  │
# │ ADS      │ 20       │ 31  (3 Epic + 7 Task +    │
# │          │          │      21 Sub-task)          │
# │ CCM      │ 3        │ 83  (3 Epic + 20 Task +   │
# │          │          │      60 Sub-task)          │
# └──────────┴──────────┴───────────────────────────┘
PropertyDetail
Single Config VariableFAMILY = 'KSI' at the top of every notebook — all paths, labels, and output dirs are derived from it
Zero Code ChangesSwitching from KSI to ADS or CCM is a one-line edit + kernel restart — no pipeline code modifications
Scale DifferenceKSI produces 127 tickets from 56 controls; CCM produces 83 from just 3 controls — the pipeline handles both scales
Label Taggingfamily:ksi / family:ads / family:ccm labels auto-generated for Jira filtering and bulk operations

Code Showcase 5

Jira Upload — Auto-Detecting Instance Configuration

Part 8c's upload notebook auto-detects three configuration values that vary between Jira instances: the subtask issue type name, the project management style, and the Epic Link custom field ID. This eliminates manual configuration when switching between test (Sunstone) and production (SearchStax) environments.

python
# Auto-detect subtask issue type name
# (Sunstone uses 'Subtask', SearchStax uses 'Sub-task')
issue_types = jira.get(f"/rest/api/3/issue/createmeta/...")
subtask_name = next(
    t['name'] for t in issue_types
    if t['name'].lower().replace('-', '') == 'subtask'
)

# Auto-detect project style
# (next-gen = team-managed, classic = company-managed)
project = jira.get(f"/rest/api/3/project/{PROJECT_KEY}")
project_style = project.get('style', 'classic')

# Auto-detect Epic Link custom field ID
# (varies per instance: customfield_10014, customfield_10600)
if project_style == 'classic':
    fields = jira.get("/rest/api/3/field")
    epic_link_field = next(
        f['id'] for f in fields
        if f['name'] == 'Epic Link'
    )

# Upload phases with proper hierarchy:
# Phase 1: Create Epics (root cause themes)
# Phase 2: Create Tasks linked to parent Epics
# Phase 3: Create Sub-tasks linked to parent Tasks
PropertyDetail
Subtask Name DetectionHandles 'Subtask' vs 'Sub-task' naming — a common Jira compatibility issue that causes silent upload failures
Project Style Detectionnext-gen (team-managed) vs classic (company-managed) determines how Epic linking works
Epic Link Field DetectionCustom field ID for Epic Link varies per instance — auto-detected from the field metadata API
Three-Phase UploadEpics first, then Tasks with Epic linkage, then Sub-tasks with Task linkage — order enforced for parent ID resolution

Data Lifecycle

End-to-end flow of a single compliance check through the pipeline. Every arrow is a single NDJSON file. Every stage enforces a schema gate and count invariant before writing its output.

L1Data Preparation — Objective Validation + Component Examples
|rev5_objective_questions.csv (updated in place), ai_response/sandbox_rev5/prompts/ + output/
L2Gap Analysis — Control-to-Tech-Stack Mapping
|gaps_results.csv, gaps_structured.csv, prompts/, output/, table_insert/
L3Vanta Integration — Live Test Inventory + NIST Mapping
|vanta_tests_inventory.csv, vanta_to_nist_mapping.csv, map_controls/prompts/ + output/
L4Merge + Scorecard — Deterministic Verdict Logic
|final_compliance_scorecard.csv
L5Custom Tests + Strategic Gaps
|PROMPT_1_CUSTOM_TESTS.csv, PROMPT_2_STRATEGIC_GAPS.csv, payload_custom_tests_LOSSLESS.csv
L6Remediation Master Guide — Root Cause Grouping
|payload_remediation.csv, remediation_master_guide.csv, executive_remediation_roadmap.json
L7Vanta UI Mapping — Configuration Manifest
|payload_ui_mapping.csv, vanta_ui_manifest.csv
L8Jira Ticket Pipeline — Build, Format, Upload
|run_manifest.json
Stage 1

Data Preparation — Objective Validation + Component Examples

Validates Rev5 800-53 objective questions against the KSI Golden Template and generates Component Examples — concrete descriptions of what 'meeting this objective' looks like for each control. GPT-5 with 5 parallel workers processes the full question set.

Inputrev5_objective_questions.csv, KSI_Golden_Template.csv
ProcessingGPT-5 validates each objective question, generates Component Examples showing what compliance looks like in practice. ThreadPoolExecutor with 5 workers for parallel processing.
Outputrev5_objective_questions.csv (updated in place), ai_response/sandbox_rev5/prompts/ + output/
ModelGPT-5 (5 parallel workers)
Key Filepart_1_prompt_sandbox/prompt_sandbox.ipynb
Stage 2

Gap Analysis — Control-to-Tech-Stack Mapping

The core analysis stage. GPT-5 receives each KSI × Control × Part combination alongside the client's technology stack and determines whether the client's infrastructure satisfies the control requirement. A second pass using GPT-4.1 Mini extracts structured fields from GPT-5's prose into a clean CSV. This is where the dual-model architecture matters most — GPT-5 needs to understand compliance semantics to make the mapping judgment, but extracting 'match_type: Direct' from the response is a mechanical task.

InputKSI_Golden_Template.csv, rev5_objective_questions.csv, client tech stack context
ProcessingPass 1: GPT-5 maps each control part to client infrastructure (one prompt per part, not combined). Pass 2: GPT-4.1 Mini extracts structured CSV fields from GPT-5's prose responses.
Outputgaps_results.csv, gaps_structured.csv, prompts/, output/, table_insert/
ModelGPT-5 (analysis) → GPT-4.1 Mini (extraction)
Key Filepart_2_fedramp_gaps_analysis/fedramp_20x_gaps_analysis.ipynb
Stage 3

Vanta Integration — Live Test Inventory + NIST Mapping

Pulls the client's live Vanta test inventory via API, exports it as a structured CSV, then uses GPT-5 to map each Vanta test to the NIST 800-53 controls it covers. This creates the bridge between 'what Vanta is testing' and 'what FedRAMP requires.'

InputVanta API (SearchStax instance), NIST control metadata
ProcessingVanta API pull (full test inventory) → CSV export → GPT-5 maps each test to NIST controls
Outputvanta_tests_inventory.csv, vanta_to_nist_mapping.csv, map_controls/prompts/ + output/
ModelGPT-5
Key Filepart_3_ksi_vanta_merge/ksi_vanta_merge.ipynb
Stage 4

Merge + Scorecard — Deterministic Verdict Logic

Joins the gap analysis output (Part 2) with the Vanta mapping (Part 3) and inventory data to produce the master compliance scorecard. Verdict logic is entirely deterministic: COMPLIANT = AI Direct match + Vanta passing; CRITICAL GAP = Vanta test actively failing; STRATEGIC GAP = no tool exists + no Vanta test; LIKELY COMPLIANT = AI Direct match but no Vanta test to confirm; PARTIAL = everything else.

Inputgaps_structured.csv, vanta_to_nist_mapping.csv, vanta_tests_inventory.csv
Processingpandas merge on control identifiers → deterministic verdict assignment based on AI match type + Vanta test status
Outputfinal_compliance_scorecard.csv
ModelNone — deterministic logic only
Key Filepart_4_final_merge/final_merge.ipynb
Stage 5

Custom Tests + Strategic Gaps

Filters the scorecard for controls lacking Vanta coverage. Splits into two categories: PROMPT_1 (client has the tool but no Vanta test — generate a custom test definition) and PROMPT_2 (client is missing the tool entirely — generate a strategic gap remediation plan). GPT-5 generates both.

Inputfinal_compliance_scorecard.csv
ProcessingFilter for gaps → split by tool availability → GPT-5 generates custom Vanta test definitions (PROMPT_1) and strategic gap remediation plans (PROMPT_2)
OutputPROMPT_1_CUSTOM_TESTS.csv, PROMPT_2_STRATEGIC_GAPS.csv, payload_custom_tests_LOSSLESS.csv
ModelGPT-5
Key Filepart_5_custom_tests/custom_tests.ipynb
Stage 6

Remediation Master Guide — Root Cause Grouping

Takes all critical gap failures (with Vanta failure details) and sends batches of ~30 failures to GPT-5 for root cause grouping and CLI fix generation. A final master analysis pass produces an Executive Remediation Roadmap with 8-10 themes and a 'Top 5 Power Fixes' — the highest-leverage remediations that resolve the most failures.

Inputfinal_compliance_scorecard.csv (critical gaps), vanta_tests_inventory.csv (failure details)
ProcessingBuild payload_remediation.csv → batch ~30 failures to GPT-5 for root cause grouping + CLI fixes → final master analysis for Executive Roadmap
Outputpayload_remediation.csv, remediation_master_guide.csv, executive_remediation_roadmap.json
ModelGPT-5
Key Filepart_6_payload_remediation/payload_remediation.ipynb
Stage 7

Vanta UI Mapping — Configuration Manifest

For controls that do have Vanta coverage, generates a configuration manifest showing exactly how each KSI maps to Vanta's UI — which tests to enable, which settings to configure, and current coverage status.

Inputfinal_compliance_scorecard.csv (rows with Vanta coverage)
ProcessingBuild payload_ui_mapping.csv → batch ~20 KSIs to GPT-5 for Vanta UI configuration generation
Outputpayload_ui_mapping.csv, vanta_ui_manifest.csv
ModelGPT-5
Key Filepart_7_vanta_ui_mapping/vanta_tests_control_ui_mapping.ipynb
Stage 8

Jira Ticket Pipeline — Build, Format, Upload

Three sub-stages that transform the remediation guide into actionable Jira tickets. 8a builds the Epic + Task hierarchy CSV. 8b rewrites descriptions to audit-ready format (Definition of Done, Architectural Directive), scrubs non-AWS references, splits Tasks into 3 Sub-tasks each, and auto-generates labels. 8c uploads to Jira via REST API with auto-detection of instance configuration. The upload is crash-safe — every created ticket is logged immediately, and re-runs skip already-uploaded tickets.

Inputremediation_master_guide.csv, executive_remediation_roadmap.json
Processing8a: Build v2 CSV (Epics + Tasks with Issue ID / Parent ID hierarchy) → 8b: GPT-5 rewrites to audit format (v3), scrubs to AWS stack (v4), splits into subtasks (v5), generates labels → 8c: Upload to Jira REST API v3 (Phase 1: Epics → Phase 2: Tasks → Phase 3: Sub-tasks)
Outputjira_remediation_import_v2.csv through v5_subtasks.csv, jira_upload_log.csv
ModelGPT-5 (audit formatting + subtask splitting)
Key Filepart_8_jira_pipeline/