Confidence Calibration

Inkog uses Bayesian confidence calibration to improve detection accuracy over time. As you provide feedback on findings, the system learns which patterns are accurate in your specific codebase.

Why Calibration Matters

Static analysis tools often produce false positives - warnings that aren’t actual vulnerabilities. Traditional tools have fixed confidence scores that never improve. Inkog is different:

Learns from feedback: Each time you mark a finding as true/false positive, calibration improves
Per-rule learning: Different rules calibrate independently based on your feedback
Reliability tracking: The system tells you how confident it is in its calibration

How It Works


Base Confidence → User Feedback → Bayesian Update → Calibrated Confidence
     (0.85)           (FP)              ↓                (0.78)

Each finding includes both the original and calibrated confidence:

Field	Description
`confidence`	Base confidence from pattern definition
`calibrated_confidence`	Adjusted confidence based on feedback
`calibration_reliability`	How stable the calibration is
`calibration_samples`	Number of feedback samples collected

Reliability Levels

Level	Samples	Meaning
`insufficient`	< 5	Not enough data - use base confidence
`low`	5-10	Early calibration, may shift significantly
`moderate`	11-30	Reasonably stable, calibration is useful
`high`	31-100	Stable calibration, high confidence
`very_high`	> 100	Very stable, unlikely to change

When calibration_reliability is “moderate” or higher, use calibrated_confidence for decision-making. It better reflects real-world accuracy in your environment.

Providing Feedback

You can submit feedback through:

Dashboard: Click “Mark as False Positive” or “Confirm” on any finding
API: Use the Feedback API to programmatically submit feedback
CLI: Use inkog feedback --finding-id <id> --type false_positive

Feedback Types

Type	When to Use
`true_positive`	The finding is a real vulnerability that needs fixing
`false_positive`	The finding is not actually vulnerable
`uncertain`	You’re not sure - this still helps calibration

Example: Improving Prompt Injection Detection

Suppose the universal_prompt_injection pattern flags a legitimate system prompt:


# This is flagged but it's safe - it's a fixed system prompt
SYSTEM_PROMPT = "You are a helpful assistant."

You mark it as false_positive
The calibrated confidence drops (e.g., 0.85 → 0.78)
Future scans weight this pattern slightly lower
After 20+ samples, the system knows this pattern’s real accuracy in your codebase

Calibration is per-organization in cloud deployments. Your feedback trains the model for your specific codebase patterns, not others.

Viewing Calibration Stats

Dashboard

Navigate to Settings → Calibration to see:

Calibration status for all rules
Sample counts and reliability levels
Recommendations for rules needing more feedback

API


curl https://api.inkog.io/v1/feedback \
  -H "Authorization: Bearer YOUR_API_KEY"

Returns all calibration data including recommendations for rules that need more feedback samples.

Best Practices

Be consistent: Same type of code should get the same feedback
Provide context: Add notes explaining why something is/isn’t vulnerable
Review regularly: Check calibration stats monthly
Focus on high-volume rules: Prioritize calibrating rules that fire frequently

Integration with CI/CD

You can set confidence thresholds in your CI pipeline:


# inkog.yaml
policy:
  fail_on:
    min_confidence: 0.7  # Use calibrated confidence if available
    severity: HIGH

As calibration improves, borderline findings will be handled more accurately, reducing alert fatigue without missing real vulnerabilities.

Feedback API Reference - API for submitting feedback
Security Scoring - How findings are scored
Configuration - Policy configuration