Confidence flags (PASS / FLAG / FILTER)
memledger applies a three-tier policy to every search result, gating on effective confidence (not declared confidence).
The three tiers
| Flag | Effective confidence | Effect |
|---|---|---|
| 🟢 PASS | ≥ flag_threshold (default 0.6) | Returned normally |
| 🟡 FLAG | between min_threshold and flag_threshold (default 0.4–0.6) | Returned with confidence_flag="FLAG"; agent code can decide |
| 🔴 FILTER | < min_threshold (default 0.4) | Excluded from results entirely |
Configuration
result = await ml.search(
query="connection pool fix",
namespace="/ops/incidents/payment-svc",
confidence_policy={
"min_threshold": 0.4,
"flag_threshold": 0.6,
},
)
print(result.metadata["confidence_gating"])
# {'passed': 4, 'flagged': 1, 'filtered': 2,
# 'policy': {'min': 0.4, 'flag': 0.6}, ...}
Set thresholds at the call site, on the instance, or globally in memledger.yaml. Per-call wins.
Per-result inspection
Every record returned by search() carries a confidence_flag attribute (PASS or FLAG — filtered records aren't returned). Filtered records are not accessible from the search result today; the gate counts and per-record decisions are available on result.metadata["confidence_gating"].
What gets observed
The confidence_gating block on result.metadata exposes counts plus a per-record breakdown — see Phoenix tracing for charting it over time.
Why three tiers and not two
Two tiers would force a binary choice — block or allow. In practice the FLAG tier is where the most interesting agent behavior lives: "we have a guess, but you should verify before acting." Removing it collapses signal you want.
Three tiers also map cleanly onto how downstream agents react:
- PASS → use directly in reasoning
- FLAG → use, but mark the derived claim as hedged so its
effective_confidencepropagates the uncertainty - FILTER → never seen; cannot contaminate the chain
That separation is what makes weakest-link confidence work end-to-end: a flagged retrieval can still be useful, but its uncertainty is preserved as the chain extends.