Kernel entropy¶

Measures semantic uncertainty in LLM generations using KLE (arXiv:2405.20003).

Key files¶

File	Purpose
`pipeline.py`	`compute_kle()`, the main entry point
`generation.py`	`HydraGenerator`, seeded sampling via PoE (pure generation, `is_mcq=False`)
`nli.py`	`ModernBERTScorer`, pairwise NLI similarity
`entropy.py`	`kle_from_similarity()`, KLE math (W → L → K → ρ → VNE)

Commands¶

pixi run -e cuda kle "prompt"     # Run the full pipeline
pixi run -e cuda olmo "prompt"    # Test Hydra OLMo generation only
pixi run -e cuda nli "s1" "s2"    # Test NLI scoring only

See the project README for environment setup, weights download, and Git LFS. The ModernBERT NLI model is fetched from HuggingFace on first use.

Usage¶

from kernel_entropy import compute_kle

entropy = compute_kle(
    prompt="What is the capital of France?",
    n_generations=5,       # Number of responses
    temperature=0.98,      # PoE generation temperature
    lengthscale_t=1.0,     # Heat kernel parameter
)
# entropy ≈ 0 → high certainty
# entropy high → low certainty / possible hallucination