kernel_entropy.nli¶
ModernBERT NLI scoring for Kernel Language Entropy.
Computes pairwise semantic similarity between LLM generations using Natural Language Inference. Produces the similarity matrix W for KLE calculation.
Classes
|
Pairwise NLI scoring using ModernBERT-large-nli. |
- class kernel_entropy.nli.ModernBERTScorer(sentences: list[str], model_id: str = 'tasksource/ModernBERT-large-nli', model: AutoModelType | None = None, tokenizer: TokenizersBackend | None = None)[source]¶
Bases:
objectPairwise NLI scoring using ModernBERT-large-nli.
Computes similarity matrix W for Kernel Language Entropy.
- compute(verbose: bool = False) Tensor | tuple[Tensor, dict[tuple[int, int], dict[str, dict[str, float]]]][source]¶
Compute pairwise similarity matrix W.
- For each pair (i, j) where i < j, computes:
W[i,j] = W[j,i] = weighted(NLI(i->j)) + weighted(NLI(j->i))
- Parameters:
verbose – If True, returns (W, raw_probabilities) tuple
- Returns:
N x N symmetric similarity matrix W with W[i,j] in [0, 2], diagonal = 0. If verbose=True, returns (W, raw_probabilities) tuple.
- compute_against_baseline(baseline_idx: int = 0) Tensor[source]¶
Compute KLE similarity between sentences[baseline_idx] and every other sentence.
Runs exactly 2*(N-1) NLI inferences in one forward pass - only the pairs involving the baseline, not the full pairwise matrix.
- Returns:
1-D tensor of length N where result[j] is the bidirectional KLE score between sentences[baseline_idx] and sentences[j], and result[baseline_idx] = 0.