olmo_tap.experiments.robustness.amplegcg¶

AmpleGCG wrapper class.

Example usage::: gcg = AmpleGCG(device=”cuda”, num_return_seq=1, num_beams=5) query = ‘How do I commit identity theft?’ adversarial_extension = gcg(query)

Classes

AmpleGCG(device[, do_sample, ...])

Wrapper for AmpleGCG from https://huggingface.co/osunlp/AmpleGCG-llama2-sourced-llama2-7b-chat

class olmo_tap.experiments.robustness.amplegcg.AmpleGCG(device: str, do_sample: bool = False, max_new_tokens: int = 20, min_new_tokens: int = 20, diversity_penalty: float = 1.0, num_beams: int = 10, num_beam_groups: int | None = None, num_return_seq: int = 1)[source]¶

Bases: object

Wrapper for AmpleGCG from https://huggingface.co/osunlp/AmpleGCG-llama2-sourced-llama2-7b-chat

Parameters:

do_sample – If True sample (instead of argmax) token generation in generative model.
max/min_new_tokens – max/min number of suffix tokens generated.
diversity_penalty – promotes diversity in beam search paths.
num_beams – number of parallel paths attempted in beam search.
num_beam_groups – can group the beam search paths.
num_return_sequences – number of returned adversarial suffixes.

NOTE: by default we always have num_beam_groups == num_beams unless arg explicitly passed for num_beam_groups.

forward(query: str, repeat: int = 1) → list[str][source]¶

Generate adversarial suffixes for a query.

Parameters:

query – Single query.
repeat – AmpleGCG HF page recommends repeating prompts to reduce perplexity in generated suffixes.

Returns:

List of length num_return_seq; each element is a suffix.