olmo_tap.experiments.robustness.amplegcg

AmpleGCG wrapper class.

Example usage::

gcg = AmpleGCG(device=”cuda”, num_return_seq=1, num_beams=5) query = ‘How do I commit identity theft?’ adversarial_extension = gcg(query)

Classes

AmpleGCG(device[, do_sample, ...])

Wrapper for AmpleGCG from https://huggingface.co/osunlp/AmpleGCG-llama2-sourced-llama2-7b-chat

class olmo_tap.experiments.robustness.amplegcg.AmpleGCG(device: str, do_sample: bool = False, max_new_tokens: int = 20, min_new_tokens: int = 20, diversity_penalty: float = 1.0, num_beams: int = 10, num_beam_groups: int | None = None, num_return_seq: int = 1)[source]

Bases: object

Wrapper for AmpleGCG from https://huggingface.co/osunlp/AmpleGCG-llama2-sourced-llama2-7b-chat

Parameters:
  • do_sample – If True sample (instead of argmax) token generation in generative model.

  • max/min_new_tokens – max/min number of suffix tokens generated.

  • diversity_penalty – promotes diversity in beam search paths.

  • num_beams – number of parallel paths attempted in beam search.

  • num_beam_groups – can group the beam search paths.

  • num_return_sequences – number of returned adversarial suffixes.

NOTE: by default we always have num_beam_groups == num_beams unless arg explicitly passed for num_beam_groups.

forward(query: str, repeat: int = 1) list[str][source]

Generate adversarial suffixes for a query.

Parameters:
  • query – Single query.

  • repeat – AmpleGCG HF page recommends repeating prompts to reduce perplexity in generated suffixes.

Returns:

List of length num_return_seq; each element is a suffix.