Skip to content

Releases: Azure/PyRIT

v0.7.0

17 Mar 16:40
Compare
Choose a tag to compare

What's Changed

Targets:

  • [BREAKING] OpenAIChatTarget has become more generalized to more broadly support OpenAI-compatible models. See the blog describing the changes here!
  • If api_version is set to None when instantiating OpenAITarget objects, it will not be added as a query parameter to requests.
  • Added Google Gemini example environment variables to .env_example and added integration tests for Gemini/OpenAIChatTargets

Converters:

  • [New] AddImageVideoConverter: PyRIT's first video converter! it allows users to add an image to a video in at a specified position. More video converters to come!
  • [New] InsertPunctuationConverter: Inserts various punctuation into a prompt to test model robustness to perturbations.

Orchestrators:

  • [New] ManyShotJailbreakOrchestrator: Prepend a faux dialogue between a human and an AI assistant within a single prompt for the target.
  • [New] [BREAKING] ContextComplianceOrchestrator: Update the context to prime an objective_chat_target to answer. The context is set using instructions defined in context_description_instructions_path, along with an adversarial_chat to generate the first turns to send.
  • [BREAKING] RolePlayOrchestrator improvements: Refactored for greater code re-use
  • FlipAttackOrchestrator improvement: Allow for additional converters applied after the flip attack

Memory:

  • Multimodal Seed Prompts Encoding Metadata: Adding non-text seed prompts to the database will automatically have metadata populated, including format (png, wav, etc.) and things like bitrate and duration for audio and video seed prompts.
  • SeedPrompt Duplicates: Duplicate seed prompts within the same dataset (identical dataset_name) will no longer be uploaded to memory.
  • Using Configured Paths for Multimodal Seed Prompts: Multimodal SeedPrompt file paths within .yaml files no longer use relative paths that break based on where the .yaml files are accessed. Instead, configured paths (located in paths.py) are used.
  • [BREAKING] Removed calls to disposing memory engines in Orchestrator and Prompt Target objects and replaces it with the atexit and weakref methods of cleanup in the Memory interface to ensure cleanup on process exit. Orchestrators and targets no longer support the context manager protocol.
  • Added get_values() method to the SeedPromptDataset class to simplify prompt values extraction from datasets. Optional filtering to retrieve the first and/or last N values has also been implemented.

Scorers:

  • [New] HumanInTheLoopScorerGradio: Create scores from manual human input by running the Gradio interface in a separate process and adds the scores to the database. For now, the possible scores that users can give are "safe" and "unsafe."

Datasets:

  • [New] Added new fetch function for Aya Red-Teaming Dataset
  • [New] Added Pliny's prompts from the l1b3rt4s repo as templates
  • [New] Added the Babelscape ALERT dataset
  • Added support for filtering based on harm categories for PKU-SafeRLHF and AdvBench datasets

Misc:

  • Other changes include various maintenance improvements and bug fixes, addition of integration tests, website enhancements, dependency updates, and doc improvements.

Full list of changes

  • FIX unblock test pipelines by skipping certain tests on Ubuntu and adding Windows additionally by @romanlutz in #727
  • MAINT: Update release version to 0.6.1.dev0 by @nina-msft in #731
  • MAINT: Upgrading DuckDB by @jbolor21 in #712
  • [FEAT][MAINT][4019] Make multi-modal easier to configure in seedprompt files by @shivenchawla in #696
  • FEAT: set favicon for the website by @paulinek13 in #717
  • FEAT: simplify extracting prompt values by @paulinek13 in #718
  • FEAT: add a fetch function for Aya Red-teaming Dataset by @paulinek13 in #713
  • MAINT update Roakey image to have transparent background by @romanlutz in #735
  • FEAT Moonshot Attack Module: Insert Punctuation Attack by @u7780339 in #475
  • FEAT: include scored_prompt_id in orchestrator_identifier of the system prompt by @NicolePell in #725
  • FEAT: Create many shot jailbreak orchestrator by @AdrGav941 in #709
  • MAINT pre-commit hook to remove notebook header from notebooks by @jbolor21 in #737
  • FEAT Add Encoding Data to Multimodal Seed Prompts by @jsong468 in #740
  • FEAT added Pliny's prompts from the l1b3rt4s repo as templates by @joaodunas in #710
  • FEAT Adding babelscape dataset by @Jarro01X in #738
  • FIX: Upgrading Packages by @rlundeen2 in #741
  • FIX: Increasing pipeline timout by @rlundeen2 in #743
  • FEAT PyRIT to not upload duplicate seed-prompts by @shivenchawla in #742
  • MAINT: Azure SQL Integration Test Misc. Updates by @nina-msft in #745
  • FIX Small bug fixes (renaming file, editing MANIFEST) by @jsong468 in #746
  • [BREAKING] FEAT: OpenAI Generalization Improvements by @rlundeen2 in #747
  • FEAT: Add example_count field to ManyShotJailbreakOrchestrator by @nina-msft in #748
  • DOC: Blog: A More Generalized OpenAIChatTarget by @rlundeen2 in #751
  • DOC: Updating git docs by @rlundeen2 in #753
  • FIX: Fixing integration tests broken with OpenAIChatTarget Update by @rlundeen2 in #755
  • FEAT Video Converter: Adding Images to Videos by @jbolor21 in #702
  • FIX: Adding back static js by @rlundeen2 in #761
  • [BREAKING] FEAT: RolePlayOrchestrator Improvements by @rlundeen2 in #758
  • [BREAKING] FIX: Dispose Memory in Memory vs Class Objects by @nina-msft in #752
  • MAINT clean up dependencies by @romanlutz in #757
  • FEAT Adding converter support to many shot jailbreak orchestrator by @AdrGav941 in #760
  • FIX: Default API Version for TTS Target by @jbolor21 in #749
  • [BREAKING] FEAT: Adding Context Compliance Orchestrator by @rlundeen2 in #763
  • DOC: Add Instructions for Tagging Breaking Changes in PR Template by @nina-msft in #765
  • FEAT: support filtering based on harm categories for PKU-SafeRLHF dataset by @paulinek13 in #756
  • DOC Update CCA Documentation for Clarity by @eugeniavkim in #773
  • DOC: Update OpenAI Environment Variable Names in Documentation by @nina-msft in #776
  • FEAT: add harm categories to AdvBench Dataset by @paulinek13 in #732
  • FIX: Allow api_version to be set to None when instantiating OpenAITarget objects by @LeoVrana in #764
  • MAINT standardize Hugging Face token environment variable, add integration tests for Google Gemini and Open AI by @romanlutz in #778
  • FEAT: Gradio HiTL Scorer by @mart123p in #722
  • DOC: clarify OpenAIChatTarget usage with Ollama by @jsdlm in #777
  • FIX: small edits to make integration tests pass by @jsong468 in #780
  • MAINT add notice generation to component governance by @romanlutz in #781
  • MAINT update NOTICE file by @romanlutz in #782

New Contributors

Full Changelog: releases/v0.6.0...releases/v0.7.0

v0.6.0

22 Feb 01:37
Compare
Choose a tag to compare

What's Changed

  • Cookbooks are live, and replace our How To Guide! Cookbooks try to tackle a problem and use the components that work best, instead of our typical documentation which illustrates that many pieces of PyRITs are swappable.

Cookbooks:

Targets:

  • OllamaChatTarget: Implement ability to forward custom parameters directly to the HTTP client
  • HuggingFaceChatTarget: Adds optional keywords device_map, torch_dtype and attn_implementation
  • [New] PlaywrightTarget: Interact with web applications using Playwright. This is particularly useful for testing interactions with web interfaces like chatbots.
  • [New] RealtimeTarget: Send and receive audio with the Realtime API.
  • [New] GroqChatTarget: Interact with Groq's OpenAI-compatible API.

Converters:

  • [New] ANSI Escape Code Converter: AnsiAttackConverter
  • [New] BinaryConverter: Convert input text into binary with configurable bits per character
  • PDFConverter: Updates to support templated and non-templated PDF generation & enabling text injection into existing PDFs
  • [New] TextToHexConverter: Convert text to hexadecimal encoded utf-8 string
  • Add easier querying for converter-supported input/output types

Orchestrators:

  • RedTeamingOrchestrator & CrescendoOrchestrator now support prepended conversations. You can set a system prompt on the objective target using this feature, or provide conversation history as context to continue execution from a specific point.
  • ScoringOrchestrator: Add ability to score responses using filters.
  • PromptSendingOrchestrator: Set Skip Criteria to specify which prompts to skip being sent to the target with this orchestrator.
  • [New] RolePlayingOrchestrator: Single-turn orchestrator which prepends some prompts which describe fictional scenarios to attempt and elicit harmful responses
  • XPIAOrchestrator: Fix to BlobNotFound exception
     
    Memory:
  • [BREAKING] All notebooks must explicitly initialize Central Memory through a new initialize_pyrit() function: #616. This puts ownership into the hands of the user to set where your prompts will be stored. Read more here: Memory
  • Ability to add memory labels on a per-prompt level, specifically useful in Multimodal scenarios
  • Conversation Scores now available when exporting Prompt Data
  • Filter Data by various queries (e.g. prompt ID, orchestrator ID, labels, etc) using get_prompt_request_pieces()
  • Consolidated method to Export Conversations using Filters: export_conversations()
  • SeedPrompts: Support for Multimodal Seed Prompts
  • [BREAKING] NormalizerRequestPieces replaced with SeedPrompts: #648

Scorers:

  • Add tasks by default to scorers to improve scorer accuracy

Misc:

  • Other changes include various maintenance improvements and bug fixes, addition of integration tests, new blog posts, and doc improvements.

Full list of changes

Read more

v0.5.2

03 Dec 23:19
Compare
Choose a tag to compare

What's Changed

  • Pinned the httpx version to 0.27.2 and refactored the codebase to ensure compatibility.
  • Fixed AzureSQLMemory authentication issues by adding token refresh, pool recycling, and pre-ping mechanisms.
  • Redesigned PAIR attack technique to function as a specialized instance of TAP orchestrator, streamlining architecture.
  • Added support for local Hugging Face model checkpoints.

Full list of changes

  • [DOC] Updating README by @rlundeen2 in #579
  • Fix Azure SQL Authentication Errors: Add Token Refresh, Pool Recycling, and Pre-Ping by @rdheekonda in #576
  • FEAT: add support for local model checkpoints and trust_remote_code in HuggingFaceChatTarget by @KutalVolkan in #574
  • FEAT: Refactor PAIR to be a special instance of TAP by @rlundeen2 in #580
  • FIX: httpx proxy arg fix, pinned httpx version by @jsong468 in #589
  • FIX: Not raising exceptions on None responses by @rlundeen2 in #590
  • Fix Test Prompt Response Error Values by @rdheekonda in #591

Full Changelog: v0.5.0...v0.5.2

v0.5.0

27 Nov 00:28
c3a1a48
Compare
Choose a tag to compare

What's Changed

  • PyRIT now has a website

  • We've been working on standardizing orchestrators in terms of naming and functionality:

    • The endpoint (of type PromptTarget) that PyRIT attacks will be referred to as objective_target.
    • The endpoint (of type PromptChatTarget) that helps us craft attacks will be referred to as adversarial_chat.
    • Beyond that, we've settled on a common interface for multi-turn orchestrators with a shared result object.
    • Instead of an attack_strategy arg we require a file path called adversarial_chat_system_prompt_path to make the connection to the adversarial_chat target clearer. Some orchestrators have a default for this, of course.
    • The initial prompt to the adversarial_chat is now called adversarial_chat_seed_prompt to also help with clarity and connection to adversarial_chat
    • Sometimes we use multiple scorers. For that reason, objective_scorer will be the scorer that decides if the objective has been achieved. Other scorers have similarly specific names, e.g., on_topic_scorer in the CrescendoOrchestrator
    • The new standard name for all orchestrators to execute an attack is run_attack_async.

    The standardization is not fully completed yet but will continue in future releases. So far, CrescendoOrchestrator, TreeOfAttacksWithPruningOrchestrator, and RedTeamingOrchestrator have been adjusted.

  • Support for a centralized database using Azure SQL as an optional alternative to a local DuckDB database.

  • Introduced (multi-modal) SeedPrompts and SeedPromptDatasets as a starting point for red teaming ops with integration to our databases.

  • New orchestrators and auxiliary attacks:

    • FuzzerOrchestrator with 5 template converters
    • GCG support via Azure ML pipelines to optimize adversarial suffixes
    • FlipAttackOrchestrator
  • New targets:

    • HuggingFaceChatTarget
    • HTTPTarget
    • Open AI and Azure Open AI targets were refactored to simplify the logic. They now share a common interface OpenAITarget and you can decide between Azure vs. Open AI using is_azure_target=True or False.
  • New datasets:

    • HarmBench
    • PKU-SafeRLHF
    • wmdp-bio, wmdp-chem, and wmdp-cyber (now fetchable from the original data source)
    • AdvBench
    • Decoding Trust Stereotypes
    • LLM-LAT/harmful-dataset
    • tdc23 red teaming dataset
    • TrustAIRLab/forbidden_question_set
    • LibrAI 'Do Not Answer' Dataset
  • New converters:

    • QRCodeConverter
    • AzureSpeechAudioToTextConverter
    • URLConverter
    • HumanInTheLoopConverter
    • ColloquialWordswapConverter
    • UnicodeConfusableConverter (updated with new functionality)
    • CharSwapGenerator
    • MaliciousQuestionGeneratorConverter
    • AsciiSmugglerConverter
    • MathPromptConverter
    • AudioFrequencyConverter
    • ZeroWidthConverter
    • DiacriticConverter
  • New scorers:

    • SelfAskRefusalScorer
    • HumanInTheLoopScorer
    • InsecureCodeScorer
  • We generally use a .env file to configure details of endpoints that PyRIT needs to execute. A new .env.local override file allow for further customization.

  • Finally, PyRIT now comes with several extras that you can install using pip install pyrit[<extra>]

    • dev includes developer dependencies that you shouldn't need unless you plan on contributing to the project.
    • torch includes just pytorch which is needed for some targets (e.g. Hugging Face) or auxiliary attacks (e.g., GCG) but not core functionality. This allows you to choose whether you want to install it.
    • gcg includes extra dependencies that are only needed for running GCG. Since this requires dedicated compute (ideally with GPU) you can choose whether it is required for you.
    • all includes all of the above.

Full list of changes

Read more

v0.4.0

23 Aug 01:36
Compare
Choose a tag to compare

What's Changed

  1. New Advanced Attack Techniques: Expanded orchestrators with advanced attack techniques, including PAIR, tree of attacks, and crescendo strategies.
  2. New Targets: Crucible target, Prompt Shield Target, Azure OpenAI GPT-4o target
  3. New Converters: Added Tense, Emoji, image to text, and Character Space converters.
  4. New Scorers: Scale Scorer, Prompt Shield, and True/False Inverter Scorer
  5. Automatic Scoring & Memory Labels: Introduced automatic scoring in the PromptSendingOrchestrator. Added support for scoring with user-provided memory labels.
  6. Delegation SAS Authentication: Supported delegation SAS authentication for secure interactions with Azure Blob Storage targets.
  7. Improved Resiliency: Enhanced the resiliency of targets, converters, and orchestrators with robust error handling mechanisms.
  8. Bug Fixes & Performance: Various bug fixes, added support for Python 3.12, speedup unit tests
  9. Fetch functionality: Introduced functionality to fetch adversarial datasets, such as SecLists, XStest etc.,
  10. Updated Demo Codes: Replaced demo code examples with the GPT-4o target.

Full List of Changes

New Contributors

Full Changelog: v0.3.0...v0.4.0

v0.3.0

28 Jun 21:19
Compare
Choose a tag to compare

What's Changed

  • New and improved scorers! Many new scorers have been added, and scorers can now be swapped out and made generic.
  • Many new attack techniques and variations have been introduced. These include skeleton key, most of GPTFuzz, adding text to images, repeated token attack, cipherchat, shorten/expand, tone, CodeChameleon, and more. A total of 13 new converters have been added!
  • Framework improvements:
    • Ability to duplicate conversations for reuse (this makes implementation easier for attacks like PAIR/TAP/crescendo).
    • Converters can be added to LLM responses.
    • All framework calls are now async and parallelizable.
    • Error handling and intelligent automatic retries in targets (e.g., for network errors) and converters/scorers (e.g., for JSON deserialization).

Full list of Changes

New Contributors

Full Changelog: v0.2.1...v0.3.0

v0.2.1

01 May 22:53
9e852f1
Compare
Choose a tag to compare

What's Changed

  • added user authentication support for AOAI Chat Targets
  • request validation in targets
  • support for exporting conversations from the memory

Full list of changes

Full Changelog: v0.2.0...v0.2.1

v0.2.0

29 Apr 22:17
Compare
Choose a tag to compare

What's Changed

  • Multi-modal support: You can now input/output various multi-modal targets.
  • XPIA support: Enabling easier second order prompt injection attacks.
  • A more robust local (duckDB) database: Allowing querying and inserting previous conversations.

Full list of Changes

  • Added a ChatMessageNormalizer that formats messages in the template specified by a Hugging Face tokenizer by @blakebullwinkel in #128
  • PromptMemoryEntry Table Added for more Extensible Target Logic by @rlundeen2 in #125
  • Added prompt softener prompt converter by @cseifert1 in #132
  • Dataset Organization and Adding Public Jailbreaks by @rlundeen2 in #131
  • Adding Image Target by @jbolor21 in #118
  • Adding more authentication methods, add capital letters converter by @pgrek001 in #139
  • Add cross-domain prompt injection orchestrator by @romanlutz in #127
  • Added support to target an Ollama endpoint as a prompt chat target by @uskr in #141
  • Normalizer multi modal/flexible support refactor by @rlundeen2 in #143
  • Adding Identifiers to Memory by @rlundeen2 in #145
  • Adding Data Type Normalizer Helpers by @rlundeen2 in #147
  • Updating run_jupytext to cache notebooks that previously passed by @rlundeen2 in #148
  • Gandalf through level7 by @jorisdg in #152
  • Adding Multi-Modal Output Support to Converters by @rlundeen2 in #155
  • Adding TTS Target by @rlundeen2 in #161
  • Updating Gandalf Target to be more clear by @rlundeen2 in #153
  • Support python 3.11 by @romanlutz in #168
  • New Converters: Replace Whitespace and Leetspeak by @jbolor21 in #162
  • Refactored SelfAskGptClassifier into SelfAskScore class and added Likert scale scoring by @blakebullwinkel in #154
  • Fix mypy issues, convert Azure completion class to target, fix AOAI and OAI tests, remove clip embedding class by @romanlutz in #172
  • Converter for prompt text to audio by @pgrek001 in #149
  • Updating PromptSendingOrchestrator to handle multi-modal by @rlundeen2 in #174
  • Generalize XPIA orchestrator by @romanlutz in #163
  • Add Several Content Classifiers by @nina-msft in #175
  • Add AzureOpenAIGPTVChatTarget to Support MultiModal by @rdheekonda in #160
  • Refactoring Dalle Target to support database by @jbolor21 in #156

New Contributors

Full Changelog: v0.1.2...v0.2.0

v0.1.2

22 Mar 03:33
Compare
Choose a tag to compare

What's Changed

Big changes this release include solidifying the orchestrator, converter, target model for attacks, and migrating the local memory storage from a JSON file to a DuckDB instance.

The first two demos have been updated with the new architecture, and two new demos have been added; send all prompts and using prompt converters

Full List of Changes

  • FEAT: Adding StringJoinConverter by @rlundeen2 in #70
  • DOC: Add release instructions by @romanlutz in #57
  • FEAT: Chain Prompt Converters in Normalizer by @rlundeen2 in #73
  • FEAT: Adding Support for 1:N PromptConverters by @rlundeen2 in #75
  • FEAT: Adding NoOpTarget by @rlundeen2 in #79
  • FEAT: Added converter for ascii art by @petebryan in #81
  • FEAT: Add rot13 by @pgrek001 in #80
  • FEAT: Adding Batch/Async Processing to PromptTargets by @rlundeen2 in #91
  • FEAT: add support for chat messages dataset by @dlmgary in #90
  • DOC: Release guidelines and PR template update by @romanlutz in #92
  • DOC: Adding Docs for ChatMessageNormalizer by @rlundeen2 in #93
  • FEAT: Prompt Variation Converter by @jbolor21 in #86
  • DOC: Adding Converter Docs and Demos by @rlundeen2 in #100
  • MAINT: Add red teaming orchestrators to replace RedTeamingBot by @romanlutz in #84
  • FEAT: Making prompt_nop_target into a stream target by @rlundeen2 in #99
  • MAINT: Adding orchestrator abstract base class by @rlundeen2 in #102
  • FIX: simplify flow in red teaming orchestrator code by @romanlutz in #105
  • DOC: use google style docstrings by @romanlutz in #104
  • DOC: add short guide on how to handle stale PRs & introduce standardized prefixes by @romanlutz in #101
  • FEAT: Language Translation Converter by @rlundeen2 in #106
  • FEAT: Add scalable and efficient memory by @rdheekonda in #97
  • FEAT: add support for question answering benchmark by @dlmgary in #94
  • FEAT: New prompt target: AzureBlobStorageTarget by @nina-msft in #95
  • FEAT: Add UTR39 confusability converter by @yonatanzunger in #115
  • MAINT: Refactoring AzureOpenAIChat to only be a promptTarget by @rlundeen2 in #114
  • FEAT: Add support to OpenAI API to use official or custom endpoints by @friyin in #65
  • FEAT: Migrating Azure ML to PromptTarget by @rlundeen2 in #113
  • Various bug fixes and smaller documentation updates by the AI Red Team

New Contributors

Full Changelog: v0.1.1...v0.1.2

v0.1.1

11 Mar 03:25
f9b0739
Compare
Choose a tag to compare

What's Changed

The previous release 0.1.0 did not include the datasets used in the example notebooks. Version 0.1.1 addresses this.

Full list of changes

Full Changelog: https://github.com/Azure/PyRIT/commits/v0.1.1