#
AI
Tools and resources for pentesting against API endpoints.
#
Evasion, Poisoning, Extraction
- Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference.
- CounterFit - A CLI that provides a generic automation layer for assessing the security of ML models.
- Foolbox - A Python toolbox to create adversarial examples that fool neural networks in PyTorch, TensorFlow, and JAX.
#
Jailbreaking
- EasyJailbreak - An easy-to-use Python framework to generate adversarial jailbreak prompts.
- JailbreakBench - An Open Robustness Benchmark for Jailbreaking Language Models.
#
LLM & Generative AI Red Teaming
- Garak - The LLM vulnerability scanner.
- PromptBreach - The Python Risk Identification Tool for generative AI.
- PyRIT - The Python Risk Identification Tool for generative AI.
- Promptmap2 - A security scanner for custom LLM applications
- Purple Llama - Set of tools to assess and improve LLM security.
#
Model Extraction & Inference Attacks
- Privacy Meter - Audit data privacy in statistical and machine learning algorithms.
- SecretFlow - A unified framework for privacy-preserving data analysis and machine learning.
- ShadowAttack - Stealthy and Effective Physical-world Adversarial Attack by Natural Phenomenon.
#
Data Poisoning & Supply Chain Attacks
- Backdoor Box - A universal pytorch platform to conduct security researches of image classification in deep learning.
- TrojanZoo - The open-sourced Python toolbox for backdoor attacks and defenses.
#
LLM Safety & Guardrails
- Alibi Detect - Algorithms for outlier, adversarial and drift detection.
- Guardrails AI - Adding guardrails to large language models.
- LLM Guard - A comprehensive tool designed to fortify the security of Large Language Models.
- NeMo Guardrails - Toolkit for easily adding programmable guardrails to LLM-based conversational systems.
- Vigil - Detect prompt injections, jailbreaks, and other potentially risky Large Language Model inputs.