Testing via CLI

The most current instructions for use of the Mindgard CLI will always be available at https://github.com/mindgard/cli

Pre-requisites

Mindgard’s CLI requires a working python 3.10+ environment, and pip, pipx, or equivalent package manager.

If your organization has custom ssl certificates deployed for purposes of traffic inspection then these must be available to the python certificate store.

Installation

Install the Mindgard CLI with

pip install mindgard

Updating the Mindgard CLI

Update to the latest version of the Mindgard CLI with

pip install --upgrade mindgard

The Mindgard CLI works in conjunction with a Mindgard SaaS deployment. Before you can run tests, you will need to login with

mindgard login

For Mindgard enterprise private tenant customers only, login to your enterprise instance using the command mindgard login --instance <name>.

Replace <name> with the instance name provided by your Mindgard representative. This instance name identifies your SaaS, private tenant, or on-prem deployment.

Bulk Deployment

To perform a bulk deployment of the Mindgard CLI:

Login and Configure: Login and Configure the Mindgard CLI on a test workstation
Provision Files: Provision the files contained in the .mindgard/ folder within your home directory to your target instances via your preferred deployment mechanism.

The .mindgard/ folder contains:

token.txt: A JWT for authentication.
instance.txt (enterprise only): Custom instance configuration for your SaaS or private tenant.

Testing a Model or Application

Testing an external AI model uses the test command, and is used to target your LLMs. The general command form is:

mindgard test <name> --url <url> <other settings>

Config Files

To test a custom API, you will need to specify a few options to tell Mindgard how to interface with your API. These can be specified as command line flags or in a .toml config file. Mindgard will automatically use configuration from a file name mindgard.toml in the current directory, or specify a different filename with

mindgard test --config my-file-name.toml

Testing LLMs

To test an LLM you will need to specify a few parameters to help Mindgard interface with your API. Here is an example command to test an inference API:

mindgard test my-model-name \
  --url http://127.0.0.1/infer \ # url to test
  --selector '["response"]' \ # JSON selector to match the textual response
  --request-template '' \ # how to format the system prompt and prompt in the API request
  --system-prompt 'respond with hello' # system prompt to test the model with

--url

The URL is for an API endpoint that accepts HTTP POST requests representing user input to the model or application. Mindgard will POST adversarial inputs to this url.

--selector

The Selector is a JSON Path expression (https://jsonpath.com), that tells Mindgard how to identify your Model’s response within the API response.

Your browser devtools may be useful to observe the structure of your API response to determine what this should be set to. In the example in the below screenshot “$.text” would be used to match the text response from the chatbot.

--request-template

The Request Template tells Mindgard how to format an outbound request to your test target API.

Your browser devtools may be useful to observe the structure of the outbound request.

There are two template placeholders you must include in your Request Template.

{prompt} Mindgard will replace this placeholder with an adversarial input as part of an attack technique.

{system_prompt} Mindgard will replace this with the system prompt you specify below. This will allow you to test how the system behaves with different system instructions.

The screenshot above would require a Request Template of

--system-prompt

This flag specifies the system prompt for the AI model. If you’re testing a model inference API directly, you may wish to include the real system prompt used by your application here to simulate its performance as part of the wider application.

If the system prompt is not relevant to your tests, you may place a benign placeholder here e.g. “Please answer the following question:”

If you include a relevant system prompt here Mindgard will include in its testing an evaluation of whether the system prompt instructions can be bypassed.

--help

All of the test options are available via

mindgard test --help

--domain

The Domain option allows you to choose to test with datasets and goals relevant to your system domain. For example, the Finance domain covers scenarios like abuse to enable fraud or money laundering. The SQL Injection option covers scenarios such as abuse of an LLM component to bypass a WAF and expose an SQL Injection vulnerability to exploit. More information can be found here

--dataset

Alternatively you may provide your own custom set of prompts relevant to your risks. More information can be found here

The CLI tool also allows users to generate their own datasets for use in either the CLI, or the Burp suite extension. More information on how to generate your own dataset can be found here

--mode

The mode option takes a duration setting. This controls a tradeoff between speed of the test, and the confidence of any scores given. Dynamic testing of inherently non-deterministic AI systems has inherent limitations in confidence of results. A mitigation for this is running tests for longer with more samples to increase confidence. The tradeoff being tests will take longer and increase your model hosting costs.

--json

The JSON flag switches from a human readable output to a JSON output, for use in automated workflows and to aid in composition with other tooling. See the workflow integrations section of this guide for more details.

--preset

The preset is convenient way to indicate the interaction mechanism with your model. For example openai-compatible or huggingface-openai.

--model-name

The model-name is used to specify the model you want to test against. If this not provided it will default depending on the preset you have provided.

Preset	Default Model
openai	gpt-3.5-turbo
huggingface-openai	tgi
openai-compatible	tgi
anthropic	claude-3-opus-20240229

--exclude

To exclude attacks, each attack name or attack category can be excluded using this option. e.g. --exclude 'EvilConfidant' --exclude 'skeleton_key'

More information can be found here

--include

To include attacks, each attack name or attack category can be included using this option. e.g. --include 'EvilConfidant' --include 'skeleton_key'

More information can be found here

--prompt-repeats

To configure the number of times the same prompt is repeated, use this option. This is useful for testing models that are not deterministic and may return different results for the same input.

Default is currently set to 1

Multi-turn attacks with stateful LLM applications

--force-multi-turn

Some attacks, such as Crescendo and HouYi employ multi-turn strategies that rely on chat history continuity. Unless using --preset with OpenAI chat completion, the test command will not maintain this history for you and multi-turn attacks will not function. Using this flag will allow the requests through, and is designed for situations where the target application is maintaining its own history or sessions.

Additionally, this will force the parallelism to 1, which will make the attacks run sequentially to avoid overlapping chat histories.

This can be combined with --include to allow targeting a specific attack to a specific session:

mindgard test <name> --url <url> --include crescendo --force-multi-turn true

Validate Configuration

Validate your API is accessible and your configuration is working before launching tests. A preflight check is run automatically when submitting a new test, but if you want to invoke it manually:

mindgard validate --url <endpoint_url> <other_settings>

e.g.

mindgard validate \ --url http://127.0.0.1/infer \ # url to test --selector '["response"]' \ # JSON selector to match the textual response --request-template '' \ # how to format the prompts in the API request --system-prompt 'respond with hello' # system prompt to test with

Example CLI Configurations

There are examples of what the configuration file (mymodel.toml) might look like here in the examples/ folder in the Mindgard CLI github repo https://github.com/mindgard/cli

Here are two examples:

Targeting OpenAI

This example uses the built in preset settings for OpenAI. Presets exist for OpenAI, Hugging Face, Anthropic, and OpenAI chat-completions compatible APIs hosted by other vendors

target = "my-model-name"
preset = "openai"
api_key = "CHANGE_THIS_TO_YOUR_OPENAI_API_KEY"
system-prompt = ''' You are a helpful, respectful and honest assistant. ... '''

You will need to substitute your own api_key value.

The target setting is an identifier for the model you are testing within the Mindgard platform, tests for the same model will be grouped and traceable over time.

Altering the system-prompt enables you to compare results with different system prompts in use. Some of Mindgard’s tests assess the efficacy of your system prompt.

Any of these settings can also be passed as command line arguments. e.g. mindgard test my-model-name —system-prompt ‘You are…’. This may be useful to pass in a dynamic value for any of these settings.

Custom API Structure

This example shows how you might test OpenAI if the preset did not exist. With the request_template and selector settings you can interface with any JSON API.

target = "my-model-name"
url = "https://api.openai.com/v1/chat/completions"
request_template = '''  '''
selector = '''choices[0].message.content'''
headers = "Authorization: Bearer CHANGE_THIS_TO_YOUR_OPENAI_API_KEY"
system_prompt = ''' You are a helpful, respectful and honest assistant. ... '''

The request_template setting specifies how to structure an outgoing message to the model. You will need to specify the placeholders so that Mindgard knows how to pass this information to your custom API.

The url setting should point to an inference endpoint for your model under test. Mindgard will POST messages here formatted by the above request_template setting.

The selector setting is a JSON selector and specifies how to extract the model’s response from the API response. See instructions above for how to format JSON Path selector instructions

The headers setting allows you to specify a custom HTTP header to include with outgoing requests, for example to implement a custom authentication method.

OpenAI chat completions compatible APIs

This example shows how you can test a chat completions compatible API hosted by any provider, using mistral small 3 hosted by Mistral.

preset = "openai-compatible"
target = "mistral-small-3"
url = "https://api.mistral.ai/v1/"
model_name = "mistral-small-latest"
system_prompt = ''' You are a helpful, respectful and honest assistant. ... '''

CLI in an CI/CD pipeline

This will cause the CLI to exit with an non-zero exit status if the test’s flagged event to total event ratio is >= the threshold. To override the default threshold pass —risk-threshold 50.

See an example of this in action here: https://github.com/Mindgard/mindgard-github-action-example

Viewing Results in Web

After running a test, the CLI will provide a link to the results in the Web UI. The results will also trigger any relevant webhook integrations and be shared with any relevant collaborators for the test target. These are detailed later in this guide..

Managing Request load

To avoid overloading a system under test, there are options to control load.

The --rate-limit flag restricts the number of outbound requests to your API under test.

The --parallelism flag sets the maximum amount of requests you want to target your model concurrently. This enables you to protect your model from getting too many requests.

Mindgard requires that your model responds within 60s so set parallelism accordingly (it should be less than the number of requests you can serve per minute).

Then run:

mindgard test --config-file mymodel.toml --parallelism X

Debugging

You can provide the flag —log-level=debug to get some more info out of whatever command you’re running. On unix-like systems this will write stderr to file:

mindgard --log-level=debug test --config=<toml> --parallelism=5 2> stderr.log

If you are having difficulty testing your API, this config file will help support@mindgard.ai diagnose your problem.

Python SDK

If you have an unusual API that is incompatible with the CLI, or not accessible over the network, the Mindgard Python SDK is another option.

The same configurations used with the CLI can be defined programmatically using Mindgard as a library.

Testing Chatbot Applications

Chatbot API Testing

It’s recommended to test a chatbot through its application API. This will usually be more reliable and faster than testing via the UI.

It may be useful to use the browser devtools to locate API to test. You can use the browser devtools to create a CURL command to duplicate the API request, which is then straightforward to translate into a Mindgard CLI command.

This will copy to your clipboard a command like the following (edited for brevity)

curl 'https://example.com/chat/conversation/foobarbaz' \ -H 'accept: */*' \ -H 'accept-language: en-GB,en-US;q=0.9,en;q=0.8' \ -H 'content-type: application/json' \ -H 'cookie: token=foo; token=bar;' \ -H 'origin: https://example.com' \ -H 'user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Safari/537.36' \ --data-raw $''

To translate to a Mindgard command replace -H with —header, replace —data-raw with —request-template and replace curl with mindgard test your-target-name —url

You will also need to replace your user inputs with a placeholder of

The equivalent Mindgard CLI for the above curl command would therefore be

mindgard test your-target-name \ --url 'https://example.com/chat/conversation/foobarbaz' \ --header 'accept: */*' \ --header 'accept-language: en-GB,en-US;q=0.9,en;q=0.8' \ --header 'content-type: application/json' \ --header 'cookie: token=foo; token=bar;' \ --header 'origin: https://example.com' \ --header 'user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Safari/537.36' \ --data-raw $''

Browser Web UI testing

It is also possible to use the Mindgard CLI to test an AI application through a web UI using browser automation. This approach is generally slower and less reliable than the API testing approach above.

This is provided via a wrapper that exposes a suitable API for testing via browser automation. If you need to use this approach you may request access to the mindgard chatbot wrapper via support@mindgard.ai

Welcome

User Guide

Attack Library

Remediation Library

Pre-requisites

Installation

Updating the Mindgard CLI

Bulk Deployment

Testing a Model or Application

Config Files

Testing LLMs

Multi-turn attacks with stateful LLM applications

Validate Configuration

Example CLI Configurations

Targeting OpenAI

Custom API Structure

OpenAI chat completions compatible APIs

CLI in an CI/CD pipeline

Viewing Results in Web

Managing Request load

Debugging

Python SDK

Testing Chatbot Applications

Chatbot API Testing

Browser Web UI testing

Welcome

User Guide

Attack Library

Remediation Library

​Pre-requisites

​Installation

​Updating the Mindgard CLI

​Login to Mindgard

​Bulk Deployment

​Testing a Model or Application

​Config Files

​Testing LLMs

​Multi-turn attacks with stateful LLM applications

​Validate Configuration

​Example CLI Configurations

​Targeting OpenAI

​Custom API Structure

​OpenAI chat completions compatible APIs

​CLI in an CI/CD pipeline

​Viewing Results in Web

​

​Managing Request load

​Debugging

​Python SDK

​Testing Chatbot Applications

​Chatbot API Testing

​Browser Web UI testing

Pre-requisites

Installation

Updating the Mindgard CLI

Login to Mindgard

Bulk Deployment

Testing a Model or Application

Config Files

Testing LLMs

Multi-turn attacks with stateful LLM applications

Validate Configuration

Example CLI Configurations

Targeting OpenAI

Custom API Structure

OpenAI chat completions compatible APIs

CLI in an CI/CD pipeline

Viewing Results in Web

Managing Request load

Debugging

Python SDK

Testing Chatbot Applications

Chatbot API Testing

Browser Web UI testing