The most current instructions for use of the Mindgard CLI will always be available at https://github.com/mindgard/cli

Pre-requisites

Mindgard’s CLI requires a working python 3.10+ environment, and pip, pipx, or equivalent package manager.

If your organization has custom ssl certificates deployed for purposes of traffic inspection then these must be available to the python certificate store.

Installation

Install the Mindgard CLI with

pip install mindgard

Login to Mindgard

The Midgard CLI works in conjunction with a Mindgard SaaS deployment. Before you can run tests, you will need to login with

mindgard login

If you are a Mindgard enterprise private tenant customer, login to your enterprise instance using the command:

mindgard login --instance <name>

Replace <name> with the instance name provided by your Mindgard representative. This instance name identifies your SaaS, private tenant, or on-prem deployment.

Bulk Deployment

To perform a bulk deployment of the Mindgard CLI:

  1. Login and Configure: Login and Configure the Mindgard CLI on a test workstation
  2. Provision Files: Provision the files contained in the .mindgard/ folder within your home directory to your target instances via your preferred deployment mechanism.

The .mindgard/ folder contains:

  • token.txt: A JWT for authentication.
  • instance.txt (enterprise only): Custom instance configuration for your SaaS or private tenant.

Testing a Model or Application

Testing an external AI model uses the test command, and can target either LLMs or Image Classifiers. The general command form is:

mindgard test <name> --url <url> <other settings>

Config Files

To test a custom API, you will need to specify a few options to tell Mindgard how to interface with your API. These can be specified as command line flags or in a .toml config file. Mindgard will automatically use configuration from a file name mindgard.toml in the current directory, or specify a different filename with

mindgard test --config my-file-name.toml

Testing LLMs

To test an LLM you will need to specify a few parameters to help Mindgard interface with your API. Here is an example command to test an inference API:

mindgard test my-model-name \ --url http://127.0.0.1/infer \ # url to test --selector '["response"]' \ # JSON selector to match the textual response --request-template '' \ # how to format the system prompt and prompt in the API request --system-prompt 'respond with hello' # system prompt to test the model with
--url

The URL is for an API endpoint that accepts HTTP POST requests representing user input to the model or application. Mindgard will POST adversarial inputs to this url.

--selector

The Selector is a JSON Path expression (https://jsonpath.com), that tells Mindgard how to identify your Model’s response within the API response.

Your browser devtools may be useful to observe the structure of your API response to determine what this should be set to. In the example in the below screenshot “$.text” would be used to match the text response from the chatbot.

--request-template

The Request Template tells Mindgard how to format an outbound request to your test target API.

Your browser devtools may be useful to observe the structure of the outbound request.

There are two template placeholders you must include in your Request Template.

— Mindgard will replace this placeholder with an adversarial input as part of an attack technique.

— Mindgard will replace this with the system prompt you specify below. This will allow you to test how the system behaves with different system instructions.

The screenshot above would require a Request Template of

--system-prompt

This flag specifies the system prompt for the AI model. If you’re testing a model inference API directly, you may wish to include the real system prompt used by your application here to simulate its performance as part of the wider application.

If the system prompt is not relevant to your tests, you may place a benign placeholder here e.g. “Please answer the following question:”

If you include a relevant system prompt here Mindgard will include in its testing an evaluation of whether the system prompt instructions can be bypassed.

--help

All of the test options are available via

mindgard test --help
--domain

The Domain option allows you to choose to test with datasets and goals relevant to your system domain. For example, the Finance domain covers scenarios like abuse to enable fraud or money laundering. The SQL Injection option covers scenarios such as abuse of an LLM component to bypass a WAF and expose an SQL Injection vulnerability to exploit.

--mode

The mode option takes a duration setting. This controls a tradeoff between speed of the test, and the confidence of any scores given. Dynamic testing of inherently non-deterministic AI systems has inherent limitations in confidence of results. A mitigation for this is running tests for longer with more samples to increase confidence. The tradeoff being tests will take longer and increase your model hosting costs.

--json

The JSON flag switches from a human readable output to a JSON output, for use in automated workflows and to aid in composition with other tooling. See the workflow integrations section of this guide for more details.

Testing Image Classifiers

Mindgard supports any image model compatible with an API compatible with HuggingFace’s InferenceEndpoints for image classifiers.

Image models require a few more parameters than LLMs so we recommend using a configuration file:

target = "my-custom-model" model-type = "image" api_key = "hf_###" url = "https://####.@@@@.aws.endpoints.huggingface.cloud" dataset = "beans" labels = ''''''

After saving as image-model-config.toml, it can be used in the test command as follows:

mindgard test --config=image-model-config.toml
Image Classifier Labels

Many image classifiers do not return probabilities for all classes. A config is required to make sure we’re aware of all the labels and tensor indexes for the classes you’re going to send us.

Below is an example for a eurosat model,

curl "https://address.com/model" \ -X POST \ --data-binary '@cats.jpg' \ -H "Accept: application/json" \ -H "Content-Type: image/jpeg"

The image in bytes will be sent in the data field of the POST request, and the HTTP response body should include predictions in the form:

[ ]
Image Classifier Datasets

We include a number of datasets to choose from that cover diverse domains such as facial recognition, medical imaging, satellite imagery, and handwritten digit recognition, allowing for a suite of different custom models to be tested.

CLI DatasetDomainSource/Name
mriClassification of Alzheimers based on MRI scansHuggingFace Alzheimer_MRI
xrayClassification of Pneumonia based on chest x-raysHuggingFace chest-xray-classification
rvltest_miniClassification of documents as letter, memo, etcHuggingFace rvlTest
eurosatClassification of satellite images by terrain featuresHuggingFace eurosat-demo
mnistClassification of handwritten digits 0 - 9TorchVision MNIST
beansClassification of leaves as either healthy or unhealthy.HuggingFace beans

Validate Configuration

Validate your API is accessible and your configuration is working before launching tests. A preflight check is run automatically when submitting a new test, but if you want to invoke it manually:

mindgard validate --url <endpoint_url> <other_settings>

e.g.

mindgard validate \ --url http://127.0.0.1/infer \ # url to test --selector '["response"]' \ # JSON selector to match the textual response --request-template '' \ # how to format the prompts in the API request --system-prompt 'respond with hello' # system prompt to test with

Example CLI Configurations

There are examples of what the configuration file (mymodel.toml) might look like here in the examples/ folder in the Mindgard CLI github repo https://github.com/mindgard/cli

Here are two examples:

Targeting OpenAI

This example uses the built in preset settings for OpenAI. Presets exist for OpenAI, Hugging Face, and Anthropic.

target = "my-model-name" preset = "openai" api_key= "CHANGE_THIS_TO_YOUR_OPENAI_API_KEY" system-prompt = ''' You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information. '''

You will need to substitute your own api_key value.

The target setting is an identifier for the model you are testing within the Mindgard platform, tests for the same model will be grouped and traceable over time.

Altering the system-prompt enables you to compare results with different system prompts in use. Some of Mindgard’s tests assess the efficacy of your system prompt.

Any of these settings can also be passed as command line arguments. e.g. mindgard test my-model-name —system-prompt ‘You are…’. This may be useful to pass in a dynamic value for any of these settings.

Custom API Structure

This example shows how you might test OpenAI if the preset did not exist. With the request_template and selector settings you can interface with any JSON API.

target = "my-model-name" url = "https://api.openai.com/v1/chat/completions" request_template = ''' ''' selector = ''' choices[0].message.content ''' headers = "Authorization: Bearer CHANGE_THIS_TO_YOUR_OPENAI_API_KEY" system_prompt = ''' You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information. '''

The request_template setting specifies how to structure an outgoing message to the model. You will need to specify the placeholders so that Mindgard knows how to pass this information to your custom API.

The url setting should point to an inference endpoint for your model under test. Mindgard will POST messages here formatted by the above request_template setting.

The selector setting is a JSON selector and specifies how to extract the model’s response from the API response. See instructions above for how to format JSON Path selector instructions

The headers setting allows you to specify a custom HTTP header to include with outgoing requests, for example to implement a custom authentication method.

CLI in an CI/CD pipeline

The exit code of a test will be non-zero if the test identifies risks above your risk threshold. To override the default risk-threshold pass —risk-threshold 50. This will cause the CLI to exit with a non-zero exit status if any test results in a risk score over 50.

See an example of this in action here: https://github.com/Mindgard/mindgard-github-action-example

Viewing Results in Web

After running a test, the CLI will provide a link to the results in the Web UI. The results will also trigger any relevant webhook integrations and be shared with any relevant collaborators for the test target. These are detailed later in this guide..

Managing Request load

To avoid overloading a system under test, there are options to control load.

The —rate-limit flag restricts the number of outbound requests to your API under test.

The —parallelism flag sets the maximum amount of requests you want to target your model concurrently. This enables you to protect your model from getting too many requests.

Mindgard requires that your model responds within 60s so set parallelism accordingly (it should be less than the number of requests you can serve per minute).

Then run:

mindgard test --config-file mymodel.toml --parallelism X

Debugging

You can provide the flag —log-level=debug to get some more info out of whatever command you’re running. On unix-like systems this will write stderr to file:

mindgard --log-level=debug test --config=<toml> --parallelism=5 2> stderr.log

If you are having difficulty testing your API, this config file will help support@mindgard.ai diagnose your problem.

Python SDK

If you have an unusual API that is incompatible with the CLI, or not accessible over the network, the Mindgard Python SDK is another option.

The same configurations used with the CLI can be defined programmatically using Mindgard as a library.

Testing Image Model In-Process

This example uses mindgard python sdk to test an image model using a stub image model wrapper. The Wrapper is the programmatic interface to the model and the most likely extensibility point for a custom integration.

import logging import os import random from typing import List import requests from mindgard.test import Test, TestConfig, ImageModelConfig from mindgard.wrappers.image import ImageModelWrapper, LabelConfidence logging.basicConfig(format='%(asctime)s [%(levelname)-8s] %(name)s: %(message)s', level=logging.DEBUG) # Get token for Mindgard API. # Assumes you have previously run the CLI once on this machine and logged in, generating a ~/.mindgard/token.txt config_folder = os.path.join(os.path.expanduser('~'), '.mindgard') with open(os.path.join(config_folder, "token.txt"), "r") as f: refresh_token = f.read().strip() mindgard_access_token = ( requests.post( "https://, ) .json() .get("access_token") ) # Override implementation of our Image Model Wrapper so that instead of calling an API run some custom code class ExampleStubImageModelWrapper(ImageModelWrapper): def __init__(self, labels: List[str]) -> None: super().__init__(url="", labels=labels) # Example returning random scores for provided labels def __call__(self, image: bytes) -> List[LabelConfidence]: return [LabelConfidence(label=l,score=random.uniform(0, 1)) for l in self.labels] image_wrapper = ExampleStubImageModelWrapper(labels=["0","1","2"]) config = TestConfig( api_base="https://api.sandbox.mindgard.ai/api/v1", api_access_token=mindgard_access_token, target="my-image-model", attack_source="user", model=ImageModelConfig( labels = image_wrapper.labels, wrapper=image_wrapper, dataset="mnist" ), parallelism=5 ) test = Test(config) test.run()

Testing Chatbot Applications

Chatbot API Testing

It’s recommended to test a chatbot through its application API. This will usually be more reliable and faster than testing via the UI.

It may be useful to use the browser devtools to locate API to test. You can use the browser devtools to create a CURL command to duplicate the API request, which is then straightforward to translate into a Mindgard CLI command.

This will copy to your clipboard a command like the following (edited for brevity)

curl 'https://example.com/chat/conversation/foobarbaz' \ -H 'accept: */*' \ -H 'accept-language: en-GB,en-US;q=0.9,en;q=0.8' \ -H 'content-type: application/json' \ -H 'cookie: token=foo; token=bar;' \ -H 'origin: https://example.com' \ -H 'user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Safari/537.36' \ --data-raw $''

To translate to a Mindgard command replace -H with —header, replace —data-raw with —request-template and replace curl with mindgard test your-target-name —url

You will also need to replace your user inputs with a placeholder of

The equivalent Mindgard CLI for the above curl command would therefore be

mindgard test your-target-name \ --url 'https://example.com/chat/conversation/foobarbaz' \ --header 'accept: */*' \ --header 'accept-language: en-GB,en-US;q=0.9,en;q=0.8' \ --header 'content-type: application/json' \ --header 'cookie: token=foo; token=bar;' \ --header 'origin: https://example.com' \ --header 'user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Safari/537.36' \ --data-raw $''

Browser Web UI testing

It is also possible to use the Mindgard CLI to test an AI application through a web UI using browser automation. This approach is generally slower and less reliable than the API testing approach above.

This is provided via a wrapper that exposes a suitable API for testing via browser automation. If you need to use this approach you may request access to the mindgard chatbot wrapper via support@mindgard.ai