Test Results
MITRE Atlas™ Adviser
Available via the top navigation, the MITRE Atlas™ Adviser shows all the AI risks identified from all of your testing with Mindgard, grouped according to the ATLAS framework. Where Mindgard has detected exposure it will be highlighted with a pill icon.
Clicking a category will show details of that adversarial technique, along with a list of the tests where exposure has been detected.
Model Test Results
Accessible via the top navigation, the Model Tests page will list the results of any tests you have run. Your most recent test results will always be shown at the top of the page. Testing history for the same test target will be grouped.
Test Overview
Clicking through from the Model Tests page to a specific target will show details of the test and the various risks identified.
Top left shows an indication of the overall risk of the system under test, based on an aggregation of the risk scores from each test technique.
Top right shows a risk profile. It shows the risk score (which is a success percentage) per category of attack technique. Mindgard runs various attack techniques within each category. In the example provided here, the system under test was particularly susceptible to Jailbreaks and Robustness, however was not susceptible at all to the indirect prompt injection or prompt extraction techniques tested.
Underneath is a cumulative risk score of the target over time, from all of the security tests of the same target. The current result you are viewing is highlighted. Clicking another datapoint will take you to those test results.
This example illustrates the risk score increasing when the system prompt for the application under test was changed to weaken the protection.
The results can also be downloaded as a CSV for import into other tools via the Download Attacks List button.
Lower down on the page is a list of each attack technique tested, along with its respective risk score. Clicking into a single attack row will provide a more detailed report.
Each risk overview page has the same overall structure.
AI Risk Score Pane
Within the top left, you will see details of the scoring methodology. This score is represented as a percentage of attack attempts against the AI model deemed successful for the specific attack technique. A higher proportion of attack samples deemed successful results in a higher risk score.
AI Model Threat Landscape Pane
Top middle presents the threat landscape which shows how the AI model you have selected compares with other similar models from Mindgard’s threat intelligence.
Target Pane
Top right will provide you the attack context, with various details of the attack that was run.
Provenance Pane
Bottom middle you see the provenance, i.e. the details of the inputs and outputs observed during the test. In this case as we are testing an LLM there are text prompts and responses shown.
Remediation Pane
The remediation section shows recommendations to reduce the system’s susceptibility to this technique.
Other Examples
Other test results will follow a similar structure, but remediation and provenance will vary. Here are image and text attack example results.
False Positives or Negatives
Mindgard employs many techniques to classify the results of a test as successful or unsuccessful. As with many forms of security testing, this classification is not and cannot be 100% accurate. Mindgard is continually improving the classification system to improve its accuracy.
In the event you spot a false positive or negative, you can tag it as such here to amend your results and flag the error to improve Mindgard’s detection system in the future.