Input In Output
Description
When evaluating your application we detected prompts our system sent to your LLM in the resopnse sent back from your LLM. If intential, for our system to effectively evaluate your system we require only the new content generated from the LLM. If not intential this can indicate a misconfiguration.
Impact
Our evaluation systems use the output of your LLM to determine classification. If input we sent to your LLM gets repeated in the output this can potentially cause misclassification of results.
Solution
Mindgard CLI
If your LLM application returns the input within the output you can use our Mindgard CLI --selector
parameter to run a regex substring operation to remove the input being sent to Mindgard. e.g.
mindgard --selector '$.reply.`sub(/.*inst\]/, )`'
The above will first apply the JSON Path selector to extract the contents of the “reply” field, and then remove all content up to the last instance of the string inst]
.
So
{"reply": "[inst]I am the input provided to the LLM[/inst] I am the response from the LLM"}
will be transformed into
I am the response from the LLM
for analysis by Mindgard.
LLM Service Provider
Depending on your LLM hosting provider there can be many built in solutions to santaise outputs. For example within HuggingFace Inference Endpoints you can set return_full_text
to false
.