Skip to main content
Following are some examples of what the configuration file (mymodel.toml) might look like

Using the OpenAI preset

target = "my-model-name"
preset = "openai"
api_key = "CHANGE_THIS_TO_YOUR_OPENAI_API_KEY"
system-prompt = '''
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
'''

Using the Anthropic preset

target = "Anthropic Claude 3 Sonnet"
model_name = "claude-3-sonnet-20240229"
preset = "anthropic"
api_key = "CHANGE_THIS_TO_YOUR_ANTHROPIC_API_KEY"
system_prompt = '''
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
'''

Custom API support (using an Open AI API)

target = "my-model-name"
url = "https://api.openai.com/v1/chat/completions"
request_template = '''
{
    "messages": [
        {"role": "system", "content": "{system_prompt}"},
        {"role": "user", "content": "{prompt}"}],
    "model": "gpt-3.5-turbo",
    "temperature": 0.0,
    "max_tokens": 1024
}
'''
selector = '''
choices[0].message.content
'''
headers = "Authorization: Bearer CHANGE_THIS_TO_YOUR_OPENAI_API_KEY"
system_prompt = '''
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
'''

Custom API

target = "users_model"
url = "http://gpu-a100:8001/infer"
selector = '["response"]'
request-template = '{"prompt": "[INST] {system_prompt} {prompt} [/INST]"}'
# request-template = '{"prompt": "### System\n {system_prompt}\n ### User\n {prompt} ### Assitant\n"}'
system-prompt = 'respond with hello'

GPT using an Azure deployment

target = "gpt2"
model_name = "gpt2"
tokenizer = "gpt2"
headers = "Authorization: Bearer YOUR_BEARER_TOKEN,Content-Type: application/json"  # Replace YOUR_BEARER_TOKEN with the api key for your endpoint in Azure
url = "DEPLOYMENT_ENDPOINT"  # Your deployed scoring endpoint in Azure - eg something like "https://MY_MODEL_NAME.REGION.inference.ml.azure.com/score"
selector = "[0]['generated_text']"
request_template = "{'inputs': {tokenized_chat_template}}"
system_prompt = '''
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
'''

Llma3 deployed in Huggingface

target = "llama-3-8b-instruct"
preset = "huggingface"
url = "https://CHANGE_THIS_TO_YOUR_INFERENCE_ENDPOINT_URL.us-east-1.aws.endpoints.huggingface.cloud"
api_key = "CHANGE_THIS_TO_API_KEY"
# selector = '["response"]'
request_template = '''
{
    "inputs": "<|im_start|>system\n{system_prompt}<|im_end|>\n<|im_start|>user\n{prompt}?<|im_end|>\n<|im_start|>assistant\n",
    "parameters": {
		"do_sample": true,
		"return_full_text": false
	}
}
'''
system_prompt = '''
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
'''
# Not using huggingface? Remove above preset setting and you may need to specify the following
# selector = '["response"]'
# headers = 'Authorization: Bearer CHANGE_THIS_TO_API_KEY"'

Local model

target = "users_model"
url = "http://localhost:9999/infer"
selector = '["response"]'
request-template = '{"prompt": "[INST] {system_prompt} {prompt} [/INST]"}'
system-prompt = '''
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
'''

Mistral02 deployed in Huggingface

target = "mistral-7b-instruct-v0-2"
preset = "huggingface"
url = "XXX"
api_key = "XXX"
request_template = '''
{
    "inputs": "[INST] {system_prompt} {prompt} [/INST]",
    "parameters": {
		"do_sample": true,
		"return_full_text": false
	}
}
'''
system-prompt = '''
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
'''

# Not using huggingface? Remove above preset setting and you may need to specify the following
# selector = '["response"]'
# headers = 'Authorization: Bearer CHANGE_THIS_TO_API_KEY"'
I