Parameterization of GPT

The GPT models provided by OpenAI provide a variety of parameters that can change the way the language model responds. Below you can find a list of the most important ones.

Temperature: Temperature (temperature) is a parameter that controls the randomness of the generated text. Lower temperatures result in more deterministic outputs, where the model tends to choose the most likely tokens at each step. Higher temperatures introduce more randomness, allowing the model to explore less likely tokens and produce more creative outputs. It’s often used to balance between generating safe, conservative responses and more novel, imaginative ones.
Max Tokens: Max Tokens (max_tokens) limits the maximum length of the generated text by specifying the maximum number of tokens (words or subwords) allowed in the output. This parameter helps to control the length of the response and prevent the model from generating overly long or verbose outputs, which may not be suitable for certain applications or contexts.
Top P (Nucleus Sampling): Top P (top_p), also known as nucleus sampling, dynamically selects a subset of the most likely tokens based on their cumulative probability until the cumulative probability exceeds a certain threshold (specified by the parameter). This approach ensures diversity in the generated text while still prioritizing tokens with higher probabilities. It’s particularly useful for generating diverse and contextually relevant responses.
Frequency Penalty: Frequency Penalty (frequency_penalty) penalizes tokens based on their frequency in the generated text. Tokens that appear more frequently are assigned higher penalties, discouraging the model from repeatedly generating common or redundant tokens. This helps to promote diversity in the generated text and prevent the model from producing overly repetitive outputs.
Presence Penalty: Presence Penalty (presence_penalty) penalizes tokens that are already present in the input prompt. By discouraging the model from simply echoing or replicating the input text, this parameter encourages the generation of responses that go beyond the provided context. It’s useful for generating more creative and novel outputs that are not directly predictable from the input.
Stop Sequence: Stop Sequence (stop) specifies a sequence of tokens that, if generated by the model, signals it to stop generating further text. This parameter is commonly used to indicate the desired ending or conclusion of the generated text. It helps to control the length of the response and ensure that the model generates text that aligns with specific requirements or constraints.

Roles

In order to cover most tasks you want to perform using a chat format, the OpenAI API let’s you define different roles in the chat. The available roles are system, assistant, user and tools. You should already be familiar with two of them by now: The user role corresponds to the actual user prompting the language model, all answers are given with the assisstant role.

The system role can now be given to provide some additional general instructions to the language model that are typically not a user input, for example, the style in which the model responds. In this case, an example is better than any explanation.

import os
from llm_utils.client import get_openai_client, OpenAIModels

MODEL = OpenAIModels.GPT_4o.value

client = get_openai_client(
    model=MODEL,
    config_path=os.environ.get("CONFIG_PATH")
)

completion = client.chat.completions.create(
  model="MODEL",
  messages=[
    {"role": "system", "content": "You are an annoyed technician working in a help center for dish washers, who answers in short, unfriendly bursts."},
    {"role": "user", "content": "My dish washer does not clean the dishes, what could be the reason."}
  ]
)

print(completion.choices[0].message.content)

Check if you overloaded it. Make sure spray arms aren't blocked. Use proper detergent. Clean filters.

Function calling

As we have seen, most interactions with a language model happen in form of a chat with almost “free” question or instructions and answers. While this seems the most natural in most cases, it is not always a practical format if we want to use a language model for very specific purposes. This happens particularly often when we want to employ a language model in business situations, where we require a consistent output of the model.

As an example, let us try to use GPT for sentiment analysis (see also here). Let’s say we want GPT to classify a text into one of the following four categories:

sentiment_categories = [
    "positive", 
    "negative",
    "neutral",
    "mixed"
]

We could do the following:

messages = []
messages.append(
    {"role": "system", "content": f"Classify the given text into one of the following sentiment categories: {sentiment_categories}."}
)
messages.append(
    {"role": "user", "content": "I really did not like the movie."}
)

response = client.chat.completions.create(
    messages=messages,
    model=MODEL
)

print(f"Response: '{response.choices[0].message.content}'")

Response: 'Category: Negative'

It is easy to spot the problem: GPT does not necessarily answer in the way we expect or want it to. In this case, instead of simply returning the correct category, it also returns the string Category: alongside it (and capitalized Negative). So if we were to use the answer in a program or data base, we’d now again have to use some NLP techniques to parse it in order eventually retrieve exactly the category we were looking for: negative. What we need instead is a way to constrain GPT to a specific way of answering, and this is where functions or tools come into play (see also Function calling and Function calling (cookbook)).

This concept allows us to specify the exact output format we expect to receive from GPT (it is called functions since ideally we want to call a function directly on the output of GPT so it has to be in a specific format).

# this looks intimidating but isn't that complicated
tools = [
    {
        "type": "function",
        "function": {
            "name": "analyze_sentiment",
            "description": "Analyze the sentiment in a given text.",
            "parameters": {
                "type": "object",
                "properties": {
                    "sentiment": {
                        "type": "string",
                        "enum": sentiment_categories,
                        "description": f"The sentiment of the text."
                    }
                },
                "required": ["sentiment"],
            }
        }
    }
]

messages = []
messages.append(
    {"role": "system", "content": f"Classify the given text into one of the following sentiment categories: {sentiment_categories}."}
)
messages.append(
    {"role": "user", "content": "I really did not like the movie."}
)

response = client.chat.completions.create(
    messages=messages,
    model=MODEL,
    tools=tools,
    tool_choice={
        "type": "function", 
        "function": {"name": "analyze_sentiment"}}
)

print(f"Response: '{response.choices[0].message.tool_calls[0].function.arguments}'")

Response: '{"sentiment":"negative"}'

We can now easily extract what we need:

import json 
result = json.loads(response.choices[0].message.tool_calls[0].function.arguments) # remember that the answer is a string
print(result["sentiment"])

negative

We can also include multiple function parameters if our desired output has multiple components. Let’s try to include another parameter which includes the reason for the sentiment.

tools = [
    {
        "type": "function",
        "function": {
            "name": "analyze_sentiment",
            "description": "Analyze the sentiment in a given text.",
            "parameters": {
                "type": "object",
                "properties": {
                    "sentiment": {
                        "type": "string",
                        "enum": sentiment_categories,
                        "description": f"The sentiment of the text."
                    },
                    "reason": {
                        "type": "string",
                        "description": "The reason for the sentiment in few words. If there is no information, do not make assumptions and leave blank."
                    }
                },
                "required": ["sentiment", "reason"],
            }
        }
    }
]

messages = []
messages.append(
    {"role": "system", "content": f"Classify the given text into one of the following sentiment categories: {sentiment_categories}. If you can, also extract the reason."}
)
messages.append(
    {"role": "user", "content": "I loved the movie, Johnny Depp is a great actor."}
)

response = client.chat.completions.create(
    messages=messages,
    model=MODEL,
    tools=tools,
    tool_choice={
        "type": "function", 
        "function": {"name": "analyze_sentiment"}}
)

print(f"Response: '{response.choices[0].message.tool_calls[0].function.arguments}'")

Response: '{"sentiment":"positive","reason":"Enjoyment of the movie and appreciation for Johnny Depp's acting"}'

Here, again, we could also constrain the possibilities for the reason to a certain set. Hence, functions are great to have more consistent answers of the language model such that we can use it in applications.