Introduction
I’ve been playing around with LangChain and the OpenAI APIs lately, and I decided to build a tiny project to test things out—a command-line chatbot powered by GPT-4o-mini. I named the assistant Agustrama, and it’s designed to be a friendly, helpful companion in the terminal.
Full code is available on this github repo.
What Does This Chatbot Do?
This bot is super simple:
- It greets the user on the first interaction.
- It keeps track of the conversation history.
- It trims the conversation when it gets too long.
- You can exit anytime by typing
q
All of this is done using LangChain’s chat message utilities and OpenAI’s model API. Here’s a quick walkthrough of how it works.
Setup
First, we load our environment variables and initialize the model.
Create one .env
file in the root of the project directory and create a key OPENAI_API_KEY
with the value associated.
Once .env
file is ready, we are ready to build our chatbot.
First let us load the environment variable using dotenv
and create an instance of ChatOpenAI
.
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
load_dotenv()
model = ChatOpenAI(model="gpt-4o-mini")
We also set up an initial system prompt that defines the assistant’s persona:
messages: Sequence[BaseMessage] = []
system_prompt = SystemMessage(content="Your name is 'Agustrama' and you are a helpful assistant responsible for helping the users in the best possible way. On the first conversation make sure to introduce your self.")
messages.append(system_prompt)
Too add some additional context, system messages in chatbots are used to define the personal of the assistant/bot. Here we are naming the bot as agustrama
- a hypothetical name. messages
will be holding the chat history so it can be either stored or help us maintain the context of the chat.
Implementing the chatbot
We are using a simple loop which does the following operations:
- Waits for the user’s input.
- Appends it to the message list.
- Sends the messages to the model.
- Prints the model’s response.
while True:
chat_messages = messages
msg = input("User: ")
if msg == 'q':
print(f"Agustrama: Thank you very much for contacting me. Hope to see you soon.")
break;
if len(messages) >= 7:
chat_messages = trim_messages(
messages,
token_counter=len,
max_tokens=5,
start_on="human",
end_on=("human", "tool"),
include_system=True,
allow_partial=False
)
user_prompt = HumanMessage(content=msg)
messages.append(user_prompt)
chat_messages.append(user_prompt)
model_response = model.invoke(chat_messages)
print(f"Agustrama: {model_response.content}")
messages.append(model_response)
Every chat thread is created using system message, human message and AI message. human messages are the prompts user asks while ai message is the response from the LLM. LLMs have a context window so in order to ensure we are not breaching that cutoff context, we have to keep trimming the message. Keeping last X messages will help LLM understand the context of the chat.
As we keep asking questions, we keep appending the question and the response in the messages
list. In future, I can cover the process of persisting this chat in a database.
Things covered in this chatbot
messages_to_dict
/messages_from_dict
to serialize/deserialize the conversationtrim_messages
to control the size of the history
Running the experiment
- Setup the virtual environment
- Setup your OPENAI api key
- Execute the notebook and play with the prompts
Next steps
- Adding memory storage (e.g., using a vector store)
- Giving Agustrama a more dynamic personality