Simplest Method to Build a Custom Chatbot with GPT-3.5

Using Embedchain to simplify the chatbot building process

Aug 29, 2023

Chatbots powered by Large Language Models (LLM) are reshaping our interaction with technology. Many businesses now use them for customer support. Interestingly, the domain chat.com was sold for over $10M, showing the value of chat interfaces. These chatbots, thanks to advancements in AI, can handle intricate questions, mimicking human conversation.

In this article, we'll explore a new tool that makes building such chatbots easier, leveraging the OpenAI GPT-3.5 Turbo model.

The library we’re talking about is Embedchain.

Supports Multiple Data Types

For a chatbot to be effective, it needs sufficient context. However, there's limited space for this context. To solve this, we transform the info into 'embeddings'. When the chatbot is replying to a query, the most relevant embeddings are included in the context of the chatbot so that it provides accurate information.

In the Embedchain library, we don’t have to do any of this as it is all automatically handled behind the scenes.

Embedchain can parse information from the below sources:

Youtube video
PDF file
Web page
Sitemap
Doc file
Code documentation website loader
Notion (a fancy text/information management app)

So, How Do We Use It?

I recommend using Google Colab to run the below code as it is one of the easiest ways to try out Python code in your browser.

You might be surprised that the actual code for Embedchain is not more than 8 lines.

Let’s install the Embedchain library

Run the below command in the Google Colab notebook to install the Embedchain package

!pip install embedchain

Import a few essentials

Let’s import the os package — we will use it to store our OpenAI key as an environment variable for the Embedchain package to access it. Also, we will import the App from the Embedchain package.

import os
from embedchain import App

Set things up

Add our OpenAI key to the environment and create a bot.

os.environ["OPENAI_API_KEY"] = "sk-xxxxSUPERxxxSECRETxxxKEY"
elon_bot = App()

Add our information sources

We need to specify the sources from which our chatbot can pick information and start replying to user queries. We can do that by the code below.

elon_bot.add("web_page", "https://en.wikipedia.org/wiki/Elon_Musk")
elon_bot.add("youtube_video", "https://www.youtube.com/watch?v=MxZpaJK74Y4")

The above code on running should output something similar to this:

Successfully saved https://en.wikipedia.org/wiki/Elon_Musk. New chunks count: 365 
Successfully saved https://www.youtube.com/watch?v=MxZpaJK74Y4. New chunks count: 9

The bot is ready!

We can start querying the bot and it shall start replying with the information it gathered from the above sources.

To query the bot, run the below code:

elon_bot.query("How many companies does Elon Musk run?")

The bot would start replying like this:

Elon Musk currently runs multiple companies. He is the CEO of Tesla, Inc. and also serves as the CEO of SpaceX. Additionally, he is the founder and CEO of Neuralink and The Boring Company. Therefore, he runs at least four companies.

Here are a few links to learn more:

Conclusion

As constructing chatbots with LLMs becomes increasingly straightforward, there are two distinct aspects that can elevate your bot above the rest. First, there's the art of information curation: ensuring your chatbot is equipped with relevant, concise, and accurate data. This primes it to deliver top-notch responses. Second, and equally crucial, is the chatbot's tone. An engaging, conversational tone can make interactions feel less robotic and more genuine. A lively and interactive tone not only captures users' attention but also fosters a more enjoyable and memorable user experience. It's not just about answering questions; it's about creating a connection.

Note: If you encounter an error with the blinker package while installing embedchain, use the flag --ignore-installed as below to resolve the error:

!pip install embedchain --ignore-installed