Demo: Command Bots with GPT-4
Using the Mineflayer API and OpenAI API to create very fun Minecraft bots
GPT-4 is helping us bridge the gap from boring NPCs with canned dialog to truly dynamic video game characters. Minecraft provides the perfect sandbox for testing this out.
My demo shows how you can leverage GPT-4 to command bots to come to the player, harvest wood from trees, place items in a chest, and answer questions about the contents of their inventory.
I achieve this functionality using a combination of the Mineflayer API and the OpenAI API.
Demo video is linked below, followed by some ramblings about my dreams for the future of gaming.
Demo
NPCs of the Future
Today I want to talk about one aspect of the future of gaming. You may have seen it coming--I'm talking about bots or NPCs (non-player characters) controlled by more realistic AI, specifically Large Language Models or LLMs.
If you don't know what a large languages model is, it's the type of AI model that powers tools like OpenAI's ChatGPT and Google's Gemini.
While I wouldn't put these models on the level of a human just yet, they can be very convincing--especially if you don't know what to look for.
For instance, I frequently see social media posts and comments that I know were written by ChatGPT. I know they were written by ChatGPT because I use ChatGPT all the time, and I know how it talks.
It's...my best friend...Just kidding.
I'll give you a hint. If someone comments on your post talking about how "intriguing" and "insightful" your "perspective" is...it's ChatGPT.
Humans don't really talk like that.
So these models can be used for deception, but on the other hand, it can feel good to talk with LLMs. They're polite, sometimes funny, and they never start scrolling on their phone while you're talking to them.
I do think there are going to be constructive uses for these bot, but we need to "keep them in their place", so to speak. We need transparency. We need to know we're talking to an LLM.
Telling Better Stories in Video Games
Video games provide this transparency. When you're playing Skyrim and you're talking to the Jarl of whatever, you know you're talking to an AI and you're perfectly okay with that. In fact, it's storytelling, it's role-playing, and you really want someone who's going to stay in character. You want the AI to play it's role really well.
Up until now, all the non-player characters or NPCs that we encounter in video games use canned dialogue and actions, but true story characters cannot be created with a canned script.
We want characters who have their own goals, ambitions, backstories, and we want them to be able to pursue those goals while responding to changes in the environment.
For example, if you decide to go crazy in the game and kill the Jarl's son, in past games you'd be thrown in prison...maybe banned from the town...but mostly the gameplay stays the same. In future games, if you kill the Jarl's son, the entire rest of the game should be about you dealing with the consequences of that action.
Maybe multiple towns band together to hunt you down. Maybe you have a hard time getting supplies because everyone's out to get you. You can probably forget about that heroic quest you were on. Or even better, maybe you have to craft some unexpected redemption story.
I'm talking about actual (virtual) consequences for your virtual actions. I think this has the potential to finally make open-world games into a story.
This has always been a point of contention in video games. Games like final fantasy might have a great story, but you're too stuck on one narrow plot line. There's no flexibility. On the other hand, games like Breath of the Wild offer tons of flexibility and exploration, but they lack story elements because the game can't really react dynamically to the player's choices.
In any story, the main character's choices are what make-up the story. You watch them make choices and you watch the consequences. The setting of the story is not as important as we think. The characters can be working normal desk jobs at a defunct paper company in Scranton, PA, and the story can still be highly compelling.
It's all about characters' choices and the consequences of those choices.
Finally with LLMs, game developers will have an opportunity to make choices stick. You as the player will write the story. You will make the choices. All the game developers will have to do is provide a sufficiently dynamic and compelling simulated environment. That will give rise to great stories.
And then some day everyone will stream about their gameplay and it will be like watching TV or a movie because the environment will respond in such interesting ways. It will be like a blending between fiction and reality TV. The story and visuals will be fictional, but the person or people making the choices that form the structure of the story will be real.
Imagine you were watching Lord of the Rings but the person making choices for Frodo was your favorite streamer. The only thing standing between us and that reality is a lack of truly dynamic video game environments...and that is what LLMs can provide.
Do It Yourself
So you might say...great, Bethesda's next Elder Scrolls game will be awesome. But this actually isn't just for the big video game studios. OpenAI has made their state-of-the-art GPTs publicly available. Anyone can use them through an API. We can start creating LLM-powered NPCs today.
In my opinion, Minecraft is hands-down the best game for this. It's already a sandbox game, meaning everything in the entire Minecraft world is dynamic. You can explore wherever you want, build whatever you want, and destroy whatever you want.
I anticipate that games like Breath of the Wild continue to move in this direction, but we'll have to wait. Right now our best simulation environment is Minecraft.
I hope my demo gives you an idea of how powerful video games and simulations are going to become. If you enjoy the video, let me know here on Substack or in the YouTube comments directly. It helps me figure-out what to focus on in future videos.