Solve the GPT Branching Problem with an Artifact Pipeline

Agentic AI Edition #6

Jan 08, 2024

A vibrant, pastel-colored digital linework art depicts a large, stylized tree with branches representing different aspects of a meal planning app development. Around the tree are floating artifacts like lists and diagrams, connected by delicate lines, with a thoughtful male figure at the base, symbolizing the decision-making process in app development. — “The Branching Problem”, created by the author using ChatGPT

“Create an app to help me meal plan for the week.”

Imagine if you could ask ChatGPT to implement a full app and provide it as a download. As we discussed last week, the technology isn’t powerful enough for that yet, but AI can already automate parts of the workflow, like requirements definition, task planning, and even coding.

When provided with the correct instructions, ChatGPT (or custom GPTs) can help breakdown the problem recursively. As we divide the problem into sub-problems, we have to keep track of all of the different “branches” that we generate. Branches split into more branches, creating a complex tree structure. I call this The Branching Problem.

We’ll explore a couple methods for using GPT-based tools to create well-defined output artifacts as part of an app design pipeline. At the end, I’ll explain how this can help overcome the Branching Problem.

GPT Pilot agents
OpenAI's new custom GPTs
Thanks for reading Agentic AI! Subscribe for free to receive new posts and support my work.

The Agent Approach: GPT Pilot

The GPT Pilot project implements a proof-of-concept software development pipeline by creating multiple AI agents, each with a defined role and output.

This image displays a flowchart titled “GPT Pilot Workflow,” outlining a software development process with sticky note icons representing different roles: User writes the description, Product Owner Agent asks for clarifications and breaks down user requirements, Architect Agent breaks down tech requirements, DevOps Agent sets up the environment, Tech Lead Agent breaks down the app into development tasks, and Developer Agent writes the code. — Screenshot from the GPT Pilot Github page (linked above)

They have a role for Product Owner, responsible for creating user stories, an Architect role that determines the technical requirements, a Tech Lead that splits the work into development tasks, and a Developer role that writes the code.

They use the OpenAI API to create one or more agents in each of these roles. Think of an “agent” as a single AI instance. The user is guided through a workflow where it collaborates with each of the agents to develop a web application.

First the Product Owner agent asks the user clarifying questions about their app idea. Then the Architect agent suggests a particular set of software frameworks to use. The Tech Lead agent creates a queue of dev tasks that Developer agents then work-on, one task at a time.

This strategy seems like a step in the right direction. AI agents can’t do any step in the workflow entirely on their own, but they can speed-up the development process considerably by guiding the user through a pre-defined process, performing part of the work and then asking the user to provide more info and, eventually, approve or reject the artifact. If an agent receives a rejection from the user, it continues to iterate. If the agent receives an approval, we can continue to the next step in the development pipeline.

In practice, I’ve found that GPT Pilot is not yet usable. I spent a few minutes using it to setup a React.js app with a python Flask backend, something that ChatGPT can easily guide a user to accomplish.

The GPT Pilot agents consistently lost track of what directory they were running in. Overall the test cost me about $1.50 in OpenAI API usage. Not much, but it failed early and often enough that I decided it’s not yet competitive with ChatGPT. We’ll give them some time to fix bugs and make the tool more flexible.

The Custom GPT Approach

In the meantime, we can implement a similar strategy — breaking down a task into steps — but rather than using API-based agents to perform each step, we can use custom GPTs.

With a pipeline of custom GPTs, we’re not charged for usage — just a flat fee of $20/month for the ChatGPT Plus plan.

One downside is that Custom GPTs have a usage limit of 50 requests per 3 hours per user.

3 hours divided by 50 requests equals approximately one request every 4 minutes.

I’ve never personally hit that limit, and I think it’s unlikely as long as you’re not frequently wasting requests with vague or inaccurate prompts. I can integrate ChatGPT-generated code into my codebase in less than 4 minutes, but the bottleneck is figuring-out what to ask ChatGPT next. That typically takes longer than 4 minutes, on average.

Humans don’t think very fast…and we need lots of coffee breaks.

Example: Requirements Gathering GPT

The first step in creating a meal planning app is defining the actual requirements for what the application is going to do. There are a number of different ways to define requirements, but the Agile Development Methodology recommends defining requirements from the perspective of the end user. We call these user stories.

Here are some examples of user stories for the meal planner app idea:

As a user, I want to create an account to access and personalize my experience.
As a user, I want to input a list of ingredients and receive meal recipe suggestions.
As a user, I want to add meals to a customizable weekly meal plan.
As a user, I want to adjust the number of servings for each meal in the plan.
As a user, I want to generate a grocery list based on my weekly meal plan.
As a user, I want to view a recipe with its title, description, ingredients, macronutrient stats, and total price.
As a user, I want to edit ingredient quantities and prices in a recipe.
As a user, I want to set dietary restrictions in my profile settings.
As a user, I want to save my favorite recipes and meal plans for future use.

Note: Technically these should also include why the user wants to do these actions, but I left that out for simplicity.

I created a custom GPT for requirements gathering using the following instructions:

You are an experienced software project manager who manages the entire process of creating software applications for clients from the client specifications to the development. You are talking to a client who wants your team to develop an application for them.

// 1. DO be concise and to-the-point.
// 2. DO ensure every word you say has a very specific purpose.
// 3. DO NOT repeat yourself.
// 4. DO NOT use pleasantries and formalities like "good morning" and "hello".
// 5. DO focus on listening to the client.

GOAL: Gather requirements and create a collection of user stories for the application that will be provided to a software architect for system design.

THINK STEP BY STEP to perform the following steps:
1. ASK the client for a description of their app.
2. ITERATE through the following steps in a loop:
    * STEP 1: SUMMARIZE the app requirements so far.
    * STEP 2: ASK the client to either answer 3 clarifying questions about the app OR approve the the requirements summary as-is.
    * STEP 3: IF client has chosen to answer the clarifying questions rather than approving, LOOP to SUMMARIZE and ASK questions again.
3. Once the requirements are approved, WRITE user stories into a file in MARKDOWN FORMAT. The file should contain bullet items ONLY.
4. PROVIDE that file as a DOWNLOAD.

EXAMPLE USER STORIES for a to-do list app:
- As a user, I want to press a button to create a new to-do list.
- As a user, I want to enter a new todo list item using a text input.
- As a user, I want to delete any list item at any point.
- ...
  
Conduct the conversation as though you are talking to the client.

1. Provide a persona

You are an experienced software project manager

In its default role of “helpful assistant”, ChatGPT gives pretty boring advice. Avoid this by giving the custom GPT a specific persona.

2. Define the Tone

// 1. DO be concise and to-the-point.
// 2. DO ensure every word you say has a very specific purpose.
// 3. DO NOT repeat yourself.
// 4. DO NOT use pleasantries and formalities like "good morning" and "hello".
// 5. DO focus on listening to the client.

I based this particular prompt format on the actual ChatGPT custom instructions, which were accessible via a hack a few weeks ago. The combination of “//” characters, list items, and capital letters allows the model to pay better attention to the directives.

3. Provide the Goal

GOAL: Gather requirements and create a collection of user stories for the application that will be provided to a software architect for system design.

Pretty simple — just tell the model what you’re trying to do.

4. Provide the Method

THINK STEP BY STEP to perform the following steps:
1. ASK the client for a description of their app.
2. ITERATE through the following steps in a loop:
    * STEP 1: SUMMARIZE the app requirements so far.
    * STEP 2: ASK the client to either answer 3 clarifying questions about the app OR approve the the requirements summary as-is.
    * STEP 3: IF client has chosen to answer the clarifying questions rather than approving, LOOP to SUMMARIZE and ASK questions again.
3. Once the requirements are approved, WRITE user stories into a file in MARKDOWN FORMAT. The file should contain bullet items ONLY.
4. PROVIDE that file as a DOWNLOAD.

As OpenAI’s Andrej Karpathy said, “The hottest new programming language is English”.

The problem is that English is not great for giving detailed and precise instructions. That’s why we invented programming languages in the first place!

In this example, I devised my own pseudocode to tell the model how to generate user stories. I was fascinated that the model was able to follow the loop that I gave it. This method opens-up a lot of prompting possibilities.

5. Give Examples

EXAMPLE USER STORIES for a to-do list app:
- As a user, I want to press a button to create a new to-do list.
- As a user, I want to enter a new todo list item using a text input.
- As a user, I want to delete any list item at any point.

If you want to make your GPT consistently output the same format, provide examples! This is the best way to tell it exactly what type of output you’re expecting.

6. Adopt the Persona

Conduct the conversation as though you are talking to the client.

The GPT should jump right into the persona, so I end the instructions with this line about talking directly to the client.

Artifacts Solve the Branching Problem

The Branching Problem

Freeform conversation tends to have a branching, tree-like structure, where each turn in the conversation opens-up several different possibilities for where the conversation can go. Usually we start with the big-picture and then drill down to the details. Unfortunately, we can only go down one path at a time, and it can be difficult to backtrack and explore the other paths, especially after multiple layers of branching.

For example, I must decide how to begin elaborating user stories for my meal planning app. I can choose to add user stories for user account management, building ingredient lists, generating meal ideas, or generating the grocery list, but I can only explore one feature set at a time. Eventually I will need to backtrack and define the others. I’ll need to traverse the entire tree of functionality.

Artifacts as Shared Memory

We can handle branching by building an artifact.

Since I’m using the GPT to build a specific artifact, a list of user stories, it becomes easier to “go back” to a previous branch that I missed. I can always look through the list for any topics that require more elaboration and then ask the GPT to add more user stories. In this way, the artifact serves as my external memory.

What’s more, this method also forces the GPT to keep the latest version of the artifact fresh in its context, since the artifact is updated and repeated at each iteration. Once I’m done detailing a particular feature, like meal generation, the GPT outputs the entire updated user story list. I can continue immediately with a different topic, like grocery list generation, without needing to re-explain the whole app to the GPT.

The artifact serves as external shared memory for both the user and the custom GPT.

Multi-Level Artifacts

When defining requirements, I’m using a flat list as my output artifact, but you can also mimic a tree structure in your artifact by using multi-level lists. Use bullets for unordered components and numbered lists for sequences.

For example, if I’m outlining the software modules for an application, it may be necessary to split the design into general modules and then have several sub-modules for each high-level module. For instance, I may want a “meal plan” module with sub-modules for “editing the meal plan” and “generating the grocery list”. This example is a bit contrived, but multi-level designs can be useful for more complex projects.

Artifacts Feed the Pipeline

Artifacts provide shared memory between you and the model, but they can also be used to connect different parts of a pipeline. User stories must be defined before determining technical requirements. Technical requirements must be outlined before designing user interfaces or writing code.

In summary, don’t just talk to ChatGPT. Create custom GPTs that build specific artifacts. Chain the custom GPTs together to form a pipeline.

Someday AI will be able to perform the entire pipeline on its own. You’ll ask ChatGPT for an app, and it will read your mind with Neuralink and give you a zip file to download.

In the meantime, use AI to automate processes and perform the tedious parts of your workflow. That’s still what computers do best.

Agentic AI