LLM Conversation Branching

2025-01-17 NYC

tip

The interactive visuals in the post don’t play so nice on small devices. You can still enjoy on your phone but be sure to check them out later on a bigger screen.

I wrote about Conversation Branching on my main blog and I’ve been prototyping an LLM chat app called Delta that has first-class conversation branching. LLM conversations typically follow a linear path, making it unintuitive to explore alternative directions or recover from miscommunications. This post is an interactive introduction to conversation branching but also an exploration into additional UX patterns that using a canvas can enable.

Here are the basic primitives of conversation branching. Click any node on the left. The conversation on the right updates to show the prompts and responses (since we’re modeling an LLM conversation).

What is machine learning?

Machine learning is a branch of AI that enables systems to learn from data

The motivation behind conversation branching is multifold

Keep the context clean

Language models can be sensitive to typos or ambiguous phrases that send the conversation in the wrong direction. Often, if you notice the model is misunderstanding you, it’s more effective to delete the previous conversation turn or update your message. ChatGPT actually provides subtle support for branching with the message “edit” feature.

GIF showing ChatGPT multiple turns of a conversation with edits

Explore parallel threads, maintaining conversation history

While editing message and implicit branching helps keep the context clean, making branches visible and navigable adds additional depth. We now can explore multiple lines of thinking from a shared starting premise, while maintaining the integrity of the context, keeping the LLM focused on the thing you care about, in each of the branches.

We’re now exploring multiple lines of thought branching off of the original question “What is machine learning?”. As above, we could navigate any of these branches individually and linearly by clicking on a node, and branch off any node with a new message. However, the canvas has a ton of content, making it difficult to navigate or get an at-a-glance sense of what each node contains.

LLMs can help us here. We can summarize each node based on the prompt and response, then show summaries when zoomed out to make the content more digestible at a high level.

Try zooming in and out on either of the last two visualizations to see how the content adapts based on zoom level.

Semantic clustering and exploration

With concise summaries, we can now add a new perspective to these conversations. If we generate embeddings of the summaries, which gives us a list of floats for each summary, we can then apply Principal Component Analysis to reduce the many dimensions down to just two. Finally, we use those two floats as coordinates to position each of the conversation nodes, clustering them by their semantic similarity.

Here are the results:

By my assessment, there are three main categories

ML and programming languages/development (bottom left)
ML and medicine/society (top left)
PyTorch (top right)

Equidistant from the three clusters is the more generic, start message.

I always find it interesting to explore latent space like this. Doing so with a conversation is a bit like blazing a trail and identifying new terrain, seeing where similar pieces fit together and finding unexplored areas in the negative spaces.