Amazon Bedrock On Demand Throughput Error
I was working with Amazon Bedrock to run LLM inference. AWS has its fair share of complexity -- VPCs, subnets, security groups, etc.
I was working with Amazon Bedrock to run LLM inference. AWS has its fair share of complexity -- VPCs, subnets, security groups, etc.
I was reading your blog and had a question about this: "I noticed that my coworker was prompting for specific technical implementations, and Claude was struggling, pulling in too much context and taking an unfocused approach, whereas I would have been much more vague and general to start and...
If you've read any of my writing in the past year, you're probably aware I've heavily adopted agents to build much of the software I write now. What I've done less of is write about the strategies I've used to do this.
Who is finding LLMs useful and who is not? And why is this the case?
Lots of language model providers implement the OpenAI API spec. These look similar in shape but often behave differently in subtle ways. Anthropic's prefill sequences are one such example.
Today, I ran into an issue where I wanted to use repomix to pack a large codebase into a single file to pass to an LLM, but I couldn't paste the output into any of the UIs I typically use. The React apps all became sluggish as I waited for ~500,000 tokens to paste.
These days I use agents that write code often. When I am trying to build a new feature, I first write a markdown spec, then point the agent at it and send it on its way.
RSS feeds for blogs and things you write or create are great. If you read a lot, you probably also have a lot of articles you've read that you share with others and occasionally revisit.
A few concepts for LLM chat UIs
To write this post, I was going to take myself through some of the history of different chat interfaces. This is not that post. I was too impatient and decided to go in without any appreciation for prior art (beyond what I'm already aware of), because it seemed more fun at the time.
You need to use models to build software to really understand their limits
I'm on a flight and wanted to write code to work on an idea. After a few moments of shifting mental gears, I popped open Zed, which allows me to code with a local LLM using ollama. My default impulse when writing code is to prompt a model. At first, I felt somewhat negative about this but with...