How can I add videos to Google Gemini as context (is this even what their newest model is called anymore) and why is it so hard to figure it out? https://gemini.google.com only let's me upload...
Research and experimentation with models presents different problems than I am used to dealing with on a daily basis. The structure of what you want to try out changes often, so I understand why some...
I spent some more time experimenting with thought partnership with language models. I've previously experimented with this idea when building write-partner. Referring back to this work, the prompts...
While I didn't have much success getting gpt-4o to perform Task 1 - Counting Line Intersection from the Vision Language Models Are Blind paper, I pulled down some code and did a bit of testing with...
We probably are living in a simulation and we’re probably about to create the next one. Martin Casado
VLMs are Blind showed a number of interesting cases where vision language models fail to solve problems that humans can easily solve. I spent some time trying to build examples with additional...
Kent wrote this post on how to engage an audience by switching the first and second slide of a presentation. The audience focuses more as they try to fill in the gaps of what you've introduced them...
I've been chatting with qwen2, a model from Alibaba. I mostly chatted with it in English but it appears to support several other languages and I noticed a bit of Chinese leaking through even though I...
I was inspired by Daniel's post to add sidenotes to this blog. I used claude-3.5-sonnet to generate the CSS and HTML shortcode to do this. I was impressed how well it turned out[^1]. It was almost...
A nice read by Stuart on Python development tools. This introduced me to the pyproject.toml configuration file, which is more comprehensive than a requirements file. It's something I'll need to...