AI tutorial: I built a browser extension in a few hours - you can, too
Hi readers,
I built a browser extension for Chrome in a few hours using Claude - and you can, too! Customizing your browsing experience gives you more control, is fun, secure because you know exactly what third party code is invoked. Going through this exercise will help you build intuition for what AI can do for you and you’ll learn a new tool.
To get started, think of what you find yourself wishing for again and again when you’re online. I often find myself wanting to read articles, blog posts, etc. that are similar to what I’m enjoying and to remember what I read. I like that Substack highlights long-form writing but I’m open to other sites, too. In other words, I want to discover relevant content and remember it.
Next, set-up an IDE and an AI chatbot. Because I use Gemini often, I used Claude to learn a different tool. I used Visual Studio Code (free) as my IDE and created a repository on Github so I could test Copilot in both applications.
Sketch of what I wanted the extension to look like:
Final experience (GIF may not render in email/mobile, read on desktop instead):
Step 1: Prompt AI chatbot
The first prompt I gave to Claude using Google’s recommended task/persona/format/context prompting framework:
I added a local folder on my Mac, copied the HTML and JS code Claude generated and opened the folder in Visual Studio. I knew I was going to add features for recommendations but wanted to see what the first version looked like. In Chrome, click on Settings, Extensions, Manage extensions and Load unpacked extension. Point to the local folder.
With little editing (I changed the font color), this is what I was working with:
The latest version of my extension looks and functions very differently and releasing this version gave me intuition for what collaborating with AI is like. Before writing the extension, I had not debugged with AI. I’d used it to check my writing, find recommendations, and translate across languages but collaborating to code demonstrated to me that it can help me solve well-defined problems better because the overall development was still driven by me.
Step 2: Edit the HTML to show all of the features you want
I added 3 buttons: to persist notes across browsing sessions, download notes in a folder dedicated to reading, and to ask Gemini for a recommendation. I wrote most of the code but again asked Claude for help on how exactly syntax should be. For example, to give enough space between buttons, I needed to set padding properties:
Step 3: Create an API key in Google AI studio.
I wanted Gemini to give me a recommendation based on the contents of what I was reading. Google AI studio makes it really easy to request Gemini via its API for free. Click the button that says Get API key:
I received an error when I first loaded the updated code but I debugged successfully using Thunder Client in VS (detailed at the end of this post).
Step 4: Test the prompt in Gemini and Visual Studio
I wanted to know more about how the recommended sites were similar so after a few tests, I landed on this prompt: "list 3 websites with content similar to ${domain} and provide a description less than 20 words of each website and explain why the site is similar in less than 20 words”. I tested the prompt on gemini.google.com - but you can do that in Visual Studio too.
Step 5: Add code to use Chrome’s downloads API to download your note
True to the name, I want to save notes so I can recall factoids, etc. I read. So I asked Claude to help me prompt the user for a download location. The Chrome browser requires that the user give permission.
Errors I ran into and you might, too
Error: Manifest is not valid JSON
Manifest files are meant to be simple with important metadata and permissions eg for access to the browser’s information. But in changing the Name field, I added a typo. This is a common mistake and Claude rewrote the Manifest file without me asking it to:
Error: Error fetching recommendations
I couldn’t make requests to Gemini’s API and Claude suggested I download the Thunder Client extension in VS. Thunder Client will help you test the connection with different keys (click into the Query tab) and test responses. I tested API key with different request prompts and finally narrowed the issue I had to missing permissions in the manifest.json file.
Thanks for reading Think clearly, do better! Share this post to support my work.