008 AI for Operators

Anonymous AI Product Lead at a public company, review of ChatGPT Agent, 9 links

Hi there, 

Welcome back to AI for Operators for our first ~anonymous~ interview. Because sometimes it’s easier to get the real download on AI progress when you don’t have to show your face.

This week, here’s what we’ve got:

  • The Operator: An anonymous AI research and product leader at a public company.

  • The Review: ChatGPT Agent, OpenAI’s latest release

  • The Links: 9 links, including an AI case study from Carta, AI-generated Netflix shows, and how to make AI think you’re a jerk.

The Operator

Anonymous

Our anonymous guest is in an AI research and product role at a public company and operates an independent consultancy doing AI advisory work.

In this episode, our anonymous guest, an AI research and product lead at a public company, discusses his learnings from deploying AI at companies big and small, and the constraints imposed by enterprise scale.

  • The advantages of minimalism: a slim stack - ChatGPT o3 plus Cursor/Warp - powers 95% of his work.

  • Power in context: he loads meetings, docs, and emails so the model retrieves and reasons like an informed teammate.

  • AI competence limits are real - respect them: AI handles tasks where ‘good enough’ is good enough, but for tasks that involve SQL (where he considers himself near the top of the skill distribution), hand crafted code is still the way to go.

  • Enterprise guardrails create opportunity costs: strict browser and API bans delay experimentation and limit productivity gains.

  • Vibe coding has its limits: Yes, your CEO (and that X influencer you follow) are gushing about how much they can build with Cursor, but building and maintaining enterprise-grade systems is still out of reach for AI coding tools, at least without input from experts.

  • Things are about to get better…a lot better: near-term hardware improvements, richer context, and AI browsers will dissolve boundaries between tools, supercharging personal productivity.

  • Don’t skimp out on AI tools: pay for ChatGPT Pro, default to o3, and connect your instance to Google Workspace (if your company allows it).

The Review

This is not a sponsored post.

What It Does

ChatGPT agent is a new capability from the team at OpenAI, released last Thursday. Designed to replace and enhance OpenAI’s first agent release “Operator” (which was novel to test but underwhelming in practice), ChatGPT agent combines the capabilities of Deep Research with the ability to take action in a virtual browser environment. Like a research analyst that can also look through your email inbox…very slowly.

Why Ops Leaders Should Care

Looking up information and taking actions based on your findings sounds like the foundation of what most operators spend their time doing.

Key Features (Pros & Cons)

Pros

  • Access to a virtual computer to take action on any website (including ones where you need to log in).

  • Access to ChatGPT connectors, so that you can search and take actions across your files, calendar, emails, CRM, and more.

  • The ability to create and edit presentations, spreadsheets, and more.

Cons

  • It’s slow: while the capability and accuracy of the agent is much improved since the release of Operator, watching it complete its task can be like watching paint dry.

  • On sites where you log in, it often won’t complete work in the background, which somewhat defeats the purpose of having an agent work on your behalf.

  • Difficult to control and train: while you can observe the agent and stop it whenever you want, there don’t appear to be ways to audit the agent’s actions other than watching a replay of the agent’s action.

An Operator’s Perspective

I gave ChatGPT agent several tasks to test its capabilities across a number of domains.

First, I asked it to create an event on my calendar - a simple task. I had to help it log in, but then it was able to create the event successfully. Unfortunately, it took much longer to complete the task than it would have taken for me to do it myself, and it paused its work whenever I switched over to another tab (due to its handling of sensitive data, apparently). I’d give it a D on this task.

Second, I asked it to read through all of the emails that I had received so far that day and summarize the five most important ones. It began to read through the emails and seemed to understand the context for most of them. However, it was unable to open attachments, which made it somewhat less useful. I got bored of watching it work after five minutes and paused the task. This task gets an Incomplete.

It was pretty good at meeting prep: when I gave it specific instructions to help me prepare for an external meeting I had coming up later that day, its response was detailed and helpful, saving me time. I tried asking it to prep me for all my meetings that day and the quality degraded somewhat - it didn’t pull as much information and it left out some important context on some meetings, despite having access to the relevant emails and my calendar. On meeting prep, I’d give it a B+.

In its announcement post, OpenAI touted its agent’s ability to mimic an investment banking analyst’s work, so I asked it to prepare a three-statement model on a public company for me. It created the spreadsheet and pulled the data, but it was very high-level and didn’t format the spreadsheet at all (despite me asking it to try). Perhaps it’d be better with a longer prompt, but this result earned it a C- from me.

My final test (for now) was for it to pull a list of the largest influencers in a couple of different categories on LinkedIn. It completed this task quickly and well, responding with a well-formatted answer. This saved me a decent amount of time. A- on this.

Other Options

Bottom Line

This is a step forward for OpenAI from its Operator release which was, frankly, unusable. With more time and longer prompts, I can see this agent being quite useful for specific tasks, but more as an extended Deep Research than a general purpose agent. For releases like this, OpenAI would benefit from tempering expectations (is that possible for Sam Altman?) and adding tutorials and templates to help users understand the limits of their tools. I’m looking forward to continuing to explore the limits of what this agent can do, but I expect a general purpose agent that can really supercharge my productivity as an operator is still at least a few months away.

The Links

News

  • That Netflix show you love? It’s AI. On the entertainment behemoth’s latest earning call, CEO Ted Sarandos touted the inclusion of some AI-generated footage in one of its recent shows. If you’ve scrolled through Netflix recently, you may be surprised to find out that any of their new shows are not AI-generated.

  • The Big OpenAI-Softbank announcement? Getting Stargate off the ground is proving complicated, between massive $ amounts, OpenAI corporate structuring challenges, and big egos.

Perspectives

Practical

Thanks for reading,

Tom Guthrie

If you liked this edition of AI for Operators, share it with another operator you know!