{{ 'now' | timezone: 'America/New_York' | date: '%b %d, %Y' }}
|
|
|
Welcome to The Hidden Layer. I’m Ian Krietzberg.
So ChatGPT is
finally gearing up to test ads on its platform, which I’ll get into later this week. Today, though, we’re talking about vibe coding: the euphoria surrounding Claude’s A.I. coding tools, my own semi-successful experience using them, and what it all portends for the future of software development. Plus, news and notes on an efficient new A.I. audio model, Waymo’s school bus shortcomings, and Grok’s ongoing “nudify” scandal.
Also mentioned in this issue:
Boris Cherny, Ashley St. Clair, Itamar Friedman, Armando Solar-Lezama, Dario Amodei, Fabian Braesemann, Mike Pappas, Elon Musk, Greg Brockman, and many more…
|
Three Things You Should Know…
|
-
What was that?: Modulate, an A.I. firm that got its start providing live, automated moderation for video games like Call of Duty, today released an entirely new architectural approach to what it calls “conversational intelligence”: the ensemble listening model (E.L.M.). The idea behind the E.L.M. was to create a cheap and accessible model more capable of real-time voice monitoring—i.e., detecting “deception, toxic behavior, synthetic voices, and escalations” in order to
keep gamers safe from hate speech and enterprises safe from fraud.
The new system, dubbed Velma 2.0, is made up of more than 100 purpose-built models that are “dynamically orchestrated in real time,” said Mike Pappas, the C.E.O. and co-founder of Modulate. Pappas told me that Modulate trained each of these component models—which vary in architecture from basic processing algorithms, to stand-alone neural networks, to small language models—from scratch. The system itself
is broken into “blocks” that are powered by clusters of these small, specially trained models. According to internal benchmarks prepared by Modulate, Velma 2.0 is “30 percent more accurate than the leading L.L.M.s” when it comes to audio-based processing, and it’s somewhere between 10 and 100 times cheaper.
The company has raised a total of $30 million, a fraction of the billions typically needed to keep frontier-level developers afloat. “Compared to an OpenAI, we’re able to punch way
above our weight class,” Pappas told me. “We’re running into more and more reasons why we’re starting to see diminishing returns on this idea of, Just pump more data and more compute. So I think the world really is ripe right now to be excited about an opportunity to say, Hey, if this is our compute envelope, we can actually get all the intelligence we have today using 1 percent of it.” - Waymo’s school bus problem: For the past few
months, local school districts in Texas and Georgia have challenged Waymo over an apparent software bug that allows its robotaxis to sail on past idling school buses when their stop arms are out and red lights flashing. By November, the Austin Independent School District (A.I.S.D.) had recorded 19 such violations since the start of the semester, despite Waymo issuing a
software update to fix the problem.
For months, the A.I.S.D. has been calling on Waymo—which issued another software update in December—to cease its robotaxi operations while its buses are picking kids up and dropping them off. Waymo hasn’t. As of last week, the A.I.S.D. had
recorded a total of 24 violations, but no accidents. The National Highway Traffic Safety Administration already has an open investigation
into Waymo for reports of similar school bus blindness in Atlanta. Not surprisingly, Waymo is undaunted. “Our vehicles have 12x fewer crashes involving injuries to pedestrians compared to human benchmarks and we’re invested in demonstrating exceptional driving performance around school bus interactions that exceeds human-driven vehicles,” a Waymo spokesperson told me. “We have seen material improvement in our performance since our software update.” - Grok
gets sued: Less than a month after Grok’s “nudify” scandal first made headlines, we’ve got our first lawsuit—and it’s come from inside Elon Musk’s sprawling family. Last week, Ashley St. Clair, a G.O.P. influencer and estranged mother of one of Musk’s children, filed a lawsuit against xAI accusing the platform of negligence and infliction
of emotional distress for enabling its chatbot to create sexualized images of her. In response, xAI filed a countersuit against St. Clair, accusing her of violating her user agreement. The day before the lawsuit was filed, xAI said it would prevent users from generating bikini deepfakes of real people “in jurisdictions where such content is illegal.” xAI did not respond to a request for
comment.
|
“Financially, what will take me to 1 billion?” —OpenAI co-founder Greg Brockman’s diary
entry from 2017, which surfaced in discovery as part of the epic Musk v. Altman litigation. The trial will commence in April; Musk is seeking up to $110 billion in damages.
And now for the main event…
|
|
|
A new wave of A.I. coding tools are impressive and empowering enough to make one imagine a
future where we’re all coding our own apps and software engineers are a thing of the past. But these days, it still takes a pro (or armies of them) to get it right.
|
|
|
Though there are dozens of A.I. coding tools out there, Anthropic’s Claude Code seems to exist at the center
of what feels like an actual revolution—one of the first real examples of the radical impact that A.I. is sure to have on software development, and the subject of fawning headlines like “Move Over, ChatGPT” (The
Atlantic) and “Even Non-Nerds Are Blown Away” (The Journal). It’s a boon, of course, for Anthropic, and a sign that its focus on coding and enterprise applications is paying off. Anthropic’s Boris Cherny claimed that his team built the recently released Claude Cowork software in under two weeks using Claude Code.
Last
March, Anthropic C.E.O. Dario Amodei predicted that 90 percent of all code would be written by A.I. by the middle of last year. That didn’t quite pan out, but we are entering an era where amateurs with little to no software development expertise are, well, developing software. One woman
documented how she used Claude, along with ChatGPT and v0 by Vercel, to build a restaurant-selection app in a week. Others are using it to build personalized news feed apps, or email organization workflows, or
custom mobile games for the family to play over the holiday.
Recently, I gave it a try myself. I decided to start slow, with GitHub Codespaces and a boring idea offered up by ChatGPT: “Build a simple but powerful browser-first web app that automatically saves copied content with context (source, time, surrounding
text) and makes it searchable and reusable later.” After a few minutes, that 23-word prompt became a modest but functional web app. A quick highlight and a right click, and I can save quotes—complete with surrounding context and a link to the original source, be it tweet or article—on an endless clipboard attached to my browser. Useful, sure, but not a game-changer.
Then I downloaded Claude Code, and asked it to make me a Star Wars–themed online tower defense game, a request it
obliged. In less than a half-hour, I was fighting the Empire (I managed to win after 15 rounds), although Claude Code failed to honor my request for a background that looked like planets from Star Wars. We had a bit of a (polite) back-and-forth, but I gave up after a few attempts. Still, I had a workable game after $20 and 20 minutes.
|
A good start, so I decided to up the ante. I’ve been intent on mastering conversational Spanish for a long
time, and have the lapsed Duolingo account and dusty textbooks to prove it. So I asked Claude to build me a “highly competitive, production-level app designed to teach non-Spanish speakers total conversational Spanish.” Claude got to work, building me a web app it called FluenteSpanol, which included vocab practice and scenario-dependent conversational training. This was by far the most complex thing I attempted to build with Claude Code, and it was genuinely remarkable.
And yet,
there were immediate issues around context awareness during the conversational practice, as well as some broken links. And thus began my days-long back-and-forth with Claude: I would ask the bot to fix these issues; it would “identify” and “fix” the problem in the app’s codebase; and then… the issues persisted. I went through six rounds of this before giving up. Yesterday, I tried it twice more in a fresh chat, but to no avail.
|
As I explained to Dr. Armando Solar-Lezama, an MIT computing professor, my amazement at the
speed with which Claude built my app didn’t diminish my skepticism of the approach. As a non-coder, I assumed that problems on the front end implied that there would be some on the back end, too, but I’d need a professional to dogfood it for me. (Also, I probably needed a Spanish speaker to vet the results that Claude was spitting out.) This, Solar-Lezama told me, was consistent with his own experience: “At some point, you need to know how to program in order to actually get what you want,” he
said. Part of the challenge is simply in identifying where the coding tool went wrong and understanding how to fix it—an insurmountable challenge for most amateurs. “If you are not writing the code, some basic understanding of coding is still essential to write anything that is not trivial,” Solar-Lezama continued. “All the people I know who are leveraging these tools to do useful things know how to code, even if they are not software developers.”
|
So, A.I. has not made professional developers obsolete, at least not yet. Itamar Friedman,
the C.E.O. of code quality and review startup Qodo, told me that the tools are not at a point where non-developers can truly contribute to software development. “With all due respect to everything you’re seeing on Twitter and everyone saying, Wow, Claude Code created this amazing thing, 99 percent of what you’re using right now on your phone and in your computer is basically code that was written four years ago,” he said.
The journey to that stage, he added, would involve
developing A.I. systems that have reliable contextual awareness of the individual quirks, standards, and best practices of a given company’s codebase, while also training the underlying models to both generate and review code. Non-developers, he said, can easily create a bug without knowing what they’ve done. (By the way, he fully expects a cyberattack resulting from shoddy, A.I.-generated code, à la the 2024 CrowdStrike outage, to occur soon.) “Let’s admit, it’s still in the
early stages,” he noted. But he thinks we’ll get there eventually. Friedman said he’s aware of engineers at enterprises who are now managing sizable teams of A.I. coding agents.
As code review and verification systems improve—and as A.I. coding tools help non-developers act more and more like developers—Friedman envisions a barbell effect, wherein the number of junior and senior developers increases while midlevel developers decline. Despite Amodei’s premonition and
increasingly noisy anecdotes about A.I.-related job replacement, the number of software developers around the world has increased dramatically since 2022, according to market research firm SlashData,
from 31 million to 47.2 million in 2025. That’s not to mention the persistent lack of clear evidence, pointed out by the Oxford Economics group and others, that causally ties A.I. to job losses in general. And even as A.I. adoption increases, many developers still don’t trust the accuracy of its outputs. Only 15 percent say they perceive A.I. as a threat to their jobs, according to Stack
Overflow’s 2025 developer survey.
Of course, there’s little doubt that L.L.M.s are changing the practice of coding itself. But that doesn’t mean the field of software engineering will simply disappear. Instead, said Dr. Fabian Braesemann, a researcher at the Oxford Internet Institute, coding will just become more accessible to the masses—similar to how, in the early
days of the internet, anyone with a bit of HTML knowledge could slap together a simple blog or website. “Back then, the internet did not lead to the disappearance of software developers, but instead to the creation of a whole new occupation—web developer—and ever more appealing websites,” he said, adding that he expects the same cycle to repeat itself now. “Instead of developers being pushed off the job market, there is even more demand because many companies in non-tech industries can now
afford to also do their fair share of web and software development in-house.”
As for the value of today’s personalized A.I. vibe-coded apps, Braesemann is skeptical. “We will see a very long tail of not-useful tools that are not professional and that will disappear,” he predicted. “Over time, however, new industries and big corporations that leverage A.I. and data to produce novel software products will undoubtedly appear.”
|
That’s all for today. I’ll see you on Thursday.
Ian
|
|
|
Join Emmy Award-winning journalist Peter Hamby, along with the team of expert journalists at Puck, as they let you in on the
conversations insiders are having across the four corners of power in America: Wall Street, Washington, Silicon Valley, and Hollywood. Presented in partnership with Audacy, new episodes publish daily, Monday through Friday.
|
|
|
A professional-grade rundown on the business of sports from John Ourand, the industry’s preeminent journalist, covering the
leagues, players, agencies, media deals, and the egos fueling it all.
|
|
|
Need help? Review our
FAQ page or contact us for assistance. For brand partnerships, email ads@puck.news.
You received this email because you signed up to receive emails from Puck, or as part of your Puck account associated with {{customer.email}}. To stop receiving this newsletter and/or manage all your email preferences, click here.
|
Puck is published by Heat Media LLC. 107 Greenwich St., New York, NY 10006
|
|
|
|