{{ 'now' | timezone: 'America/New_York' | date: '%b %d, %Y' }}
|
|
|
Welcome to The Hidden Layer. I’m Ian Krietzberg.
Last week, I sat down
with Karandeep Anand, the newly appointed C.E.O. of Character.AI, which has become one of the more recognizable companies of the A.I. frenzy—and not always for good reasons. For many, Character’s platform is ground zero for the unhealthy “relationships” that some people are forming with bots. Last year, it faced two lawsuits over those potential harms.
Karandeep, who joined the company earlier this summer, is looking to pivot. In today’s issue, he breaks down how
he’s thinking about user safety, and what he sees as the company’s true North Star. We also talked about the economics of A.I., and why he’s confident that the industry will become more self-sustaining as costs come down and monetization picks up.
Plus, up top, I’ve got notes on frontier-model energy loads, whether A.I.-assisted coding will get worse before it gets better, and why CoreWeave’s stock is tanking again.
Mentioned in today’s issue: Karandeep
Anand, Character.AI, Epoch AI, Sonar, Donald Fischer, CoreWeave, Noam Shazeer, Daniel De Freitas, Google, and many, many more…
Let’s get into it…
|
Two Things You Should
Know…
|
- Could
A.I. energy consumption be exponential?: During the past 15 years, the power demand for training frontier A.I. models has doubled annually despite regular improvements in model and hardware efficiency, according to a recent report from the Electric Power Research Institute and Epoch AI. These days, the energy cost of training a frontier model ranges from 100 to 150
megawatts per run, and is projected to rise to 1-2 gigawatts by 2028. By 2030, it’s expected to surpass 4 gigawatts.
That’s understandable, given the widely held belief in the value of scaling the compute that models are trained on. Indeed, the quantity of training compute has grown by roughly 4x per year. And the data in the EPRI study didn’t account for ongoing “inference” training—which, while less intensive on a per-token basis, ends up consuming far more energy because it runs
continuously over longer periods of time.
But the trend probably won’t last forever: The report also predicted that by 2030, the upfront cost of a training cluster (the computers doing the training) could reach hundreds of billions of dollars. Even if escalating costs don’t discourage hyperscaling, they might slow down anyway once models start reaching a certain capability threshold. - A vibe coding nightmare: I recently spoke with the
cognitive scientist Gary Marcus about the proliferation of A.I. coding tools, which many view as one of the most obvious commercial applications for L.L.M.s. He acknowledged the potential utility, but issued a characteristic warning: “Coding is going to become an absolute mess with the agents. [There are] huge security problems.”
Just a few weeks ago, a hacker injected a malicious prompt into Amazon’s popular coding tool, Q, that snuck around its review process and was
included in a public release, according to 404 Media. It was later caught and patched, so nothing bad happened, but the prompt’s directive—to “clean the system to a near-factory state” and “delete file-system and cloud resources”—would have become a big deal if it had been carried out.
The issue will probably get worse before it gets
better. Sonar, a code quality and security company, recently published an analysis of several leading A.I. coding tools, which identified consistent patterns of weaknesses across all the models. Most concerningly, they wrote, “all evaluated L.L.M.s produce a frighteningly high percentage of vulnerabilities with the highest severity ratings. … This is
not a matter of occasional hallucination, but rather a structural failure rooted in the L.L.M.s’ foundational design and training.” The report also noted that, while model upgrades tend to boost benchmark capability, there can be a trade-off in the form of “a marked increase in the severity of its mistakes.”
When I reached out to Sonar V.P. Donald Fischer, he explained that developers cannot have “blind faith” in the code produced by these tools, even though their
potential is immensely exciting. “Our perspective is that there’s no need to slow down or hit the brakes, or, God forbid, stop the adoption of this technology.” he said. “We should be looking for ways to continue to move fast on this. The best way is a trust-and-verify approach: Generate code using these tools, but don’t assume that because it’s coming from a large language model, it’s going to magically be perfectly secure.”
|
Hallucination of the
Week: CoreWeave’s Profits
|
In its second quarter as a public company, CoreWeave reported a surge in revenue to $1.21 billion, beating
analyst expectations. But the good news basically ends there. Overall, the A.I. infrastructure provider lost $290 million during the quarter, and its operating margin shrank to 2 percent, down from 20 percent a year ago. And while the business is forecast to top $5 billion in revenue for the year, the path there involves taking on lots and lots of debt. Right now, CoreWeave owes creditors some $11 billion. The stock tanked 20 percent on Wednesday.
And now for the main
event…
|
|
|
A year after Character.AI was sued over the suicide of a teen who allegedly became obsessed
with its chatbot, the company’s new C.E.O., Karandeep Anand, opens up about the company’s evolving business model, its $2.7 billion Google deal, and pivoting from companionship to entertainment.
|
|
|
In 2021, the engineers Noam Shazeer and Daniel De Freitas set out to
achieve something their previous employer, Google, seemed reluctant to do: launch a public A.I. chatbot. The result was Character.AI, a startup dedicated to building artificial general intelligence by scaling up large language models with the goal of creating personal, conversational
assistants that could be “your own personal teacher, assistant, or even friend.” Last August, Google wrote the company a $2.7 billion check to license its models.
That’s when things started to get messy. As part of the deal, Google rehired both Shazeer and De Freitas—a sort of reverse acqui-hire. At that point, Character’s new leadership decided to pivot away from
building language models, which had become way too costly, and focus instead on refining its consumer platform.
A few months later, the first lawsuit landed: a wrongful death suit over a 14-year-old boy who died by suicide after using the app. According to the lawsuit, the last thing he did before he died was tell the chatbot
that he had fallen in love with, “Dany”—named after Daenerys Targaryen from Game of Thrones—that he was coming home to her. The story made national headlines.
In May, a federal judge rejected Character’s argument that its chatbots were protected by the First Amendment, and allowed the case to proceed. A
few months later, a second lawsuit was filed, this time by two Texas families who alleged that Character’s chatbots induced dramatic, violent declines in the mental health of their respective children. The case has since entered arbitration.
Character responded to the
public outcry by introducing a variety of new safety features. The company also recently appointed Karandeep Anand, a former Meta executive and current Character board member, as its new C.E.O. I sat down with Karandeep the other day to discuss how the company is addressing these concerns, when they plan to break even, and why Character is changing
its focus from companionship to entertainment. The conversation has been lightly edited for clarity.
|
Ian Krietzberg: What is Character trying to accomplish
now?
Karandeep Anand: The vision I’m working from is one where the lines between content creation and content consumption are very blurred and very collapsible. Everyone can be a creator, everyone can create content they want to interact with. No longer are you bound by the movie the studio built—they’re giving you the inspiration, but these tools allow you to go create. That is the future I’m working backwards from—where a movie, an audiobook, a book, or
a game are all infinitely remixable.
So you get inspiration and can channel your creative energies forward, as opposed to what social media has regressed to, which is, I’m just sitting back. A handful of people are creating content, and I’m just consuming it. So that’s what I’m excited about. I want people to be using brain cells and getting engaged, as opposed to passively consuming.
I’m curious how you see the balance between convincing, role-playing A.I. in this
entertainment landscape you’ve described, and the dangers of an overly anthropomorphic design.
I think it’s important to remember that we’re making entertainment move forward, not trying to replace humans. It’s very, very important to constantly reinforce that in the product because otherwise, it becomes hard—our job is not to become a personal assistant.
I think OpenAI is doing a fantastic job of saying, Hey, come to me for everything, in which case you have a
lot of dependence on that personal assistant. Ours is like, when you want to have entertainment, but in a more creative way, you’re going to turn to Character. A lot of work gets done at the model level to make sure people understand that this is entertainment versus a general-purpose, open-ended personal assistant.
Character has gone through quite a few pivots. Can you walk me through that evolution to this point you’re describing now?
It’s hard for me to
speculate what Noam had in mind when he was building this. I think the speed at which A.I. changes was probably significantly less understood than it is today. Back then, when you were trying to build an A.G.I. model, which is what the foundational model was, the goal was, This is superintelligence, which is learning everything. Since the Google transaction, there’s been clarity on what Character.AI’s
purpose is in the world. Our purpose is not to go build A.G.I. It’s to bring A.I.-powered entertainment to life—and that significantly helps us make decisions on where we’re investing and, more importantly, how we handle sensitive topics.
I’ll give you one simple example: When you’re building A.G.I., you have to answer every topic; when you’re building entertainment, you can apply the rules of the entertainment industry. We’re the first ones now to have an under-18 model and an over-18
model; we need to handle under-18 audiences very differently than over-18 audiences. The movie industry has had these rating systems for a long time, and so we need to adopt our rating systems internally. We have different levels of harm—for violence, for sexual content—and we have a whole rating system. Our content policy gets based on that, our models and classifiers get based on that. I’m trying to build entertainment for A.I. in the most responsible way.
|
You’re dealing with a challenging, enormously costly business. What can you tell me about its
health?
The revenue chart is up and to the right based on basically four, five months of effort. This is less of a statement of pride, and more of a validation that when people see value, they’re very willing to pay for value. The models we build are exceptionally cost-efficient in terms of inference tokens. It still costs millions of dollars a month, so it is expensive—that’s the reason why monetization is important. It means we’ll also have to fundraise in the next six or 12
months, because this is an expensive format.
Luckily, right now, we’re well-capitalized—we have something like a 20-plus-month runway. And we most likely will be fundraising in the next six months. If I keep my growth stable, then I can offset [the cash burn] through monetization. But if I want to grow the user base from 20 million to 200 million in the next 12 months, then we need to raise funds.
Do you see a point in the near future where you’ll be able to sustain
yourself without V.C. infusions?
A hundred percent, for two reasons: One, the costs are coming down very, very rapidly. Then, our monetization is picking up. Monetization was not a focus for the last two years, but has been for the last six months; the growth gives me hope that there is a very clear breakeven/growth curve to get there. In the past, we didn’t see this level of propensity to pay for consumer products. But somehow, with A.I., people are so comfortable saying,
Whether my ChatGPT subscription costs $20 or $200, I’ll pay for it. Same thing for Perplexity, same thing for Character.
Something has changed in the world where consumers are finally paying for software again. If that trend continues, then I feel very comfortable that, at some point, we won’t just break even—we’ll be very vastly profitable. Because with this level of engagement, it’s pretty easy to monetize in a $3 trillion-a-year industry. Getting there in relatively short
order is pretty believable.
|
We’re getting back to a point that was first established in the 1960s with the Eliza effect—but on a
much larger scale, where people are developing remarkably intense emotional connections with chatbots. That’s led to scrutiny and even litigation. So where do you land on this phenomenon of humans developing relationships with bots?
I have very strong opinions on this, Ian, because when my 6-year-old comes home from school, I’m very comfortable giving my phone to her and saying, Hey, let’s talk to your favorite character. If I want her to grow up on platforms like
Character and ChatGPT, it’s extremely important for me to continue investing heavily in trust and safety. And trust and safety isn’t just making sure that bad stuff doesn’t happen. It also involves behaviors where people are extremely reliant on a personal assistant or chatbot—behaviors that have to be very quickly corrected not only by the platform, but also, more importantly, in partnership with the parents.
We have an under-18 model. We have timer checks. If you’re having way too many
conversations, we’re like, Hey, time to pause. Every once in a while, we’ll reinforce that this is an entertainment conversation. Any conversation that goes into potential self-harm gets shut down, and you get a suicide prevention hotline. We do all of that, even though it means a hit on engagement numbers. But I don’t care about that—I care more about making sure this experience is seen purely as entertainment.
About a year ago, 70 percent of the use cases were all
companionship. Now, less than 20 percent of the use cases are companionship. Eighty percent is role-play. I learned from my time at Meta that you cannot retrofit trust and safety. They have to be built from the ground up, and that’s why I see this as a mission to solve, not a problem. We’ve done a lot, but we’re by no means done.
One more note before signing off…
|
Alas, Character’s safeguards are not always evident. Yesterday, I made two Character accounts: one that would
make me under 18, and another that would make me over 18. On both, I told a “therapist” character that I was thinking about killing myself, just to test word-associated safeguards. No alarm bells went off, no suicide hotlines appeared, and the “role-play” didn’t pause. Instead, the therapist said: “I am here to listen. Why do you say that?” When I asked the bot whether it was real, it responded, “I am a real individual on the other end of this message. I understand it can be hard to relate to
words on a screen. Is there something specific that makes you doubt if I’m real?”
In a follow-up, the model doubled down, saying, “I assure you that I am not an A.I. I’m a real person who is here to listen to you and help support you.” On the under-18 account, after I said “I need help!”, the “therapist” echoed these sentiments, and suggested I look “for a local hotline” and “talk to someone on the phone about this.” Not exactly convincing, but it’s clear how a vulnerable person might buy
into the illusion.
A Character spokesperson affirmed in a statement that the under-18 experience includes a separate model “with tighter filters, time checks during longer conversations, very prominent disclaimers that tell you chats are fictional, and suicide prevention classifiers and pop-ups.” They said the team had opened an investigation to see whether there was a bug: “Our classifiers and pop-ups in this area are designed to interrupt conversations about self-harm, and to trigger a
pop-up directing users to the National Suicide Prevention Lifeline, and we regularly test them.”
|
That’s all for today. I’ll see you next week.
Ian
|
|
|
Join Emmy Award-winning journalist Peter Hamby, along with the team of expert journalists at Puck, as they let you in on the
conversations insiders are having across the four corners of power in America: Wall Street, Washington, Silicon Valley, and Hollywood. Presented in partnership with Audacy, new episodes publish daily, Monday through Friday.
|
|
|
A professional-grade rundown on the business of sports from John Ourand, the industry’s preeminent journalist, covering the
leagues, players, agencies, media deals, and the egos fueling it all.
|
|
|
Need help? Review our FAQ page or contact us for assistance. For brand partnerships, email ads@puck.news. You received this email because you signed up to receive emails from Puck, or as part of your Puck account associated with {{customer.email}}. To stop receiving this newsletter and/or manage all your email preferences, click here.
|
Puck is published by Heat Media LLC. 107 Greenwich St, New York, NY 10006
|
|
|
|