Berkeley MemGPT Team Started a Business to Create Open Source OpenAI, Jeff Dean Also Invested
On Monday, the startup Letta made headlines with its launch of technology that can help AI models remember users and conversations.
Letta was founded in UC Berkeley’s famous lab startup factory and has announced $10 million in seed funding led by Felicis’ Astasia Myers, with the round valuing the company at $70 million.
Letta has also received support from a range of angel investors in the field of artificial intelligence, including Jeff Dean of Google, Clem Delangue of Hugging Face, Cristóbal Valenzuela of Runway, and Robert Nishihara of Anyscale.
The highly anticipated AI startup, founded by Berkeley PhD students Sarah Wooders and Charles Packer, is a spinoff of Berkeley’s Sky Computing Lab and the commercial entity for the popular MemGPT open source project.
GitHub link: https://github.com/cpacker/MemGPT
Berkeley’s Sky Computing Lab, led by renowned professor and Databricks co-founder Ion Stoica, is the successor to RISELab and AMPLab, which spawned companies such as Anyscale, Databricks, and SiFive. Sky Lab in particular has spawned many popular open source large language model (LLM) projects, such as Gorilla LLM, vLLM, and the LLM structured language SGLang.
“In less than a year, there was a lot of activity happening in the lab. Those were the people I sat next to,” Wooders said. “It was an incredible time.”
MemGPT is also such a project, but it is so popular that it became popular before it was even promoted.
The project's creators released a white paper on Thursday, October 12, 2023, and planned to release a more in-depth paper and code to GitHub the following Monday. But someone stumbled upon the paper and posted it to Hacker News on Sunday. Packer said it "went viral on Hacker News before we had a chance to officially release code, a paper, or a tweet about it," he said.
Project homepage: https://memgpt.ai/
What’s exciting about MemGPT is that it aims to solve a pain point of Large Language Models (LLMs): in their native form, models like GPT-4 are stateless, meaning they do not store historical data in long-term memory.
This is a big problem for AI applications that need to gradually understand and learn about users over time - from customer support robots to healthcare symptom tracking applications, many potential AI applications require us to give large models "long-term memory." MemGPT manages data and memory so that large model agents and chatbots can remember previous users and conversations.
Packer recalled that the post of the MemGPT paper stayed on the front page of Hacker News, a website operated by Y Combinator, for 48 hours, and he spent several days answering questions on the website while preparing to release the code. After the MemGPT project was released on GitHub, its link went viral again on Hacker News. Interviews and tutorials on YouTube, Medium posts, 11,000 stars and 1.2K forks quickly emerged.
Myers of VC Felicis also discovered Wooders and Packer while reading about MemGPT and immediately recognized the commercial potential of the technology.
“I saw the paper when it came out,” she says, and immediately reached out to the project team. “Our investment thesis is around AI agent infrastructure, and we realized that a very important component of that is data and memory management to make these conversational chatbots and agents effective.”
Until they find the company that first takes a liking to them, the MemGPT team is still roaming around Sand Hill Road, talking to venture capital firms via Zoom.
At the same time, Stoica also helped the company get to know Jeff Dean, Robert Nishihara and other well-known Silicon Valley angel investors. Packer recalled that the angel investment process was extremely simple: "Many professors at Berkeley have extensive connections because they work locally. They are all very concerned about the projects that this laboratory is about to commercialize."
Competition and the Threat of OpenAI o1
Although MemGPT has been available for use since last year, the commercial version of Letta, Letta Cloud, is not yet available. As of Monday, Letta is accepting requests from beta users. It will provide a hosted proxy service that allows developers to deploy and run stateful agents on the cloud platform, accessible through a REST API (a programming interface that can maintain state). Letta Cloud will store the long-term data needed to do so. Letta will also provide developer tools for building AI agents.
Wooders sees a wide range of uses for MemGPT. “The number one use case we see is a highly personalized, very engaging chatbot,” she says. But there are also cutting-edge uses, like a “chatbot for cancer patients,” where patients upload their medical history and then share ongoing symptoms so the AI can learn and provide guidance over time.
It’s worth noting that MemGPT isn’t the only company working on this. LangChain, perhaps its most notable competitor, already offers commercial options. Leading companies in large models are also offering agent tools, such as OpenAI’s Assistants API.
OpenAI’s new o1 model may eliminate the need for users to repair state. Since it is a multi-step model, it fundamentally must maintain state to a certain extent in order to “think” and verify facts before responding.
But Wooders, Packer, and Myers see some key differences between Letta’s offering and OpenAI’s. Letta claims it can work with any AI model, and expects its users to use many of them: OpenAI, Anthropic, Mistral, and their own big models. OpenAI’s technology currently only works with itself.
More importantly, Letta is using the open source MemGPT project and firmly stands on the side of the open source camp, believing that open source is a better choice for AI applications.
“We position ourselves as the open alternative to OpenAI,” Packer said. “Building the best AI applications is very hard, especially when you care about problems like hallucinations.”