Rabbit’s handheld AI device aims to create a post-smartphone experience
Santa Monica-based Rabbit debuted the r1, a small personal AI device that operates multiple apps on the user’s behalf to get things done.
The device, whose design comes courtesy of the eclectic Swedish tech firm Teenage Engineering, is the size of a stack of Post-it notes and slips easily into a pocket. It uses a large language model—courtesy of OpenAI—to understand the user’s spoken requests. But Rabbit’s idea is to go well past the generative output of LLMs and toward a more active and agentic AI.
The startup’s special sauce is its software, namely a proprietary AI model called the Large Action Model (LAM) that learns to operate the user’s apps on their behalf. For example, a user might tell r1: “Get me an Uber to the office, pick out some pump-up music for the trip, and notify the team I’ll be a little late.” The LAM then interfaces with the relevant apps to fulfill the request.
LAM’s app coordination abilities, in theory, create a sort of post-smartphone experience. “We’ve come to a point where we have hundreds of apps on our smartphones with complicated UX designs that don’t talk to each other,” says Rabbit founder and CEO Jesse Lyu. “As a result, end users are frustrated with their devices and are often getting lost.”
The model was initially trained by watching thousands of recorded user sessions with a variety of apps. The user can grant the Rabbit assistant access to their apps by providing their app credentials at a set-up page on their laptop or desktop.
Lyu says Rabbit could have chosen to connect with various apps by paying for an application programming interface to them. But, he says, the app functionality made available through the API is often limited, and increasing the API’s feature set is often not among the app makers’ top priorities. That’s one of the reasons Rabbit chose to access the user’s apps by going in through the front door using the user’s credentials. In theory, the LAM model could learn to do most everything the user can with an app.
Will app makers feel disaggregated by Rabbit’s approach? After all, Rabbit is proposing that users begin sending a proxy AI agent to use their apps. And that proxy isn’t likely to view ads or be persuaded to make in-app purchases. Lyu says he doesn’t believe app owners will squawk because the service they’re selling is still being used, but is just accessed in a different way.
Later this year, users will be able to use a “teach mode,” which will allow them to directly train the LAM model to use their go-to apps. Using a web portal called the “rabbit hole,” the user can teach the model by demonstrating how they use their apps. The model “can infer and model human actions on computer interfaces by learning users’ intention and behavior when they use specific apps, and then mimic and perform them both reliably and quickly,” Rabbit explains in a press release. For instance, a user could teach a video-editing app to remove watermarks from photos.
Of course the proof is in the pudding. Ex-Apple executives Imran Chaudhri and Bethany Bongiorno created a lot of buzz with their Humane startup and its lapel-riding device the Ai Pin. But at the actual launch of the product it became clear that Pin’s ability to integrate with apps in useful ways was somewhat limited. OpenAI currently lets ChatGPT users control some apps by using app “plugins.” The company is also said to be working with ex-Apple design guru Jony Ive on a personal AI device.
When we actually get an r1, we’ll be looking closely at how easy it is to set up, the variety of the apps that LAM already knows how to use, the flexibility or rigidity with which novel multi-app actions can be called up, and the ease at which the model can be trained on new apps or workflows.
Rabbit says it chose the r1’s shape and features so that it can activate user tasks faster than an advanced smartphone can. The handheld device weighs 115 grams and is roughly the size of a single stack of Post-it notes. It has a 2.88 inch touchscreen display—a scroll wheel navigates through “cards” that provide textual information on the tasks being performed by AI. A rotating camera can perform computer vision tasks (e.g., “What are the ingredients of this food item?”) and is capable of making video calls. It has Wi-Fi and cellular connectivity, a MediaTek processor, 4 GB of memory, 128 GB of storage, a USB-C port, and a SIM card slot. Its battery will last all day, Rabbit says.
The new r1 device will cost $199 and will be released by Easter, Lyu says.
(14)