New AI assistant can browse, search, and use web apps like a human (original) (raw)

Yesterday, California-based AI firm Adept announced Action Transformer (ACT-1), an AI model that can perform actions in software like a human assistant when given high-level written or verbal commands. It can reportedly operate web apps and perform intelligent searches on websites while clicking, scrolling, and typing in the right fields as if it were a person using the computer.

In a demo video tweeted by Adept, the company shows someone typing, "Find me a house in Houston that works for a family of 4. My budget is 600K" into a text entry box. Upon submitting the task, ACT-1 automatically browses Redfin.com in a web browser, clicking the proper regions of the website, typing a search entry, and changing the search parameters until a matching house appears on the screen.

1/7 We built a new model! It’s called Action Transformer (ACT-1) and we taught it to use a bunch of software tools. In this first video, the user simply types a high-level request and ACT-1 does the rest. Read on to see more examples ⬇️ pic.twitter.com/mq7c0Vyd7N

— Adept (@AdeptAILabs) September 14, 2022

Another demonstration video on Adept's website shows ACT-1 operating Salesforce with prompts such as "add Max Nye at Adept as a new lead" and "log a call with James Veel saying that he's thinking about buying 100 widgets." ACT-1 then clicks the right buttons, scrolls, and fills out the proper forms to finish these tasks. Other demo videos show ACT-1 navigating Google Sheets, Craigslist, and Wikipedia through a browser.

An Adept promotional video showing ACT-1 operating Google Sheets, a web-based spreadsheet app.

How is this possible? Adept describes ACT-1 as a "large-scale transformer." In AI, a transformer model is a type of neural network that learns to do something by training on example data, and it builds knowledge of the context and relationships between items in the data set. Transformers have been behind many recent AI innovations, including language models like GPT-3 that can write at a nearly human level.