Microsoft has announced the launch of a new feature called “computer use” in Copilot Studio, which enables AI agents to interact with websites and desktop applications. Now, users can create agents that click buttons, select menu items, and enter text into on-screen fields—even if the app or site doesn’t have an open API. This makes it possible to automate routine tasks such as data entry, conducting marketing research, or processing invoices.
AI agents created in Copilot Studio can work with major browsers—Edge, Chrome, and Firefox. Users don’t need programming skills: it’s enough to describe the desired task in the Copilot Studio window in plain language. Before launching, you can test and adjust the task in a special simulator, as well as view the agent’s action history along with screenshots and its logic.
The system can adapt to changes in apps or websites, for example, if button locations or the page’s appearance are changed. This allows the agent to continue performing tasks without user intervention, even if the interface is updated. The agents run on Microsoft’s cloud platform, and the data generated during operation is not used to train the model.
Additionally, Microsoft has made the Copilot Vision feature free for Edge users, which helps recognize information on the screen and suggests how to work with applications. This feature is activated in the browser’s sidebar, and you only need to grant the appropriate permission to use it. Copilot Vision can, for example, help you cook dishes by recipe or provide tips for preparing for an interview.