Menu

Is Microsoft 365 Copilot Agent Mode Ready to Rival Human Accuracy?

Is Microsoft 365 Copilot Agent Mode Ready to Rival Human Accuracy?

Analyst(s): Nick Patience
Publication Date: October 2, 2025

Microsoft has launched Agent Mode in Excel and Word, alongside Office Agent in Copilot chat. These features allow users to steer Copilot through multi-step tasks, producing spreadsheets, documents, and presentations with higher accuracy and interactivity.

What is Covered in this Article:

  • Agent Mode in Excel adds multi-step orchestration, benchmarked at 57.2% accuracy vs 71.3% for humans.
  • Word gains conversational “vibe writing” for drafting, refining, and formatting documents.
  • Office Agent in Copilot chat (Anthropic-powered) builds PowerPoint and Word through chat-first workflows.
  • Microsoft blends OpenAI and Anthropic models, assigning each to distinct Copilot roles.

The News: Microsoft has rolled out Agent Mode in Excel, Word, and Office Agent in Copilot chat, introducing what it calls “vibe working.” Agent Mode brings multi-step automation to Excel and Word, while Office Agent lets users generate Word and PowerPoint content straight from chat prompts.

Agent Mode is available now for Microsoft 365 Copilot users and Microsoft 365 Personal or Family subscribers through the Frontier program. Excel and Word will be supported on the web, and PowerPoint will be coming soon. Office Agent, powered by Anthropic models, is also launching in the U.S. for Personal and Family subscribers.

Is Microsoft 365 Copilot Agent Mode Ready to Rival Human Accuracy?

Analyst Take: The release of Agent Mode and Office Agent marks a shift in Microsoft’s Copilot strategy, embedding agent-driven workflows directly into tools people use every day. By adding advanced orchestration and reasoning inside Excel, Word, and PowerPoint, Microsoft aims to make powerful functionality more accessible while creating a new way for teams to collaborate.

Expanding Excel Beyond Experts

Excel has long been essential for everything from simple budgets to corporate finance, but its deeper features have mostly been used by experts. Agent Mode changes this by allowing Copilot to “speak Excel” natively through OpenAI’s latest reasoning models. Instead of just generating results, it can check outputs, fix mistakes, and rerun processes until they are correct. Microsoft benchmarked Agent Mode at 57.2% accuracy on SpreadsheetBench – below the 71.3% human score, but ahead of Shortcut.ai and Claude Files. While not flawless, this is a big step in making advanced Excel capabilities usable for non-experts.

Conversational Writing in Word

Agent Mode in Word reimagines document creation as “vibe writing” – a more interactive process where Copilot drafts, refines, and asks questions as you go. Rather than one-off outputs, it stays in conversation, folding in feedback and applying proper formatting. Example uses include summarizing customer reviews, updating reports, or polishing documents to match branding. By mixing drafting with iteration, Microsoft hopes to speed up writing tasks while still keeping users in control.

Office Agent’s Chat-First Creation Workflow

Office Agent brings Copilot into chat to generate complete PowerPoint decks and Word documents using Anthropic models. The process starts by clarifying length, theme, focus areas, and audience details. It then conducts web research with a visible reasoning trail and live slide previews before using code generation and quality checks to deliver polished content. This chat-first setup pulls clarification, research, generation, and revision into one workflow.

Model Composition Across Copilot

Office apps are powered by OpenAI models, with Agent Mode using GPT-5 for step-by-step task execution, while Office Agent in Copilot chat runs on Anthropic models. The company signals an ongoing commitment to OpenAI but is also building a broader model family to match different strengths and needs. Anthropic models are now appearing in Microsoft 365 apps through Copilot chat, showing a clear strategy of mixing providers across the portfolio. In practice, this means selecting models based on their specific role within Copilot, rather than relying on a single source for every task.

What to Watch:

  • User adoption of Agent Mode accuracy at 57.2% versus 71.3% human benchmark.
  • Expansion of Office Agent beyond U.S. Personal/Family subscribers to commercial customers.
  • Integration of PowerPoint into Agent Mode following Excel and Word.
  • The interplay between OpenAI-powered Office apps and Anthropic-powered Copilot chat.

See the complete blog post on the introduction of Agent Mode and Office Agent on the Microsoft website.

Disclosure: Futurum is a research and advisory firm that engages or has engaged in research, analysis, and advisory services with many technology companies, including those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.

Analysis and opinions expressed herein are specific to the analyst individually and data and other information that might have been provided for validation, not those of Futurum as a whole.

Other insights from Futurum:

Microsoft Q4 FY 2025 Earnings Beat Driven by 39% Azure Growth

Microsoft Reimagines Marketplace: A New Battleground for AI Agents?

What Role Will Microsoft 365 Copilot Agents Play in Enterprise Workflows?

Author Information

Nick Patience is VP and Practice Lead for AI Platforms at The Futurum Group. Nick is a thought leader on AI development, deployment, and adoption - an area he has researched for 25 years. Before Futurum, Nick was a Managing Analyst with S&P Global Market Intelligence, responsible for 451 Research’s coverage of Data, AI, Analytics, Information Security, and Risk. Nick became part of S&P Global through its 2019 acquisition of 451 Research, a pioneering analyst firm that Nick co-founded in 1999. He is a sought-after speaker and advisor, known for his expertise in the drivers of AI adoption, industry use cases, and the infrastructure behind its development and deployment. Nick also spent three years as a product marketing lead at Recommind (now part of OpenText), a machine learning-driven eDiscovery software company. Nick is based in London.

Related Insights
CIO Take Smartsheet's Intelligent Work Management as a Strategic Execution Platform
December 22, 2025

CIO Take: Smartsheet’s Intelligent Work Management as a Strategic Execution Platform

Dion Hinchcliffe analyzes Smartsheet’s Intelligent Work Management announcements from a CIO lens—what’s real about agentic AI for execution at scale, what’s risky, and what to validate before standardizing....
Will Zoho’s Embedded AI Enterprise Spend and Billing Solutions Drive Growth
December 22, 2025

Will Zoho’s Embedded AI Enterprise Spend and Billing Solutions Drive Growth?

Keith Kirkpatrick, Research Director with Futurum, shares his insights on Zoho’s latest finance-focused releases, Zoho Spend and Zoho Billing Enterprise Edition, further underscoring Zoho’s drive to illustrate its enterprise-focused capabilities....
Will IFS’ Acquisition of Softeon Help Attract New Supply Chain Customers
December 19, 2025

Will IFS’ Acquisition of Softeon Help Attract New Supply Chain Customers?

Keith Kirkpatrick, Research Director at Futurum, shares his insights into IFS’ acquisition of WMS provider Softeon, and provides his assessment on the impact to IFS’s market position and the overall...
NVIDIA Bolsters AI/HPC Ecosystem with Nemotron 3 Models and SchedMD Buy
December 16, 2025

NVIDIA Bolsters AI/HPC Ecosystem with Nemotron 3 Models and SchedMD Buy

Nick Patience, AI Platforms Practice Lead at Futurum, shares his insights on NVIDIA's release of its Nemotron 3 family of open-source models and the acquisition of SchedMD, the developer of...
Will a Digital Adoption Platform Become a Must-Have App in 2026?
December 15, 2025

Will a DAP Become the Must-Have Software App in 2026?

Keith Kirkpatrick, Research Director with Futurum, covers WalkMe’s 2025 Analyst Day, and discusses the company’s key pillars for driving success with enterprise software in an AI- and agentic-dominated world heading...
Broadcom Q4 FY 2025 Earnings AI And Software Drive Beat
December 15, 2025

Broadcom Q4 FY 2025 Earnings: AI And Software Drive Beat

Futurum Research analyzes Broadcom’s Q4 FY 2025 results, highlighting accelerating AI semiconductor momentum, Ethernet AI switching backlog, and VMware Cloud Foundation gains, alongside system-level deliveries....

Book a Demo

Newsletter Sign-up Form

Get important insights straight to your inbox, receive first looks at eBooks, exclusive event invitations, custom content, and more. We promise not to spam you or sell your name to anyone. You can always unsubscribe at any time.

All fields are required






Thank you, we received your request, a member of our team will be in contact with you.