Microsoft Copilot Meets DeepSeek: The Multimodel AI Shift

Quick Summary
Microsoft is exploring DeepSeek inside Copilot Co-Work while selling Western AI to China. Here's what the multimodel pivot means for enterprise AI.
In This Article
When a Western Tech Giant Opens the Door to a Chinese AI Model
Microsoft embedding a Chinese AI model inside one of its flagship enterprise products would have sounded implausible eighteen months ago. Yet that is precisely what is now on the table. According to reporting by Axios, Microsoft is exploring a fine-tuned version of DeepSeek as a lower-cost model option inside Copilot Co-Work — its most powerful, agentic product to date. This is not a curiosity. It is a signal that the enterprise AI market is entering a fundamentally different phase: multimodel, cost-aware, and geopolitically complicated in ways that no single company can fully untangle.
Related Post
To understand why this matters, you need to understand what Copilot Co-Work actually does, why AI agents are so expensive to run, and what Microsoft's broader position in the global AI market really looks like.
What Copilot Co-Work Actually Does — and Why It Costs So Much
Copilot Co-Work is not a chatbot. It is Microsoft's agentic AI layer, designed for long-running, multi-step tasks that traditional prompt-response systems cannot handle. Where a standard Copilot session might summarise a meeting or draft an email, Co-Work can ingest thousands of files, reason across internal databases, call external tools, generate structured outputs, and return a finished result — all without a human steering each step.
Microsoft reports that during its three-month preview inside the Frontier program, more than half of Fortune 500 companies used the product. The use cases are telling: one engineering team used it to edit batch job spreadsheets and auto-generate dependency flowcharts after every change; another compared nearly 4,000 files across two product versions — work Microsoft estimates would have taken weeks manually; a sales team received a ranked list of at-risk pipeline opportunities, complete with the specific follow-up touchpoints that had gone cold.
Those outcomes sound impressive precisely because they are. But they come at a real computational cost. Agentic AI systems do not issue one model call per task. They iterate. They retrieve context, call tools, check intermediate outputs, spawn sub-tasks, and loop until the job is done. A single Co-Work session can involve dozens of model calls, multiple retrieval steps, and extended cloud runtime. The more useful these systems become, the more tasks users assign them. The more tasks they run, the more compute burns.
Microsoft has acknowledged this directly. Charles Lammana, the company's executive vice president for Copilot agents and platform, told Axios that some users were running hundreds of tasks per week. The tool was working — that was the problem. Unlimited use was no longer financially viable.
The result is a shift to usage-based billing. Co-Work now runs on a pay-as-you-go model, with pricing measured in Copilot Credits at one cent each. Task cost depends on four variables: model use, context retrieval, tool calls, and runtime. Microsoft has defined three task tiers — light, medium, and heavy — along with four user personas (corporate knowledge workers, management, customer-facing staff, and technical workers), giving enterprises a framework to forecast spend before committing.
The DeepSeek Option: Cost Engineering at the Model Layer
This is precisely where DeepSeek enters. Right now, Co-Work runs primarily on Anthropic models — Opus 4.8 and Sonnet 4.6 — with GPT-4.5 available inside the Frontier programme. Microsoft's own Co-Work 1 model is also imminent: a post-trained, fine-tuned model described as optimised for everyday Co-Work tasks at substantially lower cost.
But Axios reports Microsoft is also evaluating a fine-tuned DeepSeek V4 as an additional lower-cost option. The architecture here is important. DeepSeek's models use a Mixture-of-Experts (MoE) design, activating only a subset of parameters per inference call. That is part of why DeepSeek R1 caused such a stir when it launched — it delivered near-frontier reasoning performance at a fraction of the compute cost of comparable dense models. For high-volume agentic workloads where many tasks are routine rather than frontier-level, routing those jobs to a cheaper, capable model is straightforward cost engineering.
If Microsoft adopts it, DeepSeek would not be the default. It would be an opt-in option, fully hosted on Azure, with customer data remaining inside Microsoft's cloud and covered by Azure's enterprise security, compliance, and data residency controls. Microsoft says it has fine-tuned the model and applied additional safety measures. This is not Microsoft replacing OpenAI. It is Microsoft building a routing layer that selects the right model for each task based on cost, capability, and security requirements — much like a cloud database service might choose between storage tiers depending on access frequency.
Think of it like airline seat classes applied to inference: you pay for GPT-4.5 or Opus 4.8 when the task demands it, and you route lighter workloads to a cheaper capable model when it does not. The output quality target stays roughly constant; the cost per completed task drops significantly at scale.
Microsoft's AI Business in China: A Complicated Bridge
The DeepSeek story becomes considerably more interesting when you examine Microsoft's existing position in China's AI market. Bloomberg has reported that Microsoft has quietly built a substantial business selling AI model access — including OpenAI models via Azure — to Chinese technology companies. ByteDance has reportedly been among Microsoft's largest AI customers in China, on track to spend over $1 billion annually on Microsoft AI and cloud services. Ant Group, Meituan, and Tencent are also significant spenders.
This is possible because OpenAI and Anthropic do not sell their models directly in China — restrictions exist around intellectual property concerns and potential model distillation. Microsoft, however, holds a unique partnership arrangement with OpenAI and sets its own resale policies for Azure. Chinese customers access these models over the internet from facilities in third countries such as Singapore, not from Microsoft's China-based data centres, specifically to reduce IP exposure risk.
The scale of the growth is striking. During an internal sales meeting in July 2025, Microsoft's then chief commercial officer reportedly said Azure AI revenue in China had roughly tripled in the fiscal year ending June 2025, following 400% growth the year before. His framing of Microsoft's strategy was explicit: the world's most elite AI solutions are being built on the western coast of the United States and the eastern coast of China, and Microsoft is the company connecting those two places.
That positioning is commercially logical. It is also fragile. OpenAI has reportedly raised concerns privately that Microsoft is not doing enough to prevent Chinese companies from using model outputs to improve their own systems — a process sometimes called distillation. The line between legitimate enterprise use and systematic capability extraction is genuinely blurry. A company with access to frontier model outputs can use them for synthetic training data, evaluation benchmarks, coding assistance, or internal research, all of which can feed back into model development in ways that are difficult to monitor or prevent.
Microsoft uses automated monitoring to flag policy violations, but Bloomberg reports no heightened surveillance applies to Chinese customers specifically. This is a tension that usage policies alone cannot resolve.
Web IQ: Owning the Full Agent Stack
Alongside Co-Work and the multimodel strategy, Microsoft has introduced Web IQ — a Bing-powered grounding system rebuilt specifically for AI agents rather than human searchers. The distinction matters more than it might initially seem.
Conventional search engines optimise for human consumption: ranked links, snippets, images, ads. Agents search differently. They fan out across multiple queries simultaneously, retrieve specific passages, cross-reference sources, and feed results back into reasoning loops — potentially dozens or hundreds of times during a single complex task. They need low latency, token-efficient responses, and fresh data, not a results page designed for a human to skim.
Microsoft claims Web IQ is approximately 2.5 times faster than the next best alternative for agent search workloads. That is a significant claim in a market that already includes Perplexity's API, Brave Search, Tavily, Exa, and Google's agent-oriented search tools. Some scepticism is warranted — in many real agent pipelines, LLM inference time, tool orchestration, and memory management dominate latency far more than search retrieval does. And Web IQ is currently limited to selected Azure customers in early access, so the performance advantage may be most pronounced within Microsoft's own infrastructure.
Still, Web IQ is strategically significant because it reveals Microsoft's intent: to own every layer of the enterprise agent stack. Model selection and routing. Search and grounding. Company memory and file access. Security and compliance controls. Billing and audit infrastructure. Cloud runtime. Co-Work is the product that assembles these layers into something an enterprise can actually deploy — with admin controls, budget caps, usage reporting, audit logs, and data loss prevention baked in.
What the Multimodel Pivot Actually Means for Enterprise AI
Free Weekly Newsletter
Enjoying this guide?
Get the best articles like this one delivered to your inbox every week. No spam.
The broader pattern across all of these moves is worth naming clearly. Microsoft Copilot is no longer a product powered by one model. It is becoming an enterprise AI routing platform — a layer that matches tasks to models based on cost, capability, latency, security requirements, and data residency rules. OpenAI models for frontier work. Anthropic models for certain reasoning tasks. Microsoft's own Co-Work 1 for cost-sensitive everyday jobs. Potentially DeepSeek for high-volume, lower-complexity workloads where cost efficiency matters most.
This is how mature software markets tend to evolve. Early adopters accept a single vendor's stack because it works. As usage scales and cost becomes material, buyers demand optionality. Vendors who can offer model routing without sacrificing security or compliance will have a structural advantage over those locked to a single model provider.
For enterprise buyers, this creates both opportunity and complexity. Usage-based billing for agentic AI can deliver real ROI when tasks are well-defined and volume is predictable. But it also means that poorly scoped workflows — agents that iterate excessively, retrieve unnecessary context, or spawn redundant sub-tasks — can generate surprise costs quickly. Understanding your task distribution across light, medium, and heavy categories is not just a budgeting exercise. It is essential to deploying these systems responsibly.
The geopolitical dimension will not go away. Microsoft sitting at the intersection of Western AI infrastructure and Chinese AI demand is a commercially powerful position, but it carries regulatory, reputational, and intellectual property risks that will only grow as AI competition between the US and China intensifies. The decision to potentially include a fine-tuned Chinese model inside a Western enterprise product — however carefully hosted and secured — is a preview of the kinds of choices every major AI platform will eventually face.
The infrastructure is being built. The routing logic is being written. Which models get routed where, and under what conditions, is now one of the most consequential decisions in enterprise technology.
Frequently Asked Questions
Is Microsoft replacing OpenAI with DeepSeek in Copilot?
No. Microsoft is not replacing OpenAI with DeepSeek. The reported plan is to offer DeepSeek as an optional lower-cost model within Copilot Co-Work, not as a default. Co-Work currently runs on Anthropic models and GPT-4.5, with Microsoft's own Co-Work 1 model also in development. DeepSeek would be one option among several, hosted on Azure with enterprise security controls applied.
Why is Microsoft moving Copilot Co-Work to usage-based billing?
Because agentic AI tasks are computationally expensive. Unlike a single prompt-response exchange, Co-Work sessions can involve many model calls, tool invocations, retrieval steps, and extended cloud runtime. When the product works well, users run hundreds of tasks per week. Microsoft found that unlimited use at a flat subscription price was not financially sustainable at that level of usage.
How does DeepSeek's architecture make it cost-effective for agentic workloads?
DeepSeek models use a Mixture-of-Experts (MoE) design, which activates only a fraction of total model parameters for each inference call. This reduces compute cost per call compared to dense models of equivalent nominal size. For high-volume agentic workflows where many tasks are routine rather than frontier-level, this architecture can deliver strong performance at significantly lower cost per completed task.
What is Microsoft Web IQ and how does it differ from regular search?
Web IQ is a Bing-powered grounding system designed specifically for AI agents. Unlike conventional search, which returns ranked links and snippets formatted for human readers, Web IQ returns concise, machine-readable passages optimised for low token consumption and low latency. Agents typically run many search queries per task session, so Web IQ is structured to handle high-frequency, parallel retrieval efficiently rather than optimising for a single human search session.
Can enterprise customers control which models Copilot Co-Work uses?
Microsoft is building administrative controls into Co-Work that allow IT and compliance teams to set budgets, track per-user spending, limit access, and generate audit logs. Model routing decisions — which model handles which task type — appear to be managed at the platform level, with DeepSeek being opt-in rather than default. Microsoft has indicated that customers will be able to configure usage parameters as the product matures.
Frequently Asked Questions
When a Western Tech Giant Opens the Door to a Chinese AI Model
Microsoft embedding a Chinese AI model inside one of its flagship enterprise products would have sounded implausible eighteen months ago. Yet that is precisely what is now on the table. According to reporting by Axios, Microsoft is exploring a fine-tuned version of DeepSeek as a lower-cost model option inside Copilot Co-Work — its most powerful, agentic product to date. This is not a curiosity. It is a signal that the enterprise AI market is entering a fundamentally different phase: multimodel, cost-aware, and geopolitically complicated in ways that no single company can fully untangle.
To understand why this matters, you need to understand what Copilot Co-Work actually does, why AI agents are so expensive to run, and what Microsoft's broader position in the global AI market really looks like.
What Copilot Co-Work Actually Does — and Why It Costs So Much
Copilot Co-Work is not a chatbot. It is Microsoft's agentic AI layer, designed for long-running, multi-step tasks that traditional prompt-response systems cannot handle. Where a standard Copilot session might summarise a meeting or draft an email, Co-Work can ingest thousands of files, reason across internal databases, call external tools, generate structured outputs, and return a finished result — all without a human steering each step.
Microsoft reports that during its three-month preview inside the Frontier program, more than half of Fortune 500 companies used the product. The use cases are telling: one engineering team used it to edit batch job spreadsheets and auto-generate dependency flowcharts after every change; another compared nearly 4,000 files across two product versions — work Microsoft estimates would have taken weeks manually; a sales team received a ranked list of at-risk pipeline opportunities, complete with the specific follow-up touchpoints that had gone cold.
Those outcomes sound impressive precisely because they are. But they come at a real computational cost. Agentic AI systems do not issue one model call per task. They iterate. They retrieve context, call tools, check intermediate outputs, spawn sub-tasks, and loop until the job is done. A single Co-Work session can involve dozens of model calls, multiple retrieval steps, and extended cloud runtime. The more useful these systems become, the more tasks users assign them. The more tasks they run, the more compute burns.
Microsoft has acknowledged this directly. Charles Lammana, the company's executive vice president for Copilot agents and platform, told Axios that some users were running hundreds of tasks per week. The tool was working — that was the problem. Unlimited use was no longer financially viable.
The result is a shift to usage-based billing. Co-Work now runs on a pay-as-you-go model, with pricing measured in Copilot Credits at one cent each. Task cost depends on four variables: model use, context retrieval, tool calls, and runtime. Microsoft has defined three task tiers — light, medium, and heavy — along with four user personas (corporate knowledge workers, management, customer-facing staff, and technical workers), giving enterprises a framework to forecast spend before committing.
The DeepSeek Option: Cost Engineering at the Model Layer
This is precisely where DeepSeek enters. Right now, Co-Work runs primarily on Anthropic models — Opus 4.8 and Sonnet 4.6 — with GPT-4.5 available inside the Frontier programme. Microsoft's own Co-Work 1 model is also imminent: a post-trained, fine-tuned model described as optimised for everyday Co-Work tasks at substantially lower cost.
But Axios reports Microsoft is also evaluating a fine-tuned DeepSeek V4 as an additional lower-cost option. The architecture here is important. DeepSeek's models use a Mixture-of-Experts (MoE) design, activating only a subset of parameters per inference call. That is part of why DeepSeek R1 caused such a stir when it launched — it delivered near-frontier reasoning performance at a fraction of the compute cost of comparable dense models. For high-volume agentic workloads where many tasks are routine rather than frontier-level, routing those jobs to a cheaper, capable model is straightforward cost engineering.
If Microsoft adopts it, DeepSeek would not be the default. It would be an opt-in option, fully hosted on Azure, with customer data remaining inside Microsoft's cloud and covered by Azure's enterprise security, compliance, and data residency controls. Microsoft says it has fine-tuned the model and applied additional safety measures. This is not Microsoft replacing OpenAI. It is Microsoft building a routing layer that selects the right model for each task based on cost, capability, and security requirements — much like a cloud database service might choose between storage tiers depending on access frequency.
Think of it like airline seat classes applied to inference: you pay for GPT-4.5 or Opus 4.8 when the task demands it, and you route lighter workloads to a cheaper capable model when it does not. The output quality target stays roughly constant; the cost per completed task drops significantly at scale.
Microsoft's AI Business in China: A Complicated Bridge
The DeepSeek story becomes considerably more interesting when you examine Microsoft's existing position in China's AI market. Bloomberg has reported that Microsoft has quietly built a substantial business selling AI model access — including OpenAI models via Azure — to Chinese technology companies. ByteDance has reportedly been among Microsoft's largest AI customers in China, on track to spend over $1 billion annually on Microsoft AI and cloud services. Ant Group, Meituan, and Tencent are also significant spenders.
This is possible because OpenAI and Anthropic do not sell their models directly in China — restrictions exist around intellectual property concerns and potential model distillation. Microsoft, however, holds a unique partnership arrangement with OpenAI and sets its own resale policies for Azure. Chinese customers access these models over the internet from facilities in third countries such as Singapore, not from Microsoft's China-based data centres, specifically to reduce IP exposure risk.
The scale of the growth is striking. During an internal sales meeting in July 2025, Microsoft's then chief commercial officer reportedly said Azure AI revenue in China had roughly tripled in the fiscal year ending June 2025, following 400% growth the year before. His framing of Microsoft's strategy was explicit: the world's most elite AI solutions are being built on the western coast of the United States and the eastern coast of China, and Microsoft is the company connecting those two places.
That positioning is commercially logical. It is also fragile. OpenAI has reportedly raised concerns privately that Microsoft is not doing enough to prevent Chinese companies from using model outputs to improve their own systems — a process sometimes called distillation. The line between legitimate enterprise use and systematic capability extraction is genuinely blurry. A company with access to frontier model outputs can use them for synthetic training data, evaluation benchmarks, coding assistance, or internal research, all of which can feed back into model development in ways that are difficult to monitor or prevent.
Microsoft uses automated monitoring to flag policy violations, but Bloomberg reports no heightened surveillance applies to Chinese customers specifically. This is a tension that usage policies alone cannot resolve.
Web IQ: Owning the Full Agent Stack
Alongside Co-Work and the multimodel strategy, Microsoft has introduced Web IQ — a Bing-powered grounding system rebuilt specifically for AI agents rather than human searchers. The distinction matters more than it might initially seem.
Conventional search engines optimise for human consumption: ranked links, snippets, images, ads. Agents search differently. They fan out across multiple queries simultaneously, retrieve specific passages, cross-reference sources, and feed results back into reasoning loops — potentially dozens or hundreds of times during a single complex task. They need low latency, token-efficient responses, and fresh data, not a results page designed for a human to skim.
Microsoft claims Web IQ is approximately 2.5 times faster than the next best alternative for agent search workloads. That is a significant claim in a market that already includes Perplexity's API, Brave Search, Tavily, Exa, and Google's agent-oriented search tools. Some scepticism is warranted — in many real agent pipelines, LLM inference time, tool orchestration, and memory management dominate latency far more than search retrieval does. And Web IQ is currently limited to selected Azure customers in early access, so the performance advantage may be most pronounced within Microsoft's own infrastructure.
Still, Web IQ is strategically significant because it reveals Microsoft's intent: to own every layer of the enterprise agent stack. Model selection and routing. Search and grounding. Company memory and file access. Security and compliance controls. Billing and audit infrastructure. Cloud runtime. Co-Work is the product that assembles these layers into something an enterprise can actually deploy — with admin controls, budget caps, usage reporting, audit logs, and data loss prevention baked in.
What the Multimodel Pivot Actually Means for Enterprise AI
The broader pattern across all of these moves is worth naming clearly. Microsoft Copilot is no longer a product powered by one model. It is becoming an enterprise AI routing platform — a layer that matches tasks to models based on cost, capability, latency, security requirements, and data residency rules. OpenAI models for frontier work. Anthropic models for certain reasoning tasks. Microsoft's own Co-Work 1 for cost-sensitive everyday jobs. Potentially DeepSeek for high-volume, lower-complexity workloads where cost efficiency matters most.
This is how mature software markets tend to evolve. Early adopters accept a single vendor's stack because it works. As usage scales and cost becomes material, buyers demand optionality. Vendors who can offer model routing without sacrificing security or compliance will have a structural advantage over those locked to a single model provider.
For enterprise buyers, this creates both opportunity and complexity. Usage-based billing for agentic AI can deliver real ROI when tasks are well-defined and volume is predictable. But it also means that poorly scoped workflows — agents that iterate excessively, retrieve unnecessary context, or spawn redundant sub-tasks — can generate surprise costs quickly. Understanding your task distribution across light, medium, and heavy categories is not just a budgeting exercise. It is essential to deploying these systems responsibly.
The geopolitical dimension will not go away. Microsoft sitting at the intersection of Western AI infrastructure and Chinese AI demand is a commercially powerful position, but it carries regulatory, reputational, and intellectual property risks that will only grow as AI competition between the US and China intensifies. The decision to potentially include a fine-tuned Chinese model inside a Western enterprise product — however carefully hosted and secured — is a preview of the kinds of choices every major AI platform will eventually face.
The infrastructure is being built. The routing logic is being written. Which models get routed where, and under what conditions, is now one of the most consequential decisions in enterprise technology.
Frequently Asked Questions
Is Microsoft replacing OpenAI with DeepSeek in Copilot?
No. Microsoft is not replacing OpenAI with DeepSeek. The reported plan is to offer DeepSeek as an optional lower-cost model within Copilot Co-Work, not as a default. Co-Work currently runs on Anthropic models and GPT-4.5, with Microsoft's own Co-Work 1 model also in development. DeepSeek would be one option among several, hosted on Azure with enterprise security controls applied.
Why is Microsoft moving Copilot Co-Work to usage-based billing?
Because agentic AI tasks are computationally expensive. Unlike a single prompt-response exchange, Co-Work sessions can involve many model calls, tool invocations, retrieval steps, and extended cloud runtime. When the product works well, users run hundreds of tasks per week. Microsoft found that unlimited use at a flat subscription price was not financially sustainable at that level of usage.
How does DeepSeek's architecture make it cost-effective for agentic workloads?
DeepSeek models use a Mixture-of-Experts (MoE) design, which activates only a fraction of total model parameters for each inference call. This reduces compute cost per call compared to dense models of equivalent nominal size. For high-volume agentic workflows where many tasks are routine rather than frontier-level, this architecture can deliver strong performance at significantly lower cost per completed task.
What is Microsoft Web IQ and how does it differ from regular search?
Web IQ is a Bing-powered grounding system designed specifically for AI agents. Unlike conventional search, which returns ranked links and snippets formatted for human readers, Web IQ returns concise, machine-readable passages optimised for low token consumption and low latency. Agents typically run many search queries per task session, so Web IQ is structured to handle high-frequency, parallel retrieval efficiently rather than optimising for a single human search session.
Can enterprise customers control which models Copilot Co-Work uses?
Microsoft is building administrative controls into Co-Work that allow IT and compliance teams to set budgets, track per-user spending, limit access, and generate audit logs. Model routing decisions — which model handles which task type — appear to be managed at the platform level, with DeepSeek being opt-in rather than default. Microsoft has indicated that customers will be able to configure usage parameters as the product matures.
About Zeebrain Editorial
Our editorial team is dedicated to providing clear, well-researched, and high-utility content for the modern digital landscape. We focus on accuracy, practicality, and insights that matter.
More from Science & Tech
Related Guides
Keep exploring this topic
Explore More Categories
Keep browsing by topic and build depth around the subjects you care about most.


