Artificial intelligence applied to radio access networks — AI-RAN — has moved from conference slides to production infrastructure. In Q1 2026, three Tier-1 operators published independent field trial results comparing AI-optimized RAN against conventional configurations on identical hardware. For the first time, the industry has apples-to-apples performance data. The results are uneven: significant gains in some areas, marginal improvements in others, and a few cases where AI actually degraded performance. This report compiles and analyzes every publicly available benchmark.
Methodology: What We Measured and How
This report aggregates data from seven sources: T-Mobile US (March 2026 field trial, Denver metro), Rakuten Mobile (Q4 2025 production data, Tokyo), Deutsche Telekom (February 2026 trial, Berlin), SK Telecom (January 2026 trial, Seoul), Vodafone UK (March 2026 trial, London), NVIDIA Aerial platform benchmarks (lab conditions), and Nokia AirFrame AI benchmarks (lab + field). All results compare AI-enabled versus AI-disabled configurations on the same hardware, eliminating equipment variability.
Key metrics tracked: downlink throughput (Mbps per cell), uplink throughput, average latency (ms), 99th percentile latency, spectral efficiency (bps/Hz), energy consumption (kWh per TB transported), handover success rate, and call drop rate. Where operators reported different metrics, we normalize to common units and note methodology differences.
Throughput: +8-22% Gains, Highly Scenario-Dependent
The headline number most operators led with was throughput improvement. Across all seven sources, AI-RAN delivered:
- Dense urban (>1,000 users/km²): +15-22% average downlink throughput improvement. T-Mobile reported +18% in Denver's downtown core (from 285 Mbps to 336 Mbps average per cell). SK Telecom measured +22% in Gangnam district during peak hours (from 310 Mbps to 378 Mbps).
- Suburban (<500 users/km²): +8-12% improvement. Deutsche Telekom saw +11% in Berlin's outer districts. Vodafone reported +8% across suburban London sites.
- Rural (<50 users/km²): +3-5% — statistically significant but operationally minor. The AI scheduler has fewer users to optimize across, limiting multiuser diversity gains.
- Indoor (enterprise/stadium): +25-30% in high-density venues. Rakuten's deployment at Tokyo Dome showed +28% during a sold-out baseball game (45,000 concurrent devices).
The pattern is clear: AI-RAN's throughput advantage scales with user density. This aligns with the underlying mechanism — AI schedulers exploit multiuser diversity more effectively than proportional-fair algorithms when there are many users with diverse channel conditions to choose from.
Latency: The Most Surprising Result
Latency results were the most counterintuitive. Average latency improved modestly (5-15%), but 99th percentile (tail) latency — the metric that matters most for real-time applications — showed dramatic improvement in some trials and degradation in others.
| Operator | Avg Latency Change | P99 Latency Change | Notes |
|---|---|---|---|
| T-Mobile US | -12% | -35% | Denver metro, 5G NR n41 |
| Rakuten Mobile | -8% | -42% | Tokyo, O-RAN 4G+5G |
| Deutsche Telekom | -15% | +8% | Berlin, Nokia AirScale |
| SK Telecom | -10% | -28% | Seoul, Samsung vRAN |
| Vodafone UK | -6% | +12% | London, Ericsson RAN |
The divergence in P99 results is significant. Operators using O-RAN-based architectures (Rakuten, T-Mobile) saw large tail latency reductions, while those on traditional vendor stacks (Deutsche Telekom on Nokia, Vodafone on Ericsson) saw P99 increase. The likely explanation: O-RAN's Near-RT RIC allows AI inference within the 10 ms control loop, while proprietary architectures add an additional inference hop that occasionally exceeds the latency budget under load. This is an architecture problem, not an AI problem — and it suggests that AI-RAN benefits are maximized on disaggregated, O-RAN-compliant infrastructure.
Energy Efficiency: The Business Case Driver
If throughput and latency improvements are incremental, energy savings are transformational — and this is where AI-RAN makes its strongest business case.
Every operator reported energy reductions, with remarkably consistent results:
- Sleep mode optimization: AI-controlled cell sleep/wake cycles reduced energy consumption by 18-25% during low-traffic periods (midnight to 6 AM). T-Mobile reported 22% overnight savings across 1,200 Denver sites. Deutsche Telekom achieved 25% — the highest figure — by allowing AI to shut down individual MIMO layers rather than entire cells.
- Beamforming optimization: AI-driven beam management reduced power amplifier consumption by 8-12% during peak hours. SK Telecom's Samsung vRAN deployment showed 11% peak-hour savings by predicting user movement and pre-steering beams rather than reactively adjusting.
- Cooling reduction: Lower RF power output translated to 5-8% cooling cost savings at tower sites. Vodafone reported 7% cooling reduction across 200 London macro sites.
- Total energy savings: Across all operators, AI-RAN reduced per-site energy consumption by 15-22% on average, translating to $8,000-15,000 annual savings per macro site depending on local electricity costs.
At scale, these numbers reshape the investment case. T-Mobile estimates $120 million in annual energy savings across its US network if AI-RAN is deployed nationwide. For Deutsche Telekom's European footprint, the projected saving is €200 million per year. Energy savings alone justify AI-RAN deployment within 18-24 months for most Tier-1 operators — even without counting throughput and latency gains.
Spectral Efficiency: +10-18% in Real Conditions
Spectral efficiency — bits per second per Hertz — is the fundamental measure of how well a radio system uses its allocated spectrum. AI-RAN improvements here have direct financial value, as they are equivalent to having more spectrum without buying it.
Results across operators:
- SK Telecom: +18% spectral efficiency improvement (from 7.2 bps/Hz to 8.5 bps/Hz average) on 3.5 GHz n78 band. This is the highest reported figure and reflects Samsung's aggressive AI scheduler, which uses a transformer-based model trained on 18 months of network data.
- T-Mobile: +14% (from 6.8 to 7.8 bps/Hz) on n41 (2.5 GHz). The AI model coordinates scheduling across 3 carrier components simultaneously.
- Rakuten: +12% across their 4G/5G shared spectrum. Notably, the AI scheduler dynamically reallocated spectrum between 4G and 5G in real-time based on demand — something their traditional scheduler could not do.
- Vodafone: +10% (from 5.9 to 6.5 bps/Hz) on 3.5 GHz. The lowest improvement, attributed to Vodafone's already-optimized Ericsson configuration leaving less headroom for AI gains.
Handover and Mobility: Mixed Results
Handover performance — how smoothly users transition between cells — showed the most inconsistent results across trials. This is a critical metric for connected vehicles and mobile VR/AR applications.
T-Mobile reported handover success rate improvement from 98.2% to 99.1% with AI-assisted handover prediction. The AI model predicts which cell a user will move to 2-3 seconds before the handover event, allowing pre-preparation of target cell resources. However, false predictions (approximately 8% of cases) caused unnecessary resource reservation, slightly increasing interference on neighboring cells.
Rakuten's results were more dramatic: handover failure rate dropped from 1.5% to 0.4% in their Tokyo deployment. Their advantage: Rakuten's fully cloud-native, O-RAN architecture allows the AI model to access data from all neighboring cells simultaneously, rather than relying on measurement reports from the user device alone.
Deutsche Telekom and Vodafone reported no statistically significant change in handover performance. Both operators noted that their existing handover algorithms were already heavily tuned for European urban environments, and the AI models did not have sufficient training data to outperform decades of manual optimization.
NVIDIA Aerial: Lab vs Field Reality
NVIDIA's Aerial platform — a GPU-accelerated software RAN running on NVIDIA converged accelerators — published lab benchmarks claiming 40% throughput improvement and 50% energy reduction compared to traditional DSP-based RAN. These numbers have been widely cited in industry presentations.
The field trial data tells a more nuanced story. T-Mobile's Denver deployment, which used NVIDIA Aerial on a subset of sites, showed 18% throughput improvement — less than half the lab claim. The gap is explained by real-world factors absent from lab testing: inter-cell interference, non-ideal propagation, device diversity (lab tests use reference devices), and the computational overhead of running AI inference alongside real-time signal processing on shared GPU resources.
NVIDIA acknowledged the gap in a March 2026 blog post, noting that "lab benchmarks represent theoretical ceiling performance" and that "field deployments typically achieve 40-60% of lab gains depending on environment complexity." This is an honest assessment, and the 18% field result falls within NVIDIA's stated range.
What NVIDIA Aerial does deliver consistently is operational flexibility. The platform supports over-the-air model updates, allowing operators to retrain AI models monthly as network conditions change. Traditional RAN vendors require software upgrades with downtime; Aerial updates models in the background without service interruption.
Nokia AirFrame AI: The Incumbent Approach
Nokia's approach differs fundamentally from NVIDIA's. Rather than replacing the RAN processor with GPUs, Nokia adds an AI inference accelerator alongside existing baseband hardware. The AirFrame AI module plugs into Nokia's AirScale platform and runs trained models that optimize scheduling, power control, and beamforming.
Nokia's published benchmarks show 12-15% throughput improvement and 18-20% energy savings — more conservative than NVIDIA's claims but closer to what operators actually measured in field trials. The Deutsche Telekom Berlin trial, running Nokia AirFrame AI, achieved 11% throughput and 25% energy improvement — aligning well with Nokia's stated range.
The trade-off: Nokia's approach requires operator lock-in to the AirScale ecosystem. AI models are trained on Nokia's cloud platform and deployed through Nokia's management system. Operators cannot bring their own models or use third-party AI frameworks. For operators committed to Nokia's platform, this is acceptable. For those pursuing multi-vendor O-RAN strategies, it is a non-starter.
Samsung vRAN AI: Training on Operator Data
Samsung's AI-RAN strategy emphasizes training models on operator-specific data rather than generic network simulations. SK Telecom's trial used a Samsung vRAN model trained on 18 months of SKT's own network telemetry — 4.2 TB of scheduling decisions, channel measurements, and user mobility patterns.
The result: SK Telecom achieved the highest spectral efficiency improvement (+18%) of any operator in our dataset. Samsung attributes this to the model's deep familiarity with SKT's specific propagation environment, user behavior patterns, and traffic profiles. A generic model trained on synthetic data achieved only +7% in the same environment during a controlled comparison — confirming that operator-specific training data is the single most important factor in AI-RAN performance.
Samsung's approach has a significant scaling limitation: each operator deployment requires months of data collection and custom model training. Samsung is addressing this with a federated learning framework that allows multiple operators to jointly train models without sharing raw data, but this system is still in pilot phase with only SKT and KDDI participating as of Q1 2026.
The O-RAN ML Framework: Standardizing AI-RAN
The O-RAN Alliance's ML framework, specified in O-RAN WG2 technical specifications, aims to standardize how AI models are trained, deployed, and managed across multi-vendor RAN environments. As of Release 3 (finalized Q4 2025), the framework defines:
- Model hosting on Near-RT RIC (inference within 10 ms loop) and Non-RT RIC (inference within 1 second loop)
- Standard interfaces (A1, E2) for model-to-RAN communication
- Model lifecycle management: training, validation, deployment, monitoring, rollback
- Performance monitoring and drift detection — automatically flagging when a deployed model's performance degrades
Rakuten's deployment is the only large-scale production implementation of the O-RAN ML framework. Their results validate the architecture: the Near-RT RIC achieved inference latency under 5 ms in 99.7% of scheduling intervals, confirming that real-time AI optimization is feasible within O-RAN's control loop constraints.
However, interoperability remains a challenge. In a February 2026 O-RAN Alliance plugfest, only 3 out of 8 vendor combinations successfully ran a common AI model across different RIC implementations. The specification leaves enough ambiguity in data formats and model interfaces that vendor-specific adaptations are still required.
Cost Analysis: What AI-RAN Actually Costs
Deploying AI-RAN is not free. Operators face three cost categories:
- Hardware: GPU accelerators (NVIDIA A100/H100 for Aerial) or AI coprocessors (Nokia AirFrame AI module) add $5,000-15,000 per site. For a Tier-1 operator with 30,000 sites, hardware cost ranges from $150 million to $450 million.
- Training infrastructure: Cloud computing for model training costs $2-5 million per operator per year. Samsung's operator-specific approach requires more training compute than generic models.
- Personnel: AI-RAN requires data scientists and ML engineers that most operators do not currently employ. Typical team size is 15-30 specialists, costing $3-8 million annually in salaries.
Against these costs, the energy savings alone ($120-200 million annually for a Tier-1 operator) produce a positive ROI within 2-3 years. Add revenue uplift from improved throughput and capacity (estimated at 3-5% increase in ARPU from better user experience), and the payback period drops to 18 months.
Outlook: What Changes in 2027
Three developments will reshape AI-RAN benchmarks over the next 12 months:
1. Foundation models for RAN: Both NVIDIA and Ericsson have announced "foundation model" approaches — large AI models pre-trained on diverse network data that can be fine-tuned for specific operators with minimal additional data. If successful, this would eliminate Samsung's data collection bottleneck and democratize high-performance AI-RAN.
2. 6G AI-native design: The 3GPP Study Item on AI/ML for NR (Release 19, expected Q3 2026) will standardize AI model exchange formats and performance requirements. This creates a common baseline for comparing AI-RAN implementations across vendors — something that does not exist today.
3. Edge inference hardware: Next-generation inference chips (NVIDIA Blackwell, Qualcomm Cloud AI 200, Intel Gaudi 3) will reduce per-site hardware cost by 40-60% while doubling inference throughput. This addresses the primary barrier to universal AI-RAN deployment: site-level economics.
The 2026 benchmarks establish a baseline. AI-RAN delivers measurable, reproducible benefits — but the magnitude varies significantly by architecture, vendor, and environment. The winners will be operators who invest in data collection infrastructure and O-RAN-compliant architectures that maximize AI's ability to optimize in real time. The technology works. The question is no longer whether AI improves RAN — it is how much, where, and at what cost.