AnyAi.fyi - Discover ANY AI to make more online for less.

venturebeat

Inside Ring-1T: Ant engineers solve reinforcement learning bottlenecks at trillion scale

China’s Ant Group, an affiliate of Alibaba, detailed technical information around its new model, Ring-1T, which the company said is “the first open-source reasoning model with one trillion total parameters.”Ring-1T aims to compete with other reasoning models like GPT-5 and the o-series from OpenAI, as well as Google’s Gemini 2.5. With the new release of the latest model, Ant extends the geopolitical debate over who will dominate the AI race: China or the US. Ant Group said Ring-1T is optimized for mathematical and logical problems, code generation and scientific problem-solving. “With approximately 50 billion activated parameters per token, Ring-1T achieves state-of-the-art performance across multiple challenging benchmarks — despite relying solely on natural language reasoning capabilities,” Ant said in a paper.Ring-1T, which was first released on preview in September, adopts the same architecture as Ling 2.0 and trained on the Ling-1T-base model the company released earlier this month. Ant said this allows the model to support up to 128,000 tokens.To train a model as large as Ring-1T, researchers had to develop new methods to scale reinforcement learning (RL).New methods of training
Ant Group developed three “interconnected innovations” to support the RL and training of Ring-1T, a challenge given the model's size and the typically large compute requirements it entails. These three are IcePop, C3PO++ and ASystem.IcePop removes noisy gradient updates to stabilize training without slowing inference. It helps eliminate catastrophic training-inference misalignment in RL. The researchers noted that when training models, particularly those using a mixture-of-experts (MoE) architecture like Ring-1T, there can often be a discrepancy in probability calculations. “This problem is particularly pronounced in the training of MoE models with RL due to the inherent usage of the dynamic routing mechanism. Additionally, in long CoT settings, these discrepancies can gradually accumulate across iterations and become further amplified,” the researchers said. IcePop “suppresses unstable training updates through double-sided masking calibration.”The next new method the researchers had to develop is C3PO++, an improved version of the C3PO system that Ant previously established. The method manages how Ring-1T and other extra-large parameter models generate and process training examples, or what they call rollouts, so GPUs don’t sit idle. The way it works would break work in rollouts into pieces to process in parallel. One group is the inference pool, which generates new data, and the other is the training pool, which collects results to update the model. C3PO++ creates a token budget to control how much data is processed, ensuring GPUs are used efficiently.The last new method, ASystem, adopts a SingleController+SPMD (Single Program, Multiple Data) architecture to enable asynchronous operations. Benchmark resultsAnt pointed Ring-1T to benchmarks measuring performance in mathematics, coding, logical reasoning and general tasks. They tested it against models such as DeepSeek-V3.1-Terminus-Thinking, Qwen-35B-A22B-Thinking-2507, Gemini 2.5 Pro and GPT-5 Thinking. In benchmark testing, Ring-1T performed strongly, coming in second to OpenAI’s GPT-5 across most benchmarks. Ant said that Ring-1T showed the best performance among all the open-weight models it tested. The model posted a 93.4% score on the AIME 25 leaderboard, second only to GPT-5. In coding, Ring-1T outperformed both DeepSeek and Qwen.“It indicates that our carefully synthesized dataset shapes Ring-1T’s robust performance on programming applications, which forms a strong foundation for future endeavors on agentic applications,” the company said. Ring-1T shows how much Chinese companies are investing in models Ring-1T is just the latest model from China aiming to dethrone GPT-5 and Gemini. Chinese companies have been releasing impressive models at a quick pace since the surprise launch of DeepSeek in January. Ant's parent company, Alibaba, recently released Qwen3-Omni, a multimodal model that natively unifies text, image, audio and video. DeepSeek has also continued to improve its models and earlier this month, launched DeepSeek-OCR. This new model reimagines how models process information. With Ring-1T and Ant’s development of new methods to train and scale extra-large models, the battle for AI dominance between the US and China continues to heat up.

Discover Copy

Rating

Innovation

Pricing

Technology

Usability

We have discovered similar tools to what you are looking for. Check out our suggestions for similar AI tools.

Oura Ring 4 long-term review: Out ahead of its rivals

Smart rings have been a niche inside a niche in the wearables world for more than a decade. But in the last few years, they’ve enjoyed a renaissance as more attention and hype brought bigge [...]

More Copy

Match Score: 147.60

venturebeat

Self-improving language models are becoming reality with MIT's updated

Researchers at the Massachusetts Institute of Technology (MIT) are gaining renewed attention for developing and <a href="https://github.com/Continual-Intelligence/SEAL/blob/main/LICEN [...]

More Copy

Match Score: 89.87

The best smart scales for 2025

The New Year is here and there’s no better time to kickstart those health and fitness goals. Whether you’re looking to shed a few holiday pounds, track your muscle gains or simply stay on [...]

More Copy

Match Score: 85.17

venturebeat

Research finds that 77% of data engineers have heavier workloads despite AI

Data engineers should be working faster than ever. AI-powered tools promise to automate pipeline optimization, accelerate data integration and handle the repetitive grunt work that has define [...]

More Copy

Match Score: 82.13

The best smart rings for 2025

It’s getting increasingly difficult to say smart rings are just a niche in [...]

More Copy

Match Score: 75.02

venturebeat

Cerebras says its chips run a trillion-parameter AI model nearly 7 times fa

Less than a week after completing the largest tech IPO of 2026, <a href="https://www.cerebras.ai/">Cerebras Systems</a> is making its most aggressive play yet to dominat [...]

More Copy

Match Score: 66.30

venturebeat

MIT's new fine-tuning method lets LLMs learn new skills without losing

When enterprises fine-tune LLMs for new tasks, they risk breaking everything the models already know. This forces companies to maintain separate models for every skill.Rese [...]

More Copy

Match Score: 63.58

venturebeat

Thinking Machines challenges OpenAI's AI scaling strategy: 'First

While the world's leading artificial intelligence companies race to build ever-larger models, betting billions that scale alone will unlock artificial general intelligence, a researc [...]

More Copy

Match Score: 62.01

venturebeat

Nvidia researchers boost LLMs reasoning skills by getting them to 'thi

Researchers at Nvidia have developed a new technique that flips the script on how large language models (LLMs) learn to reason. The method, called <a href="https:// [...]

More Copy

Match Score: 61.62