Grok-3 exemplifies uncompromising scale, powered by 200,000 NVIDIA H100 GPUs in pursuit of cutting-edge developments. In distinction, DeepSeek-R1 achieves comparable efficiency utilizing a fraction of the computational sources, showcasing how architectural innovation and knowledge curation can successfully rival sheer processing energy.
Since February, DeepSeek has captured international consideration by open-sourcing its flagship reasoning mannequin, DeepSeek-R1, which has demonstrated efficiency on par with a number of the world’s main AI techniques.
“What units it aside isn’t simply its elite capabilities, however the truth that it was skilled utilizing solely 2,000 NVIDIA H800 GPUs — a scaled-down, export-compliant different to the H100, making its achievement a masterclass in effectivity,” stated Wei Solar, principal analyst in AI at Counterpoint.
Musk’s xAI has unveiled Grok-3, its most superior mannequin to this point, which barely outperforms DeepSeek-R1, OpenAI’s GPT-o1 and Google’s Gemini 2. “In contrast to DeepSeek-R1, Grok-3 is proprietary and was skilled utilizing a staggering 200,000 H100 GPUs on xAI’s supercomputer Colossus, representing an enormous leap in computational scale,” stated Solar.
Grok-3 embodies the brute-force technique — huge compute scale (representing billions of {dollars} in GPU prices) driving incremental efficiency positive aspects. It’s a route solely the wealthiest tech giants or governments can realistically pursue.
“In distinction, DeepSeek-R1 demonstrates the facility of algorithmic ingenuity by leveraging methods like Combination-of-Specialists (MoE) and reinforcement studying for reasoning, mixed with curated and high-quality knowledge, to attain comparable outcomes with a fraction of the compute,” defined Solar.
Grok-3 proves that throwing 100x extra GPUs can yield marginal efficiency positive aspects quickly. Nevertheless it additionally highlights quickly diminishing returns on funding (ROI), as most real-world customers see minimal profit from incremental enhancements. In essence, DeepSeek-R1 is about reaching elite efficiency with minimal {hardware} overhead, whereas Grok-3 is about pushing boundaries by any computational means obligatory, stated the report. (With IANS Inputs)







