We collect cookies to analyze our website traffic and performance; we never collect any personal data. Cookie Policy
Accept
NEW YORK DAWN™NEW YORK DAWN™NEW YORK DAWN™
Notification Show More
Font ResizerAa
  • Home
  • Trending
  • New York
  • World
  • Politics
  • Business
    • Business
    • Economy
    • Real Estate
  • Crypto & NFTs
  • Tech
  • Lifestyle
    • Lifestyle
    • Food
    • Travel
    • Fashion
    • Art
  • Health
  • Sports
  • Entertainment
Reading: Nvidia’s new Llama-3.1 Nemotron Extremely outperforms DeepSeek R1 at half the dimensions
Share
Font ResizerAa
NEW YORK DAWN™NEW YORK DAWN™
Search
  • Home
  • Trending
  • New York
  • World
  • Politics
  • Business
    • Business
    • Economy
    • Real Estate
  • Crypto & NFTs
  • Tech
  • Lifestyle
    • Lifestyle
    • Food
    • Travel
    • Fashion
    • Art
  • Health
  • Sports
  • Entertainment
Follow US
NEW YORK DAWN™ > Blog > Technology > Nvidia’s new Llama-3.1 Nemotron Extremely outperforms DeepSeek R1 at half the dimensions
Nvidia’s new Llama-3.1 Nemotron Extremely outperforms DeepSeek R1 at half the dimensions
Technology

Nvidia’s new Llama-3.1 Nemotron Extremely outperforms DeepSeek R1 at half the dimensions

Last updated: April 8, 2025 6:35 pm
Editorial Board Published April 8, 2025
Share
SHARE

At the same time as Meta fends off questions and criticisms of its new Llama 4 mannequin household, graphics processing unit (GPU) grasp Nvidia has launched a brand new, totally open supply massive language mannequin (LLM) based mostly on Meta’s older mannequin Llama-3.1-405B-Instruct mannequin and it’s claiming close to prime efficiency on a wide range of third-party benchmarks — outperforming the vaunted rival DeepSeek R1 open supply reasoning mannequin.

Llama-3.1-Nemotron-Extremely-253B-v1, is a dense 253-billion parameter designed to assist superior reasoning, instruction following, and AI assistant workflows. It was first talked about again at Nvidia’s annual GPU Expertise Convention (GTC) in March.

The discharge displays Nvidia continued concentrate on efficiency optimization by architectural innovation and focused post-training.

Introduced final evening, April 7, 2025, the mannequin code is now publicly accessible on Hugging Face, with open weights and post-training knowledge. It’s designed to function effectively in each “reasoning on” and “reasoning off” modes, permitting builders to toggle between high-complexity reasoning duties and extra simple outputs based mostly on system prompts.

Designed for environment friendly inference

The Llama-3.1-Nemotron-Extremely-253B builds on Nvidia’s earlier work in inference-optimized LLM growth. Its structure—custom-made by a Neural Structure Search (NAS) course of—introduces structural variations similar to skipped consideration layers, fused feedforward networks (FFNs), and variable FFN compression ratios.

This architectural overhaul reduces reminiscence footprint and computational calls for with out severely impacting output high quality, enabling deployment on a single 8x H100 GPU node.

The end result, based on Nvidia, is a mannequin that gives robust efficiency whereas being more cost effective to deploy in knowledge heart environments. Extra {hardware} compatibility contains assist for Nvidia’s B100 and Hopper microarchitectures, with configurations validated in each BF16 and FP8 precision modes.

Submit-training for reasoning and alignment

Nvidia enhanced the bottom mannequin by a multi-phase post-training pipeline. This included supervised fine-tuning throughout domains similar to math, code technology, chat, and power use, adopted by reinforcement studying with Group Relative Coverage Optimization (GRPO) to additional increase instruction-following and reasoning efficiency.

The mannequin underwent a information distillation section over 65 billion tokens, adopted by continuous pretraining on an extra 88 billion tokens.

Coaching datasets included sources like FineWeb, Buzz-V1.2, and Dolma. Submit-training prompts and responses had been drawn from a mix of public corpora and artificial technology strategies, together with datasets that taught the mannequin to distinguish between its reasoning modes.

Improved efficiency throughout quite a few domains and benchmarks

Analysis outcomes present notable beneficial properties when the mannequin operates in reasoning-enabled mode. For example, on the MATH500 benchmark, efficiency elevated from 80.40% in commonplace mode to 97.00% with reasoning enabled.

Equally, outcomes on the AIME25 benchmark rose from 16.67% to 72.50%, and LiveCodeBench scores greater than doubled, leaping from 29.03% to 66.31%.

Efficiency beneficial properties had been additionally noticed in tool-based duties like BFCL V2 and performance composition, in addition to usually query answering (GPQA), the place the mannequin scored 76.01% in reasoning mode versus 56.60% with out.

These benchmarks had been carried out with a most sequence size of 32,000 tokens, and every check was repeated as much as 16 instances to make sure accuracy.

In comparison with DeepSeek R1, a state-of-the-art MoE mannequin with 671 billion whole parameters, Llama-3.1-Nemotron-Extremely-253B exhibits aggressive outcomes regardless of having lower than half the variety of parameters (mannequin settings) — outperforming in duties like GPQA (76.01 vs. 71.5), IFEval instruction following (89.45 vs. 83.3), and LiveCodeBench coding duties (66.31 vs. 65.9).

In the meantime, DeepSeek R1 holds a transparent benefit on sure math evaluations, notably AIME25 (79.8 vs. 72.50), and barely edges out MATH500 (97.3 vs. 97.00).

These outcomes counsel that regardless of being a dense mannequin, Nvidia’s providing matches or exceeds MoE alternate options on reasoning and basic instruction alignment duties, whereas trailing barely in math-heavy classes.

Utilization and integration

The mannequin is suitable with the Hugging Face Transformers library (model 4.48.3 beneficial) and helps enter and output sequences as much as 128,000 tokens.

Builders can management reasoning conduct through system prompts and choose decoding methods based mostly on job necessities.

For reasoning duties, Nvidia recommends utilizing temperature sampling (0.6) with a top-p worth of 0.95. For deterministic outputs, grasping decoding is most popular.

Llama-3.1-Nemotron-Extremely-253B helps multilingual functions, with capabilities in English and several other further languages, together with German, French, Italian, Portuguese, Hindi, Spanish, and Thai.

It is usually appropriate for widespread LLM use circumstances similar to chatbot growth, AI agent workflows, retrieval-augmented technology (RAG), and code technology.

Licensed for business use

Launched underneath the Nvidia Open Mannequin License and ruled by the Llama 3.1 Group License Settlement, the mannequin is prepared for business use.

Nvidia has emphasised the significance of accountable AI growth, encouraging groups to guage the mannequin’s alignment, security, and bias profiles for his or her particular use circumstances.

Oleksii Kuchaiev, Director of AI Mannequin Submit-Coaching at Nvidia, shared the announcement on X, stating that the staff was excited to share the open launch, describing it as a dense 253B mannequin designed with toggle ON/OFF reasoning capabilities and launched with open weights and knowledge.

Day by day insights on enterprise use circumstances with VB Day by day

If you wish to impress your boss, VB Day by day has you coated. We provide the inside scoop on what corporations are doing with generative AI, from regulatory shifts to sensible deployments, so you’ll be able to share insights for optimum ROI.

An error occured.

You Might Also Like

5 key questions your builders must be asking about MCP

Meet AnyCoder, a brand new Kimi K2-powered instrument for quick prototyping and deploying net apps

New embedding mannequin leaderboard shakeup: Google takes #1 whereas Alibaba’s open supply various closes hole

How OpenAI’s purple staff made ChatGPT agent into an AI fortress

Salesforce used AI to chop assist load by 5% — however the true win was educating bots to say ‘I’m sorry’

TAGGED:DeepSeekLlama3.1NemotronNvidiasoutperformssizeUltra
Share This Article
Facebook Twitter Email Print

Follow US

Find US on Social Medias
FacebookLike
TwitterFollow
YoutubeSubscribe
TelegramFollow
Popular News
From ‘Tigers Are Not Afraid’ to ‘Sujo,’ listed below are 11 movies concerning the drug battle by Mexican administrators
Entertainment

From ‘Tigers Are Not Afraid’ to ‘Sujo,’ listed below are 11 movies concerning the drug battle by Mexican administrators

Editorial Board January 20, 2025
Unbiased evaluations refute claims of novel neurological illness in New Brunswick
Trump doubles Canadian metals tariffs as markets maintain plunging
Boris Johnson Is a Liar
Fireplace It Up, Out of doors Grill Station Concepts You’ll Love

You Might Also Like

Mistral’s Le Chat provides deep analysis agent and voice mode to problem OpenAI’s enterprise dominance
Technology

Mistral’s Le Chat provides deep analysis agent and voice mode to problem OpenAI’s enterprise dominance

July 18, 2025
OpenAI unveils ‘ChatGPT agent’ that offers ChatGPT its personal pc to autonomously use your e-mail and internet apps, obtain and create information for you
Technology

OpenAI unveils ‘ChatGPT agent’ that offers ChatGPT its personal pc to autonomously use your e-mail and internet apps, obtain and create information for you

July 17, 2025
Nvidia’s new Llama-3.1 Nemotron Extremely outperforms DeepSeek R1 at half the dimensions
Technology

Slack will get smarter: New AI instruments summarize chats, clarify jargon, and automate work

July 17, 2025
Nvidia’s new Llama-3.1 Nemotron Extremely outperforms DeepSeek R1 at half the dimensions
Technology

Blaxel raises $7.3M seed spherical to construct ‘AWS for AI agents’ after processing billions of agent requests

July 17, 2025

Categories

  • Health
  • Sports
  • Politics
  • Entertainment
  • Technology
  • World
  • Art

About US

New York Dawn is a proud and integral publication of the Enspirers News Group, embodying the values of journalistic integrity and excellence.
Company
  • About Us
  • Newsroom Policies & Standards
  • Diversity & Inclusion
  • Careers
  • Media & Community Relations
  • Accessibility Statement
Contact Us
  • Contact Us
  • Contact Customer Care
  • Advertise
  • Licensing & Syndication
  • Request a Correction
  • Contact the Newsroom
  • Send a News Tip
  • Report a Vulnerability
Term of Use
  • Digital Products Terms of Sale
  • Terms of Service
  • Privacy Policy
  • Cookie Settings
  • Submissions & Discussion Policy
  • RSS Terms of Service
  • Ad Choices
© 2024 New York Dawn. All Rights Reserved.
Welcome Back!

Sign in to your account

Lost your password?