We collect cookies to analyze our website traffic and performance; we never collect any personal data. Cookie Policy
Accept
NEW YORK DAWN™NEW YORK DAWN™NEW YORK DAWN™
Notification Show More
Font ResizerAa
  • Home
  • Trending
  • New York
  • World
  • Politics
  • Business
    • Business
    • Economy
    • Real Estate
  • Crypto & NFTs
  • Tech
  • Lifestyle
    • Lifestyle
    • Food
    • Travel
    • Fashion
    • Art
  • Health
  • Sports
  • Entertainment
Reading: Nvidia launches totally open supply transcription AI mannequin Parakeet-TDT-0.6B-V2 on Hugging Face
Share
Font ResizerAa
NEW YORK DAWN™NEW YORK DAWN™
Search
  • Home
  • Trending
  • New York
  • World
  • Politics
  • Business
    • Business
    • Economy
    • Real Estate
  • Crypto & NFTs
  • Tech
  • Lifestyle
    • Lifestyle
    • Food
    • Travel
    • Fashion
    • Art
  • Health
  • Sports
  • Entertainment
Follow US
NEW YORK DAWN™ > Blog > Technology > Nvidia launches totally open supply transcription AI mannequin Parakeet-TDT-0.6B-V2 on Hugging Face
Nvidia launches totally open supply transcription AI mannequin Parakeet-TDT-0.6B-V2 on Hugging Face
Technology

Nvidia launches totally open supply transcription AI mannequin Parakeet-TDT-0.6B-V2 on Hugging Face

Last updated: May 5, 2025 9:30 pm
Editorial Board Published May 5, 2025
Share
SHARE

Nvidia has change into one of the vital useful corporations on the earth in recent times because of the inventory market noticing how a lot demand there’s for graphics processing models (GPUs), the highly effective chips Nvidia makes which can be used to render graphics in video video games but in addition, more and more, practice AI giant language and diffusion fashions.

However Nvidia does excess of simply make {hardware}, in fact, and the software program to run it. Because the generative AI period wears on, the Santa Clara-based firm has additionally been steadily releasing increasingly of its personal AI fashions — largely open supply and free for researchers and builders to take, obtain, modify and use commercially — and the most recent amongst them is Parakeet-TDT-0.6B-v2, an computerized speech recognition (ASR) mannequin that may, within the phrases of Hugging Face’s Vaibhav “VB” Srivastav, “transcribe 60 minutes of audio in 1 second [mind blown emoji].”

That is the brand new era of the Parakeet mannequin Nvidia first unveiled again in January 2024 and up to date once more in April of that 12 months, however this model two is so highly effective, it at the moment tops the Hugging Face Open ASR Leaderboard with a median “Word Error Rate” (instances the mannequin incorrectly transcribes a spoken phrase) of simply 6.05% (out of 100).

To place that in perspective, it nears proprietary transcription fashions comparable to OpenAI’s GPT-4o-transcribe (with a WER of two.46% in English) and ElevenLabs Scribe (3.3%).

And it’s providing all this whereas remaining freely out there beneath a commercially permissive Artistic Commons CC-BY-4.0 license, making it a horny proposition for business enterprises and indie builders seeking to construct speech recognition and transcription companies into their paid functions.

Efficiency and benchmark standing

The mannequin boasts 600 million parameters and leverages a mixture of the FastConformer encoder and TDT decoder architectures.

It’s able to transcribing an hour of audio in only one second, offered it’s operating on Nvidia’s GPU-accelerated {hardware}.

The efficiency benchmark is measured at an RTFx (Actual-Time Issue) of 3386.02 with a batch measurement of 128, putting it on the prime of present ASR benchmarks maintained by Hugging Face.

Use circumstances and availability

Launched globally on Could 1, 2025, Parakeet-TDT-0.6B-v2 is geared toward builders, researchers, and trade groups constructing functions comparable to transcription companies, voice assistants, subtitle turbines, and conversational AI platforms.

The mannequin helps punctuation, capitalization, and detailed word-level timestamping, providing a full transcription package deal for a variety of speech-to-text wants.

Entry and deployment

Builders can deploy the mannequin utilizing Nvidia’s NeMo toolkit. The setup course of is appropriate with Python and PyTorch, and the mannequin can be utilized straight or fine-tuned for domain-specific duties.

The open-source license (CC-BY-4.0) additionally permits for business use, making it interesting to startups and enterprises alike.

Coaching information and mannequin improvement

Parakeet-TDT-0.6B-v2 was skilled on a various and large-scale corpus referred to as the Granary dataset. This contains round 120,000 hours of English audio, composed of 10,000 hours of high-quality human-transcribed information and 110,000 hours of pseudo-labeled speech.

Sources vary from well-known datasets like LibriSpeech and Mozilla Frequent Voice to YouTube-Commons and Librilight.

Nvidia plans to make the Granary dataset publicly out there following its presentation at Interspeech 2025.

Analysis and robustness

The mannequin was evaluated throughout a number of English-language ASR benchmarks, together with AMI, Earnings22, GigaSpeech, and SPGISpeech, and confirmed robust generalization efficiency. It stays strong beneath assorted noise situations and performs nicely even with telephony-style audio codecs, with solely modest degradation at decrease signal-to-noise ratios.

{Hardware} compatibility and effectivity

Parakeet-TDT-0.6B-v2 is optimized for Nvidia GPU environments, supporting {hardware} such because the A100, H100, T4, and V100 boards.

Whereas high-end GPUs maximize efficiency, the mannequin can nonetheless be loaded on programs with as little as 2GB of RAM, permitting for broader deployment eventualities.

Moral concerns and accountable use

NVIDIA notes that the mannequin was developed with out the usage of private information and adheres to its accountable AI framework.

Though no particular measures have been taken to mitigate demographic bias, the mannequin handed inner high quality requirements and contains detailed documentation on its coaching course of, dataset provenance, and privateness compliance.

The discharge drew consideration from the machine studying and open-source communities, particularly after being publicly highlighted on social media. Commentators famous the mannequin’s potential to outperform business ASR alternate options whereas remaining totally open supply and commercially usable.

Builders interested by making an attempt the mannequin can entry it by way of Hugging Face or by means of Nvidia’s NeMo toolkit. Set up directions, demo scripts, and integration steerage are available to facilitate experimentation and deployment.

Each day insights on enterprise use circumstances with VB Each day

If you wish to impress your boss, VB Each day has you coated. We provide the inside scoop on what corporations are doing with generative AI, from regulatory shifts to sensible deployments, so you possibly can share insights for max ROI.

An error occured.

You Might Also Like

Enterprises are measuring the unsuitable a part of RAG

Most RAG programs don’t perceive refined paperwork — they shred them

OpenClaw proves agentic AI works. It additionally proves your safety mannequin doesn't. 180,000 builders simply made that your drawback.

How main CPG manufacturers are reworking operations to outlive market pressures

This tree search framework hits 98.7% on paperwork the place vector search fails

TAGGED:FacefullyHugginglaunchesmodelNvidiaopenParakeetTDT0.6BV2sourceTranscription
Share This Article
Facebook Twitter Email Print

Follow US

Find US on Social Medias
FacebookLike
TwitterFollow
YoutubeSubscribe
TelegramFollow
Popular News
Paul Dano obtained slammed by Tarantino. Now, he is ‘grateful that the world spoke up for me’
Entertainment

Paul Dano obtained slammed by Tarantino. Now, he is ‘grateful that the world spoke up for me’

Editorial Board January 29, 2026
Abdul Carter says Giants protection ‘not good enough’ vs. Cowboys
The Dodgers-Blue Jays World Collection had record-setting scores. This is what it means
Bob Raissman: A disgrace if we don’t get a ‘Hard Knocks’ with Invoice Belichick within the lead function
Gun Talks Snag on Tricky Question: What Counts as a Boyfriend?

You Might Also Like

Arcee's U.S.-made, open supply Trinity Massive and 10T-checkpoint supply uncommon take a look at uncooked mannequin intelligence
Technology

Arcee's U.S.-made, open supply Trinity Massive and 10T-checkpoint supply uncommon take a look at uncooked mannequin intelligence

January 30, 2026
The belief paradox killing AI at scale: 76% of information leaders can't govern what staff already use
Technology

The belief paradox killing AI at scale: 76% of information leaders can't govern what staff already use

January 30, 2026
AI brokers can speak to one another — they only can't suppose collectively but
Technology

AI brokers can speak to one another — they only can't suppose collectively but

January 29, 2026
Infostealers added Clawdbot to their goal lists earlier than most safety groups knew it was operating
Technology

Infostealers added Clawdbot to their goal lists earlier than most safety groups knew it was operating

January 29, 2026

Categories

  • Health
  • Sports
  • Politics
  • Entertainment
  • Technology
  • Art
  • World

About US

New York Dawn is a proud and integral publication of the Enspirers News Group, embodying the values of journalistic integrity and excellence.
Company
  • About Us
  • Newsroom Policies & Standards
  • Diversity & Inclusion
  • Careers
  • Media & Community Relations
  • Accessibility Statement
Contact Us
  • Contact Us
  • Contact Customer Care
  • Advertise
  • Licensing & Syndication
  • Request a Correction
  • Contact the Newsroom
  • Send a News Tip
  • Report a Vulnerability
Term of Use
  • Digital Products Terms of Sale
  • Terms of Service
  • Privacy Policy
  • Cookie Settings
  • Submissions & Discussion Policy
  • RSS Terms of Service
  • Ad Choices
© 2024 New York Dawn. All Rights Reserved.
Welcome Back!

Sign in to your account

Lost your password?