We collect cookies to analyze our website traffic and performance; we never collect any personal data. Cookie Policy
Accept
NEW YORK DAWN™NEW YORK DAWN™NEW YORK DAWN™
Notification Show More
Font ResizerAa
  • Home
  • Trending
  • New York
  • World
  • Politics
  • Business
    • Business
    • Economy
    • Real Estate
  • Crypto & NFTs
  • Tech
  • Lifestyle
    • Lifestyle
    • Food
    • Travel
    • Fashion
    • Art
  • Health
  • Sports
  • Entertainment
Reading: Hidden prices in AI deployment: Why Claude fashions could also be 20-30% costlier than GPT in enterprise settings
Share
Font ResizerAa
NEW YORK DAWN™NEW YORK DAWN™
Search
  • Home
  • Trending
  • New York
  • World
  • Politics
  • Business
    • Business
    • Economy
    • Real Estate
  • Crypto & NFTs
  • Tech
  • Lifestyle
    • Lifestyle
    • Food
    • Travel
    • Fashion
    • Art
  • Health
  • Sports
  • Entertainment
Follow US
NEW YORK DAWN™ > Blog > Technology > Hidden prices in AI deployment: Why Claude fashions could also be 20-30% costlier than GPT in enterprise settings
Hidden prices in AI deployment: Why Claude fashions could also be 20-30% costlier than GPT in enterprise settings
Technology

Hidden prices in AI deployment: Why Claude fashions could also be 20-30% costlier than GPT in enterprise settings

Last updated: May 1, 2025 9:16 pm
Editorial Board Published May 1, 2025
Share
SHARE

It’s a well-known incontrovertible fact that totally different mannequin households can use totally different tokenizers. Nevertheless, there was restricted evaluation on how the method of “tokenization” itself varies throughout these tokenizers. Do all tokenizers lead to the identical variety of tokens for a given enter textual content? If not, how totally different are the generated tokens? How important are the variations?

On this article, we discover these questions and look at the sensible implications of tokenization variability. We current a comparative story of two frontier mannequin households: OpenAI’s ChatGPT vs Anthropic’s Claude. Though their marketed “cost-per-token” figures are extremely aggressive, experiments reveal that Anthropic fashions could be 20–30% costlier than GPT fashions.

API Pricing — Claude 3.5 Sonnet vs GPT-4o

As of June 2024, the pricing construction for these two superior frontier fashions is very aggressive. Each Anthropic’s Claude 3.5 Sonnet and OpenAI’s GPT-4o have an identical prices for output tokens, whereas Claude 3.5 Sonnet presents a 40% decrease value for enter tokens.

AD 4nXdAK7hYA9Qdvc4NrG1F7yz33bal2r4qGEliCW8xhvfOE8o0qMZTqOeB 84NzBmXJp0 GDj3L9eAMc2Ww6D3f8Lb0HunNU6DBhquEthh61oTKnz

Supply: Vantage

The hidden “tokenizer inefficiency”

Regardless of decrease enter token charges of the Anthropic mannequin, we noticed that the overall prices of operating experiments (on a given set of fastened prompts) with GPT-4o is less expensive when in comparison with Claude Sonnet-3.5.

Why?

The Anthropic tokenizer tends to interrupt down the identical enter into extra tokens in comparison with OpenAI’s tokenizer. Because of this, for an identical prompts, Anthropic fashions produce significantly extra tokens than their OpenAI counterparts. In consequence, whereas the per-token value for Claude 3.5 Sonnet’s enter could also be decrease, the elevated tokenization can offset these financial savings, resulting in larger total prices in sensible use circumstances. 

This hidden value stems from the way in which Anthropic’s tokenizer encodes data, typically utilizing extra tokens to signify the identical content material. The token rely inflation has a big impression on prices and context window utilization.

Area-dependent tokenization inefficiency

Several types of area content material are tokenized in another way by Anthropic’s tokenizer, resulting in various ranges of elevated token counts in comparison with OpenAI’s fashions. The AI analysis group has famous related tokenization variations right here. We examined our findings on three well-liked domains, specifically: English articles, code (Python) and math.

DomainModel InputGPT TokensClaude Tokens% Token OverheadEnglish articlesAD 4nXdjCIsR wwC VvwAgm3cw0Cys1IRIk9OuXBUYe2ydbwPRMDzBVPsom7vL hk2VgDsYL65GcUgcwh owIGVCu 0PKegRJpPWuf8utXK lVOt42uwg7vIpRCy7789~16percentCode (Python)AD 4nXepTz9vd3dAyEY855SAs63QEohyV9qA2Ty 8Jm1PMUDwVhVYsQavqCY1 H7qVzFnRWFhJPGBHrx9NVkaOd7wOEvAf7xTQ6078~30percentMathAD 4nXfcD nA oc7iUh0 bo6gK9E 9R9JHmvwl4B0KmCYAKbFTKuULgTn oztgx6PKOzh7 gJRbdSp8ViM7tBx7tEYXBCjX4QyqYEVScPF6qYEbJdgJveBri2CmQMjAQxlrqx5gS7rV114138~21%

% Token Overhead of Claude 3.5 Sonnet Tokenizer (relative to GPT-4o) Supply: Lavanya Gupta

When evaluating Claude 3.5 Sonnet to GPT-4o, the diploma of tokenizer inefficiency varies considerably throughout content material domains. For English articles, Claude’s tokenizer produces roughly 16% extra tokens than GPT-4o for a similar enter textual content. This overhead will increase sharply with extra structured or technical content material: for mathematical equations, the overhead stands at 21%, and for Python code, Claude generates 30% extra tokens.

This variation arises as a result of some content material sorts, resembling technical paperwork and code, typically include patterns and symbols that Anthropic’s tokenizer fragments into smaller items, resulting in the next token rely. In distinction, extra pure language content material tends to exhibit a decrease token overhead.

Different sensible implications of tokenizer inefficiency

Past the direct implication on prices, there’s additionally an oblique impression on the context window utilization.  Whereas Anthropic fashions declare a bigger context window of 200K tokens, versus OpenAI’s 128K tokens, attributable to verbosity, the efficient usable token house could also be smaller for Anthropic fashions. Therefore, there might probably be a small or massive distinction within the “advertised” context window sizes vs the “effective” context window sizes.

Implementation of tokenizers

GPT fashions use Byte Pair Encoding (BPE), which merges incessantly co-occurring character pairs to type tokens. Particularly, the newest GPT fashions use the open-source o200k_base tokenizer. The precise tokens utilized by GPT-4o (within the tiktoken tokenizer) could be seen right here.

JSON

{
#reasoning
“o1-xxx”: “o200k_base”,
“o3-xxx”: “o200k_base”,

# chat
“chatgpt-4o-“: “o200k_base”,
“gpt-4o-xxx”: “o200k_base”, # e.g., gpt-4o-2024-05-13
“gpt-4-xxx”: “cl100k_base”, # e.g., gpt-4-0314, and many others., plus gpt-4-32k
“gpt-3.5-turbo-xxx”: “cl100k_base”, # e.g, gpt-3.5-turbo-0301, -0401, and many others.
}

Sadly, not a lot could be mentioned about Anthropic tokenizers as their tokenizer will not be as straight and simply out there as GPT. Anthropic launched their Token Counting API in Dec 2024. Nevertheless, it was quickly demised in later 2025 variations.

Latenode experiences that “Anthropic uses a unique tokenizer with only 65,000 token variations, compared to OpenAI’s 100,261 token variations for GPT-4.” This Colab pocket book incorporates Python code to investigate the tokenization variations between GPT and Claude fashions. One other instrument that allows interfacing with some widespread, publicly out there tokenizers validates our findings.

The flexibility to proactively estimate token counts (with out invoking the precise mannequin API) and price range prices is essential for AI enterprises. 

Key Takeaways

Anthropic’s aggressive pricing comes with hidden prices:Whereas Anthropic’s Claude 3.5 Sonnet presents 40% decrease enter token prices in comparison with OpenAI’s GPT-4o, this obvious value benefit could be deceptive attributable to variations in how enter textual content is tokenized.

Hidden “tokenizer inefficiency”:Anthropic fashions are inherently extra verbose. For companies that course of massive volumes of textual content, understanding this discrepancy is essential when evaluating the true value of deploying fashions.

Area-dependent tokenizer inefficiency:When selecting between OpenAI and Anthropic fashions, consider the character of your enter textual content. For pure language duties, the fee distinction could also be minimal, however technical or structured domains might result in considerably larger prices with Anthropic fashions.

Efficient context window:As a result of verbosity of Anthropic’s tokenizer, its bigger marketed 200K context window might supply much less efficient usable house than OpenAI’s 128K, resulting in a possible hole between marketed and precise context window.

Anthropic didn’t reply to VentureBeat’s requests for remark by press time. We’ll replace the story in the event that they reply.

Each day insights on enterprise use circumstances with VB Each day

If you wish to impress your boss, VB Each day has you lined. We provide the inside scoop on what firms are doing with generative AI, from regulatory shifts to sensible deployments, so you’ll be able to share insights for optimum ROI.

An error occured.

vb daily phone

You Might Also Like

Z.ai debuts open supply GLM-4.6V, a local tool-calling imaginative and prescient mannequin for multimodal reasoning

Anthropic's Claude Code can now learn your Slack messages and write code for you

Reserving.com’s agent technique: Disciplined, modular and already delivering 2× accuracy

Design within the age of AI: How small companies are constructing massive manufacturers quicker

Why AI coding brokers aren’t production-ready: Brittle context home windows, damaged refactors, lacking operational consciousness

TAGGED:ClaudecostsdeploymententerpriseExpensiveGPThiddenmodelssettings
Share This Article
Facebook Twitter Email Print

Follow US

Find US on Social Medias
FacebookLike
TwitterFollow
YoutubeSubscribe
TelegramFollow
Popular News
What Juan Soto’s contract says concerning the Mets
Sports

What Juan Soto’s contract says concerning the Mets

Editorial Board December 10, 2024
Knicks fall to Warriors, 97-94, regardless of large effort from Karl-Anthony Cities
UK analysis highlights pressing want for nationwide technique to fight rising consuming problems
Cal Raleigh says Mariners’ nearer was tipping pitches throughout Yankees’ comeback
Q&A: Researcher discusses finding out ovaries to know how all of us age

You Might Also Like

AI denial is turning into an enterprise threat: Why dismissing “slop” obscures actual functionality positive factors
Technology

AI denial is turning into an enterprise threat: Why dismissing “slop” obscures actual functionality positive factors

December 5, 2025
GAM takes purpose at “context rot”: A dual-agent reminiscence structure that outperforms long-context LLMs
Technology

GAM takes purpose at “context rot”: A dual-agent reminiscence structure that outperforms long-context LLMs

December 5, 2025
The 'reality serum' for AI: OpenAI’s new technique for coaching fashions to admit their errors
Technology

The 'reality serum' for AI: OpenAI’s new technique for coaching fashions to admit their errors

December 5, 2025
Anthropic vs. OpenAI pink teaming strategies reveal completely different safety priorities for enterprise AI
Technology

Anthropic vs. OpenAI pink teaming strategies reveal completely different safety priorities for enterprise AI

December 4, 2025

Categories

  • Health
  • Sports
  • Politics
  • Entertainment
  • Technology
  • Art
  • World

About US

New York Dawn is a proud and integral publication of the Enspirers News Group, embodying the values of journalistic integrity and excellence.
Company
  • About Us
  • Newsroom Policies & Standards
  • Diversity & Inclusion
  • Careers
  • Media & Community Relations
  • Accessibility Statement
Contact Us
  • Contact Us
  • Contact Customer Care
  • Advertise
  • Licensing & Syndication
  • Request a Correction
  • Contact the Newsroom
  • Send a News Tip
  • Report a Vulnerability
Term of Use
  • Digital Products Terms of Sale
  • Terms of Service
  • Privacy Policy
  • Cookie Settings
  • Submissions & Discussion Policy
  • RSS Terms of Service
  • Ad Choices
© 2024 New York Dawn. All Rights Reserved.
Welcome Back!

Sign in to your account

Lost your password?