“AI Factory” from NVIDIA faces a reality in converting 2025

Photo of author

By [email protected]


Join the event that the leaders of the institutions have been trusted for nearly two decades. VB Transform combines people who build AI’s strategy for real institutions. Learn more


Gloves stopped at Tuesday In VB Transfer 2025 Alternative chips makers directly stabbed the NVIDIA domination during a committee on reasoning, and exposed a basic contradiction: how the inference can be artificial intelligence. “Factory” of 70 % of the total margins?

Jonathan Ross, CEO of the company GrocThe words were not carefully turned off carefully. “The artificial intelligence factory is just a marketing way to make Amnesty International less frightening,” Ross said during the committee. Shawn lie, cto from BrainA competitor, he was equally directly: “I don’t think Nvidia’s minds have all the service providers who fight them for each last penny while sitting there comfortably with 70 points.”

Hundreds of billions in investment in infrastructure and future architecture AI at stake. For Cisos and artificial intelligence leaders currently being held in the weekly negotiations with Openai and other service providers for more capacity, the committee revealed uncomfortable facts about the reason for the continued artificial intelligence initiatives to hit road barriers.

>>Watch every coverage of our conversion 2025 here<

The capacity crisis does not talk about it

“Anyone is actually a great user of Gen Ai models that knows that you can go to Openai, or whatever, and they will not be already able to serve you enough.” Half -solutions. There are weekly meetings between some of the largest artificial intelligence users and model service providers to try to persuade them to allocate greater capabilities. Then there are weekly meetings between the models providers and their appliances. “

The team participants also pointed to the symbolic deficiency as a basic defect in the factory analogy. Traditional manufacturing responds to request signals by adding capacity. However, when institutions require a 10 -fold conclusion, they discover that the supply chain cannot bend. Graphics processing units require time times for two years. Data centers need energy permits and agreements. The infrastructure is not designed to expand the scope of the ASI, forcing service providers to access the classes through the API borders.

According to Patille, man He jumped from $ 2 billion to $ 3 billion in ARR in only six months. Indicator He went from scratch to $ 500 million. Openai Through 10 billion dollars. However, companies still cannot get the symbols they need.

Why do you think the “factory” breaks the economies of Amnesty International

Jensen Huang “Artificial Intelligence FactoryThe concept refers to monotheism, goods and efficiency gains that reduce costs. However, the committee revealed three basic ways to collapse this metaphor:

First, reasoning is not uniform. “To this day, for example, for example, Dibsic, there are a number of service providers along the curve of a kind of extent of their speed at any cost,” Pateel pointed out. Deepseek offers its own style of the lowest cost but only provides 20 icons per second. “Nobody wants to use a model in 20 code per second. I speak faster than 20 icons per second.”

Second, the quality varies violently. Ross’s drawing historically for standard oil: “When Standard Oil began, oil was of varying quality. You can buy oil from one seller and may set fire to your home.” Today, the market of artificial intelligence reasoning is facing similar quality differences, as service providers use many technologies to reduce the costs that offer production quality unintentionally.

Third, and most importantly, the economy is inverted. “One of the unusual things in artificial intelligence is that you cannot spend more for better results,” Ross explained. “You can not only get a software application, for example, I will spend twice what hosts my software, and applications can improve.”

When Ross stated that Mark Zuckerberg praised a jungle for being “the only one who launched it in full quality”, unintentionally revealed the quality crisis in the industry. This was not just a confession. It was an indictment for all other provisions.

Ross explained the mechanics: “Many people make a lot of tricks to reduce quality, not intentionally, but to reduce their cost, improve their speed.” Techniques seem technical, but the effect is clear and direct. The quantity reduces accuracy. Pruning removes parameters. All improvement degrades the model in ways that the institutions may not discover until production fails.

Ross Ross parallel with parallel oil sheds light on the risks. Today, the inference market faces the same problem as a variety of quality. Service providers are betting that institutions will not notice the difference between 95 % and 100 % of accuracy, betting on companies such as Meta that have a development to measure deterioration.

This creates immediate necessities for buyers of institutions.

  1. Create quality standards before choosing service providers.
  2. Check the current inference partners for unannounced improvements.
  3. Accept that the excellent pricing of Full Model Fidelity is now the permanent market feature. The era of the assumption of functional parity has ended through the inference providers when Zuckerberg summoned the difference.

A symbolic paradox of one million dollars

The most unveiled moment came when the committee discussed pricing. Lying highlighted an uncomfortable fact of this industry: “If these symbols full of millions are valuable as we think they can be, right? This is not related to words. You do not charge one dollar to transfer words. I pay $ 800 for an hour to write a two -page note.”

This note cuts the problem of the problem of detecting artificial intelligence prices. The industry is racing to pay the costs of the distinctive symbol to less than $ 1.50 per million while claiming these symbols will turn each aspect of business. The committee implicitly agreed with each other that mathematics does not add up.

“Everyone is largely spent, like all these fast -growing startups, and the amount they spend on symbols as a service that almost matches their revenues to one to one,” Ross revealed. The spending ratio is 1: 1 on artificial intelligence symbols against revenue is an unsustainable business model that the “factory” narratives ignores easily.

Performance changes everything

The brain and GROQ do not compete only for the price; It also competes for performance. They mainly change what is possible in terms of speed of reasoning. “With the technology of the chip scale that we created, we can 10 times, sometimes 50 times, and the fastest performance than the fastest graphics processing units today,” he told me.

This is not a gradual improvement. It enables completely new use cases. “We have customers who have a work agent that may take 40 minutes, and they want these things to continue in an actual time,” made clear. “These things are not possible, even if you are ready to pay the top of the dollar.”

The difference creates a saturated market that challenges the standardization of the factory. Institutions that need to be inferred in the actual time of applications facing customers cannot use the same infrastructure as they work between the night.

Real bottle neck: energy and data centers

While everyone focuses on the supply of chips, the committee revealed the restrictions of the actual restriction that transmits artificial intelligence. “The database capacity is a big problem. You cannot find a data center space in the United States,” Patel said. “Power is a big problem.”

The infrastructure challenge goes beyond the manufacture of chips to basic resource restrictions. Patel explained, “TSMC in Taiwan is able to get more than $ 200 million in chips, isn’t it? It is not … it’s the speed that expands is ridiculous.”

But the production of chips does not mean anything without infrastructure. “The reason we see these big deals in the Middle East, and partly why each of these two companies has great factors in the Middle East, it’s strength,” Patel revealed. The global stampede for Compute includes “Going all over the world to get anywhere for the presence of energy, wherever the database is found, wherever there are electricians who can build these electrical systems.”

The “Google Success Disaster” becomes the truth of everyone

Ross shared a wise story from the history of Google: “There was a term that became very common in Google in 2015 called SUCCESS CARASTER. Some teams built Amnesty International applications that started to work better than humans for the first time, and the demand for the account was very high, and they needed to double the fingerprint of international data data quickly or three times.”

This pattern is now repeated through all the publication of Amnesty International. Applications either fail to get a traction or experience of the hockey stick that immediately strikes the limits of infrastructure. There is no middle land, nor a smooth scaling curve that the factory’s economies predict.

What does this mean for the Foundation’s AI’s strategy

As for information monitoring, Cisos and artificial intelligence leaders, the detection of the committee requires the restoration of strategic calibration:

Planning requires new models. The traditional prediction of information technology assumes linear growth. The workforce of Amnesty International broke this assumption. When successful applications increase the consumption of the distinctive symbol by 30 % per month, the annual capacity plans are old inside the quarters. Institutions must be transformed from fixed purchase courses to management of dynamic capacity. Building contracts with the rulings of impulsivity. Monitor the weekly use, not a chapter. Accept that artificial intelligence moderation patterns are similar to the patterns of viral adoption curves, not for the programs of traditional institutions programs.

Permanent speed installments. The idea that the inference will ignore the uniform pricing of the huge gaps in performance among service providers. Institutions need a budget for speed where it matters.

Architecture exceeds improvement. Groq and Cerebras do not win by making graphics processing units better. They win by rethinking the basic architecture of artificial intelligence. Companies that are betting on everything on GPU infrastructure may find themselves stuck in the slow corridor.

The infrastructure of the authority is strategic. Registration is not chips or software but kilowatts and cooling. Smart institutions are already working on the energy capacity and the area of ​​data center for the year 2026 and beyond.

Realistic infrastructure institutions cannot ignore

The committee revealed a basic fact: the borrowing of the artificial intelligence factory is not only wrong, but also dangerous. Institutional construction strategies on the pricing of reasoning and the unified delivery of a market that does not exist.

The real market works on three brutal facts.

  1. The scarcity of capacity creates energy coups, as suppliers dictate the terms and institutions begging for allocation.
  2. The variation of quality, the difference between 95 % and 100 % accuracy, is determined, whether artificial intelligence applications succeed or disastrous failure.
  3. Infrastructure restrictions, not technology, determine the limits of linking to the transformation of artificial intelligence.

The way forward requires CISOS and artificial intelligence leaders to give up thinking about the entire factory. Lock power capacity now. The reasoning is to check for the deterioration of the hidden quality. Building the seller’s relationships on the basis of architectural advantages, not savings in marginal costs. More importantly, accepting the payment of 70 % margins for high -quality, trusted inference may be your smartest investment.

Transform chip makers did not challenge NVIDIA. They revealed that the institutions face an option: payment for quality and performance, or join the weekly negotiating meetings. The consensus of the committee was clear: Success requires conformity of specific work burdens to the appropriate infrastructure instead of following the solutions that suit everyone.



https://venturebeat.com/wp-content/uploads/2025/06/PANEL.jpg?w=911?w=1200&strip=all
Source link

Leave a Comment