However, the real price for developing Deepseek models is still unknown, because the number one is quoted in one search sheet may not take the full picture of its costs. “I don’t think this is 6 million dollars, but even if it is $ 60 million, it is a change of games,” says Umesh Padval, Managing Director of Thomvest Ventures, a company that has invested in Cohere and artificial intelligence companies. “It will press the profitability of companies that focus on the consumer AI.”
Shortly after Deepseek revealed the details of its latest model, says Ghodsi of Databrics, clients began to ask whether they could use them as well as the basic Deepseek technologies to reduce costs in their own institutions. He adds that one of the methods used by Deepseek engineers, known as distillation, which involves using the output from a large language model to train another model, is relatively cheap and direct.
Badoval says that the presence of models such as Dibsic will ultimately benefit companies that look forward to less spending on artificial intelligence, but he says that many companies may have reservations to rely on a Chinese model of sensitive tasks. To date, at least the prominent artificial intelligence company is confused Declare He uses the Deepseek R1 model, but he says it is hosted “completely independent of China.”
Amjad Massad, CEO of Repress, a start -up company that provides artificial intelligence coding tools, told WIRE that it is believed that the last Deepseek models are great. Although he still finds the SonNet model for the Anthropor’s best in many computer engineering tasks, it has been found that R1 is particularly good in converting text orders into a code that can be implemented on a computer. “We explore its use, especially for the agent’s thinking.”
The latest two shows for Deepseek-Deepseek R1 and Deepsek R1-Zero-are capable of the same type of simulator thinking as the most advanced systems of Openai and Google. They all work by dividing problems into parts formed to address them more effectively, a process that requires a great deal of additional training to ensure artificial intelligence reliably reach the correct answer.
A paper Posted by Deepseek researchers last week, shows the approach that the company used to create its R1 models, which it claims to perform on some standards about the leading thinking model in Openai known as O1. Tactics Deepseek includes a more automatic way to learn how to correctly solve problems as well as a strategy to transfer skills from larger models to smaller models.
One of the most important speculation topics about Deepseek is the devices that you may have used. The question is especially noticed because the United States government presented a series of Export controls And other commercial restrictions over the past few years aim to reduce China’s ability to obtain and manufacture advanced chips required to build advanced artificial intelligence.
in Search paper From August 2024, Deepseek indicated that he has access to a range of 10,000 NVIDIA A100 chips, which was placed under the United States Restrictions Announced in October 2022 Separate As of June of that year, Deepseek stated that a previous model called Deepseek-V2 has been developed using groups of NVIDIA H800 computer chips, a lower capacity component developed by NVIDIA to comply with US export controls.
A source in one company of artificial intelligence estimates the training of the great Amnesty International models, which has requested to be unknown to protect its professional relationships, that Deepseek is likely to use about 50,000 NVIDIA chips to build its technology.
NVIDIA refused directly to any of its Deepseek chips. A NVIDIA spokesman said in a statement, “Deepseek is excellent progress of artificial intelligence,” adding that the approach to the start of operation, “requires large numbers of NVIDIA graphics processing and high -performance networks.
However, Deepseek models have been built, they seem to show that a less closed approach to developing artificial intelligence acquires momentum. In December, Clem Delangue, CEO of Hugingface, a platform hosting artificial intelligence models, I expected that A Chinese company will take the initiative in artificial intelligence due to the speed of innovation that occurs in open source models, which China has largely embraced. He says, “I walked faster than I thought,” he says.
https://media.wired.com/photos/6797a86a2461698d2c47c0b3/191:100/w_1280,c_limit/deepseek-biz-2195594456.jpg
Source link