Anna Barclay Getty Images News | Gety pictures
The experimental model for Chinese start in Deepseek is to increase efficiency and improve the ability of artificial intelligence to deal with a lot of information in a small part of the cost, but the questions about the effectiveness of the structure remain.
Deepseek is crazy when it launched the first R1 of nothing anywhere in the past year, indicating that it is possible to quickly train LLMS models, on less powerful chips, using lower resources.
Deepseek-V3.2-EXP released on Monday, a experimental version of its current party Deepseek-V3.1, which depends on its mission to increase efficiency in artificial intelligence systems, According to publishing on the face of the artificial intelligence forum.
“Deepseek v3.2 continues to focus on efficiency, reducing costs, and sharing open source,” Adina Yakavo, the Chinese community at Huging Face, told CNBC. “The great improvement is a new feature called DSA (Deepseek Excel Eminted), which makes artificial intelligence better in dealing with documents and long conversations. It also reduces the cost of operating artificial intelligence to half compared to the previous version.”
“It is important because it must make the model faster and more costly effective to use it without a noticeable decrease in performance,” said Nick Badi, Vice President and Practice Practice. “This makes strong artificial intelligence easier for developers, researchers and smaller companies, which may lead to a wave of new and innovative applications.”
Pros and negatives of interest
The artificial intelligence model takes decisions based on training data and new information, such as a router. Say an airline want to find the best road from A to B, while there are many options, not all possible. By filtering the less applicable methods, you greatly reduce time and fuel, and ultimately, money, needed to do the trip. This is the exact branching attention, it is only the factors in data that you think are important given the mission offered, unlike other models that have been overwhelmed by all data in the form.
“You cut things that you think are not important,” said Ekaterina Alack, founder and administrative partner of the new Investment Capital Fund.
Verifying attention is a blessing of efficiency and the ability to expand the scope of less resources, but one of the concerns is that it can lead to a decrease in the extent of reliable models due to the lack of supervision on how information discounts.
“The truth is that they (sporadic attention models) have lost many nuances,” said the mask, who was an early supporter of Dataiku and Darktrace, and an investor in Graphcore. “Then the real question is, do they have the correct mechanism to exclude unimportant data, or is there a mechanism that really excludes important data, then the result is much less important?”
The investor pointed out that this may be a special problem for the safety of artificial and comprehensive intelligence, adding that it may not be the “optimal model or the safest model” for its use compared to competitors or traditional structure.
However, Dibsic says the experimental model is equally working with the V3.1 party. Despite speculation Forming a bubbleArtificial intelligence in the center of the geopolitical competition with the United States and China is still competing for the winning position. Yakefu noted that Deepseek models work “directly outside the box” with Chinese Chinese AI chips, such as ASCEND and Cambricon, which means that they can work locally on local devices without any additional preparation.

She said Dibsic also shared the actual programming code and tools needed to use the experimental model. “This means that others can learn from it and build their own improvements.”
But for Almasque, the nature of this means that technology may not be defended. “This approach is not very new,” she said, noting that the industry was “talking about separate models since 2015” and that Deepseek is unable to patented its technology because it is open source. She added that the competitive Deepseek feature should lie in how to determine the information that must be included.
The company itself admits that V3.2-EXP is “an intermediate step towards the structure of the next generation,” according to face post.
As Patience indicated, “This is the support of Deepseek value everywhere: efficiency has become an important task like raw strength.”
Yakefu added: “Deepseek plays the long game to keep society investing in its progress.” “People will always go to what is cheap, reliable and effective.”
https://image.cnbcfm.com/api/v1/image/108102024-1739451742603-gettyimages-2196279912-deepseek-7.jpeg?v=1759175726&w=1920&h=1080
Source link