First Llms Openai Open-WEIGHT six years ago here

Photo of author

By [email protected]


For the first time since then GPT-2 in 2019Openai is Launching new large -weight large language models. It is a major teacher for its company He was increasingly accused From its distinction The original task announced “Ensuring artificial intelligence benefits all humanity.” Now, after multiple delays for an additional test and safety Embroidery.

Before going further, it is worth spending a moment to clarify what Openai exactly is doing here. The company does not issue new open source models that include the basic code and data that the company used to train. Instead, the weights – that is, the numerical values that the models have learned to set the inputs during their training – that reach the new systems. according to Benjamin C LiProfessor of Engineering and Computer Science at the University of Pennsylvania, open weight models and open source served two very different purposes.

“The open weight model provides the values that were learned during a large language model training, and allows you mainly to use the model and build on it. If commercial models are an absolute black box and an open source system allows a full allocation and adjustment, then AIS is open in the middle.

Openai did not issue open source models, probably since the competitor can use training and symbols to reflect his technology. “The open source model is more than just weights. It is also possible that the code used to operate the training process.” In practical terms, the average person will not benefit much from an open source model unless they have a farm of high -end NVIDIA graphics units that manage their electricity bill. (They will be useful for researchers looking to learn more about the data used by the company to train its models, and there is a handful of open source models there such as Nemo Mistral and Mistral Small 3.)

With this way, the primary difference between GPT -SS-120B and GPT -SS-20B is the number of parameters that each one offers. If you are not aware of the term, the parameters are the settings that the Great Language model can provide to provide you with an answer. The label is a little bicker here, but GPT -SS-120B is a parameter model 117 billion, while his younger brother is 21 billion.

In practice, this means that the GPT-SS-120B requires the operation of more powerful devices, with Openai recommending the one 80 GB of graphics processing unit for effective use. The good news is that the company says that any modern computer contains 16 GB RAM can run GPT-SS-20B. As a result, you can use the smaller model to do something like a VIBE code on your computer without internet connection. What’s more, Openai makes models available through Apache 2.0 Licensing, giving people a great deal of flexibility to adjust systems to their needs.

Although this is not a new commercial version, Openai says that new models are similar to property systems. The only restriction in OSS models is that they do not offer a multimedia input, which means that they cannot process images, videos and sound. For these capabilities, you will still need to resort to the commercial models of Cloud and Openai, which can be formed both. Moreover, it provides many capabilities themselves, including a series of thinking and use tools. This means that models can deal with more complicated problems by dividing them into smaller steps, and if they need additional help, they know how to use web languages and coding like Python.

In addition, Openai has trained models using the technologies that the company had previously used to develop O3 and other modern border systems. In the coding at the competition level, GPT-SS-120B got a worse degree of O3, which is the current thinking model in Openai, while GPT-SS-20B fell between O3-MINI and O4-MINI. Of course, we will have to wait for more tests in the real world to find out how the new model is compared to Openai’s commercial offers and their competitors.

The GPT -SS-120B, GPT -SS-20B and Openai’s Openai launch comes to double on open models after Mark Zuckerberg indicated to Meta Less number of these systems are launched for the public. The use of open sources of Zuckerberg’s correspondent was about AI’s efforts to his company, as the CEO once mentioned the closed closed “curse” systems. At least among the technology lovers sect is ready to cross with LLMS, timing, transverse, somewhat embarrassing for dead.

Professor Lee said: “One can argue that open -weight models weaken access to the largest and most capable models for people who do not have these huge centers and high -precision data with many graphics processing units.” “It allows people to use outputs or products for a period of months in a huge data center without the need to invest in this infrastructure on their own. From the perspective of someone who just wants to start a truly capable model, and then wants to build for some applications. I think that open models can be really useful.”

Openai is already working with a few different institutions to publish their own versions of these models, including Amnesty International SwedenThe National Center for the Applied Spontaneous Organization. At a press conference delivered by Openai before the announcement of today, the team that worked on GPT -SS-120B and GPT-SS-20B said they look at the two models as a map; The more people use them, the greater the possibility of issuing additional models for the open weight in the future.



https://s.yimg.com/ny/api/res/1.2/h.x4yMUDaImWjy01vUPLcg–/YXBwaWQ9aGlnaGxhbmRlcjt3PTEyMDA7aD02NzU-/https://s.yimg.com/os/creatr-uploaded-images/2025-02/16fa6fb0-ef9d-11ef-bcb2-9f19dfb18b03

Source link

Leave a Comment