Want more intelligent visions of your inbox? Subscribe to our weekly newsletters to get what is concerned only for institutions AI, data and security leaders. Subscribe now
Tiktok publishes newspaper addresses again today The White House joined the popular social media application – But her mother company BensanisA Chinese web giant, also had a sudden advertisement.
Company The seed team of artificial intelligence researchers Today, the seeds are 36b On the site of sharing the artificial intelligence code.
Seeds-OSS-36B is a new line of open sources, large language models (LLM) designed for advanced thinking, and the ease of use of developers with a The context of a longer symbol – Any amount of information that models can accept as inputs and then out in one exchange – From many LLMS competing from American technology companiesEven leaders like Openai and Anthropor.
The group offers three main variables:
Artificial intelligence limits its limits
Power caps, high costs of the symbol, and inference delay are reshaped. Join our exclusive salon to discover how the big difference:
- Transforming energy into a strategic advantage
- Teaching effective reasoning for real productivity gains
- Opening the return on competitive investment with sustainable artificial intelligence systems
Securing your place to stay in the foreground: https://bit.ly/4mwngngo
- Seeds-OSS-36B Base With artificial data
- Seeds-OSS-36B Base Without artificial data
- Seeds-OSS-36B-Instruct
At the launch of both artificial and non-synthetic versions of the OSS-36B-Base, the seed team sought to balance practical performance with research flexibility.
the Artificial alternative, Train with additional instructions data constantly It provides stronger degrees on standard standards It is intended as an option for high -performance general purposes.
the The non -consensual model, On the contrary, it deletes these reinforcements and created A cleaner basis that avoids possible bias or distortion It was presented by artificial instructions data.
By providing both, the team gives applied users access to improved results while ensuring that researchers maintain a neutral basis line to study post -training methods.
At the same time, and Seed Form-OSS-36B-Instruct It differs in that After training the instructions data To determine the priorities of implementing the following task and instructions, rather than working as a purely basic model.
All three models are released under the APache-2.0 license, allowing free use, modification and distribution by researchers and developers working in institutions.
This means It can be used to operate commercial, interior applications for a company or external/facing the customer, without paying any license fees or use of the application programming interface (API).
This continues Summer 2025 direction of Chinese companies, charging strong open source models With Openai’s attempt to catch up with it Duet Open Open Source GPT -SS was released earlier this month.
Seed team sites Seed for international applicationsEmphasizing diversity by thinking, and carrying out agents -like tasks, and multi -language settings.
The seed team, which was formed in 2023, focused on building basic models that can serve both research and applied cases.
Design and basic features
Architecture behind the Seed-SS-36B combines familiar design options such as causal modeling, the attention of collected query, the activation of Swiglu, RMSNORM, and topical coding rope.
Each model carries 36 billion teachers across 64 layers and supports vocabulary of 155,000 symbols.
One of the specific features is The possibility of the original long context, with a maximum of 512,000 symbols, It is designed to process extended documents and thinking chains without losing performance.
This is twice the length The new GPT-5 family from Openai and That is, the equivalent of about 1,600 pages of the text, The length of the Christian Bible.
Another distinctive element is the introduction of a Thinking budgetWhich allows developers to determine the amount of thinking that the model should perform before providing an answer.
It is something we have seen from other open sources as well, including New Nvidia Nemotron-nano-9B-V2also Hidden.
In practice, this means that the difference can adjust performance depending on the complexity of the task and efficiency requirements for publication.
Budgets are recommended in complications of 512 symbols, with 0 providing a direct response mode/
Competitive performance on third -party standards
The criteria published with the OSS-36B version between large open source models. The instructor, in particular, is the latest in multiple areas.
- Mathematics and logic: The seeds-OSS-36B-Instruction 91.7 percent on aime24 and 65 post -SayemBoth represent the “latest source”.
- Coding: On livecodebeench v6, instructions form records 67.4Another Sota degree.
- Treat long context: On the ruler along the context of 128K, it reaches 94.6On the occasion of the highest open source of the source.
- Perform the basic modelProvides the artificial database variable 65.1 on MMLU-PRO and 81.7 on mathematicsEach of the latest cases of categories is produced.
The basic version that does not give up the subscription, while proving some measures in many measures, proves competitive in itself.
He – she It surpasses its artificial counterpart on GPQA-D, Providing researchers with a primary -cleaner -free basis line for the experiment.
For institutions that compare open options, these results The seeds of the seeds suggest providing strong potentials through the heavy work burden of mathematics, coding and long context While still provides flexibility for research cases.
Access and publishing
Beyond performance, the Seel team highlights access to developers and practitioners. Models It can be published using embraced facial transformerswith Supporting quantity in each of the 4 -bit and 8 -bit formats To reduce memory requirements.
They are also Integrated with VLLM for developed serviceIncluding examples of training and API server instructions.
To reduce barriers further, the team includes textual programs for inference, quick allocation and tool integration.
to Technical leaders run small teams or work under budget restrictionsThese provisions are placed to conduct an experiment with the parameters of the parameter of 36 billion more friendly.
Licensing and considerations of decision makers for institutions
With the models submitted by APache-2.0, institutions can adopt them without conditions for a restricted license, an important factor for the teams that balance legal and operational concerns.
For decision makers who evaluate the open source scene, the version brings three fast food:
- Modern standards through mathematics, coding and long -context logic.
- A balance between high -performance artificial models and clean research foundations.
- Accessible features that reduce the expenditures working for meager engineering teams.
By placing strong performance and flexible publishing under an open license, the seed team in BYTEDANCE added new options for institutions, researchers and developers alike.
https://venturebeat.com/wp-content/uploads/2025/08/cfr0z3n_pixellated_2D_image_of_a_humanoid_robot_carefully_and_31b4356a-30f8-42d5-a0b8-9d52cab1e426_1.png?w=1024?w=1200&strip=all
Source link