Latam-GPT: Free, open artificial intelligence and cooperation with Latin America

Photo of author

By [email protected]


New Latam-GPT A large language model is developed in Latin America. The project, under the leadership of the Chilean Non -profit national center of artificial intelligence (Cenia), aims to help the region achieve technological independence by developing an open source Amnesty International model trained in Latin American languages ​​and contexts.

“This work cannot be done by one group or only one country in Latin America: it is a challenge that requires everyone’s participation,” said Alvaro Soto, Celia’s director, in an interview with Wired en Español. “Latam-GPT is a project that seeks to create an Amnesty International Open and Free model, and above all. We have worked for two years with an operation from Al Qaeda to the top, with combining citizens from different countries who want to cooperate. In recent times, some initiatives have also seen from top to bottom, with interested governments and start participating in the project.”

The project highlights its cooperative spirit. “We are not looking to compete with Openai, Deepseek, or Google. We want a model for Latin America and the Caribbean region, familiar with the requirements and cultural challenges that this requires, such as understanding different accents, history of the region, and unique cultural aspects,” explains Soto.

Thanks to 33 strategic partnerships with institutions in Latin America and the Caribbean region, the project collected a set of data that exceeds eight terabytes of the text, equivalent to millions of books. This database enabled the development of a language model with 50 billion teachers, a measure that makes it similar to the GPT-3.5 and gives it a medium to high capacity to perform complex tasks such as thinking, translation and associations.

Latam-GPT is trained in a regional database that collects information from 20 countries in Latin America and Spain, with a total of 255500 documents. Data distribution shows a great focus in the largest countries in the region, with pioneering Brazil with 685,000 documents, followed by Mexico with 385,000, Spain with 325,000, Colombia with 220,000, and Argentina with 210,000 documents. The numbers reflect the size of these markets, their digital development, and the availability of organized content.

“Initially, we will launch a language model. We expect its performance in public tasks close to large commercial models, but with the superior performance in the topics of Latin America. The idea is that if we ask about the relevant topics in our region, its knowledge will be much deeper.”

The first model is the starting point for the development of a family of the most advanced technologies in the future, including those that contain photos and videos, and to expand larger models. “Since this is an open project, we want other institutions able to use it. A group in Colombia can adapt to the school education system or can adapt to it in the health sector.



https://media.wired.com/photos/68b1f8e975376e91d567cb26/191:100/w_1280,c_limit/WIRED-IT-LLM-Latin-America-2182163226.jpg

Source link

Leave a Comment