CecilIA is not only the name of a 19th century Cuban novel. In the midst of 2025, it is also the bet of a group of scientists, professors and students of the University of Havana to develop a Cuban artificial intelligence language model, designed from Cuba, with Cuban data and at the service of the country. In a global context dominated by technologies trained with cultural biases, corporate interests and hegemonic languages, CecilIA emerges as a sovereign alternative with identity.
The model, initially trained with Cuban literary texts, national press, political speeches and the Official Gazette, has already begun to show results. Its development has been promoted by the Artificial Intelligence and Data Science Group of the Faculty of Mathematics and Computer Science (Matcom), and its ultimate goal is for AI in Cuba to speak and understand “in Cuban”.
During its second public presentation, this time at the headquarters of the National Union of Jurists of Cuba, technical advances were shared, but above all, something fundamental was insisted on: without Cuban data, there is no truly Cuban AI. For this reason, the developers invite institutions, media, jurists and artists to add documents, scripts, songs, news and legal texts to nurture the model.
A collective project, from Cuba and for Cuba.
The CecilIA model is based on what specialists call “small language models” (SLM), which require fewer resources and are ideal for developing countries. Starting with the Salamandra base model (also in Spanish), the Cuban team carried out continuous training with its own corpus, and is now working on the design of Cuban instructions to fine-tune the interaction and better adapt the model's responses.
Dr. Yudivián Almeida, one of its leaders, announced that they expect to build a corpus with at least 10,000 specific instructions, many of which can be openly proposed by anyone interested. This aims at a collaborative AI, rooted in the knowledge of the people.
But the challenge is not only technical. The team stresses the importance of ethics, prevention of bias, explanation and the urgent need to digitize heritage documents. Libraries, publishers and archives still have valuable information only on paper, which limits access and use for AI training purposes.
Beyond software development, the CecilIA project has a deep cultural dimension: protecting Cuban identity in a world where algorithms often ignore the nuances of the Global South. The possibility of creating AI applications that recognize the country's speech, concepts and cultural references represents a strategic tool for information sovereignty.
During the meeting in Havana, jurists, linguists, sociologists and scientists agreed on the need for all disciplines to contribute to this type of projects. Not only to guarantee technical quality, but also so that the final result is aligned with the social, cultural and ethical values of the nation.
CecilIA is an example of what can be achieved from a public university and a committed scientific system. But its success will depend on society as a whole understanding that digital transformation is not a luxury, but a national necessity. While the work continues, the model is already inspiring other teams across the country. As they said at the close of the presentation: “we grew in everything”.