A team of engineers, researchers, and a chip company from Silicon Valley collaborated to create advanced Arabic language software that can power generative AI applications. The new large language model, Jais, has 13 billion parameters and was created from a large batch of data combining Arabic and English, some of which is computer code. The group, which included academics and engineers, took on the project in part because there are few large bilingual language models.
The new language model was developed using supercomputers manufactured by Silicon Valley-based Cerebras Systems, which designs dinner plate-sized chips that compete with Nvidia's powerful AI hardware. Nvidia's chips are in short supply, prompting businesses all over the world to look for alternatives.
Jais is a collaboration between Cerebras, the Mohamed bin Zayed University of Artificial Intelligence, and Inception, an AI-focused subsidiary of Abu Dhabi-based tech conglomerate G42.
According to Timothy Baldwin, a professor at Mohamed bin Zayed University of Artificial Intelligence, because there isn't enough Arabic data to train a model of Jais' size, the computer code within the English language data helped train the model's reasoning ability.
"(Code) gives the model a big leg up in terms of reasoning abilities, because it spells out the (logical) steps," Baldwin explained. Jais will be made available under an open source licence.