Send
News

Apple Unveils OpenELM, Its Open-Source AI Language Model

apple-unveils-openelm-its-open-source-ai-language-model

Apple has released OpenELM, a new family of open-source large language models designed to be more efficient and accurate than existing models, while requiring less training data.

OpenELM utilizes a "layer-wise scaling strategy" that allocates parameters more efficiently across the different layers of the transformer model architecture. According to Apple researchers, a 1 billion parameter version of OpenELM exhibits 2.36% higher accuracy compared to OLMo, an existing open language model, while needing only half as many pre-training tokens.

Unlike typical practices of only releasing model weights and code, Apple has taken an "open" approach by providing the full training and evaluation framework used to develop OpenELM. This includes training logs, multiple checkpoints, pre-training configurations, and code to run OpenELM on Apple's ML acceleration frameworks like MLX for on-device inference.

The release aims to "empower and strengthen the open research community" according to the Apple researchers. OpenELM was pre-trained on publicly available datasets like subsets of RedPajama, Dolma, and others totaling around 1.8 trillion tokens.

Apple is releasing 8 versions of OpenELM in total - 4 pre-trained and 4 instruction-tuned models ranging from 270 million to 3 billion parameters. The models can be used for text generation tasks like answering questions, summarization and analysis.

About author

Kelvin Maina

Kelvin Maina is a dedicated content creator. He has a Bsc. Computer Science, and has worked for companies such as Investingcube.com, and cryptopolitan.com as a financial research analyst. At Shortfi, he mostly focuses on the latest technologies, gadgets, and technologies companies making progress in advancing humanity through innovation.

Scroll to Top