Bookmark: Eagle 7B : Soaring past Transformers with 1 Trillion Tokens Across 100+ Languages (RWKV-v5)

lqdev👽01/29/2024

https://blog.rwkv.com/p/eagle-7b-soaring-past-transformers

Eagle 7B is a 7.52B parameter model that:

Built on the RWKV-v5 architecture (a linear transformer with 10-100x+ lower inference cost)

Ranks as the world’s greenest 7B model (per token)

Trained on 1.1 Trillion Tokens across 100+ languages

Outperforms all 7B class models in multi-lingual benchmarks

Approaches Falcon (1.5T), LLaMA2 (2T), Mistral (>2T?) level of performance in English evals

Trade blows with MPT-7B (1T) in English evals

All while being an “Attention-Free Transformer”

Is a foundation model, with a very small instruct tune - further fine-tuning is required for various use cases!

We are releasing RWKV-v5 Eagle 7B, licensed as Apache 2.0 license, under the Linux Foundation, and can be used personally or commercially without restrictions

Download from HuggingFace

Permalink: /feed/eagle-7b-rkwv/

Tags: #ai #llm #rwkv #deeplearning #neuralnetwork

Back to feed

Send me a message or webmention