https://www.inceptionlabs.ai/news
We are announcing the Mercury family of diffusion large language models (dLLMs), a new generation of LLMs that push the frontier of fast, high-quality text generation.
Mercury is up to 10x faster than frontier speed-optimized LLMs. Our models run at over 1000 tokens/sec on NVIDIA H100s, a speed previously possible only using custom chips.
Back to feed