| 0 |
llama3.1 |
Llama 3.1 is a new state-of-the-art model from Meta available in 8B, 70B and 405B parameter sizes. |
['8B', '70B', '405B'] |
5.3M |
94 |
6 days ago |
| 1 |
gemma2 |
Google Gemma 2 is a high-performing and efficient model available in three sizes: 2B, 9B, and 27B. |
['2B', '9B', '27B'] |
1.2M |
94 |
6 days ago |
| 2 |
qwen2.5 |
Qwen2.5 models are pretrained on Alibaba's latest large-scale dataset, encompassing up to 18 trillion tokens. |
['0.5B', '1.5B', '3B', '7B', '14B', '32B', '72B'] |
289.1K |
133 |
3 days ago |
| 3 |
phi3.5 |
A lightweight AI model with 3.8 billion parameters with performance overtaking similarly and larger sized models. |
['3B'] |
53.9K |
17 |
4 weeks ago |
| 4 |
nemotron-mini |
A commercial-friendly small language model by NVIDIA optimized for roleplay, RAG QA, and function calling. |
[] |
7,650 |
17 |
2 days ago |
| 5 |
mistral-small |
Mistral Small is a lightweight model designed for cost-effective use in tasks like translation and summarization. |
['22B'] |
5,816 |
17 |
4 days ago |
| 6 |
mistral-nemo |
A state-of-the-art 12B model with 128k context length, built by Mistral AI in collaboration with NVIDIA. |
['12B'] |
202.2K |
17 |
4 hours ago |
| 7 |
deepseek-coder-v2 |
An open-source Mixture-of-Experts code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks. |
['16B', '236B'] |
294.4K |
65 |
3 months ago |
| 8 |
mistral |
The 7B model released by Mistral AI, updated to version 0.3. |
['7B'] |
3.4M |
84 |
4 months ago |
| 9 |
mixtral |
A set of Mixture of Experts (MoE) model with open weights by Mistral AI in 8x7b and 8x22b parameter sizes. |
['8x7B', '8x22B'] |
415.8K |
69 |
5 months ago |
| 10 |
codegemma |
CodeGemma is a collection of powerful, lightweight models that can perform a variety of coding tasks like fill-in-the-middle code completion, code generation, natural language understanding, mathematical reasoning, and instruction following. |
['2B', '7B'] |
282.5K |
85 |
5 months ago |
| 11 |
command-r |
Command R is a Large Language Model optimized for conversational interaction and long context tasks. |
['35B'] |
203.5K |
32 |
3 weeks ago |
| 12 |
command-r-plus |
Command R+ is a powerful, scalable large language model purpose-built to excel at real-world enterprise use cases. |
['104B'] |
95.8K |
21 |
3 weeks ago |
| 13 |
llava |
🌋 LLaVA is a novel end-to-end trained large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding. |
['7B', '13B', '34B'] |
1.1M |
98 |
7 months ago |
| 14 |
llama3 |
Meta Llama 3: The most capable openly available LLM to date. |
['8B', '70B'] |
6.2M |
68 |
4 months ago |
| 15 |
gemma |
Gemma is a family of lightweight, state-of-the-art open models built by Google DeepMind. |
['2B', '7B'] |
4.1M |
102 |
5 months ago |
| 16 |
qwen |
Qwen 1.5 is a series of large language models by Alibaba Cloud spanning from 0.5B to 110B parameters. |
['0.5B', '1.8B', '4B', '32B', '72B', '110B'] |
4M |
379 |
3 months ago |
| 17 |
qwen2 |
Qwen2 is a new series of large language models from Alibaba group. |
['0.5B', '1.5B', '7B', '72B'] |
3.7M |
97 |
3 months ago |
| 18 |
llama2 |
Llama 2 is a collection of foundation language models ranging from 7B to 70B parameters. |
['7B', '13B', '70B'] |
2.1M |
102 |
7 months ago |