Llama-3-Smaug-8B-GGUF - Llama-3-Smaug-8B模型的GGUF格式文件支持多级量化

Llama-3-Smaug-8B-GGUF项目介绍

Llama-3-Smaug-8B-GGUF是由MaziyarPanahi基于abacusai的Llama-3-Smaug-8B模型进行量化处理后创建的项目。该项目为原始的Llama-3-Smaug-8B模型提供了GGUF格式的文件，使其可以在更多平台和设备上运行。

项目特点

量化版本：项目提供了多种量化版本，包括2-bit、3-bit、4-bit、5-bit、6-bit和8-bit，以适应不同的硬件环境和性能需求。
GGUF格式：采用GGUF（GGML Universal Format）格式，这是llama.cpp团队于2023年8月21日推出的新格式，用于替代不再支持的GGML格式。
文本生成能力：该模型主要用于文本生成任务，可以用于各种自然语言处理应用。

使用方法

要使用Llama-3-Smaug-8B-GGUF模型，用户必须遵循Llama-3提供的提示模板。以下是一个使用llama.cpp的命令行示例：

./llama.cpp/main -m Llama-3-Smaug-8B.Q2_K.gguf -r '<|eot_id|>' --in-prefix "\n<|start_header_id|>user<|end_header_id|>\n\n" --in-suffix "<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n" -p "<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nYou are a helpful, smart, kind, and efficient AI assistant. You always fulfill the user's requests to the best of your ability.<|eot_id|>\n<|start_header_id|>user<|end_header_id|>\n\nHi! How are you?<|eot_id|>\n<|start_header_id|>assistant<|end_header_id|>\n\n" -n 1024

这个命令设置了系统提示、用户输入和助手回复的格式，确保模型能够正确理解和响应输入。

支持GGUF格式的工具和库

GGUF格式得到了广泛支持，以下是一些支持GGUF格式的客户端和库：

llama.cpp：GGUF的源项目，提供CLI和服务器选项。
text-generation-webui：功能丰富的Web UI，支持GPU加速。
KoboldCpp：全功能Web UI，适合讲故事应用。
GPT4All：免费开源的本地运行GUI，支持Windows、Linux和macOS。
LM Studio：易用且功能强大的本地GUI，支持Windows和macOS。
LoLLMS Web UI：具有独特功能的Web UI，包括完整的模型库。
Faraday.dev：基于角色的聊天GUI，支持Windows和macOS。
llama-cpp-python：Python库，支持GPU加速和LangChain。
candle：专注于性能和易用性的Rust ML框架。
ctransformers：Python库，支持GPU加速和OpenAI兼容的AI服务器。

这些工具和库为用户提供了多种使用Llama-3-Smaug-8B-GGUF模型的方式，从命令行界面到图形用户界面，从本地运行到服务器部署，满足不同用户的需求。