Llama3.1-70B-Chinese-Chat

项目介绍：Llama3.1-70B-Chinese-Chat

Llama3.1-70B-Chinese-Chat 是一款精调的语言模型，专为中文和英文用户设计，具备角色扮演、工具使用等多种功能。该模型基于Meta-Llama-3.1-70B-Instruct模型进行开发，由Shenzhi Wang、Yaowei Zheng、Guoyin Wang、Shiji Song和Gao Huang联合创建。

模型概要

Llama3.1-70B-Chinese-Chat在Meta-Llama-3.1-70B-Instruct模型基础上，通过指令微调，为中英文用户提供强大的语言交互能力。其主要特性包括：

许可协议为：Llama-3.1 License
基础模型：Meta-Llama-3.1-70B-Instruct
模型参数大小：8.03B
上下文长度：128K（根据Meta-Llama-3.1-70B-Instruct模型报告，尚未在中文模型上测试）

项目背景

这是首个面向中英文用户特别调优的模型，采用了ORPO算法 [1]。该算法致力于在不参考外部标准的情况下进行偏好优化。

训练信息

训练周期数：3
学习率：1.5e-6
学习率调度器类型：余弦
热启动比例：0.1
截断长度：8192
ORPO算法参数：0.05
全局批量大小：128
微调类型：全参数
优化器：paged_adamw_32bit

使用方法

1. BF16模型的使用

用户需要更新transformers包到最新版本4.43.0，以支持该模型。可以通过以下Python脚本下载模型：

from huggingface_hub import snapshot_download
snapshot_download(repo_id="shenzhi-wang/Llama3.1-70B-Chinese-Chat", ignore_patterns=["*.gguf"])

下载完毕后，可通过以下代码进行推断：

import torch
import transformers
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "/Your/Local/Path/to/Llama3.1-70B-Chinese-Chat"
dtype = torch.bfloat16

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="cuda",
    torch_dtype=dtype,
)

chat = [
    {"role": "user", "content": "写一首关于机器学习的诗。"},
]
input_ids = tokenizer.apply_chat_template(
    chat, tokenize=True, add_generation_prompt=True, return_tensors="pt"
).to(model.device)

outputs = model.generate(
    input_ids,
    max_new_tokens=8192,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
)
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))

2. GGUF模型的使用

从gguf_models文件夹下载模型。
通过LM Studio使用GGUF模型。
也可按指引使用ggerganov/llama.cpp来操作gguf模型。

参考文献

如果该模型对您有帮助，请通过以下方式进行引用：

@misc {shenzhi_wang_2024,
	author       = { Wang, Shenzhi and Zheng, Yaowei and Wang, Guoyin and Song, Shiji and Huang, Gao },
	title        = { Llama3.1-70B-Chinese-Chat },
	year         = 2024,
	url          = { https://huggingface.co/shenzhi-wang/Llama3.1-70B-Chinese-Chat },
	doi          = { 10.57967/hf/2780 },
	publisher    = { Hugging Face }
}

Llama3.1-70B-Chinese-Chat致力于为其用户提供卓越的语言生成和交互功能，该模型的发布有助于推动语言模型在多语言环境下的应用和发展。