项目介绍:LongCite-llama3.1-8b
LongCite-llama3.1-8b是一个经过专门训练的语言模型,基于Meta-Llama-3.1-8B开发。该模型特别擅长在长文本背景中生成精细化引文,适用于长上下文的问答任务。模型支持的最大上下文窗口可达到12.8万标记(tokens),这使得它在处理长文档时具有很强的能力。
背景和数据集
该项目使用的是由THUDM(清华大学数据挖掘中心)提供的LongCite-45k数据集。LongCite-45k是一个包含大量例子的问答数据集,专门用于训练和评估在复杂文本情境下的引文生成能力。
模型实现和运行环境
LongCite-llama3.1-8b模型在transformers库上实现,其环境需求为transformers>=4.43.0
。用户可以使用Python进行简单的模型部署和调用。以下是一个简要的使用示例:
import json
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained('THUDM/LongCite-llama3.1-8b', trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained('THUDM/LongCite-llama3.1-8b', torch_dtype=torch.bfloat16, trust_remote_code=True, device_map='auto')
context = '''
W. Russell Todd, 94, United States Army general (b. 1928). February 13. Tim Aymar, 59, heavy metal singer (Pharaoh) (b. 1963). Marshall \"Eddie\" Conway, 76, Black Panther Party leader (b. 1946). Roger Bonk, 78, football player (North Dakota Fighting Sioux, Winnipeg Blue Bombers) (b. 1944). Conrad Dobler, 72, football player (St. Louis Cardinals, New Orleans Saints, Buffalo Bills) (b. 1950). Brian DuBois, 55, baseball player (Detroit Tigers) (b. 1967). Robert Geddes, 99, architect, dean of the Princeton University School of Architecture (1965–1982) (b. 1923). Tom Luddy, 79, film producer (Barfly, The Secret Garden), co-founder of the Telluride Film Festival (b. 1943). David Singmaster, 84, mathematician (b. 1938).
'''
query = "What was Robert Geddes' profession?"
result = model.query_longcite(context, query, tokenizer=tokenizer, max_input_length=128000, max_new_tokens=1024)
print("Answer:\n{}\n".format(result['answer']))
print("Statement with citations:\n{}\n".format(
json.dumps(result['statements_with_citations'], indent=2, ensure_ascii=False)))
print("Context (divided into sentences):\n{}\n".format(result['splited_context']))
其他实现选项
除了通过transformers库进行部署,用户还可以结合vllm项目来使用该模型。在vllm_inference.py文件中有相关的代码示例,供参考使用。
使用许可
LongCite-llama3.1-8b的使用遵循Llama-3.1许可证。
如何引用
如果该项目对您的研究有帮助,请在引用时注明:
@article{zhang2024longcite,
title = {LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA}
author={Jiajie Zhang and Yushi Bai and Xin Lv and Wanjun Gu and Danqing Liu and Minhao Zou and Shulin Cao and Lei Hou and Yuxiao Dong and Ling Feng and Juanzi Li},
journal={arXiv preprint arXiv:2409.02897},
year={2024}
}
通过这一项目,用户能够利用先进的自然语言处理技术进行更加精确和有效的长文档问答,实现创新的知识引用和传播。