llama_cpp-rs

这是对同名C++项目的安全、高级Rust绑定，旨在尽可能地用户友好。只需15行代码，无需机器学习经验，即可直接在CPU上运行基于GGUF的大型语言模型！

// 从任何实现了`AsRef<Path>`的对象创建模型：
let model = LlamaModel::load_from_file("path_to_model.gguf", LlamaParams::default()).expect("无法加载模型");

// `LlamaModel`保存了在多个会话中共享的权重；虽然你的模型可能有几个GB大，但一个会话通常只有几十到一百MB！
let mut ctx = model.create_session(SessionParams::default()).expect("创建会话失败");

// 你可以将任何实现了`AsRef<[u8]>`的对象输入到模型的上下文中。
ctx.advance_context("这是一个名叫斯坦利的男人的故事。").unwrap();

// 大型语言模型通常用于预测序列中的下一个单词。让我们生成一些标记！
let max_tokens = 1024;
let mut decoded_tokens = 0;

// `ctx.start_completing_with`创建一个生成标记的工作线程。当完成句柄被丢弃时，标记生成停止！
let mut completions = ctx.start_completing_with(StandardSampler::default(), 1024).into_strings();

for completion in completions {
    print!("{completion}");
    let _ = io::stdout().flush();
    
    decoded_tokens += 1;
    
    if decoded_tokens > max_tokens {
        break;
    }
}

这个仓库包含高级绑定（crates/llama_cpp）以及自动生成的llama.cpp低级C API绑定（crates/llama_cpp_sys）。欢迎贡献 - 只需保持用户体验的简洁性！