Starwhale: 革新MLOps和LLMOps的开源平台

starwhale

Starwhale:革新机器学习操作的开源平台

在当今快速发展的人工智能领域,如何高效管理机器学习项目、模型和数据集成为了一个巨大的挑战。Starwhale应运而生,作为一个创新的MLOps/LLMOps平台,它为机器学习操作带来了效率和标准化,彻底改变了模型开发的生命周期。

Starwhale的核心优势

Starwhale的设计理念是简化和优化机器学习工作流程的各个关键环节,包括模型构建、评估、发布和微调。它提供了三种灵活的部署配置来满足不同的需求:

🐥 Standalone: 适用于本地开发环境,通过swcli命令行工具管理,满足开发和调试需求。
🦅 Server: 部署在私有数据中心,依赖Kubernetes集群,提供集中化、基于Web的安全服务。
🦉 Cloud: 托管在公共云上,访问地址为https://cloud.starwhale.cn,由Starwhale团队负责维护,用户注册后即可使用。

Starwhale产品架构

Starwhale的核心在于将Model、Runtime和Dataset抽象为一等公民,为流线型操作提供了基础。此外,Starwhale还为常见的工作流场景提供了量身定制的功能:

🔥 模型评估: 通过Python SDK实现强大的生产规模评估,只需编写最少的代码。
🌟 实时演示: 通过用户友好的Web界面交互式评估模型性能。
🌊 LLM微调: 从高效微调到比较基准测试和发布的端到端工具链。

Starwhale的关键概念

🐘 Starwhale Dataset

Starwhale Dataset提供了高效的数据存储、加载和可视化功能,是专为机器学习和深度学习领域量身定制的数据管理工具。它支持多种数据格式,并可轻松与常用的机器学习框架集成。

Starwhale Dataset概览

使用Starwhale Dataset可以轻松创建、管理和共享数据集:

import torch
from starwhale import dataset, Image

# 为Starwhale云实例构建数据集
with dataset("https://cloud.starwhale.cn/project/starwhale:public/dataset/test-image", create="empty") as ds:
    for i in range(100):
        ds.append({"image": Image(f"{i}.png"), "label": i})
    ds.commit()

# 加载数据集
ds = dataset("https://cloud.starwhale.cn/project/starwhale:public/dataset/test-image")
print(len(ds))
print(ds[0].features.image.to_pil())
print(ds[0].features.label)

torch_ds = ds.to_pytorch()
torch_loader = torch.utils.data.DataLoader(torch_ds, batch_size=5)
print(next(iter(torch_loader)))

🐇 Starwhale Model

Starwhale Model是一种标准的机器学习模型打包格式,可用于模型微调、评估和在线服务等多种用途。一个Starwhale Model包含模型文件、推理代码、配置文件以及运行模型所需的任何其他文件。

Starwhale Model概览

使用Starwhale Model可以轻松构建、复制和运行模型:

# 模型构建
swcli model build . --module mnist.evaluate --runtime pytorch/version/v1 --name mnist

# 从standalone复制模型到云端
swcli model cp mnist https://cloud.starwhale.cn/project/starwhale:public

# 模型运行
swcli model run --uri mnist --runtime pytorch --dataset mnist
swcli model run --workdir . --module mnist.evaluator --handler mnist.evaluator:MNISTInference.cmp

🐌 Starwhale Runtime

Starwhale Runtime旨在为Python程序提供可复现和可共享的运行环境。用户可以轻松地与团队成员或外部人员共享工作环境,反之亦然。此外,用户可以在Starwhale Server或Starwhale Cloud上运行程序,而无需担心依赖问题。

Starwhale Runtime概览

Starwhale Runtime支持多种构建方式,并可与模型和数据集无缝集成:

# 从runtime.yaml、conda环境、Docker镜像或shell构建
swcli runtime build --yaml runtime.yaml
swcli runtime build --conda pytorch --name pytorch-runtime --cuda 11.4
swcli runtime build --docker pytorch/pytorch:1.9.0-cuda11.1-cudnn8-runtime
swcli runtime build --shell --name pytorch-runtime

# Runtime激活
swcli runtime activate pytorch

# 与模型和数据集集成
swcli model run --uri test --runtime pytorch
swcli model build . --runtime pytorch
swcli dataset build --runtime pytorch

🐄 Starwhale Evaluation

Starwhale Evaluation使用户能够通过Starwhale Python SDK编写几行代码就可以评估复杂的、面向生产的分布式模型。

import typing as t
import gradio
from starwhale import evaluation
from starwhale.api.service import api

def model_generate(image):
    ...
    return predict_value, probability_matrix

@evaluation.predict(
    resources={"nvidia.com/gpu": 1},
    replicas=4,
)
def predict_image(data: dict, external: dict) -> None:
    return model_generate(data["image"])

@evaluation.evaluate(use_predict_auto_log=True, needs=[predict_image])
def evaluate_results(predict_result_iter: t.Iterator):
    for _data in predict_result_iter:
        ...
    evaluation.log_summary({"accuracy": 0.95, "benchmark": "test"})

@api(gradio.File(), gradio.Label())
def predict_view(file: t.Any) -> t.Any:
    with open(file.name, "rb") as f:
        data = Image(f.read(), shape=(28, 28, 1))
    _, prob = predict_image({"image": data})
    return {i: p for i, p in enumerate(prob)}

🦍 Starwhale Fine-tuning

Starwhale Fine-tuning为大型语言模型(LLM)调优提供了完整的工作流程,包括批量模型评估、实时演示和模型发布功能。Starwhale Fine-tuning Python SDK非常简单易用。

import typing as t
from starwhale import finetune, Dataset
from transformers import Trainer

@finetune(
    resources={"nvidia.com/gpu":4, "memory": "32G"},
    require_train_datasets=True,
    require_validation_datasets=True,
    model_modules=["evaluation", "finetune"],
)
def lora_finetune(train_datasets: t.List[Dataset], val_datasets: t.List[Dataset]) -> None:
    # 初始化模型和tokenizer
    trainer = Trainer(
        model=model, tokenizer=tokenizer,
        train_dataset=train_datasets[0].to_pytorch(), # 将Starwhale Dataset转换为Pytorch Dataset
        eval_dataset=val_datasets[0].to_pytorch())
    trainer.train()
    trainer.save_state()
    trainer.save_model()
    # 保存权重,然后Starwhale SDK会将它们打包成Starwhale Model

Starwhale的应用案例

Starwhale在各种机器学习任务中都有广泛的应用,以下是一些典型案例:

🚀 LLM应用:
- OpenSource LLMs Leaderboard: 提供开源LLM模型的评估和排名。
- Llama2: 快速部署和运行Llama2聊天模型。
- Stable Diffusion: 文本到图像生成模型的部署和微调。
- LLAMA: 模型评估和微调。
- Text-to-Music: 文本到音乐生成模型的演示。
- Code Generation: 代码生成模型的部署和使用。
🌋 模型微调:
- Baichuan2和ChatGLM3: 中文大型语言模型的微调。
- Stable Diffusion: 图像生成模型的微调。
🦦 图像分类:
- MNIST: 手写数字识别模型的评估和部署。
- CIFAR10: 通用物体分类模型的训练和评估。
- Vision Transformer(ViT): 视觉Transformer模型的应用。
🐃 图像分割:
- Segment Anything(SAM): 通用图像分割模型的部署和使用。
🐦 目标检测:
- YOLO: 实时目标检测模型的部署和评估。
- Pedestrian Detection: 行人检测模型的应用。

此外,Starwhale还支持视频识别、机器翻译、文本分类和语音识别等多种任务,为研究人员和开发者提供了全面的MLOps解决方案。

Starwhale的安装和快速入门

Starwhale提供了简单的安装方式和快速入门指南,使用户能够迅速上手:

Starwhale Standalone安装:
```
python3 -m pip install starwhale
```
Starwhale Server: Starwhale Server以Docker镜像形式提供,可以直接用Docker运行或部署到Kubernetes集群。对于笔记本环境,使用swcli server start命令是一个合适的选择,该命令依赖Docker和Docker-Compose。
快速入门: Starwhale使用MNIST数据集作为Hello World示例来展示基本的Starwhale Model工作流程。用户可以选择在自己的Python环境中按照Standalone快速入门文档操作,或者使用Google Colab环境按照Jupyter notebook示例进行操作。

Starwhale的社区和支持

Starwhale拥有活跃的开源社区和全面的支持体系:

官方网站: Starwhale主页
文档: 官方文档
社区交流: Slack
问题报告: Github Issue
社交媒体: Twitter @starwhaleai
开源贡献: Starwhale欢迎社区贡献,详情请参阅贡献指南

Starwhale的开源性质和Apache-2.0许可证使其成为一个灵活、可定制的MLOps平台。它的框架设计注重清晰度和易用性,使开发者能够根据自身需求构建定制化的MLOps功能。

结语

Starwhale作为一个创新的MLOps/LLMOps平台,正在改变机器学习和深度学习的开发和部署方式。通过提供全面的工具链和灵活的部署选项,Starwhale帮助团队更高效地管理机器学习项目的整个生命周期。无论是对于研究人员、开发者还是企业用户,Starwhale都提供了强大的功能和便捷的使用体验,推动着AI技术的快速发展和广泛应用。随着AI领域的不断进步,Starwhale将继续发挥其关键作用,为机器学习操作带来更多创新和效率。