Project Icon



simpletransformers是一个基于Hugging Face Transformers的开源工具,通过简化的API让用户能够用少量代码快速构建和优化Transformer模型。该库支持文本分类、命名实体识别、问答系统等多种NLP任务,为研究人员和开发者提供了便捷的方式来应用这些强大的模型。simpletransformers具有直观的接口和丰富的功能,可用于各类自然语言处理场景,有效降低了使用Transformer模型的门槛。

License Downloads

All Contributors

Simple Transformers

This library is based on the Transformers library by HuggingFace. Simple Transformers lets you quickly train and evaluate Transformer models. Only 3 lines of code are needed to initialize, train, and evaluate a model.

Supported Tasks:

  • Information Retrieval (Dense Retrieval)
  • (Large) Language Models (Training, Fine-tuning, and Generation)
  • Encoder Model Training and Fine-tuning
  • Sequence Classification
  • Token Classification (NER)
  • Question Answering
  • Language Generation
  • T5 Model
  • Seq2Seq Tasks
  • Multi-Modal Classification
  • Conversational AI.

Table of contents


With Conda

  1. Install Anaconda or Miniconda Package Manager from here
  2. Create a new virtual environment and install packages.
$ conda create -n st python pandas tqdm
$ conda activate st

Using Cuda:

$ conda install pytorch>=1.6 cudatoolkit=11.0 -c pytorch

Without using Cuda

$ conda install pytorch cpuonly -c pytorch
  1. Install simpletransformers.
$ pip install simpletransformers


  1. Install Weights and Biases (wandb) for tracking and visualizing training in a web browser.
$ pip install wandb


All documentation is now live at

Simple Transformer models are built with a particular Natural Language Processing (NLP) task in mind. Each such model comes equipped with features and functionality designed to best fit the task that they are intended to perform. The high-level process of using Simple Transformers models follows the same pattern.

  1. Initialize a task-specific model
  2. Train the model with train_model()
  3. Evaluate the model with eval_model()
  4. Make predictions on (unlabelled) data with predict()

However, there are necessary differences between the different models to ensure that they are well suited for their intended task. The key differences will typically be the differences in input/output data formats and any task specific features/configuration options. These can all be found in the documentation section for each task.

The currently implemented task-specific Simple Transformer models, along with their task, are given below.

Binary and multi-class text classificationClassificationModel
Conversational AI (chatbot training)ConvAIModel
Language generationLanguageGenerationModel
Language model training/fine-tuningLanguageModelingModel
Multi-label text classificationMultiLabelClassificationModel
Multi-modal classification (text and image data combined)MultiModalClassificationModel
Named entity recognitionNERModel
Question answeringQuestionAnsweringModel
Sentence-pair classificationClassificationModel
Text Representation GenerationRepresentationModel
Document RetrievalRetrievalModel
  • Please refer to the relevant section in the docs for more information on how to use these models.
  • Example scripts can be found in the examples directory.
  • See the Changelog for up-to-date changes to the project.

A quick example

from simpletransformers.classification import ClassificationModel, ClassificationArgs
import pandas as pd
import logging

transformers_logger = logging.getLogger("transformers")

# Preparing train data
train_data = [
    ["Aragorn was the heir of Isildur", 1],
    ["Frodo was the heir of Isildur", 0],
train_df = pd.DataFrame(train_data)
train_df.columns = ["text", "labels"]

# Preparing eval data
eval_data = [
    ["Theoden was the king of Rohan", 1],
    ["Merry was the king of Rohan", 0],
eval_df = pd.DataFrame(eval_data)
eval_df.columns = ["text", "labels"]

# Optional model configuration
model_args = ClassificationArgs(num_train_epochs=1)

# Create a ClassificationModel
model = ClassificationModel(
    "roberta", "roberta-base", args=model_args

# Train the model

# Evaluate the model
result, model_outputs, wrong_predictions = model.eval_model(eval_df)

# Make predictions with the model
predictions, raw_outputs = model.predict(["Sam was a Wizard"])

Experiment Tracking with Weights and Biases

  • Weights and Biases makes it incredibly easy to keep track of all your experiments. Check it out on Colab here: Open In Colab

Current Pretrained Models

For a list of pretrained models, see Hugging Face docs.

The model_types available for each task can be found under their respective section. Any pretrained model of that type found in the Hugging Face docs should work. To use any of them set the correct model_type and model_name in the args dictionary.

Contributors ✨

Thanks goes to these wonderful people (emoji key):

Project Cover


豆包 MarsCode 是一款革命性的编程助手,通过AI技术提供代码补全、单测生成、代码解释和智能问答等功能,支持100+编程语言,与主流编辑器无缝集成,显著提升开发效率和代码质量。

Project Cover


Suno AI是一个革命性的AI音乐创作平台,能在短短30秒内帮助用户创作出一首完整的歌曲。无论是寻找创作灵感还是需要快速制作音乐,Suno AI都是音乐爱好者和专业人士的理想选择。

Project Cover



Project Cover


Kimi AI助手提供多语言对话支持,能够阅读和理解用户上传的文件内容,解析网页信息,并结合搜索结果为用户提供详尽的答案。无论是日常咨询还是专业问题,Kimi都能以友好、专业的方式提供帮助。

Project Cover



Project Cover


探索Tensor.Art平台的独特AI模型,免费访问各种图像生成与AI训练工具,从Stable Diffusion等基础模型开始,轻松实现创新图像生成。体验前沿的AI技术,推动个人和企业的创新发展。

Project Cover



Project Cover



Project Cover



@2024 懂AI·鲁ICP备2024100362号-6·鲁公网安备37021002001498号