twitter-roberta-base-emotion-multilabel-latest - 精确识别推文情绪的多标签分类模型

项目介绍：twitter-roberta-base-emotion-multilabel-latest

在社交媒体快速发展的今日，情感分析已成为研究焦点之一。为了解决在文本中识别复杂情感的问题，twitter-roberta-base-emotion-multilabel-latest 模型应运而生。此模型是 cardiffnlp/twitter-roberta-base-2022-154m 的微调版本，专门针对 SemEval 2018 - Task 1 Affect in Tweets 的多标签分类子任务进行了优化。

模型性能

模型在测试集上的表现令人瞩目，具体性能指标如下：

微平均 F1 分数：0.7169
宏平均 F1 分数：0.5464
样本级 Jaccard 指数：0.5970

这些指标表明模型能够有效地识别推文中的多种情感。

使用方法

用户可以通过以下两种主要方式使用该模型：

方法一：使用 tweetnlp

首先通过 pip 安装 tweetnlp：

pip install tweetnlp

然后在 Python 中加载模型：

import tweetnlp

model = tweetnlp.load_model('topic_classification', model_name='cardiffnlp/twitter-roberta-base-emotion-multilabel-latest')

result = model.predict("I bet everything will work out in the end :)")
print(result)  # 输出: {'label': ['joy', 'optimism']}

方法二：使用 pipeline

首先确保安装指定版本的 tensorflow：

pip install -U tensorflow==2.10

然后使用 transformers 提供的 pipeline：

from transformers import pipeline

pipe = pipeline("text-classification", model="cardiffnlp/twitter-roberta-base-emotion-multilabel-latest", return_all_scores=True)

results = pipe("I bet everything will work out in the end :)")
print(results)

这一方法提供了详细的情感评分，包括愤怒、期待、厌恶、恐惧、快乐和乐观等情感维度的分数。

参考文献

如果想进一步了解实现过程和使用场景，可以查阅以下参考文献：

@inproceedings{camacho-collados-etal-2022-tweetnlp,
    title={{T}weet{NLP}: {C}utting-{E}dge {N}atural {L}anguage {P}rocessing for {S}ocial {M}edia},
    author={Camacho-Collados, Jose and Rezaee, Kiamehr and Riahi, Talayeh and Ushio, Asahi and Loureiro, Daniel and Antypas, Dimosthenis and Boisson, Joanne and Espinosa-Anke, Luis and Liu, Fangyu and Mart{\'\i}nez-C{\'a}mara, Eugenio and others},
    booktitle = "Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations",
    month = nov,
    year = "2022",
    address = "Abu Dhabi, U.A.E.",
    publisher = "Association for Computational Linguistics",
}

这个模型为研究人员和开发者提供了一个强大工具，用于探索和分析社交媒体中的情感表达，助力于构建更智能的自然语言处理应用。