Pixiu Paper | FinBen Leaderboard
Disclaimer
This repository and its contents are provided for academic and educational purposes only. None of the material constitutes financial, legal, or investment advice. No warranties, express or implied, are offered regarding the accuracy, completeness, or utility of the content. The authors and contributors are not responsible for any errors, omissions, or any consequences arising from the use of the information herein. Users should exercise their own judgment and consult professionals before making any financial, legal, or investment decisions. The use of the software and information contained in this repository is entirely at the user's own risk.
By using or accessing the information in this repository, you agree to indemnify, defend, and hold harmless the authors, contributors, and any affiliated organizations or persons from any and all claims or damages.
📢 Update (Date: 09-22-2023)
🚀 We're thrilled to announce that our paper, "PIXIU: A Comprehensive Benchmark, Instruction Dataset and Large Language Model for Finance", has been accepted by NeurIPS 2023 Track Datasets and Benchmarks!
📢 Update (Date: 10-08-2023)
🌏 We're proud to share that the enhanced versions of FinBen, which now support both Chinese and Spanish!
📢 Update (Date: 02-20-2024)
🌏 We're delighted to share that our paper, "The FinBen: An Holistic Financial Benchmark for Large Language Models", is now available at FinBen.
📢 Update (Date: 05-02-2024)
🌏 We're pleased to invite you to attend the IJCAI2024-challenge, "Financial Challenges in Large Language Models - FinLLM", the starter-kit is available at Starter-kit.
Checkpoints:
Languages
Papers
- PIXIU: A Comprehensive Benchmark, Instruction Dataset and Large Language Model for Finance
- The FinBen: An Holistic Financial Benchmark for Large Language Models
- No Language is an Island: Unifying Chinese and English in Financial Large Language Models, Instruction Data, and Benchmarks
- Dólares or Dollars? Unraveling the Bilingual Prowess of Financial LLMs Between Spanish and English
Evaluations:
- English Evaluation Datasets (More details on FinBen section)
- Spanish Evaluation Datasets
- Chinese Evaluation Datasets
Sentiment Analysis
Classification
- Headlines (flare_headlines)
- FinArg ECC Task1 (flare_finarg_ecc_auc)
- FinArg ECC Task2 (flare_finarg_ecc_arc)
- CFA (flare_cfa)
- MultiFin EN (flare_multifin_en)
- M&A (flare_ma)
- MLESG EN (flare_mlesg)
Knowledge Extraction
- NER (flare_ner)
- Finer Ord (flare_finer_ord)
- FinRED (flare_finred)
- FinCausal20 Task1 (flare_causal20_sc)
- FinCausal20 Task2 (flare_cd)
Number Understanding
Text Summarization
Credit Scoring
- German (flare_german)
- Australian (flare_australian)
- Lendingclub (flare_cra_lendingclub)
- Credit Card Fraud (flare_cra_ccf)
- ccFraud (flare_cra_ccfraud)
- Polish (flare_cra_polish)
- Taiwan Economic Journal (flare_cra_taiwan)
- PortoSeguro (flare_cra_portoseguro)
- Travle Insurance (flare_cra_travelinsurance)
Forecasting
- BigData22 for Stock Movement (flare_sm_bigdata)
- ACL18 for Stock Movement (flare_sm_acl)
- CIKM18 for Stock Movement (flare_sm_cikm)
Overview
Welcome to the PIXIU project! This project is designed to support the development, fine-tuning, and evaluation of Large Language Models (LLMs) in the financial domain. PIXIU is a significant step towards understanding and harnessing the power of LLMs in the financial domain.
Structure of the Repository
The repository is organized into several key components, each serving a unique purpose in the financial NLP pipeline:
-
FinBen: Our Financial Language Understanding and Prediction Evaluation Benchmark. FinBen serves as the evaluation suite for financial LLMs, with a focus on understanding and prediction tasks across various financial contexts.
-
FIT: Our Financial Instruction Dataset. FIT is a multi-task and multi-modal instruction dataset specifically tailored for financial tasks. It serves as the training ground for fine-tuning LLMs for these tasks.
-
FinMA: Our Financial Large Language Model (LLM). FinMA is the core of our project, providing the learning and prediction power for our financial tasks.
Key Features
-
Open resources: PIXIU openly provides the financial LLM, instruction tuning data, and datasets included in the evaluation benchmark to encourage open research and transparency.
-
Multi-task: The instruction tuning data and benchmark in PIXIU cover a diverse set of financial tasks, including four financial NLP tasks and one financial prediction task.
-
Multi-modality: PIXIU's instruction tuning data and benchmark consist of multi-modality financial data, including time series data from the stock movement prediction task. It covers various