Project Icon

Awesome-LLM

全面汇总大型语言模型研究进展与资源

Awesome-LLM项目汇集了大型语言模型(LLM)领域的核心资源,包括关键论文、开源模型、训练框架及应用案例。该项目系统梳理了从GPT到当前最新LLM的技术演进,为研究者和开发者提供全面的学习参考。项目内容涵盖LLM历史发展、前沿突破及实践应用,是了解和探索LLM技术的重要资料库。

Awesome-LLM Awesome

🔥 Large Language Models(LLM) have taken the NLP community AI community the Whole World by storm. Here is a curated list of papers about large language models, especially relating to ChatGPT. It also contains frameworks for LLM training, tools to deploy LLM, courses and tutorials about LLM and all publicly available LLM checkpoints and APIs.

Trending LLM Projects

  • Deep-Live-Cam - real time face swap and one-click video deepfake with only a single image (uncensored).
  • MiniCPM-V 2.6 - A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
  • GPT-SoVITS - 1 min voice data can also be used to train a good TTS model! (few shot voice cloning).

Table of Content

Milestone Papers

DatekeywordsInstitutePaper
2017-06TransformersGoogleAttention Is All You Need
2018-06GPT 1.0OpenAIImproving Language Understanding by Generative Pre-Training
2018-10BERTGoogleBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
2019-02GPT 2.0OpenAILanguage Models are Unsupervised Multitask Learners
2019-09Megatron-LMNVIDIAMegatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
2019-10T5GoogleExploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
2019-10ZeROMicrosoftZeRO: Memory Optimizations Toward Training Trillion Parameter Models
2020-01Scaling LawOpenAIScaling Laws for Neural Language Models
2020-05GPT 3.0OpenAILanguage models are few-shot learners
2021-01Switch TransformersGoogleSwitch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
2021-08CodexOpenAIEvaluating Large Language Models Trained on Code
2021-08Foundation ModelsStanfordOn the Opportunities and Risks of Foundation Models
2021-09FLANGoogleFinetuned Language Models are Zero-Shot Learners
2021-10T0HuggingFace et al.Multitask Prompted Training Enables Zero-Shot Task Generalization
2021-12GLaMGoogleGLaM: Efficient Scaling of Language Models with Mixture-of-Experts
2021-12WebGPTOpenAIWebGPT: Browser-assisted question-answering with human feedback
2021-12RetroDeepMindImproving language models by retrieving from trillions of tokens
2021-12GopherDeepMindScaling Language Models: Methods, Analysis & Insights from Training Gopher
2022-01COTGoogleChain-of-Thought Prompting Elicits Reasoning in Large Language Models
2022-01LaMDAGoogleLaMDA: Language Models for Dialog Applications
2022-01MinervaGoogleSolving Quantitative Reasoning Problems with Language Models
2022-01Megatron-Turing NLGMicrosoft&NVIDIAUsing Deep and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model
2022-03InstructGPTOpenAITraining language models to follow instructions with human feedback
2022-04PaLMGooglePaLM: Scaling Language Modeling with Pathways
2022-04ChinchillaDeepMindAn empirical analysis of compute-optimal large language model training
2022-05OPTMetaOPT: Open Pre-trained Transformer Language Models
2022-05UL2GoogleUnifying Language Learning Paradigms
2022-06Emergent AbilitiesGoogleEmergent Abilities of Large Language Models
2022-06BIG-benchGoogleBeyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
2022-06METALMMicrosoftLanguage Models are General-Purpose Interfaces
2022-09SparrowDeepMindImproving alignment of dialogue agents via targeted human judgements
2022-10Flan-T5/PaLMGoogleScaling Instruction-Finetuned Language Models
2022-10GLM-130BTsinghuaGLM-130B: An Open Bilingual Pre-trained Model
2022-11HELMStanfordHolistic Evaluation of Language Models
2022-11BLOOMBigScienceBLOOM: A 176B-Parameter Open-Access Multilingual Language Model
2022-11GalacticaMetaGalactica: A Large Language Model for Science
2022-12OPT-IMLMetaOPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization
2023-01Flan 2022 CollectionGoogleThe Flan Collection: Designing Data and Methods for Effective Instruction Tuning
2023-02LLaMAMetaLLaMA: Open and Efficient Foundation Language Models
2023-02Kosmos-1MicrosoftLanguage Is Not All You Need: Aligning Perception with Language Models
2023-03LRUDeepMindResurrecting Recurrent Neural Networks for Long Sequences
2023-03PaLM-EGooglePaLM-E: An Embodied Multimodal Language Model
2023-03GPT 4OpenAIGPT-4 Technical Report
2023-04LLaVAUW–Madison&MicrosoftVisual Instruction Tuning
2023-04PythiaEleutherAI et al.Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling
2023-05DromedaryCMU et al.Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision
2023-05PaLM 2GooglePaLM 2 Technical Report
2023-05RWKVBo PengRWKV: Reinventing RNNs for the Transformer Era
2023-05DPOStanfordDirect Preference Optimization: Your Language Model is Secretly a Reward Model
2023-05ToTGoogle&PrincetonTree of Thoughts: Deliberate Problem Solving with Large Language Models
2023-07LLaMA2MetaLlama 2: Open Foundation and Fine-Tuned Chat Models
2023-10Mistral 7BMistralMistral 7B
2023-12MambaCMU&PrincetonMamba: Linear-Time Sequence Modeling with Selective State Spaces
2024-01DeepSeek-v2DeepSeekDeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
2024-03JambaAI21 LabsJamba: A Hybrid Transformer-Mamba Language Model
2024-05Mamba2CMU&PrincetonTransformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality
2024-05Llama3MetaThe Llama 3 Herd of Models

Other Papers

If you're interested in the field of LLM, you may find the above list of milestone papers helpful to explore its history and state-of-the-art. However, each direction of LLM offers a unique set of insights and contributions, which are essential to understanding the field as a whole. For a detailed list of papers in various subfields, please refer to the following link:

项目侧边栏1项目侧边栏2
推荐项目
Project Cover

豆包MarsCode

豆包 MarsCode 是一款革命性的编程助手,通过AI技术提供代码补全、单测生成、代码解释和智能问答等功能,支持100+编程语言,与主流编辑器无缝集成,显著提升开发效率和代码质量。

Project Cover

AI写歌

Suno AI是一个革命性的AI音乐创作平台,能在短短30秒内帮助用户创作出一首完整的歌曲。无论是寻找创作灵感还是需要快速制作音乐,Suno AI都是音乐爱好者和专业人士的理想选择。

Project Cover

有言AI

有言平台提供一站式AIGC视频创作解决方案,通过智能技术简化视频制作流程。无论是企业宣传还是个人分享,有言都能帮助用户快速、轻松地制作出专业级别的视频内容。

Project Cover

Kimi

Kimi AI助手提供多语言对话支持,能够阅读和理解用户上传的文件内容,解析网页信息,并结合搜索结果为用户提供详尽的答案。无论是日常咨询还是专业问题,Kimi都能以友好、专业的方式提供帮助。

Project Cover

阿里绘蛙

绘蛙是阿里巴巴集团推出的革命性AI电商营销平台。利用尖端人工智能技术,为商家提供一键生成商品图和营销文案的服务,显著提升内容创作效率和营销效果。适用于淘宝、天猫等电商平台,让商品第一时间被种草。

Project Cover

吐司

探索Tensor.Art平台的独特AI模型,免费访问各种图像生成与AI训练工具,从Stable Diffusion等基础模型开始,轻松实现创新图像生成。体验前沿的AI技术,推动个人和企业的创新发展。

Project Cover

SubCat字幕猫

SubCat字幕猫APP是一款创新的视频播放器,它将改变您观看视频的方式!SubCat结合了先进的人工智能技术,为您提供即时视频字幕翻译,无论是本地视频还是网络流媒体,让您轻松享受各种语言的内容。

Project Cover

美间AI

美间AI创意设计平台,利用前沿AI技术,为设计师和营销人员提供一站式设计解决方案。从智能海报到3D效果图,再到文案生成,美间让创意设计更简单、更高效。

Project Cover

AIWritePaper论文写作

AIWritePaper论文写作是一站式AI论文写作辅助工具,简化了选题、文献检索至论文撰写的整个过程。通过简单设定,平台可快速生成高质量论文大纲和全文,配合图表、参考文献等一应俱全,同时提供开题报告和答辩PPT等增值服务,保障数据安全,有效提升写作效率和论文质量。

投诉举报邮箱: service@vectorlightyear.com
@2024 懂AI·鲁ICP备2024100362号-6·鲁公网安备37021002001498号