Project Icon

awesome-efficient-aigc

AIGC效率优化技术与资源汇总

该项目汇集了提高AI生成内容(AIGC)效率的最新技术资源,包括大语言模型(LLMs)和扩散模型(DMs)的优化方法。收录内容涵盖前沿研究论文、代码实现和综述文章,重点关注量化、微调等效率提升技术。这一持续更新的资源库为AIGC领域的研究和开发提供了全面的参考,有助于推动相关技术的进步与落地应用。

Awesome Efficient AIGC Awesome

This repo collects efficient approaches for AI-Generated Content (AIGC) to cope with its huge demand for computing resources, including efficient Large Language Models (LLMs), Diffusion Models (DMs), etc. We are continuously improving the project. Welcome to PR the works (papers, repositories) missed by the repo. Special thanks to Xingyu Zheng, Xudong Ma, Yifu Ding, and all researchers who have contributed to this project!

Table of Contents

Survey

  • [Arxiv] Efficient Prompting Methods for Large Language Models: A Survey
  • [Arxiv] Efficient Diffusion Models for Vision: A Survey
  • [Arxiv] Faster and Lighter LLMs: A Survey on Current Challenges and Way Forward [code] GitHub Repo stars
  • [Arxiv] A Survey on Knowledge Distillation of Large Language Models [code] GitHub Repo stars
  • [Arxiv] Model Compression and Efficient Inference for Large Language Models: A Survey
  • [Arxiv] A Survey on Transformer Compression
  • [Arxiv] A Comprehensive Survey of Compression Algorithms for Language Models
  • [Arxiv] Unlocking Efficiency in Large Language Model Inference: A Comprehensive Survey of Speculative Decoding [code] GitHub Repo stars [Blog]
  • [Arxiv] Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security [code] GitHub Repo stars
  • [Arxiv] A Survey on Hardware Accelerators for Large Language Models
  • [Arxiv] A Survey of Resource-efficient LLM and Multimodal Foundation Models [code] GitHub Repo stars
  • [Arxiv] Beyond Efficiency: A Systematic Survey of Resource-Efficient Large Language Models [code] GitHub Repo stars
  • [Arxiv] Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems
  • [Arxiv] Efficient Large Language Models: A Survey [code] GitHub Repo stars
  • [Arxiv] The Efficiency Spectrum of Large Language Models: An Algorithmic Survey [code] GitHub Repo stars
  • [Arxiv] A Survey on Model Compression for Large Language Models
  • [Arxiv] A Comprehensive Survey on Knowledge Distillation of Diffusion Models
  • [TACL] Compressing Large-Scale Transformer-Based Models: A Case Study on BERT
  • [JSA] A Survey of Techniques for Optimizing Transformer Inference
  • [Arxiv] Understanding LLMs: A Comprehensive Overview from Training to Inference

Language

2024

Quantization

  • [arXiv] How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study [code]GitHub Repo stars [HuggingFace]
  • [ArXiv] Accurate LoRA-Finetuning Quantization of LLMs via Information Retention [code]GitHub Repo stars
  • [ArXiv] BiLLM: Pushing the Limit of Post-Training Quantization for LLMs [code]GitHub Repo stars
  • [ArXiv] DB-LLM: Accurate Dual-Binarization for Efficient LLMs
  • [ArXiv] Extreme Compression of Large Language Models via Additive Quantization
  • [ArXiv] Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models
  • [ArXiv] FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design
  • [ArXiv] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
  • [ArXiv] EdgeQAT: Entropy and Distribution Guided Quantization-Aware Training for the Acceleration of Lightweight LLMs on the Edge [code] GitHub Repo stars
  • [ArXiv] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs
  • [ArXiv] LQER: Low-Rank Quantization Error Reconstruction for LLMs
  • [ArXiv] KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache [code] GitHub Repo stars
  • [ArXiv] QuIP#: Even Better LLM Quantization with Hadamard Incoherence and Lattice Codebooks [code] GitHub Repo stars
  • [ArXiv] L4Q: Parameter Efficient Quantization-Aware Training on Large Language Models via LoRA-wise LSQ
  • [ArXiv] TP-Aware Dequantization
  • [ArXiv] ApiQ: Finetuning of 2-Bit Quantized Large Language Model
  • [ArXiv] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs
  • [ArXiv] BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation [code] GitHub Repo stars
  • [ArXiv] OneBit: Towards Extremely Low-bit Large Language Models
  • [ArXiv] WKVQuant: Quantising Weight and Key/Value Cache for Large Language Models Gains More
  • [ArXiv] GPTVQ: The Blessing of Dimensionality for LLM Quantization [code] GitHub Repo stars
  • [DAC] APTQ: Attention-aware Post-Training Mixed-Precision Quantization for Large Language Models
  • [DAC] A Comprehensive Evaluation of Quantization Strategies for Large Language Models
  • [ArXiv] No Token Left Behind: Reliable KV Cache Compression via Importance-Aware Mixed Precision Quantization
  • [ArXiv] Evaluating Quantized Large Language Models
  • [ArXiv] FlattenQuant: Breaking Through the Inference Compute-bound for Large Language Models with Per-tensor Quantization
  • [ArXiv] LLM-PQ: Serving LLM on Heterogeneous Clusters with Phase-Aware Partition and Adaptive Quantization
  • [ArXiv] IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact
  • [ArXiv] On the Compressibility of Quantized Large Language Models
  • [ArXiv] EasyQuant: An Efficient Data-free Quantization Algorithm for LLMs
  • [ArXiv] QAQ: Quality Adaptive Quantization for LLM KV Cache [code] GitHub Repo stars
  • [ArXiv] GEAR: An Efficient KV Cache Compression Recipefor Near-Lossless Generative Inference of LLM
  • [ArXiv] What Makes Quantization for Large Language Models Hard? An Empirical Study from the Lens of Perturbation
  • [ArXiv] SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression [code] GitHub Repo stars
  • [ICLR] AffineQuant: Affine Transformation Quantization for Large Language Models [code] GitHub Repo stars
  • [ICLR Practical ML for Low Resource Settings Workshop] Oh! We Freeze: Improving Quantized Knowledge Distillation via Signal Propagation Analysis for Large Language Models
  • [ArXiv] Accurate Block Quantization in LLMs with Outliers
  • [ArXiv] QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs [code] GitHub Repo stars
  • [ArXiv] Minimize Quantization Output Error with Bias Compensation [code] GitHub Repo stars
  • [ArXiv] Cherry on Top: Parameter Heterogeneity and Quantization in Large Language Models

Fine-tuning

  • [ArXiv] BitDelta: Your Fine-Tune May Only Be Worth One Bit [code] GitHub Repo stars
  • [AAAI EIW Workshop 2024] QDyLoRA: Quantized Dynamic Low-Rank Adaptation for Efficient Large Language Model Tuning

Other

  • [ArXiv] FlightLLM: Efficient Large Language Model Inference with a Complete Mapping Flow on FPGA
  • [ArXiv] Inferflow: an Efficient and Highly Configurable Inference Engine for Large Language Models

2023

Quantization

  • [ICLR] GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers [code] GitHub Repo stars
  • [NeurIPS] QLORA: Efficient Finetuning of Quantized LLMs [code] GitHub Repo stars
  • [NeurIPS] Memory-Efficient Fine-Tuning of Compressed Large Language Models via sub-4-bit Integer Quantization
  • [ICML] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models [code] GitHub Repo stars
  • [ICML] FlexRound: Learnable Rounding based on Element-wise Division for Post-Training Quantization
项目侧边栏1项目侧边栏2
推荐项目
Project Cover

豆包MarsCode

豆包 MarsCode 是一款革命性的编程助手,通过AI技术提供代码补全、单测生成、代码解释和智能问答等功能,支持100+编程语言,与主流编辑器无缝集成,显著提升开发效率和代码质量。

Project Cover

AI写歌

Suno AI是一个革命性的AI音乐创作平台,能在短短30秒内帮助用户创作出一首完整的歌曲。无论是寻找创作灵感还是需要快速制作音乐,Suno AI都是音乐爱好者和专业人士的理想选择。

Project Cover

白日梦AI

白日梦AI提供专注于AI视频生成的多样化功能,包括文生视频、动态画面和形象生成等,帮助用户快速上手,创造专业级内容。

Project Cover

有言AI

有言平台提供一站式AIGC视频创作解决方案,通过智能技术简化视频制作流程。无论是企业宣传还是个人分享,有言都能帮助用户快速、轻松地制作出专业级别的视频内容。

Project Cover

Kimi

Kimi AI助手提供多语言对话支持,能够阅读和理解用户上传的文件内容,解析网页信息,并结合搜索结果为用户提供详尽的答案。无论是日常咨询还是专业问题,Kimi都能以友好、专业的方式提供帮助。

Project Cover

讯飞绘镜

讯飞绘镜是一个支持从创意到完整视频创作的智能平台,用户可以快速生成视频素材并创作独特的音乐视频和故事。平台提供多样化的主题和精选作品,帮助用户探索创意灵感。

Project Cover

讯飞文书

讯飞文书依托讯飞星火大模型,为文书写作者提供从素材筹备到稿件撰写及审稿的全程支持。通过录音智记和以稿写稿等功能,满足事务性工作的高频需求,帮助撰稿人节省精力,提高效率,优化工作与生活。

Project Cover

阿里绘蛙

绘蛙是阿里巴巴集团推出的革命性AI电商营销平台。利用尖端人工智能技术,为商家提供一键生成商品图和营销文案的服务,显著提升内容创作效率和营销效果。适用于淘宝、天猫等电商平台,让商品第一时间被种草。

Project Cover

AIWritePaper论文写作

AIWritePaper论文写作是一站式AI论文写作辅助工具,简化了选题、文献检索至论文撰写的整个过程。通过简单设定,平台可快速生成高质量论文大纲和全文,配合图表、参考文献等一应俱全,同时提供开题报告和答辩PPT等增值服务,保障数据安全,有效提升写作效率和论文质量。

投诉举报邮箱: service@vectorlightyear.com
@2024 懂AI·鲁ICP备2024100362号-6·鲁公网安备37021002001498号