Retrieval-Augmented Generation for AI-Generated Content: A Survey
This repo is constructed for collecting and categorizing papers about RAG according to our survey paper: Retrieval-Augmented Generation for AI-Generated Content: A Survey. Considering the rapid growth of this field, we will continue to update both paper and this repo.
Overview
Catalogue
Methods Taxonomy
RAG Foundations
-
Query-based RAG
REALM: Retrieval-Augmented Language Model Pre-Training
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
REPLUG: Retrieval-Augmented Black-Box Language Models
In-Context Retrieval-Augmented Language Models
When Language Model Meets Private Library
DocPrompting: Generating Code by Retrieving the Docs
Retrieval-based prompt selection for code-related few-shot learning
Inferfix: End-to-end program repair with llms
Make-an-audio: Text-to-audio generation with prompt-enhanced diffusion models
Reacc: A retrieval-augmented code completion framework
Uni-parser: Unified semantic parser for question answering on knowledge base and database
RNG-KBQA: generation augmented iterative ranking for knowledge base question answering
End-to-end casebased reasoning for commonsense knowledge base completion
Retrievegan:Image synthesis via differentiable patch retrieval
Retrieval-Augmented Score Distillation for Text-to-3D Generation
-
Latent Representation-based RAG
Leveraging passage retrieval with generative models for open domain question answering
Bashexplainer: Retrieval-augmented bash code comment generation based on finetuned codebert
EditSum: A Retrieve-and-Edit Framework for Source Code Summarization
Retrieve and Refine: Exemplar-based Neural Comment Generation
RACE: retrieval-augmented commit message generation
A Retrieve-and-Edit Framework for Predicting Structured Outputs
DecAF: Joint Decoding of Answers and Logical Forms for Question Answering over Knowledge Bases
Bridging the kb-text gap: Leveraging structured knowledge-aware pre-training for KBQA
Retrieval-enhanced generative model for large-scale knowledge graph completion
Case-based reasoning for natural language queries over knowledge bases
Improving language models by retrieving from trillions of tokens
Remodiffuse: Retrieval-augmented motion diffusion model
Retrieval augmented convolutional encoder-decoder networks for video captioning
Retrieval-augmented egocentric video captioning
Re-imagen: Retrievalaugmented text-to-image generator
Knn-diffusion: Image generation via large-scale retrieval
Retrieval-augmented diffusion models
Text-guided synthesis of artistic images with retrieval-augmented diffusion models
Memory-driven text-to-image generation
Mention memory: incorporating textual knowledge into transformers through entity mention attention
Unlimiformer:Long-range transformers with unlimited length input
Entities as experts: Sparse memory access with entity supervision
Amd: Anatomical motion diffusion with interpretable motion decomposition and fusion
Retrieval-augmented text-to-audio generation
Concept-aware video captioning: Describing videos with effective prior information
-
Logit-based RAG
Generalization through memorization: Nearest neighbor language models
Syntax-Aware Retrieval Augmented Code Generation
Memory-augmented image captioning
Retrieval-based neural source code summarization
Efficient nearest neighbor language models
Nonparametric masked language modeling
Editsum:A retrieve-and-edit framework for source code summarization
-
Speculative RAG
RAG Enhancements
-
Input Enhancement
-
Query Transformations
Query2doc: Query Expansion with Large Language Models
Tree of Clarifications: Answering Ambiguous Questions with Retrieval-Augmented Large Language Models
-
Data Augmentation
LESS: selecting influential data for targeted instruction tuning
Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models
-
-
Retriever Enhancement
-
Recursive Retrieve
Query Expansion by Prompting Large Language Models
Rat: Retrieval augmented thoughts elicit context-aware reasoning in long-horizon generation
React: Synergizing reasoning and acting in language models
Chain-of-thought prompting elicits reasoning in large language models
ACTIVERAG: Revealing the Treasures of Knowledge via Active Learning
Retrieval-Augmented Thought Process as Sequential Decision Making
In search of needles in a 10m haystack: Recurrent memory finds what llms miss
-
Chunk Optimization
RAPTOR: RECURSIVE ABSTRACTIVE PROCESSING FOR TREE-ORGANIZED RETRIEVAL
-
Finetune Retriever
C-Pack: Packaged Resources To Advance General Chinese Embedding
LM-Cocktail: Resilient Tuning of Language Models via Model Merging
Retrieve Anything To Augment Large Language Models
Replug: Retrieval-augmented black-box language models
When Language Model Meets Private Library
EditSum: A Retrieve-and-Edit Framework for Source Code Summarization
Synchromesh: Reliable Code Generation from Pre-trained Language Models
Retrieval Augmented Convolutional Encoder-decoder Networks for Video Captioning
Reinforcement learning for optimizing RAG for domain chatbots
-
Hybrid Retrieve
RAP-Gen: Retrieval-Augmented Patch Generation with CodeT5 for Automatic Program Repair
ReACC: A Retrieval-Augmented Code Completion Framework
Retrieval-based neural source code summarization
BashExplainer: Retrieval-Augmented Bash Code Comment Generation based on Fine-tuned CodeBERT
Retrieval-Augmented Score Distillation for Text-to-3D Generation
Corrective Retrieval Augmented Generation
Retrieval augmented generation with rich answer encoding
Unims-rag: A unified multi-source retrieval-augmented generation for personalized dialogue systems
-
Re-ranking
Re2G: Retrieve, Rerank, Generate
AceCoder: Utilizing Existing Code to Enhance Code Generation
A Fine-tuning Enhanced RAG System with Quantized Influence Measure as AI Judge
UDAPDR: Unsupervised Domain Adaptation via LLM Prompting and Distillation of Rerankers
Learning to Retrieve In-Context Examples for Large Language Models
-
Retrieval Transformation
Learning to filter context for retrieval-augmented generation
Fid-light: Efficient and effective retrieval-augmented text generation
-
Others
Generate rather than retrieve: Large language models are strong context generators
Generator-retriever-generator: A novel approach to open-domain question answering
-
-
Generator Enhancement
-
Prompt Engineering
Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models
[Active Prompting with Chain-of-Thought for Large Language
-