Awesome-LLM-related-Papers-Comprehensive-Topics
We provide awesome papers and repos on very comprehensive topics as follows.
CoT / VLM / Quantization / Grounding / Text2IMG&VID / Prompt Engineering / Reasoning / Robot / Agent / Planning / Reinforcement-Learning / Feedback / In-Context-Learning / InstructionTuning / PEFT / RLHF / RAG / Embodied / VQA / Hallucination / Diffusion / Scaling / Context-Window / WorldModel / Memory / Zero-Shot / RoPE / Speech / Perception / Survey / Segmentation / Learge Action Model / Foundation / RoPE / LoRA
We strongly recommend checking our Notion table for interactive experience.
Category | Title | Links | Date |
---|---|---|---|
3D, GPT4, VLM | GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation | ArXiv | |
3D, Open-source, Perception, Robot | 3D-LLM: Injecting the 3D World into Large Language Models | ArXiv | 2023/07/24 |
AGI, Agent | OpenAGI: When LLM Meets Domain Experts | ArXiv, GitHub | 2023/04/10 |
AGI, Awesome Repo, Survey | Awesome-LLM-Papers-Toward-AGI | GitHub | |
AGI, Brain | When Brain-inspired AI Meets AGI | ||
AGI, Brain | Divergences between Language Models and Human Brains | ||
AGI, Survey | Levels of AGI: Operationalizing Progress on the Path to AGI | ||
APIs, Agent, Tool | Gorilla: Large Language Model Connected with Massive APIs | ArXiv | |
Action-Generation, Generation, Prompting | Prompt a Robot to Walk with Large Language Models | ||
Action-Model, Agent, LAM | LaVague | GitHub | |
Agent | LLM as OS, Agents as Apps: Envisioning AIOS, Agents and the AIOS-Agent Ecosystem | ArXiv | |
Agent | AIOS: LLM Agent Operating System | ArXiv | |
Agent | Cognitive Architectures for Language Agents | ArXiv | |
Agent | PromptAgent: Strategic Planning with Language Models Enables Expert-level Prompt Optimization | ||
Agent | AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn | ||
Agent | ScreenAgent: A Vision Language Model-driven Computer Control Agent | ||
Agent | swarms | GitHub | |
Agent | Agents: An Open-source Framework for Autonomous Language Agents | ||
Agent | MindAgent: Emergent Gaming Interaction | ||
Agent | InfiAgent: A Multi-Tool Agent for AI Operating Systems | ||
Agent | Predictive Minds: LLMs As Atypical Active Inference Agents | ||
Agent | XAgent: An Autonomous Agent for Complex Task Solving | ||
Agent | LLM-Powered Hierarchical Language Agent for Real-time Human-AI Coordination | ||
Agent | AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors | ArXiv | |
Agent | Agents: An Open-source Framework for Autonomous Language Agents | ArXiv, GitHub | |
Agent | AutoAgents: A Framework for Automatic Agent Generation | GitHub | |
Agent | DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines | ArXiv | |
Agent | AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation | ||
Agent | CAMEL: Communicative Agents for “Mind” Exploration of Large Language Model Society | ||
Agent | XAgent: An Autonomous Agent for Complex Task Solving | ArXiv | |
Agent | Generative Agents: Interactive Simulacra of Human Behavior | ArXiv | |
Agent | LLM+P: Empowering Large Language Models with Optimal Planning Proficiency | ArXiv | 2023/04/22 |
Agent | AgentSims: An Open-Source Sandbox for Large Language Model Evaluation | ArXiv | 2023/08/08 |
Agent, Awesome Repo | Awesome LLM-Powered Agent | GitHub | |
Agent, Awesome Repo | LLM Agents Papers | GitHub | |
Agent, Awesome Repo | Awesome Large Multimodal Agents | GitHub | |
Agent, Awesome Repo | Awesome-Papers-Autonomous-Agent | GitHub | |
Agent, Awesome Repo | Autonomous Agents | GitHub | |
Agent, Awesome Repo | Awesome AI Agents | GitHub | |
Agent, Awesome Repo, Embodied, Grounding | XLang Paper Reading | GitHub | |
Agent, Awesome Repo, LLM | CoALA: Awesome Language Agents | GitHub | |
Agent, Awesome Repo, LLM | Awesome-Embodied-Agent-with-LLMs | GitHub | |
Agent, Blog | LLM Powered Autonomous Agents | ArXiv | |
Agent, Code-LLM | TaskWeaver: A Code-First Agent Framework | ||
Agent, Code-LLM, Code-as-Policies, Survey | If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents | ArXiv | |
Agent, Code-as-Policies | Executable Code Actions Elicit Better LLM Agents | ArXiv | 2024/01/24 |
Agent, Embodied | Embodied Task Planning with Large Language Models | ||
Agent, Embodied | Octopus: Embodied Vision-Language Programmer from Environmental Feedback | ||
Agent, Embodied | Embodied Multi-Modal Agent trained by an LLM from a Parallel TextWorld | ArXiv | |
Agent, Embodied | LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models | ||
Agent, Embodied | OpenAgents: An Open Platform for Language Agents in the Wild | ArXiv, GitHub | |
Agent, Embodied, Robot | OPEx: A Component-Wise Analysis of LLM-Centric Agents in Embodied Instruction Following | ||
Agent, Embodied, Robot | AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents | ArXiv | |
Agent, Embodied, Survey | Application of Pretrained Large Language Models in Embodied Artificial Intelligence | ArXiv | |
Agent, End2End, Game, Robot | An Interactive Agent Foundation Model | ArXiv | |
Agent, Feedback, Reinforcement-Learning | AdaRefiner: Refining Decisions of Language Models with Adaptive Feedback | ArXiv | 2023/09/29 |
Agent, Feedback, Reinforcement-Learning, Robot | Accelerating Reinforcement Learning of Robotic Manipulations via Feedback from Large Language Models | ArXiv | 2023/11/04 |
Agent, GPT4, Web | GPT-4V(ision) is a Generalist Web Agent, if Grounded | ||
Agent, GUI | SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents | ||
Agent, GUI | ScreenAgent: A Computer Control Agent Driven by Visual Language Large Model | GitHub | |
Agent, GUI | CogAgent: A Visual Language Model for GUI Agents | ||
Agent, GUI, MobileApp | You Only Look at Screens: Multimodal Chain-of-Action Agents | ||
Agent, GUI, MobileApp | Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception | ||
Agent, GUI, MobileApp | AppAgent: Multimodal Agents as Smartphone Users | ||
Agent, GUI, Web | "What’s important here?": Opportunities and Challenges of Using LLMs in Retrieving Informatio from Web Interfaces | ||
Agent, Game | LEARNING EMBODIED VISION-LANGUAGE PRO- GRAMMING FROM INSTRUCTION, EXPLORATION, AND ENVIRONMENTAL FEEDBACK | ||
Agent, Instruction-Turning | AgentTuning: Enabling Generalized Agent Abilities For LLMs | ArXiv | |
Agent, LLM, Planning | LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models | ||
Agent, Memory, Minecraft | JARVIS-1: Open-World Multi-task Agents with Memory-Augmented Multimodal Language Models | ArXiv | 2023/11/10 |
Agent, Memory, RAG | RAP: Retrieval-Augmented Planning with Contextual Memory for Multimodal LLM Agents | ArXiv | 2024/02/06 |
Agent, Minecraft | Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory01 | ||
Agent, Minecraft | S-Agents: Self-organizing Agents in Open-ended Environment | ||
Agent, Minecraft | Steve-Eye: Equipping LLM-based Embodied Agents with Visual Perception in Open Worlds | ||
Agent, Minecraft | LARP: Language-Agent Role Play for Open-World Games | ||
Agent, Minecraft | Voyager: An Open-Ended Embodied Agent with Large Language Models | ArXiv | 2023/05/25 |
Agent, Minecraft | Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents | ArXiv | 2023/02/03 |
Agent, Minecraft, Reinforcement-Learning | RLAdapter: Bridging Large Language Models to Reinforcement Learning in Open Worlds | ||
Agent, MobileApp | You Only Look at Screens: Multimodal Chain-of-Action Agents | GitHub | |
Agent, Multi | War and Peace (WarAgent): Large Language Model-based Multi-Agent Simulation of World Wars | ArXiv | |
Agent, Multimodal, Robot | A Generalist Agent | ArXiv | 2022/05/12 |
Agent, Reasoning | AGENT INSTRUCTS LARGE LANGUAGE MODELS TO BE GENERAL ZERO-SHOT REASONERS | ||
Agent, Reasoning | Pangu-Agent: A Fine-Tunable Generalist Agent with Structured Reasoning | ||
Agent, Reasoning, Zero-shot | Agent Instructs Large Language Models to be General Zero-Shot Reasoners | ArXiv | 2023/10/05 |
Agent, Reinforcement-Learning | STARLING: SELF-SUPERVISED TRAINING OF TEXTBASED REINFORCEMENT LEARNING AGENT WITH LARGE LANGUAGE MODELS | ||
Agent, Reinforcement-Learning | Language Instructed Reinforcement Learning for Human-AI Coordination | ArXiv | 2023/04/13 |
Agent, Reinforcement-Learning | Eureka: Human-Level Reward Design via Coding Large Language Models | ArXiv | 2023/10/19 |
Agent, Reinforcement-Learning | Guiding Pretraining in Reinforcement Learning with Large Language Models | ArXiv | 2023/02/13 |
Agent, Reinforcement-Learning | Language to Rewards for Robotic Skill Synthesis | ArXiv | 2023/06/14 |
Agent, Reinforcement-Learning, Reward | EAGER: Asking and Answering Questions for Automatic Reward Shaping in Language-guided RL | ArXiv | 2022/06/20 |
Agent, Reinforcement-Learning, Reward | Reward Design with Language Models | ArXiv | 2023/02/27 |
Agent, Reinforcement-Learning, Reward | Text2Reward: Automated Dense Reward Function Generation for Reinforcement Learning | ArXiv | 2023/09/20 |
Agent, Soft-Dev | Communicative Agents for Software Development | GitHub | |
Agent, Soft-Dev | MetaGPT: |