Project Icon

LLM-Tool-Survey

大型语言模型工具学习调查研究

该研究系统性调查大型语言模型(LLMs)通过工具学习增强解决复杂问题能力。从工具学习的优势和实现方法两方面全面回顾现有文献,总结基准测试和评估方法,讨论当前挑战和未来方向,为相关研究和开发提供见解。

Survey: Tool Learning with Large Language Models

Recently, tool learning with large language models(LLMs) has emerged as a promising paradigm for augmenting the capabilities of LLMs to tackle highly complex problems.

This is the collection of papers related to tool learning with LLMs. These papers are organized according to our survey paper "Tool Learning with Large Language Models: A Survey".

中文: We have noticed that PaperAgent and 旺知识 have provided a brief and a comprehensive introduction in Chinese, respectively. We greatly appreciate their assistance.

Please feel free to contact us if you have any questions or suggestions!

Contribution

:tada::+1: Please feel free to open an issue or make a pull request! :tada::+1:

Citation

If you find our work helps your research, please kindly cite our paper:

@article{qu2024toolsurvey,
    author={Qu, Changle and Dai, Sunhao and Wei, Xiaochi and Cai, Hengyi and Wang, Shuaiqiang and Yin, Dawei and Xu, Jun and Wen, Ji-Rong},
    title={Tool Learning with Large Language Models: A Survey},
    journal={arXiv preprint arXiv:2405.17935},
    year={2024}
}

📋 Contents

🌟 Introduction

Recently, tool learning with large language models (LLMs) has emerged as a promising paradigm for augmenting the capabilities of LLMs to tackle highly complex problems. Despite growing attention and rapid advancements in this field, the existing literature remains fragmented and lacks systematic organization, posing barriers to entry for newcomers. This gap motivates us to conduct a comprehensive survey of existing works on tool learning with LLMs. In this survey, we focus on reviewing existing literature from the two primary aspects (1) why tool learning is beneficial and (2) how tool learning is implemented, enabling a comprehensive understanding of tool learning with LLMs. We first explore the “why” by reviewing both the benefits of tool integration and the inherent benefits of the tool learning paradigm from six specific aspects. In terms of “how”, we systematically review the literature according to a taxonomy of four key stages in the tool learning workflow: task planning, tool selection, tool calling, and response generation. Additionally, we provide a detailed summary of existing benchmarks and evaluation methods, categorizing them according to their relevance to different stages. Finally, we discuss current challenges and outline potential future directions, aiming to inspire both researchers and industrial developers to further explore this emerging and promising area.

The overall workflow for tool learning with large language models.

📄 Paper List

Why Tool Learning?

Benefit of Tools.

  • Knowledge Acquisition.

    • Search Engine

      Internet-Augmented Dialogue Generation, ACL 2022. [Paper]

      WebGPT: Browser-assisted question-answering with human feedback, Preprint 2021. [Paper]

      Internet-augmented language models through few-shot prompting for open-domain question answering, Preprint 2022. [Paper]

      REPLUG: Retrieval-Augmented Black-Box Language Models, Preprint 2023. [Paper]

      Toolformer: Language Models Can Teach Themselves to Use Tools, NeurIPS 2023. [Paper]

      ART: Automatic multi-step reasoning and tool-use for large language models, Preprint 2023. [Paper]

      ToolCoder: Teach Code Generation Models to use API search tools, Preprint 2023. [Paper]

      CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing, ICLR 2024. [Paper]

    • Database & Knowledge Graph

      Lamda: Language models for dialog applications, Preprint 2022. [Paper]

      Gorilla: Large Language Model Connected with Massive APIs, Preprint 2023. [Paper]

      ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings, NeurIPS 2023. [Paper]

      ToolQA: A Dataset for LLM Question Answering with External Tools, NeurIPS 2023. [Paper]

      Syntax Error-Free and Generalizable Tool Use for LLMs via Finite-State Decoding, NeurIPS 2023. [Paper]

      Middleware for LLMs: Tools are Instrumental for Language Agents in Complex Environments, Preprint 2024. [Paper]

    • Weather or Map

      On the Tool Manipulation Capability of Open-source Large Language Models, NeurIPS 2023. [Paper]

      ToolAlpaca: Generalized Tool Learning for Language Models with 3000 Simulated Cases, Preprint 2023. [Paper]

      Tool Learning with Foundation Models, Preprint 2023. [Paper]

  • Expertise Enhancement.

    • Mathematical Tools

      Training verifiers to solve math word problems, Preprint 2021. [Paper]

      MRKL Systems: A modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning, Preprint 2021. [Paper]

      Chaining Simultaneous Thoughts for Numerical Reasoning, EMNLP 2022. [Paper]

      Calc-X and Calcformers: Empowering Arithmetical Chain-of-Thought through Interaction with Symbolic Systems, EMNLP 2023. [Paper]

      Solving math word problems by combining language models with symbolic solvers, NeurIPS 2023. [Paper]

      Evaluating and improving tool-augmented computation-intensive math reasoning, NeurIPS 2023. [Paper]

      ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving, ICLR 2024. [Paper]

      MATHSENSEI: A Tool-Augmented Large Language Model for Mathematical Reasoning, Preprint 2024. [Paper]

      Calc-CMU at SemEval-2024 Task 7: Pre-Calc -- Learning to Use the Calculator Improves Numeracy in Language Models, NAACL 2024. [Paper]

      MathViz-E: A Case-study in Domain-Specialized Tool-Using Agents, Preprint 2024. [Paper]

    • Python Interpreter

      Pal: Program-aided language models, ICML 2023. [Paper]

      Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks, TMLR 2023. [Paper]

      Fact-Checking Complex Claims with Program-Guided Reasoning, ACL 2023. [Paper]

      Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models, NeurIPS 2023. [Paper]

      LeTI: Learning to Generate from Textual Interactions, NAACL 2024. [Paper]

      Mint: Evaluating llms in multi-turn interaction with tools and language feedback, ICLR 2024. [Paper]

      Executable Code Actions Elicit Better LLM Agents, ICML 2024. [Paper]

      CodeNav: Beyond tool-use to using real-world codebases with LLM agents, Preprint 2024. [Paper]

      APPL: A Prompt Programming Language for Harmonious Integration of Programs and Large Language Model Prompts, Preprint 2024. [Paper]

      BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions, Preprint 2024. [Paper]

      CodeAgent: Enhancing Code Generation with Tool-Integrated Agent Systems for Real-World Repo-level Coding Challenges, ACL 2024. [Paper]

    • Others

      Chemical: MultiTool-CoT: GPT-3 Can Use Multiple External Tools with Chain of Thought Prompting, ACL 2023. [Paper]

      ChemCrow: Augmenting large-language models with chemistry tools, Nature Machine Intelligence 2024. [Paper]

      A REVIEW OF LARGE LANGUAGE MODELS AND AUTONOMOUS AGENTS IN CHEMISTRY, Preprint 2024. [Paper]

      Biomedical: GeneGPT: Augmenting Large Language Models with Domain Tools for Improved Access to Biomedical Information, ISMB 2024. [Paper]

      Financial: Equipping Language Models with Tool Use Capability for Tabular Data Analysis in Finance, EACL 2024. [Paper]

      Financial: Simulating Financial Market via Large Language Model based Agents, Preprint 2024. [Paper]

      Medical: AgentMD: Empowering Language Agents for Risk Prediction with Large-Scale Clinical Tool Learning, Preprint 2024. [Paper]

      MMedAgent: Learning to Use Medical Tools with Multi-modal Agent, Preprint 2024. [Paper]

      Recommendation: Let Me Do It For You: Towards LLM Empowered Recommendation via Tool Learning, SIGIR 2024. [Paper]

      Gas Turbines: DOMAIN-SPECIFIC ReAct FOR PHYSICS-INTEGRATED ITERATIVE MODELING: A CASE STUDY OF LLM AGENTS FOR GAS PATH ANALYSIS OF GAS TURBINES, Preprint 2024. [Paper]

      WORLDAPIS: The World Is Worth How Many APIs? A Thought Experiment, ACL 2024 Workshop. [Paper]

  • Automation and Efficiency.

    • Schedule Tools

      ToolQA: A Dataset for LLM Question Answering with External Tools, NeurIPS 2023. [Paper]

    • Set Reminders

      ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs, ICLR 2024. [Paper]

    • Filter Emails

      ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs, ICLR 2024. [Paper]

    • Project Management

      ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs, ICLR 2024. [Paper]

    • Online Shopping Assistants

      WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents, NeurIPS 2022. [Paper]

  • Interaction Enhancement.

    • Multi-modal Tools

      Vipergpt: Visual inference via python execution for reasoning, ICCV 2023. [Paper]

      MM-REACT: Prompting ChatGPT for Multimodal Reasoning and Action, Preprint 2023. [Paper]

      InternGPT: Solving Vision-Centric Tasks by Interacting with ChatGPT Beyond Language, Preprint 2023. [Paper]

      AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn, Preprint 2023. [Paper]

      CLOVA: A closed-loop visual assistant with tool usage and update, CVPR 2024. [Paper]

      DiffAgent: Fast and Accurate Text-to-Image API Selection with Large Language Model, CVPR 2024. [Paper]

      MLLM-Tool: A Multimodal Large Language Model For Tool Agent Learning, Preprint 2024. [Paper]

      m&m's: A Benchmark to Evaluate Tool-Use for multi-step multi-modal Tasks, Preprint 2024. [Paper]

      From the Least to the Most: Building a Plug-and-Play Visual Reasoner via Data Synthesis, Preprint 2024. [Paper]

    • Machine Translator

      Toolformer: Language Models Can Teach Themselves to Use Tools, NeurIPS 2023. [Paper]

      Tool Learning with Foundation Models, Preprint 2023. [Paper]

    • Natural Language Processing Tools

      HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face, NeurIPS 2023. [Paper]

      GitAgent: Facilitating Autonomous Agent with GitHub by Tool Extension, Preprint 2023. [Paper]

Benefit of Tool Learning.

  • Enhanced Interpretability and User Trust.
  • Improved Robustness and Adaptability.

How

项目侧边栏1项目侧边栏2
推荐项目
Project Cover

豆包MarsCode

豆包 MarsCode 是一款革命性的编程助手,通过AI技术提供代码补全、单测生成、代码解释和智能问答等功能,支持100+编程语言,与主流编辑器无缝集成,显著提升开发效率和代码质量。

Project Cover

AI写歌

Suno AI是一个革命性的AI音乐创作平台,能在短短30秒内帮助用户创作出一首完整的歌曲。无论是寻找创作灵感还是需要快速制作音乐,Suno AI都是音乐爱好者和专业人士的理想选择。

Project Cover

白日梦AI

白日梦AI提供专注于AI视频生成的多样化功能,包括文生视频、动态画面和形象生成等,帮助用户快速上手,创造专业级内容。

Project Cover

有言AI

有言平台提供一站式AIGC视频创作解决方案,通过智能技术简化视频制作流程。无论是企业宣传还是个人分享,有言都能帮助用户快速、轻松地制作出专业级别的视频内容。

Project Cover

Kimi

Kimi AI助手提供多语言对话支持,能够阅读和理解用户上传的文件内容,解析网页信息,并结合搜索结果为用户提供详尽的答案。无论是日常咨询还是专业问题,Kimi都能以友好、专业的方式提供帮助。

Project Cover

讯飞绘镜

讯飞绘镜是一个支持从创意到完整视频创作的智能平台,用户可以快速生成视频素材并创作独特的音乐视频和故事。平台提供多样化的主题和精选作品,帮助用户探索创意灵感。

Project Cover

讯飞文书

讯飞文书依托讯飞星火大模型,为文书写作者提供从素材筹备到稿件撰写及审稿的全程支持。通过录音智记和以稿写稿等功能,满足事务性工作的高频需求,帮助撰稿人节省精力,提高效率,优化工作与生活。

Project Cover

阿里绘蛙

绘蛙是阿里巴巴集团推出的革命性AI电商营销平台。利用尖端人工智能技术,为商家提供一键生成商品图和营销文案的服务,显著提升内容创作效率和营销效果。适用于淘宝、天猫等电商平台,让商品第一时间被种草。

Project Cover

AIWritePaper论文写作

AIWritePaper论文写作是一站式AI论文写作辅助工具,简化了选题、文献检索至论文撰写的整个过程。通过简单设定,平台可快速生成高质量论文大纲和全文,配合图表、参考文献等一应俱全,同时提供开题报告和答辩PPT等增值服务,保障数据安全,有效提升写作效率和论文质量。

投诉举报邮箱: service@vectorlightyear.com
@2024 懂AI·鲁ICP备2024100362号-6·鲁公网安备37021002001498号