Skip to content

A generalized information-seeking agent system with Large Language Models (LLMs).

License

Notifications You must be signed in to change notification settings

KwaiKEG/KwaiAgents

Repository files navigation

English |Trung văn|Nhật bổn ngữ




📚Dataset| 📚Benchmark| 🤗Models| 📑Paper

KwaiAgents is a series of Agent-related works open-sourced by theKwaiKEGfromKuaishou Technology.The open-sourced content includes:

  1. KAgentSys-Lite:a lite version of the KAgentSys in the paper. While retaining some of the original system's functionality, KAgentSys-Lite has certain differences and limitations when compared to its full-featured counterpart, such as: (1) a more limited set of tools; (2) a lack of memory mechanisms; (3) slightly reduced performance capabilities; and (4) a different codebase, as it evolves from open-source projects like BabyAGI and Auto-GPT. Despite these modifications, KAgentSys-Lite still delivers comparable performance among numerous open-source Agent systems available.
  2. KAgentLMs:a series of large language models with agent capabilities such as planning, reflection, and tool-use, acquired through the Meta-agent tuning proposed in the paper.
  3. KAgentInstruct:over 200k Agent-related instructions finetuning data (partially human-edited) proposed in the paper.
  4. KAgentBench:over 3,000 human-edited, automated evaluation data for testing Agent capabilities, with evaluation dimensions including planning, tool-use, reflection, concluding, and profiling.

Type Models Training Data Benchmark Data
Qwen Qwen-7B-MAT
Qwen-14B-MAT
Qwen-7B-MAT-cpp
Qwen1.5-14B-MAT
KAgentInstruct KAgentBench
Baichuan Baichuan2-13B-MAT



News

  • 2024.4.19 - Qwen1.5-14B-MAT model[link]released.
  • 2024.4.9 - Benchmark results have been refreshed.
  • 2024.1.29 - Qwen-14B-MAT model[link]released.
  • 2023.1.5 - Training data[link]released.
  • 2023.12.27 - 🔥🔥🔥 KwaiAgents have been reported on many sites.[ cơ khí chi tâm ][Medium][InfoQ],etc.
  • 2023.12.13 - The benchmark and evaluation code[link]released.
  • 2023.12.08 - Technical report[link]release.
  • 2023.11.17 - Initial release.

Evaluation

  1. Benchmark Results
Scale Planning Tool-use Reflection Concluding Profile Overall Score
GPT-3.5-turbo - 18.55 26.26 8.06 37.26 35.42 25.63
Llama2 13B 0.15 0.44 0.14 16.60 17.73 5.30
ChatGLM3 6B 7.87 11.84 7.52 30.01 30.14 15.88
Qwen 7B 13.34 18.00 7.91 36.24 34.99 21.17
Baichuan2 13B 6.70 16.10 6.76 24.97 19.08 14.89
ToolLlama 7B 0.20 4.83 1.06 15.62 10.66 6.04
AgentLM 13B 0.17 0.15 0.05 16.30 15.22 4.88
Qwen-MAT 7B 31.64 43.30 33.34 44.85 44.78 39.85
Baichuan2-MAT 13B 37.27 52.97 37.00 48.01 41.83 45.34
Qwen-MAT 14B 43.17 63.78 32.14 45.47 45.22 49.94
Qwen1.5-MAT 14B 42.42 64.62 30.58 46.51 45.95 50.18
  1. Human evaluation. Each result cell shows the pass rate (%) and the average score (in parentheses)
Scale NoAgent ReACT Auto-GPT KAgentSys
GPT-4 - 57.21% (3.42) 68.66% (3.88) 79.60% (4.27) 83.58% (4.47)
GPT-3.5-turbo - 47.26% (3.08) 54.23% (3.33) 61.74% (3.53) 64.18% (3.69)
Qwen 7B 52.74% (3.23) 51.74% (3.20) 50.25% (3.11) 54.23% (3.27)
Baichuan2 13B 54.23% (3.31) 55.72% (3.36) 57.21% (3.37) 58.71% (3.54)
Qwen-MAT 7B - 58.71% (3.53) 65.67% (3.77) 67.66% (3.87)
Baichuan2-MAT 13B - 61.19% (3.60) 66.67% (3.86) 74.13% (4.11)

User Guide

Prebuild environment

Install miniconda for build environment first. Then create build env first:

conda create -n kagent python=3.10
conda activate kagent
pip install -r requirements.txt

Using AgentLMs

Serving byvLLM(GPU)

We recommend usingvLLMandFastChatto deploy the model inference service. First, you need to install the corresponding packages (for detailed usage, please refer to the documentation of the two projects):

  1. For Qwen-7B-MAT, install the corresponding packages with the following commands
pip install vllm
pip install"fschat[model_worker,webui]"
  1. For Baichuan-13B-MAT, install the corresponding packages with the following commands
pip install"fschat[model_worker,webui]"
pip install vllm==0.2.0
pip install transformers==4.33.2

To deploy KAgentLMs, you first need to start the controller in one terminal.

python -m fastchat.serve.controller

Secondly, you should use the following command in another terminal for single-gpu inference service deployment:

python -m fastchat.serve.vllm_worker --model-path$model_path--trust-remote-code

Where$model_pathis the local path of the model downloaded. If the GPU does not support Bfloat16, you can add--dtype halfto the command line.

Thirdly, start the REST API server in the third terminal.

python -m fastchat.serve.openai_api_server --host localhost --port 8888

Finally, you can use the curl command to invoke the model same as the OpenAI calling format. Here's an example:

curl http://localhost:8888/v1/chat/completions \
-H"Content-Type: application/json"\
-d'{ "model": "kagentlms_qwen_7b_mat", "messages": [{ "role": "user", "content": "Who is Andy Lau" }]}'

Here, changekagentlms_qwen_7b_matto the model you deployed.

Serving byLamma.cpp(CPU)

llama-cpp-python offers a web server which aims to act as a drop-in replacement for the OpenAI API. This allows you to use llama.cpp compatible models with any OpenAI compatible client (language libraries, services, etc). The converted model can be found inkwaikeg/kagentlms_qwen_7b_mat_gguf.

To install the server package and get started:

pip install"llama-cpp-python[server]"
python3 -m llama_cpp.server --model kagentlms_qwen_7b_mat_gguf/ggml-model-q4_0.gguf --chat_format chatml --port 8888

Finally, you can use the curl command to invoke the model same as the OpenAI calling format. Here's an example:

curl http://localhost:8888/v1/chat/completions \
-H"Content-Type: application/json"\
-d'{ "messages": [{ "role": "user", "content": "Who is Andy Lau" }]}'

Using KAgentSys-Lite

Download and install the KwaiAgents, recommended Python>=3.10

git clone [email protected]:KwaiKEG/KwaiAgents.git
cdKwaiAgents
python setup.py develop
  1. ChatGPT usage Declare some environment variables
export OPENAI_API_KEY=sk-xxxxx
export WEATHER_API_KEY=xxxxxx

The WEATHER_API_KEY is not mandatory, but you need to configure it when asking weather-related questions. You can obtain the API key fromthis website(Same for local model usage).

kagentsys --query="Who is Andy Lau's wife?"--llm_name="gpt-3.5-turbo"--lang="en"
  1. Local model usage

To use a local model, you need to deploy the corresponding model service as described in the previous chapter

kagentsys --query="Who is Andy Lau's wife?"--llm_name="kagentlms_qwen_7b_mat"\
--use_local_llm --local_llm_host="localhost"--local_llm_port=8888 --lang="en"

Full command arguments:

options:
-h, --help show this help message and exit
--id ID ID of this conversation
--query QUERY User query
--history HISTORY History of conversation
--llm_name LLM_NAME the name of llm
--use_local_llm Whether to use local llm
--local_llm_host LOCAL_LLM_HOST
The host of local llm service
--local_llm_port LOCAL_LLM_PORT
The port of local llm service
--tool_names TOOL_NAMES
the name of llm
--max_iter_num MAX_ITER_NUM
the number of iteration of agents
--agent_name AGENT_NAME
The agent name
--agent_bio AGENT_BIO
The agent bio, a short description
--agent_instructions AGENT_INSTRUCTIONS
The instructions of how agent thinking, acting, or talking
--external_knowledge EXTERNAL_KNOWLEDGE
The link of external knowledge
--lang {en,zh} The language of the overall system
--max_tokens_num Maximum length of model input

Note:

  1. If you need to use thebrowse_websitetool, you need to configure thechromedriveron your server.
  2. If the search fails multiple times, it may be because the network cannot access duckduckgo_search. You can solve this by setting thehttp_proxy.

Using Custom tools

Custom tools usage can be found inexamples/custom_tool_example.py

Using KAgentBench Evaluation

We only need two lines to evaluate the agent capabilities like:

cdbenchmark
python infer_qwen.py qwen_benchmark_res.jsonl
python benchmark_eval.py./benchmark_eval.jsonl./qwen_benchmark_res.jsonl

The above command will give the results like

plan: 31.64, tooluse: 43.30, reflextion: 33.34, conclusion: 44.85, profile: 44.78, overall: 39.85

Please refer tobenchmark/for more details.

Citation

@article{pan2023kwaiagents,
author = {Haojie Pan and
Zepeng Zhai and
Hao Yuan and
Yaojia Lv and
Ruiji Fu and
Ming Liu and
Zhongyuan Wang and
Bing Qin
},
title = {KwaiAgents: Generalized Information-seeking Agent System with Large Language Models},
journal = {CoRR},
volume = {abs/2312.04889},
year = {2023}
}

Releases

No releases published

Packages

No packages published

Languages