Agents Paper

From the Kaggle paper on Agents.

We see that agents are comprised of:

Orchestration Layer

Stuff like:

Instructions
Agent Profiles
Agent goals and objectives
Memory (short- and long-term)
Model-based reasoning and planning, etc

Tools

Types of tools:

Extensions
Functions
Data Stores (Vector DBs) Functions are executed on the client-side, while extensions (what we’d call plugins) are executed agent-side.

Model

The model/LLM must be capable of following instruction-based reasoning and logic frameworks, like ReAct, Chain-of-Thought or Tree-of-Thoughts. They can be general-purpose, multimodal or fine-tuned on a need-to-have basis. Tip: it’s good to fine-tune the agents with the specific tools or reasoning steps in various contexts.

Reasoning Frameworks

ReAct

This is a prompt engineering framework for taking action on user query with or without in-context examples.

Chain-of-Thought

Reasoning through intermediate steps. Flavors include:

Self-consistency
Active-prompt
Multimodal CoT

Tree-of-Thought

Suited for exploration or strategic look-ahead. This generalizes over CoT prompting and lets the model explore various thought chains.

Garden of Mogwai

Explorer