Introduction
In this guide, you will gain a deep understanding of the ECM architecture implemented in this repository. We will explore the main modules and core approaches to solving the ECM problem. It is recommended that you first read the Theoretical Fundamentals and the Architecture Overview guides.
The primary challenge of an ECM lies in synchronizing multiple layers. Although an ECM can be approached with various architectures, this repository divides all modules into four layers:
- /cognition_layer
: This directory contains all the AI agents responsible for thinking, planning, reasoning, and all necessary components to fully build and deploy those agents, generally adhering to the Agent Protocol Standard.
- /ecm
: This directory holds all the mediators, communicators, and middleware between the execution layer and the cognition layer. Only templates and virtual functions reside here, allowing the cognition layer to interact with the execution layer through any implementation using templates such as an Interpreter. Additionally, tools that can be used in both execution and cognition layers, such as the Exelent Parser or the Item Registry, are found here.
- /execution_layer
: This directory includes all modules that assist agents in executing commands, typically communicated using the Exelent language. Here, you can find thread managers, registries, callbacks, etc.
- /action_space
: This directory defines all the actions that agents can take. By importing these modules, you add new interactions to your agent, with actions resolved by the execution layer. Modules for keyboard interaction, window management, etc., can be found here.
Base Execution Diagram for the ECM
Each layer in this architecture plays a crucial role in ensuring the efficiency, coherence, and functionality of the AI system. The following diagram illustrates the main execution flow of the ECM. The upper layers focus on cognitive interactions and decision-making, while the lower layers handle practical execution and management of specific functions. This separation of responsibilities allows for better organization and facilitates maintenance and scalability of the system.
Cognition Layer Agent
The Cognition Layer Agent is responsible for interacting with the LL. Its primary function is to handle problems by receiving a user query and generating a response to that problem, generally returning Exelent code. For this implementation, the agent can use tools, multiple steps, requests to the LLM, etc. Regardless of the agent's internal workings, the external interface behaves as an iterator, where each step of the agent is controlled by lower layers.
Some implementations of this layer can be found in the /cognition_layer/[module]/agents directory, where each agent follows different architectures. Here are some examples:
- /planex: Planex uses three different agents (Planification, Reduction, Translation) executed sequentially. Each agent contains a function
.plan()
, .reduce()
, or .translate()
to chain the generated data from the LLM, returning a string with the generated plan.
- /planexv2: PlanexV2 reuses the code of the original Planex agent but implements a new
Blamer
agent that checks if the code is correct. If not, it executes the failed agent again until obtaining a valid plan.
- /RePlan: RePlan uses ReAct architecture to control the execution of the generated code. It uses PlanexV2 for generating Exelent files, and once generated, it manages the interpreters to approve, control, and check the execution of the plan.
Note that all these agents depend on a set of prompts usually found in the directory /cognition_layer/[agent]/agents/prompts.py.
Some implementations of these agents in action can be found in the /tests/sandbox directory, where you can experiment with different queries.
Note also how some of these agents implement an iter()
function, which returns a generator allowing you to easily execute all the steps of the agent. The messages returned may vary between different agents, but they can implement a Response model as a dataclass, where you can obtain information about each step.
agent = RePlan()
step: ReplanResponse
# In this case the generator is async, but this doesn't need to be true for all agents
async for step in replan.iter(query="Open the terminal in linux"):
print(f"[{step.name}]" + Style.RESET_ALL)
print(step.content)
Further information about each agent can be found on the same file or by checking the wiki if it the agent has already been published.
FastAgentProtocol
The FastAgentProtocol
is a simplified and faster alternative to the previous AgentProtocol
, which required server setups and remote API configurations. The new approach eliminates unnecessary complexity, enabling direct and efficient communication between agents and the main system.
Note: This protocol can be found at /cognition_layer/protocols/fast_ap.py
- Simplification: No more server setup or port management.
- Performance: Local execution reduces latency.
- Standardization: Provides a unified, consistent framework for agent integration.
The FastAgentProtocol
acts as a mediator between the main system and cognitive agents. It leverages an iterator-based approach, where the agent defines an iterative process that yields step-by-step progress. Each step includes information such as the step's name, content, and whether it's the final step.
More information can be found in the FastAgentProtocol section on the wiki
ECM Core
The Main Module or ECM Core can be found at /ecm/core and contains the main execution loop of the system. This component is the operational core where all modules are managed. Here, you can test all agents. These agents are implemented at the top of the file and will be executed until stopped by the user.
Here are some properties of this module:
- This module does not depend on the implementations or internals of the cognition/execution layers; it uses only the virtual interfaces/APIs from the middlewares.
- By default, all agent actions are deactivated in the host for safety reasons. Instead of executing, the actions will only log a simulation of each action's execution (keyboard, mouse control, etc. CognitionLayer steps are not included, only Action space). If you want to run an agent on the host machine with full integration, use /ecm/core/run_in_host.py.
You can run this module using the following command:
python ecm/core/main.py --debug --agent [Planex|PlanexV2|RePlan...]
You can also use -h
for more information about the arguments.
python ecm/core/main.py -h
Interpreter
The Interpreter unifies all executors within a single framework, allowing the system to execute functions and read commands from Exelent files. All interpreters must satisfy the attributes defined in the virtual class found at /ecm/mediator/Interpreter.py. This means all interpreters must contain the following properties:
- Contain and inherit all the methods and attributes defined in the virtual Interpreter class.
- Define a supports class for assigning all properties that can be used (at least the contained properties, extendable as needed) by inheriting from the InterpreterSupports class.
- If the Interpreter supports feedback returning, it must return a Feedback Object that is a subclass, inheriting from the Feedback class defined in /ecm/mediator/feedback.py and defining all specified methods/properties. Note that all feedback generated must always contain an
_exec_code
to define the status of the execution, using the ExecutionStatus(Enum)
class defined in /ecm/mediator/feedback.py.
Execution Layer Module
The Execution Layer Module manages the execution of functions within the system. This component coordinates and oversees the operation of various threads, functions, callbacks, etc., needed for building the interpreter and achieving the full execution of multiple tasks and different behaviors defined in the Exelent language. This module can vary significantly, making it challenging to establish a standard implementation. However, you can explore the ROSA Insights to further investigate the key problems of this layer.
Action Space
The Action Space consists of executable functions that can be run on the host. It represents the set of actions and commands available for the AI to carry out in response to requests and tasks assigned. It should contain multiple functions to facilitate the actions of the Agents. These functions have the following properties:
- They can raise feedback if the interpreter that calls those functions enables it, but it is better to directly return information to the agent and save statuses between calls.
- They must always receive
str
or list[str]
as arguments and return str
because they will be managed by an AI with words.
- They should be as simple and straightforward as possible. Avoid more than 2 arguments per function.
- They must always contain a docstring explaining their functionality and how to use them. This docstring will be shown to the agents.
These actions can be defined anywhere, but it is standard to define them in the /action_space directory. To define them, you just have to use the Item Registry as follows:
from ecm.tools.registry import ItemRegistry
@ItemRegistry.register(type="action")
def hello_world():
"""Says hello world.\nUsage: hello_world()""" # noqa
print("Hello World!")
If you want the Agents to have multiple names for using you function you can use the alias
function of the Item Registry. This will improve the AI's ease and probability to use your tool.
@ItemRegistry.alias([
"say_hello_world",
"greet"
])
@ItemRegistry.register(type="action")
def hello_world():
"""Says hello world.\nUsage: hello_world()""" # noqa
print("Hello World!")
Note: Don't use more than 3 aliases if it isn't necessary, each alias results in more tokens parsed by the Agent.
Finally, note that all these functions must be imported in order to be registered in the ItemRegistry. Therefore, in multiple files, you will find imports like the following, even though they are not used in the file:
import action_space.talk.greet # noqa
Note: More information about the ItemRegistry can be found on its own section at the wiki.
Conclusion
This guide has provided a comprehensive overview of the ECM architecture implemented in this repository, detailing its layered structure and key components.
Each layer and module plays a vital role in the overall architecture. By adhering to the defined standards and protocols, the system achieves a high level of integration and functionality, capable of synchronizing both cognition and execution layer.