Tag: ai agent

Never Start From Scratch: Persistent Browser Sessions for AI Agents
Building AI agents that interact with the web presents unique challenges. One of the most frustrating is the lack of persistent browser session for ai. Imagine an AI assistant that has to log in to a website every time it needs to perform a task. This repetitive process is not only time-consuming but also disrupts the flow of information and can lead to errors. Fortunately, there’s a solution: maintaining persistent browser sessions for your AI agents.

The Problem with Stateless AI Web Interactions

Without a persistent browser session, each interaction with a website is treated as a brand new visit. This means your AI agent loses all previous context, including login credentials, cookies, and browsing history. This “stateless” approach forces the agent to start from scratch each time, leading to:
- Repetitive Logins: Constant login prompts hinder automation and slow down processes.
- Loss of Context: Crucial information from previous interactions is lost, impacting the agent’s ability to perform complex tasks.
- Inefficient Resource Use: Repeatedly loading websites and resources consumes unnecessary time and computing power.
- Repetitive Logins: Constant login prompts hinder automation and slow down processes.
- Loss of Context: Crucial information from previous interactions is lost, impacting the agent’s ability to perform complex tasks.
- Inefficient Resource Use: Repeatedly loading websites and resources consumes unnecessary time and computing power.
The Power of Persistent Browser Sessions for AI

A persistent browser session for ai allows your agent to maintain a continuous connection with a website, preserving its state across multiple interactions. This means:
- Eliminate Repetitive Logins: Your AI agent stays logged in, ready to perform tasks without interruption.
- Preserve Context: Retain crucial information like cookies, browsing history, and form data for seamless task execution.
- Streamline Workflow: Enable complex, multi-step automation without constantly restarting the process. This is crucial for tasks like web scraping, data extraction, and automated testing.
How Browser-Use Enables Persistent Sessions

Browser-Use offers a powerful solution for managing persistent browser context for ai. By leveraging its features, you can easily create and maintain browser sessions, allowing your AI agents to operate with maximum efficiency. This functionality is especially beneficial for long-running ai browser sessions that require continuous interaction with web applications.

Installation Guide

Prerequisites
- Python 3.11 or higher
- Git (for cloning the repository)
Option 1: Local Installation

Read the quickstart guide or follow the steps below to get started.

Step 1: Clone the Repository
```
git clone https://github.com/browser-use/web-ui.git
cd web-ui
```
Step 2: Set Up Python Environment

We recommend using uv for managing the Python environment.

Using uv (recommended):
```
uv venv --python 3.11
```
Activate the virtual environment:
- Windows (Command Prompt):
```
.venv\Scripts\activate
```
- Windows (PowerShell):
```
.\.venv\Scripts\Activate.ps1
```
- macOS/Linux:
```
source .venv/bin/activate
```
Step 3: Install Dependencies

Install Python packages:
```
uv pip install -r requirements.txt
```
Install Playwright:
```
playwright install
```
Step 4: Configure Environment
1. Create a copy of the example environment file:
- Windows (Command Prompt):
```
copy .env.example .env
```
- macOS/Linux/Windows (PowerShell):
```
cp .env.example .env
```
1. Open .env in your preferred text editor and add your API keys and other settings
Option 2: Docker Installation

Prerequisites
- Docker and Docker Compose installed
  - Docker Desktop (For Windows/macOS)
  - Docker Engine and Docker Compose (For Linux)
Installation Steps
1. Clone the repository:
```
git clone https://github.com/browser-use/web-ui.git
cd web-ui
```
1. Create and configure environment file:
- Windows (Command Prompt):
```
copy .env.example .env
```
- macOS/Linux/Windows (PowerShell):
```
cp .env.example .env
```
Edit .env with your preferred text editor and add your API keys
1. Run with Docker:
```
# Build and start the container with default settings (browser closes after AI tasks)
docker compose up --build
```
```
# Or run with persistent browser (browser stays open between AI tasks)
CHROME_PERSISTENT_SESSION=true docker compose up --build
```
1. Access the Application:
- Web Interface: Open http://localhost:7788 in your browser
- VNC Viewer (for watching browser interactions): Open http://localhost:6080/vnc.html
  - Default VNC password: “youvncpassword”
  - Can be changed by setting VNC_PASSWORD in your .env file
Docker Setup

Environment Variables:

All configuration is done through the .env file

Available environment variables:
```
# LLM API Keys
OPENAI_API_KEY=your_key_here
ANTHROPIC_API_KEY=your_key_here
GOOGLE_API_KEY=your_key_here

# Browser Settings
CHROME_PERSISTENT_SESSION=true   # Set to true to keep browser open between AI tasks
RESOLUTION=1920x1080x24         # Custom resolution format: WIDTHxHEIGHTxDEPTH
RESOLUTION_WIDTH=1920           # Custom width in pixels
RESOLUTION_HEIGHT=1080          # Custom height in pixels

# VNC Settings
VNC_PASSWORD=your_vnc_password  # Optional, defaults to "vncpassword"
```
Platform Support:

Supports both AMD64 and ARM64 architectures

For ARM64 systems (e.g., Apple Silicon Macs), the container will automatically use the appropriate image

Browser Persistence Modes:

Default Mode (CHROME_PERSISTENT_SESSION=false):

Browser opens and closes with each AI task

Clean state for each interaction

Lower resource usage

Persistent Mode (CHROME_PERSISTENT_SESSION=true):

Browser stays open between AI tasks

Maintains history and state

Allows viewing previous AI interactions

Set in .env file or via environment variable when starting container

Viewing Browser Interactions:

Access the noVNC viewer at http://localhost:6080/vnc.html

Enter the VNC password (default: “vncpassword” or what you set in VNC_PASSWORD)

Direct VNC access available on port 5900 (mapped to container port 5901)

You can now see all browser interactions in real-time

Persistent browser sessions are essential for building efficient and robust AI agents that interact with the web. By eliminating repetitive logins, preserving context, and streamlining workflows, you can unlock the true potential of AI web automation. Explore Browser-Use and discover how its persistent session management can revolutionize your AI development process. Start building smarter, more efficient AI agents today!
February 15, 2025
How to add custom actions and skills in Eliza AI?
Eliza is a versatile multi-agent simulation framework, built in TypeScript, that allows you to create sophisticated, autonomous AI agents. These agents can interact across multiple platforms while maintaining consistent personalities and knowledge. A key feature that enables this flexibility is the ability to define custom actions and skills. This article will delve into how you can leverage this feature to make your Eliza agents even more powerful.

Understanding Actions in Eliza

Actions are the fundamental building blocks that dictate how Eliza agents respond to and interact with messages. They allow agents to go beyond simple text replies, enabling them to:
- Interact with external systems.
- Modify their behavior dynamically.
- Perform complex tasks.
Each action in Eliza consists of several key components:
- name: A unique identifier for the action.
- similes: Alternative names or triggers that can invoke the action.
- description: A detailed explanation of what the action does.
- validate: A function that checks if the action is appropriate to execute in the current context.
- handler: The implementation of the action’s behavior – the core logic that the action performs.
- examples: Demonstrates proper usage patterns
- suppressInitialMessage: When set to true, it prevents the initial message from being sent before processing the action.
Built-in Actions

Eliza includes several built-in actions to manage basic conversation flow and external integrations:
- CONTINUE: Keeps a conversation going when more context is required.
- IGNORE: Gracefully disengages from a conversation.
- NONE: Default action for standard conversational replies.
- TAKE_ORDER: Records and processes user purchase orders (primarily for Solana integration).
Creating Custom Actions: Expanding Eliza’s Capabilities

The power of Eliza truly shines when you start implementing custom actions and skills. Here’s how to create them:
1. Create a custom_actions directory: This is where you’ll store your action files.
2. Add your action files: Each action is defined in its own TypeScript file, implementing the Action interface.
3. Configure in elizaConfig.yaml: Point to your custom actions by adding entries under the actions key.
```
actions:
    - name: myCustomAction
      path: ./custom_actions/myAction.ts
```
Action Configuration Structure

Here’s an example of how to structure your action file:
```
import { Action, IAgentRuntime, Memory } from "@elizaos/core";

export const myAction: Action = {
    name: "MY_ACTION",
    similes: ["SIMILAR_ACTION", "ALTERNATE_NAME"],
    validate: async (runtime: IAgentRuntime, message: Memory) => {
        // Validation logic here
        return true;
    },
    description: "A detailed description of your action.",
    handler: async (runtime: IAgentRuntime, message: Memory) => {
        // The actual logic of your action
        return true;
    },
};
```
Implementing a Custom Action
- Validation: Before executing an action, the validate function is called to determine if it can proceed, it checks if all the prerequisites are met to execute a specific action.
- Handler: The handler function contains the core logic of the action. It interacts with the agent runtime and memory and also perform the desired tasks, such as calling external APIs, processing data, or generating output.
Examples of Custom Actions

Here are some examples to illustrate the possibilities:

Basic Action Template:
```
const customAction: Action = {
    name: "CUSTOM_ACTION",
    similes: ["SIMILAR_ACTION"],
    description: "Action purpose",
    validate: async (runtime: IAgentRuntime, message: Memory) => {
        // Validation logic
        return true;
    },
    handler: async (runtime: IAgentRuntime, message: Memory) => {
        // Implementation
    },
    examples: [],
};
```
Advanced Action Example: Processing Documents:
```
const complexAction: Action = {
    name: "PROCESS_DOCUMENT",
    similes: ["READ_DOCUMENT", "ANALYZE_DOCUMENT"],
    description: "Process and analyze uploaded documents",
    validate: async (runtime, message) => {
        const hasAttachment = message.content.attachments?.length > 0;
        const supportedTypes = ["pdf", "txt", "doc"];
        return (
            hasAttachment &&
            supportedTypes.includes(message.content.attachments[0].type)
        );
    },
    handler: async (runtime, message, state) => {
        const attachment = message.content.attachments[0];

        // Process document
        const content = await runtime
            .getService<IDocumentService>(ServiceType.DOCUMENT)
            .processDocument(attachment);

        // Store in memory
        await runtime.documentsManager.createMemory({
            id: generateId(),
            content: { text: content },
            userId: message.userId,
            roomId: message.roomId,
        });

        return true;
    },
};
```
Best Practices for Custom Actions
- Single Responsibility: Ensure each action has a single, well-defined purpose.
- Robust Validation: Always validate inputs and preconditions before executing an action.
- Clear Error Handling: Implement error catching and provide informative error messages.
- Detailed Examples: Include examples in the examples field to show the action’s usage.
Testing Your Actions

Eliza provides a built-in testing framework to validate your actions:
```
test("Validate action behavior", async () => {
    const message: Memory = {
        userId: user.id,
        content: { text: "Test message" },
        roomId,
    };

    const response = await handleMessage(runtime, message);
    // Verify response
});
```
Custom actions and skills are crucial for unlocking the full potential of Eliza. By creating your own actions, you can tailor Eliza to specific use cases, whether it’s automating complex workflows, integrating with external services, or creating unique, engaging interactions. The flexibility and power provided by this system allow you to push the boundaries of what’s possible with autonomous AI agents.

Reference URLs:
January 22, 2025

Tag: ai agent

Never Start From Scratch: Persistent Browser Sessions for AI Agents

The Problem with Stateless AI Web Interactions

The Power of Persistent Browser Sessions for AI

How Browser-Use Enables Persistent Sessions

Installation Guide

Option 1: Local Installation

Step 1: Clone the Repository

Step 2: Set Up Python Environment

Step 3: Install Dependencies

Step 4: Configure Environment

Option 2: Docker Installation

Prerequisites

Installation Steps

Docker Setup

How to add custom actions and skills in Eliza AI?

Understanding Actions in Eliza

Built-in Actions

Creating Custom Actions: Expanding Eliza’s Capabilities

Action Configuration Structure

Implementing a Custom Action

Examples of Custom Actions

Basic Action Template:

Advanced Action Example: Processing Documents:

Best Practices for Custom Actions

Testing Your Actions