Category: Agents

OpenManus: FULLY FREE Manus Alternative
First-Ever General AI Agent, Manus. But it’s restricted by the invite code and money. However, we haven’t got the prices yet. But it’s not gonna be free. So, what we do now. Well, let’s move to our saviours, the open-source community

Well, guess what? OpenManus is like the answer to your prayers! It’s basically a free version of Manus that you can just download and use right now. It does all that cool AI agent stuff like figuring things out on its own, working with other programs, and automating tasks. And the best part? You don’t have to wait in line or pay anything, and you can see exactly how it’s built. Pretty awesome, huh?

OpenManus is an open-source project designed to allow users to create and utilize their own AI agents without requiring an invite code, unlike the proprietary Manus platform. It’s developed by a team including members from MetaGPT and aims to democratize access to AI agent creation.

Key Features
- No Invite Code Required: Unlike Manus, OpenManus eliminates the need for an invite code, making it accessible to everyone.
- Open-Source Implementation: The project is fully open-source, encouraging community contributions and improvements.
- Integration with OpenManus-RL: Collaborates with researchers from UIUC on reinforcement learning tuning methods for LLM agents.
- Active Development: The team is actively working on enhancements including improved planning capabilities, standardized evaluation metrics, model adaptation, containerized deployment, and expanded example libraries.
Technical Setup and Run Steps

Installation

Method 1: Using Conda

Create and activate a new conda environment:
```
conda create -n open_manus python=3.12
conda activate open_manus
```
Clone the repository:
```
git clone https://github.com/mannaandpoem/OpenManus.git
cd OpenManus
```
Install dependencies:
```
pip install -r requirements.txt
```
Method 2: Using uv (Recommended)

Install uv:
```
curl -LsSf https://astral.sh/uv/install.sh | sh
```
Clone the repository:
```
git clone https://github.com/mannaandpoem/OpenManus.git
cd OpenManus
```
Create and activate a virtual environment:
```
uv venv
source .venv/bin/activate  # On Unix/macOS
# Or on Windows:
# .venv\Scripts\activate
```
Install dependencies:
```
uv pip install -r requirements.txt
```
Configuration:

Create a config.toml file in the config directory by copying the example:cp config/config.example.toml config/config.toml
Edit config/config.toml to add your API keys and customize settings:
```
# Global LLM configuration
[llm]
model = "gpt-4o"
base_url = "https://api.openai.com/v1"
api_key = "sk-..."  # Replace with your actual API key
max_tokens = 4096
temperature = 0.0

# Optional configuration for specific LLM models
[llm.vision]
model = "gpt-4o"
base_url = "https://api.openai.com/v1"
api_key = "sk-..."  # Replace with your actual API key
```
Running OpenManus

After completing the installation and configuration steps, you can run OpenManus with a single command. The specific command may vary depending on your setup, but generally, you can execute:python main.py

Then input your idea via the terminal when prompted.

For the unstable version, you might need to use a different command as specified in the project documentation.
March 10, 2025
Never Start From Scratch: Persistent Browser Sessions for AI Agents
Building AI agents that interact with the web presents unique challenges. One of the most frustrating is the lack of persistent browser session for ai. Imagine an AI assistant that has to log in to a website every time it needs to perform a task. This repetitive process is not only time-consuming but also disrupts the flow of information and can lead to errors. Fortunately, there’s a solution: maintaining persistent browser sessions for your AI agents.

The Problem with Stateless AI Web Interactions

Without a persistent browser session, each interaction with a website is treated as a brand new visit. This means your AI agent loses all previous context, including login credentials, cookies, and browsing history. This “stateless” approach forces the agent to start from scratch each time, leading to:
- Repetitive Logins: Constant login prompts hinder automation and slow down processes.
- Loss of Context: Crucial information from previous interactions is lost, impacting the agent’s ability to perform complex tasks.
- Inefficient Resource Use: Repeatedly loading websites and resources consumes unnecessary time and computing power.
- Repetitive Logins: Constant login prompts hinder automation and slow down processes.
- Loss of Context: Crucial information from previous interactions is lost, impacting the agent’s ability to perform complex tasks.
- Inefficient Resource Use: Repeatedly loading websites and resources consumes unnecessary time and computing power.
The Power of Persistent Browser Sessions for AI

A persistent browser session for ai allows your agent to maintain a continuous connection with a website, preserving its state across multiple interactions. This means:
- Eliminate Repetitive Logins: Your AI agent stays logged in, ready to perform tasks without interruption.
- Preserve Context: Retain crucial information like cookies, browsing history, and form data for seamless task execution.
- Streamline Workflow: Enable complex, multi-step automation without constantly restarting the process. This is crucial for tasks like web scraping, data extraction, and automated testing.
How Browser-Use Enables Persistent Sessions

Browser-Use offers a powerful solution for managing persistent browser context for ai. By leveraging its features, you can easily create and maintain browser sessions, allowing your AI agents to operate with maximum efficiency. This functionality is especially beneficial for long-running ai browser sessions that require continuous interaction with web applications.

Installation Guide

Prerequisites
- Python 3.11 or higher
- Git (for cloning the repository)
Option 1: Local Installation

Read the quickstart guide or follow the steps below to get started.

Step 1: Clone the Repository
```
git clone https://github.com/browser-use/web-ui.git
cd web-ui
```
Step 2: Set Up Python Environment

We recommend using uv for managing the Python environment.

Using uv (recommended):
```
uv venv --python 3.11
```
Activate the virtual environment:
- Windows (Command Prompt):
```
.venv\Scripts\activate
```
- Windows (PowerShell):
```
.\.venv\Scripts\Activate.ps1
```
- macOS/Linux:
```
source .venv/bin/activate
```
Step 3: Install Dependencies

Install Python packages:
```
uv pip install -r requirements.txt
```
Install Playwright:
```
playwright install
```
Step 4: Configure Environment
1. Create a copy of the example environment file:
- Windows (Command Prompt):
```
copy .env.example .env
```
- macOS/Linux/Windows (PowerShell):
```
cp .env.example .env
```
1. Open .env in your preferred text editor and add your API keys and other settings
Option 2: Docker Installation

Prerequisites
- Docker and Docker Compose installed
  - Docker Desktop (For Windows/macOS)
  - Docker Engine and Docker Compose (For Linux)
Installation Steps
1. Clone the repository:
```
git clone https://github.com/browser-use/web-ui.git
cd web-ui
```
1. Create and configure environment file:
- Windows (Command Prompt):
```
copy .env.example .env
```
- macOS/Linux/Windows (PowerShell):
```
cp .env.example .env
```
Edit .env with your preferred text editor and add your API keys
1. Run with Docker:
```
# Build and start the container with default settings (browser closes after AI tasks)
docker compose up --build
```
```
# Or run with persistent browser (browser stays open between AI tasks)
CHROME_PERSISTENT_SESSION=true docker compose up --build
```
1. Access the Application:
- Web Interface: Open http://localhost:7788 in your browser
- VNC Viewer (for watching browser interactions): Open http://localhost:6080/vnc.html
  - Default VNC password: “youvncpassword”
  - Can be changed by setting VNC_PASSWORD in your .env file
Docker Setup

Environment Variables:

All configuration is done through the .env file

Available environment variables:
```
# LLM API Keys
OPENAI_API_KEY=your_key_here
ANTHROPIC_API_KEY=your_key_here
GOOGLE_API_KEY=your_key_here

# Browser Settings
CHROME_PERSISTENT_SESSION=true   # Set to true to keep browser open between AI tasks
RESOLUTION=1920x1080x24         # Custom resolution format: WIDTHxHEIGHTxDEPTH
RESOLUTION_WIDTH=1920           # Custom width in pixels
RESOLUTION_HEIGHT=1080          # Custom height in pixels

# VNC Settings
VNC_PASSWORD=your_vnc_password  # Optional, defaults to "vncpassword"
```
Platform Support:

Supports both AMD64 and ARM64 architectures

For ARM64 systems (e.g., Apple Silicon Macs), the container will automatically use the appropriate image

Browser Persistence Modes:

Default Mode (CHROME_PERSISTENT_SESSION=false):

Browser opens and closes with each AI task

Clean state for each interaction

Lower resource usage

Persistent Mode (CHROME_PERSISTENT_SESSION=true):

Browser stays open between AI tasks

Maintains history and state

Allows viewing previous AI interactions

Set in .env file or via environment variable when starting container

Viewing Browser Interactions:

Access the noVNC viewer at http://localhost:6080/vnc.html

Enter the VNC password (default: “vncpassword” or what you set in VNC_PASSWORD)

Direct VNC access available on port 5900 (mapped to container port 5901)

You can now see all browser interactions in real-time

Persistent browser sessions are essential for building efficient and robust AI agents that interact with the web. By eliminating repetitive logins, preserving context, and streamlining workflows, you can unlock the true potential of AI web automation. Explore Browser-Use and discover how its persistent session management can revolutionize your AI development process. Start building smarter, more efficient AI agents today!
February 15, 2025
Build Your Own and Free AI Health Assistant, Personalized Healthcare
Imagine having a 24/7 health companion that analyzes your medical history, tracks real-time vitals, and offers tailored advice—all while keeping your data private. This is the reality of AI health assistants, open-source tools merging artificial intelligence with healthcare to empower individuals and professionals alike. Let’s dive into how these systems work, their transformative benefits, and how you can build one using platforms like OpenHealthForAll

What Is an AI Health Assistant?

An AI health assistant is a digital tool that leverages machine learning, natural language processing (NLP), and data analytics to provide personalized health insights. For example:
- OpenHealth consolidates blood tests, wearable data, and family history into structured formats, enabling GPT-powered conversations about your health.
- Aiden, another assistant, uses WhatsApp to deliver habit-building prompts based on anonymized data from Apple Health or Fitbit.
These systems prioritize privacy, often running locally or using encryption to protect sensitive information.

Why AI Health Assistants Matter: 5 Key Benefits
1. Centralized Health Management
  Integrate wearables, lab reports, and EHRs into one platform. OpenHealth, for instance, parses blood tests and symptoms into actionable insights using LLMs like Claude or Gemini.
2. Real-Time Anomaly Detection
  Projects like Kavya Prabahar’s virtual assistant use RNNs to flag abnormal heart rates or predict fractures from X-rays.
3. Privacy-First Design
  Tools like Aiden anonymize data via Evervault and store records on blockchain (e.g., NearestDoctor’s smart contracts) to ensure compliance with regulations like HIPAA.
4. Empathetic Patient Interaction
  Assistants like OpenHealth use emotion-aware AI to provide compassionate guidance, reducing anxiety for users managing chronic conditions.
5. Cost-Effective Scalability
  Open-source frameworks like Google’s Open Health Stack (OHS) help developers build offline-capable solutions for low-resource regions, accelerating global healthcare access.
Challenges and Ethical Considerations

While promising, AI health assistants face hurdles:
- Data Bias: Models trained on limited datasets may misdiagnose underrepresented groups.
- Interoperability: Bridging EHR systems (e.g., HL7 FHIR) with AI requires standardization efforts like OHS.
- Regulatory Compliance: Solutions must balance innovation with safety, as highlighted in Nature’s call for mandatory feedback loops in AI health tech.
Build Your Own AI Health Assistant: A Developer’s Guide

Step 1: Choose Your Stack
- Data Parsing: Use OpenHealth’s Python-based parser (migrating to TypeScript soon) to structure inputs from wearables or lab reports.
- AI Models: Integrate LLaMA or GPT-4 via APIs, or run Ollama locally for privacy.
Step 2: Prioritize Security
- Encrypt user data with Supabase or Evervault.
- Implement blockchain for audit trails, as seen in NearestDoctor’s medical records system.
Step 3: Start the setup

Clone the Repository:
```
git clone https://github.com/OpenHealthForAll/open-health.git
cd open-health
```
Setup and Run:
```
# Copy environment file
cp .env.example .env

# Add API keys to .env file:
# UPSTAGE_API_KEY - For parsing (You can get $10 credit without card registration by signing up at https://www.upstage.ai)
# OPENAI_API_KEY - For enhanced parsing capabilities

# Start the application using Docker Compose
docker compose --env-file .env up
```
For existing users, use:
```
docker compose --env-file .env up --build
```
1. Access OpenHealth: Open your browser and navigate to http://localhost:3000 to begin using OpenHealth.
The Future of AI Health Assistants
1. Decentralized AI Marketplaces: Platforms like Ocean Protocol could let users monetize health models securely.
2. AI-Powered Diagnostics: Google’s Health AI Developer Foundations aim to simplify building diagnostic tools for conditions like diabetes.
3. Global Accessibility: Initiatives like OHS workshops in Kenya and India are democratizing AI health tech.
Your Next Step
- Contribute to OpenHealth’s GitHub repo to enhance its multilingual support.
February 7, 2025

Deploy an uncensored DeepSeek R1 model on Google Cloud Run

DeepSeek R1 Distill: Complete Tutorial for Deployment & Fine-Tuning

Are you eager to explore the capabilities of the DeepSeek R1 Distill model? This guide provides a comprehensive, step-by-step approach to deploying the uncensored DeepSeek R1 Distill model to Google Cloud Run with GPU support, and also walks you through a practical fine-tuning process. The tutorial is broken down into the following sections:

Environment Setup
FastAPI Inference Server
Docker Configuration
Google Cloud Run Deployment
Fine-Tuning Pipeline

Let’s dive in and get started.

1. Environment Setup

Before deploying and fine-tuning, make sure you have the required tools installed and configured.

1.1 Install Required Tools

Python 3.9+
pip: For Python package installation
Docker: For containerization
Google Cloud CLI: For deployment

Install Google Cloud CLI (Ubuntu/Debian):
Follow the official Google Cloud CLI installation guide to install gcloud.

1.2 Authenticate with Google Cloud

Run the following commands to initialize and authenticate with Google Cloud:

gcloud init
gcloud auth application-default login

Ensure you have an active Google Cloud project with Cloud Run, Compute Engine, and Container Registry/Artifact Registry enabled.

2. FastAPI Inference Server

We’ll create a minimal FastAPI application that serves two main endpoints:

/v1/inference: For model inference.
/v1/finetune: For uploading fine-tuning data (JSONL).

Create a file named main.py with the following content:

# main.py
from fastapi import FastAPI, File, UploadFile
from fastapi.responses import JSONResponse
from pydantic import BaseModel
import json

import litellm  # Minimalistic LLM library

app = FastAPI()

class InferenceRequest(BaseModel):
    prompt: str
    max_tokens: int = 512

@app.post("/v1/inference")
async def inference(request: InferenceRequest):
    """
    Inference endpoint using deepseek-r1-distill-7b (uncensored).
    """
    response = litellm.completion(
        model="deepseek/deepseek-r1-distill-7b",
        messages=[{"role": "user", "content": request.prompt}],
        max_tokens=request.max_tokens
    )
    return JSONResponse(content=response)

@app.post("/v1/finetune")
async def finetune(file: UploadFile = File(...)):
    """
    Fine-tune endpoint that accepts a JSONL file.
    """
    if not file.filename.endswith('.jsonl'):
        return JSONResponse(
            status_code=400,
            content={"error": "Only .jsonl files are accepted for fine-tuning"}
        )

    # Read lines from uploaded file
    data = [json.loads(line) for line in file.file]

    # Perform or schedule a fine-tuning job here (simplified placeholder)
    # You can integrate with your training pipeline below.
    
    return JSONResponse(content={"status": "Fine-tuning request received", "samples": len(data)})

3. Docker Configuration

To containerize the application, create a requirements.txt file:

fastapi
uvicorn
litellm
pydantic
transformers
datasets
accelerate
trl
torch

And create a Dockerfile:

# Dockerfile
FROM nvidia/cuda:12.0.0-base-ubuntu22.04

# Install basic dependencies
RUN apt-get update && apt-get install -y python3 python3-pip

# Create app directory
WORKDIR /app

# Copy requirements and install
COPY requirements.txt .
RUN pip3 install --upgrade pip
RUN pip3 install --no-cache-dir -r requirements.txt

# Copy code
COPY . .

# Expose port 8080 for Cloud Run
EXPOSE 8080

# Start server
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8080"]

4. Deploy to Google Cloud Run with GPU

4.1 Enable GPU on Cloud Run

Make sure your Google Cloud project has a GPU quota available, such as nvidia-l4.

4.2 Build and Deploy

Run this command from your project directory to deploy the application to Cloud Run:

gcloud run deploy deepseek-uncensored \
    --source . \
    --region us-central1 \
    --platform managed \
    --gpu 1 \
    --gpu-type nvidia-l4 \
    --memory 16Gi \
    --cpu 4 \
    --allow-unauthenticated

This command builds the Docker image, deploys it to Cloud Run with one nvidia-l4 GPU, allocates 16 GiB memory and 4 CPU cores, and exposes the service publicly (no authentication).

5. Fine-Tuning Pipeline

This section will guide you through a basic four-stage fine-tuning pipeline similar to DeepSeek R1’s training approach.

5.1 Directory Structure Example

.
├── main.py
├── finetune_pipeline.py
├── cold_start_data.jsonl
├── reasoning_data.jsonl
├── data_collection.jsonl
├── final_data.jsonl
├── requirements.txt
└── Dockerfile

Replace the .jsonl files with your actual training data.

5.2 Fine-Tuning Code: finetune_pipeline.py

Create a finetune_pipeline.py file with the following code:

# finetune_pipeline.py

import os
import torch
from transformers import (AutoModelForCausalLM, AutoTokenizer,
                          Trainer, TrainingArguments)
from datasets import load_dataset

from trl import PPOTrainer, PPOConfig, AutoModelForCausalLMWithValueHead
from transformers import pipeline, AutoModel


# 1. Cold Start Phase
def cold_start_finetune(
    base_model="deepseek-ai/deepseek-r1-distill-7b",
    train_file="cold_start_data.jsonl",
    output_dir="cold_start_finetuned_model"
):
    # Load model and tokenizer
    model = AutoModelForCausalLM.from_pretrained(base_model)
    tokenizer = AutoTokenizer.from_pretrained(base_model)

    # Load dataset
    dataset = load_dataset("json", data_files=train_file, split="train")

    # Simple tokenization function
    def tokenize_function(example):
        return tokenizer(
            example["prompt"] + "\n" + example["completion"],
            truncation=True,
            max_length=512
        )

    dataset = dataset.map(tokenize_function, batched=True)
    dataset = dataset.shuffle()

    # Define training arguments
    training_args = TrainingArguments(
        output_dir=output_dir,
        num_train_epochs=1,
        per_device_train_batch_size=2,
        gradient_accumulation_steps=4,
        save_steps=50,
        logging_steps=50,
        learning_rate=5e-5
    )

    # Trainer
    trainer = Trainer(
        model=model,
        args=training_args,
        train_dataset=dataset
    )

    trainer.train()
    trainer.save_model(output_dir)
    tokenizer.save_pretrained(output_dir)
    return output_dir


# 2. Reasoning RL Training
def reasoning_rl_training(
    cold_start_model_dir="cold_start_finetuned_model",
    train_file="reasoning_data.jsonl",
    output_dir="reasoning_rl_model"
):
    # Config for PPO
    config = PPOConfig(
        batch_size=16,
        learning_rate=1e-5,
        log_with=None,  # or 'wandb'
        mini_batch_size=4
    )

    # Load model and tokenizer
    model = AutoModelForCausalLMWithValueHead.from_pretrained(cold_start_model_dir)
    tokenizer = AutoTokenizer.from_pretrained(cold_start_model_dir)

    # Create a PPO trainer
    ppo_trainer = PPOTrainer(
        config,
        model,
        tokenizer=tokenizer,
    )

    # Load dataset
    dataset = load_dataset("json", data_files=train_file, split="train")

    # Simple RL loop (pseudo-coded for brevity)
    for sample in dataset:
        prompt = sample["prompt"]
        desired_answer = sample["completion"]  # For reward calculation

        # Generate response
        query_tensors = tokenizer.encode(prompt, return_tensors="pt")
        response_tensors = ppo_trainer.generate(query_tensors, max_new_tokens=50)
        response_text = tokenizer.decode(response_tensors[0], skip_special_tokens=True)

        # Calculate reward (simplistic: measure overlap or correctness)
        reward = 1.0 if desired_answer in response_text else -1.0

        # Run a PPO step
        ppo_trainer.step([query_tensors[0]], [response_tensors[0]], [reward])

    model.save_pretrained(output_dir)
    tokenizer.save_pretrained(output_dir)
    return output_dir


# 3. Data Collection
def collect_data(
    rl_model_dir="reasoning_rl_model",
    num_samples=1000,
    output_file="data_collection.jsonl"
):
    """
    Example data collection: generate completions from the RL model.
    This is a simple version that just uses random prompts or a given file of prompts.
    """
    tokenizer = AutoTokenizer.from_pretrained(rl_model_dir)
    model = AutoModelForCausalLM.from_pretrained(rl_model_dir)

    # Suppose we have some random prompts:
    prompts = [
        "Explain quantum entanglement",
        "Summarize the plot of 1984 by George Orwell",
        # ... add or load from a prompt file ...
    ]

    collected = []
    for i in range(num_samples):
        prompt = prompts[i % len(prompts)]
        inputs = tokenizer(prompt, return_tensors="pt")
        outputs = model.generate(**inputs, max_new_tokens=50)
        completion = tokenizer.decode(outputs[0], skip_special_tokens=True)
        collected.append({"prompt": prompt, "completion": completion})

    # Save to JSONL
    with open(output_file, "w") as f:
        for item in collected:
            f.write(f"{item}\n")

    return output_file


# 4. Final RL Phase
def final_rl_phase(
    rl_model_dir="reasoning_rl_model",
    final_data="final_data.jsonl",
    output_dir="final_rl_model"
):
    """
    Another RL phase using a new dataset or adding human feedback.
    This is a simplified approach similar to the reasoning RL training step.
    """
    config = PPOConfig(
        batch_size=16,
        learning_rate=1e-5,
        log_with=None,
        mini_batch_size=4
    )

    model = AutoModelForCausalLMWithValueHead.from_pretrained(rl_model_dir)
    tokenizer = AutoTokenizer.from_pretrained(rl_model_dir)
    ppo_trainer = PPOTrainer(config, model, tokenizer=tokenizer)

    dataset = load_dataset("json", data_files=final_data, split="train")

    for sample in dataset:
        prompt = sample["prompt"]
        desired_answer = sample["completion"]
        query_tensors = tokenizer.encode(prompt, return_tensors="pt")
        response_tensors = ppo_trainer.generate(query_tensors, max_new_tokens=50)
        response_text = tokenizer.decode(response_tensors[0], skip_special_tokens=True)

        reward = 1.0 if desired_answer in response_text else 0.0
        ppo_trainer.step([query_tensors[0]], [response_tensors[0]], [reward])

    model.save_pretrained(output_dir)
    tokenizer.save_pretrained(output_dir)
    return output_dir


# END-TO-END PIPELINE EXAMPLE
if __name__ == "__main__":
    # 1) Cold Start
    cold_start_out = cold_start_finetune(
        base_model="deepseek-ai/deepseek-r1-distill-7b",
        train_file="cold_start_data.jsonl",
        output_dir="cold_start_finetuned_model"
    )

    # 2) Reasoning RL
    reasoning_rl_out = reasoning_rl_training(
        cold_start_model_dir=cold_start_out,
        train_file="reasoning_data.jsonl",
        output_dir="reasoning_rl_model"
    )

    # 3) Data Collection
    data_collection_out = collect_data(
        rl_model_dir=reasoning_rl_out,
        num_samples=100,
        output_file="data_collection.jsonl"
    )

    # 4) Final RL Phase
    final_rl_out = final_rl_phase(
        rl_model_dir=reasoning_rl_out,
        final_data="final_data.jsonl",
        output_dir="final_rl_model"
    )

    print("All done! Final model stored in:", final_rl_out)

Usage Overview

Upload Your Data:
- Prepare cold_start_data.jsonl, reasoning_data.jsonl, final_data.jsonl, etc.
- Each line should be a JSON object with “prompt” and “completion” keys.
Run the Pipeline Locally:

python3 finetune_pipeline.py

This creates directories like cold_start_finetuned_model, reasoning_rl_model, and final_rl_model.

Deploy:
- Build and push via gcloud run deploy.
Inference:
- After deployment, send a POST request to your Cloud Run service:

import requests

url = "https://<YOUR-CLOUD-RUN-URL>/v1/inference"
data = {"prompt": "Tell me about quantum physics", "max_tokens": 100}
response = requests.post(url, json=data)
print(response.json())

Fine-Tuning via Endpoint:

Upload new data for fine-tuning:

import requests

url = "https://<YOUR-CLOUD-RUN-URL>/v1/finetune"
with open("new_training_data.jsonl", "rb") as f:
    r = requests.post(url, files={"file": ("new_training_data.jsonl", f)})
print(r.json())

This tutorial has provided an end-to-end pipeline for deploying and fine-tuning the DeepSeek R1 Distill model. You’ve learned how to:

Deploy a FastAPI server with Docker and GPU support on Google Cloud Run.
Fine-tune the model in four stages: Cold Start, Reasoning RL, Data Collection, and Final RL.
Use TRL (PPO) for basic RL-based training loops.

Disclaimer: Deploying uncensored models has ethical and legal implications. Make sure to comply with relevant laws, policies, and usage guidelines.

This comprehensive guide should equip you with the knowledge to start deploying and fine-tuning the DeepSeek R1 Distill model.

January 29, 2025

How to add custom actions and skills in Eliza AI?
Eliza is a versatile multi-agent simulation framework, built in TypeScript, that allows you to create sophisticated, autonomous AI agents. These agents can interact across multiple platforms while maintaining consistent personalities and knowledge. A key feature that enables this flexibility is the ability to define custom actions and skills. This article will delve into how you can leverage this feature to make your Eliza agents even more powerful.

Understanding Actions in Eliza

Actions are the fundamental building blocks that dictate how Eliza agents respond to and interact with messages. They allow agents to go beyond simple text replies, enabling them to:
- Interact with external systems.
- Modify their behavior dynamically.
- Perform complex tasks.
Each action in Eliza consists of several key components:
- name: A unique identifier for the action.
- similes: Alternative names or triggers that can invoke the action.
- description: A detailed explanation of what the action does.
- validate: A function that checks if the action is appropriate to execute in the current context.
- handler: The implementation of the action’s behavior – the core logic that the action performs.
- examples: Demonstrates proper usage patterns
- suppressInitialMessage: When set to true, it prevents the initial message from being sent before processing the action.
Built-in Actions

Eliza includes several built-in actions to manage basic conversation flow and external integrations:
- CONTINUE: Keeps a conversation going when more context is required.
- IGNORE: Gracefully disengages from a conversation.
- NONE: Default action for standard conversational replies.
- TAKE_ORDER: Records and processes user purchase orders (primarily for Solana integration).
Creating Custom Actions: Expanding Eliza’s Capabilities

The power of Eliza truly shines when you start implementing custom actions and skills. Here’s how to create them:
1. Create a custom_actions directory: This is where you’ll store your action files.
2. Add your action files: Each action is defined in its own TypeScript file, implementing the Action interface.
3. Configure in elizaConfig.yaml: Point to your custom actions by adding entries under the actions key.
```
actions:
    - name: myCustomAction
      path: ./custom_actions/myAction.ts
```
Action Configuration Structure

Here’s an example of how to structure your action file:
```
import { Action, IAgentRuntime, Memory } from "@elizaos/core";

export const myAction: Action = {
    name: "MY_ACTION",
    similes: ["SIMILAR_ACTION", "ALTERNATE_NAME"],
    validate: async (runtime: IAgentRuntime, message: Memory) => {
        // Validation logic here
        return true;
    },
    description: "A detailed description of your action.",
    handler: async (runtime: IAgentRuntime, message: Memory) => {
        // The actual logic of your action
        return true;
    },
};
```
Implementing a Custom Action
- Validation: Before executing an action, the validate function is called to determine if it can proceed, it checks if all the prerequisites are met to execute a specific action.
- Handler: The handler function contains the core logic of the action. It interacts with the agent runtime and memory and also perform the desired tasks, such as calling external APIs, processing data, or generating output.
Examples of Custom Actions

Here are some examples to illustrate the possibilities:

Basic Action Template:
```
const customAction: Action = {
    name: "CUSTOM_ACTION",
    similes: ["SIMILAR_ACTION"],
    description: "Action purpose",
    validate: async (runtime: IAgentRuntime, message: Memory) => {
        // Validation logic
        return true;
    },
    handler: async (runtime: IAgentRuntime, message: Memory) => {
        // Implementation
    },
    examples: [],
};
```
Advanced Action Example: Processing Documents:
```
const complexAction: Action = {
    name: "PROCESS_DOCUMENT",
    similes: ["READ_DOCUMENT", "ANALYZE_DOCUMENT"],
    description: "Process and analyze uploaded documents",
    validate: async (runtime, message) => {
        const hasAttachment = message.content.attachments?.length > 0;
        const supportedTypes = ["pdf", "txt", "doc"];
        return (
            hasAttachment &&
            supportedTypes.includes(message.content.attachments[0].type)
        );
    },
    handler: async (runtime, message, state) => {
        const attachment = message.content.attachments[0];

        // Process document
        const content = await runtime
            .getService<IDocumentService>(ServiceType.DOCUMENT)
            .processDocument(attachment);

        // Store in memory
        await runtime.documentsManager.createMemory({
            id: generateId(),
            content: { text: content },
            userId: message.userId,
            roomId: message.roomId,
        });

        return true;
    },
};
```
Best Practices for Custom Actions
- Single Responsibility: Ensure each action has a single, well-defined purpose.
- Robust Validation: Always validate inputs and preconditions before executing an action.
- Clear Error Handling: Implement error catching and provide informative error messages.
- Detailed Examples: Include examples in the examples field to show the action’s usage.
Testing Your Actions

Eliza provides a built-in testing framework to validate your actions:
```
test("Validate action behavior", async () => {
    const message: Memory = {
        userId: user.id,
        content: { text: "Test message" },
        roomId,
    };

    const response = await handleMessage(runtime, message);
    // Verify response
});
```
Custom actions and skills are crucial for unlocking the full potential of Eliza. By creating your own actions, you can tailor Eliza to specific use cases, whether it’s automating complex workflows, integrating with external services, or creating unique, engaging interactions. The flexibility and power provided by this system allow you to push the boundaries of what’s possible with autonomous AI agents.

Reference URLs:
January 22, 2025
AI Agents by Google: Revolutionizing AI with Reasoning and Tools
Artificial Intelligence is rapidly changing, and AI Agents by Google are at the forefront. These aren’t typical AI models. Instead, they are complex systems. They can reason, make logical decisions, and interact with the world using tools. This article explores what makes them special. Furthermore, it will examine how they are changing AI applications.

Understanding AI Agents

Essentially, AI Agents by Google are applications. The aim of AI Agents to achieve goals. They do this by observing their environment. They also use available tools. Unlike basic AI, agents are autonomous. They act independently. Moreover, they proactively make decisions. This helps them meet objectives, even without direct instructions. This is possible through their cognitive architecture, which includes three key parts:
- The Model: This is the core language model. It is the central decision-maker. It uses reasoning frameworks like ReAct. Also, it uses Chain-of-Thought and Tree-of-Thoughts.
- The Tools: These are crucial for external interaction. They allow the agent to connect to real-time data and services. For example, APIs can be used. They bridge the gap between internal knowledge and outside resources.
- The Orchestration Layer: This layer manages the agent’s process. It determines how it takes in data. Then, it reasons internally. Finally, it informs the next action or decision in a continuous cycle.
AI Agents vs. Traditional AI Models

Traditional AI models have limitations. They are restricted by training data. They perform single inferences. In contrast, AI Agents by Google overcome these limits. They do this through several capabilities:
- External System Access: They connect to external systems via tools. Thus, they interact with real-time data.
- Session History Management: Agents track and manage session history. This enables multi-turn interactions with context.
- Native Tool Implementation: They include built-in tools. This allows seamless execution of external tasks.
- Cognitive Architectures: They utilize advanced frameworks. For instance, they use CoT and ReAct for reasoning.
The Role of Tools: Extensions, Functions, and Data Stores

AI Agents by Google interact with the outside world through three key tools:

Extensions

These tools bridge agents and APIs. They allow agents to use APIs to carry out actions through examples. For instance, they can use the Google Flights API. Extensions run on the agent-side. They are designed to make integrations scalable and strong.

Functions

Functions are self-contained code modules. Models use them for specific tasks. Unlike Extensions, these run on the client side. They don’t directly interact with APIs. This gives developers greater control over data flow and system execution.

Data Stores

Data Stores enable agents to access diverse data. This includes structured and unstructured data from various sources. For instance, they can access websites, PDFs, and databases. This dynamic interaction with current data enhances the model’s knowledge. Furthermore, it aids applications using Retrieval Augmented Generation (RAG).

Improving Agent Performance

To get the best results, AI Agents need targeted learning. These methods include:
- In-context learning: Examples provided during inference let the model learn “on-the-fly.”
- Retrieval-based in-context learning: External memory enhances this process. It provides more relevant examples.
- Fine-tuning based learning: Pre-training the model is key. This improves its understanding of tools. Moreover, it improves its ability to know when to use them.
Getting Started with AI Agents

If you’re interested in building with AI Agents, consider using libraries like LangChain. Also, you might use platforms such as Google’s Vertex AI. LangChain helps users ‘chain’ sequences of logic and tool calls. Meanwhile, Vertex AI offers a managed environment. It supports building and deploying production-ready agents.

AI Agents by Google are transforming AI. They go beyond traditional limits. They can reason, use tools, and interact with the external world. Therefore, they are a major step forward. They create more flexible and capable AI systems. As these agents evolve, their ability to solve complex problems will also grow. In addition, their capacity to drive real-world value will expand.

Read More on the AI Agents by Google Whitepaper by Google.

AI Agents by Google Download
January 20, 2025

Category: Agents

OpenManus: FULLY FREE Manus Alternative

Key Features

Technical Setup and Run Steps

Installation

Configuration:

Running OpenManus

Never Start From Scratch: Persistent Browser Sessions for AI Agents

The Problem with Stateless AI Web Interactions

The Power of Persistent Browser Sessions for AI

How Browser-Use Enables Persistent Sessions

Installation Guide

Option 1: Local Installation

Step 1: Clone the Repository

Step 2: Set Up Python Environment

Step 3: Install Dependencies

Step 4: Configure Environment

Option 2: Docker Installation

Prerequisites

Installation Steps

Docker Setup

Build Your Own and Free AI Health Assistant, Personalized Healthcare

What Is an AI Health Assistant?

Why AI Health Assistants Matter: 5 Key Benefits

Challenges and Ethical Considerations

Build Your Own AI Health Assistant: A Developer’s Guide

Step 1: Choose Your Stack

Step 2: Prioritize Security

Step 3: Start the setup

The Future of AI Health Assistants

Deploy an uncensored DeepSeek R1 model on Google Cloud Run

DeepSeek R1 Distill: Complete Tutorial for Deployment & Fine-Tuning

1. Environment Setup

1.1 Install Required Tools

1.2 Authenticate with Google Cloud

2. FastAPI Inference Server

3. Docker Configuration

4. Deploy to Google Cloud Run with GPU

4.1 Enable GPU on Cloud Run

4.2 Build and Deploy

5. Fine-Tuning Pipeline

5.1 Directory Structure Example

5.2 Fine-Tuning Code: finetune_pipeline.py

Usage Overview

How to add custom actions and skills in Eliza AI?

Understanding Actions in Eliza

Built-in Actions

Creating Custom Actions: Expanding Eliza’s Capabilities

Action Configuration Structure

Implementing a Custom Action

Examples of Custom Actions

Basic Action Template:

Advanced Action Example: Processing Documents:

Best Practices for Custom Actions

Testing Your Actions

AI Agents by Google: Revolutionizing AI with Reasoning and Tools

Understanding AI Agents

AI Agents vs. Traditional AI Models

The Role of Tools: Extensions, Functions, and Data Stores

Extensions

Functions

Data Stores

Improving Agent Performance

Getting Started with AI Agents