The Engine Room: Deploying to the Cloud with Bedrock Agentcore Runtime

#ai #agents #aws

The Engine Room: Deploying to the Cloud with Bedrock Agentcore Runtime
In Part 2 , we successfully built and tested the "brain" of our customer support agent using the Strands SDK. It can understand user requests, reason about which tool to use, and execute actions—all on our local machine. Now, it's time to move this agent from our terminal to the cloud. This is where Bedrock Agentcore Runtime shines, providing a secure, serverless engine designed specifically for agentic workloads and enabling us to deploy our agent with minimal code changes and a single command.

Introduction to Agentcore Runtime: The Secure, Serverless Engine

Amazon Bedrock Agentcore Runtime is the foundational compute layer of the Agentcore suite. It is not a generic container service; it is purpose-built infrastructure that addresses the unique demands of AI agents. Its core value propositions are :

- Serverless and Scalable: You deploy your agent's code, and Runtime handles everything else: provisioning infrastructure, managing capacity, and automatically scaling based on demand. You pay only for the compute time you consume, eliminating the need for idle servers.

- Secure by Design: Security is paramount, especially when agents handle sensitive customer data. Runtime provides complete session isolation by running each user's conversation in its own dedicated microVM with isolated CPU, memory, and filesystem resources. This robust separation prevents data leakage and cross-session contamination, a critical requirement for multi-tenant applications.

- Flexible and Resilient: Runtime is framework-agnostic, supporting agents built with Strands, LangGraph, CrewAI, or custom logic. It also supports long-running, asynchronous tasks for up to 8 hours, making it suitable for complex workflows that might involve multi-step reasoning or batch processing.

Adapting the Strands Agent for Cloud Deployment

One of the most compelling aspects of the Strands and Agentcore combination is how little you need to change your agent code to make it cloud-ready. We don't need to write a web server using FastAPI or Flask, manage API routing, or even build a Dockerfile manually. The bedrock-agentcore SDK provides a simple, declarative way to expose our agent.

We will create a new file, app.py, which will serve as the entry point for Agentcore Runtime. This file imports our support_agent from main.py and wraps its invocation logic with a few lines of code.

# app.py
from bedrock_agentcore import BedrockAgentCoreApp
from main import support_agent # Import the agent we already built

# 1. Instantiate the AgentCore App
app = BedrockAgentCoreApp()

# 2. Define the entrypoint for the runtime
@app.entrypoint
def invoke_agent(payload: dict, context) -> dict:
    """
    This function is the main entry point for the Agentcore Runtime.
    It receives the invocation payload and returns the agent's response.
    """
    try:
        user_message = payload.get("prompt")
        if not user_message:
            return {"error": "Prompt not provided."}

        # Call our existing Strands agent
        result = support_agent(user_message)

        return {"response": result.message}

    except Exception as e:
        print(f"Error invoking agent: {e}")
        return {"error": "An internal error occurred."}

# 3. Add a run block for local testing (optional but recommended)
if __name__ == "__main__":
    app.run()

That's it. The @app.entrypoint decorator is the key. It transforms our invoke_agent function into a standardized endpoint that the Agentcore Runtime service knows how to call. This simple pattern abstracts away all the underlying web server and networking complexity. We also need a requirements.txt file so the deployment toolkit knows which packages to install in the container.

# Create the requirements.txt file
echo "strands-agents" > requirements.txt
echo "strands-agents-tools" >> requirements.txt
echo "boto3" >> requirements.txt
echo "bedrock-agentcore" >> requirements.txt

One-Command Deployment: The Magic of the Starter Toolkit

With our code prepared, we can now deploy it using the bedrock-agentcore-starter-toolkit, a powerful command-line interface (CLI) that acts as a domain-specific Infrastructure-as-Code (IaC) tool for agentic workloads. It bridges the gap between our Python code and the required cloud infrastructure, automating a series of complex steps into two simple commands.

Step 1: Configure the Deployment

First, we run agentcore configure. This command inspects our project and interactively prompts us for any necessary configuration details, which it then saves to a local .bedrock_agentcore.yaml file.

# Make sure you have the toolkit installed: pip install bedrock-agentcore-starter-toolkit
agentcore configure --entrypoint app.py

The CLI will guide you through the setup, asking for an agent name and confirming the AWS resources it will create.

Step 2: Launch to the Cloud

Next, we run the agentcore launch command. This single command triggers a fully automated deployment pipeline :

agentcore launch

Here's what's happening "under the hood" while you wait:

1. Containerization: The toolkit uses AWS CodeBuild to build an ARM64-architected container image based on your code and requirements.txt. This happens in the cloud, so you don't even need Docker installed locally.
2. Infrastructure Provisioning: If this is your first deployment, the toolkit creates the necessary AWS resources, including an Amazon ECR repository to store your container images and an IAM execution role with the permissions your agent needs to run and access services like Amazon Bedrock.
3. Deployment: The container image is pushed to the ECR repository, and a new Agentcore Runtime is provisioned and started using your image.
4. Logging: CloudWatch Log groups are automatically configured, so you can immediately monitor your agent's logs.

This process encapsulates best practices for container-based deployments on AWS, saving developers from the steep learning curve of manually managing these resources.

Invoking and Interacting with the Cloud Agent

Once the launch command completes, it will output the ARN (Amazon Resource Name) of your deployed agent. You can now interact with it from anywhere.

Invocation via the CLI

The starter toolkit provides a simple invoke command for quick testing. This is a great way to verify that the agent is live and responsive.

agentcore invoke '{"prompt": "What is the return policy for apparel?"}'

You should receive a JSON response containing the agent's answer, just like you did locally.

Programmatic Invocation via the AWS SDK (Boto3)
For real-world applications, you'll invoke the agent programmatically. The following Python snippet shows how to do this using Boto3. The most important parameter here is sessionId. Agentcore Runtime uses this ID to route all requests for a given conversation to the same isolated microVM, thereby maintaining the conversation's state (in-memory variables, temporary files, etc.) for its duration. To carry on a conversation, you simply use the same sessionId for each subsequent call.

import boto3
import json
import uuid

# Configuration
AGENT_ARN = "arn:aws:bedrock-agentcore:us-east-1:123456789012:agent-runtime/YOUR_AGENT_ID" # Replace with your agent's ARN
AWS_REGION = "us-east-1"

# Create a client for the Agentcore Runtime data plane
agentcore_runtime_client = boto3.client(
    "bedrock-agentcore-runtime", 
    region_name=AWS_REGION
)

# Generate a unique session ID for a new conversation
session_id = str(uuid.uuid4())
print(f"Starting new session: {session_id}")

def ask_agent(prompt: str, session_id: str):
    """Invokes the agent and returns the response."""
    response = agentcore_runtime_client.invoke_agent_runtime(
        agentRuntimeArn=AGENT_ARN,
        sessionId=session_id,
        payload=json.dumps({"prompt": prompt}).encode('utf-8')
    )

    response_body = json.loads(response['body'].read().decode('utf-8'))
    return response_body.get("response")

# Start the conversation
response1 = ask_agent("Hi, what is the status of my order 67890?", session_id)
print(f"Agent Response 1: {response1}")

# Continue the same conversation
response2 = ask_agent("Great, can you also tell me the return policy for electronics?", session_id)
print(f"Agent Response 2: {response2}")

With our agent now running securely and scalably in the cloud, we have successfully bridged the gap from prototype to a production-grade service. However, our agent still suffers from amnesia between sessions and its tools are simple functions bundled with its code. In the final part of our series, we will elevate our solution to a truly enterprise-ready state by integrating Agentcore Memory, Gateway, Identity, and Observability.