亚马逊AWS官方博客

智能体驱动测试变革:让智能体成为测试第一性 之二:Browser Use 和 AgentCore Browser Tool

序言:从指令执行到智能自治

在上一篇文章中,我们介绍了 Amazon Nova ActAgentCore Browser Tool 的结合方式。它让测试智能体能够理解自然语言,并在浏览器中自动执行操作,显著减少了测试脚本的编写和维护的成本。

但在相对更为复杂的测试场景中,测试流程可能跨越多个系统、页面和数据源,步骤之间存在依赖和不确定性,需要智能体具备更加自主的判断和适应能力。

这时,Browser Use 与 AgentCore Browser Tool 的组合展现出新的可能。与 Nova Act 相比,它更适合处理相对更为复杂、动态变化的测试任务。Browser Use 不再依赖固定步骤,而是让智能体在浏览器环境中自主规划操作路径、调用工具、分析反馈并修正策略。

测试人员不再详细描述相对固定的操作步骤,而只需定义目标与约束,智能体便能在目标范围内完成探索、判断与执行。

在这篇文章中,我们将介绍 Browser Use 与 AgentCore Browser Tool 的核心机制,对比它与 Nova Act 的设计思路,并探讨为何在相对更为复杂的场景中,这种自治方式能够和 Amazon Nova ActAgentCore Browser Tool 优势互补

Nova Act + AgentCore Browser Tool vs. Browser Use + AgentCore Browser Tool 对比分析

1. 技术架构与模块集成方式

Nova Act

概述:
Nova Act 是亚马逊推出的浏览器自动化 SDK,它依赖于 AWS 提供的 Nova Act 服务Playwright 浏览器

使用流程如下:

  1. 开发者首先创建一个 AgentCore 浏览器会话
  2. 通过 tools.browser_client 获取 CDP(Chrome DevTools Protocol)的 WebSocket URL 和授权头信息;
  3. 然后在 NovaAct 构造器中传入这些参数、API 密钥以及起始页面。

示例代码:

with NovaAct(
    cdp_endpoint_url=ws_url,
    cdp_headers=headers,
    nova_act_api_key=NOVA_ACT_API_KEY,
    starting_page="https://www.amazon.com"
) as nova_act:
    result = nova_act.act("搜索咖啡机,并获取首页最低价格的详情")

执行机制:
调用 nova_act.act() 时,SDK 会将自然语言指令和当前网页状态发送给 Nova 模型
模型返回一系列可执行的低级浏览器操作(如点击、输入、滚动等),
然后由 PlaywrightAgentCore 浏览器实例 上执行这些操作。

整个流程由开发者在 Python 代码中显式编排,
Nova Act 负责将自然语言指令精确地翻译为可执行操作。

Browser Use

概述:
Browser Use 是一个 开源的浏览器代理框架,设计为通用 AI Agent。
它可与 AgentCore 浏览器无缝集成,通过 LLM(如 Claude)来规划和执行网页任务。

使用流程如下:

  1. 启动 BrowserClient,生成 CDP WebSocket 地址和授权头;
  2. 使用 BrowserSession 连接到该浏览器会话;
  3. 构造 BrowserUseAgent(Browser Use 的核心代理类),传入任务描述、选定的 LLM 实例以及浏览器会话对象。

示例代码:

client = BrowserClient(region=region)
client.start()

ws_url, headers = client.generate_ws_headers()

browser_session = BrowserSession(
    cdp_url=ws_url,
    browser_profile=BrowserProfile(headers=headers),
    keep_alive=True
)await browser_session.start()

browser_use_agent = BrowserUseAgent(
    task=instruction,
    llm=bedrock_chat,
    browser_session=browser_session
)
result = await browser_use_agent.run()

执行机制:
在这个模式中,Browser Use 自身内置了一个多轮 Agent 环境。
它使用 LLM(如 Claude)来进行任务规划,并提供丰富的“工具”(actions)去驱动浏览器或调用 API。

与 Nova Act 不同的是:

  • Browser Use 更像一个完整的 Agent 平台
  • 它解耦了 自然语言处理LLM 决策浏览器执行
  • 开发者可以通过配置和注册自定义的工具(Actions)来扩展 Agent 能力。

设计理念差异

Nova Act 的设计着眼于一定程度的可靠性和可预测性,要求用户将复杂流程拆分成具体、逐步的操作步骤。它以“命令-执行”方式引导 AI 聚焦于当前步骤,降低歧义,提供了基于 Pydantic 模型的结构化信息提取等功能。Browser Use 则定位于一个通用的浏览器代理 Agent,支持多轮会话和自定义工具。它提供开放的工具架构,允许在 Agent 内注册函数(用 @tools.action 注解)来扩展新功能(如表单填写、数据保存、API 调用等)。此外,目前 Nova Act 模型固定为亚马逊 Nova 系列;而 Browser Use 开源且支持多种 LLM(可用 Claude、GPT 等),可以根据情况灵活配置。

2. 对复杂任务的支持能力

任务推理与多轮交互

Nova Act 擅长由开发者预先定义好的多步任务:每次调用 act() 执行一个逻辑明确的动作(如点击某按钮、填写表单等),由用户代码控制何时继续下一步。Nova Act 要求指令尽量具化,复杂的目标需要手动分解为多个子目标。其模型对基础 UI 操作具有高准确率,但相较于完全开放式的高层目标,这类任务更适合相对明确的目标分解,从而能够帮助稳定其端到端的执行能力。因此,它并不是一个完全自主推理的 Agent,更偏向于有限受控的自动化工具。

Browser Use 则集成了一个真正的多轮 Agent:给定高层目标,它可以在后台通过 LLM 迭代规划执行步骤并执行。比如一个任务可以由一个初始提示词触发,Browser Use Agent 会“自我对话”,使用工具不断与浏览器交互直至完成。它支持诸如持久会话、多标签浏览和错误处理策略,能够在执行中询问用户或调用备用工具。实践中,Browser Use 常被用来处理需要综合推理的任务:例如,某工程师通过 Strands 和 Browser Use 在 AgentCore 平台上提出“如何创建 S3 生命周期策略”问题,Agent 会自动搜索并汇总了官方文档的关键信息,给出长段详尽的答案。这种高度自动化的多步研究任务对 Nova Act 来说,需要编写大量逐步调用,而 Browser Use 只需要一个任务指令即可触发完整流程。

高维指令处理

Nova Act 的交互主要围绕元素级操作,它通过精细提示控制页面元素;它也支持将页面内容提取到 Pydantic 模型中,用于后续数据处理。但它不内置复合指令的推理功能,所有复杂逻辑需由开发者代码管理。Browser Use 则可以让 LLM 负责分解高维目标,并在必要时通过工具链调用外部能力(如文件操作、Web API 调用等)。Browser Use 提供丰富的工具扩展机制,在处理结构化或多步骤任务时更加灵活。

3. 插件生态与扩展能力

Nova Act

目前 Nova Act SDK 以封闭形式提供。它允许开发者在脚本中任意插入 Python 代码(如并行线程、API 调用、调试断点等),并用 Pydantic 模型定义网页数据提取结构。但就“插件”层面,目前没有公开的第三方插件系统。Nova Act 可以与其他 AWS Agent API(如搜索、语音文本转换等)配合使用。也就是说,开发者可在 Nova Act 工作流中自行调用 AWS 服务或自定义代码,但并不是通过 Nova Act 内置插件机制实现。

Browser Use

作为开源框架,Browser Use 本身就是一个高度可扩展的平台。它提供了一个 Tools 架构:开发者可以使用 @tools.action 注解注册任意函数,将其作为可调用工具(action)暴露给 LLM。这些自定义工具可以指定参数类型(支持 Pydantic 数据模型作为输入)、直接访问当前的 browser_session 和 DevTools 客户端等。举例来说,可以定义“向人类求助”的工具,在需要时提示用户输入答案,或定义“保存数据到文件”的工具,Agent 在执行过程中可以调用这些工具。此外,Browser Use 有文档和示例支持与 Playwright 等底层浏览器库集成。总之,Browser Use 的“插件生态”主要体现在其开放的 Tools 体系和社区驱动的扩展;任何开发者都可以根据需求编写新工具模块并集成入 Agent。

4. 使用场景适配性

Nova Act 适用场景

Nova Act 适合那些流程清晰、可拆分为明确步骤的 Web 自动化任务。例如电商下单、表单填写、单页面数据抓取等场景。典型示例包括:使用 Nova Act 在 vegas.com 上搜索酒店列表,然后切换到 Google Maps 计算每个酒店的步行距离,并将结果并行处理;或在亚马逊网站上查找并自动添加特定商品到购物车。这些任务可以逐步写进代码,每个 act() 调用对应一段操作。由于 Nova Act 要求相对具体的控制行为,它更适用于业务流程比较稳定、目标相对细化的任务。

Browser Use 适用场景

Browser Use 对应的是更复杂、开放的任务:例如需要综合搜索、多站点数据汇总、动态页面导航等。它可以在一个 Agent 会话中自动浏览多个页面、搜索信息并总结答案,而开发者只需给出初始目标。Browser Use 特别适合需要上下文跟踪或决策分支的场景(如科研搜寻、复杂表单填充、多步骤政策制定等)。总的来说,如果任务是简单查询或固定步骤,Nova Act 可能更直观;而对于多步信息检索或网页研究类任务,Browser Use 由于其自主规划和强扩展性通常更有优势。例如前述 AWS 文档自动化的案例中,Browser Use 和 AgentCore 浏览器工具实现了跨多页检索与汇总,这种场景如果用 Nova Act 就需要编写更为细化步骤的脚本来管理。

5. 性能、效率与资源消耗

资源管理

Amazon AgentCore 浏览器工具本身采用无服务器架构,自动伸缩浏览器实例。无论 Nova Act 还是 Browser Use 均通过 CDP 连接到这一云端浏览器,因此浏览器计算资源由 AWS 管理、按需计费。开发者无需自行搭建浏览器集群。

响应速度

Nova Act 的响应时间主要受限于 Nova 模型的推理速度和 Playwright 执行操作。首轮运行需安装浏览器组件,后续调用启动较快。Browser Use 的性能取决于所选 LLM(如 Claude)和网络延迟,如果使用云浏览器,也涉及到跨服务的通信。实践中,一个较长的搜索任务可能运行数分钟。总体而言,两者都需要等待模型输出和浏览器交互,只是 Nova Act 依赖的是亚马逊的 Nova 系列模型,而 Browser Use 则可灵活选择。

错误恢复

Nova Act 提供了一些调试工具来辅助故障排查:每次 act() 调用后会生成一个 HTML 调用追踪文件,保存了该步骤的截图和识别元素,用于回放和定位问题;还可开启全程视频录制以查看完整流程。然而,Nova Act 的错误处理需由开发者编写代码捕获 ActError 异常或修改提示来重试。Browser Use 则可借助其工具机制进行恢复,例如内置的“人类输入”工具可以在执行出错时暂停请求用户帮助;同时,其多轮 Agent 机制使其能在一定程度上自我纠正和重试。总的来看,Browser Use 自带灵活的错误处理策略,而 Nova Act 更侧重通过日志和追踪定位问题后人工调整。

成本效率

Nova Act 作为 AWS 服务的一部分,按使用量计费,目前预览期免费使用。业界分析认为其成本效率较高;Browser Use 本身免费开源,但运行时需要依赖 LLM API(可能按调用计费)和浏览器实例资源。对企业而言,Browser Use 的开源灵活性意味着需要自行承担基础设施成本,而 Nova Act 则享受 AWS 集成优势和定价政策。

6.更为复杂场景下为何推荐 Browser Use 与 AgentCore Browser Tool 的组合

综合上述对比,当任务场景极其复杂、多步骤且需自主规划时,Browser Use 与 AgentCore Browser Tool 的组合更具优势。其开放的 Agent 架构可让模型自行决策执行路径,并通过可扩展工具链调用任何外部能力。例如,在自动化 AWS 文档查询示例中,一个 Browser Use 代理即可遍历多个网页并汇总答案,而若使用 Nova Act,则需人工拆分每个浏览、检索动作并编写相应脚本。开源项目实际使用经验也表明,Browser Use 社区提供了丰富示例和工具,方便应对意外情况。相比之下,Nova Act 更适用于相对可预见的工作流;一旦遇到高层次目标或歧义指令,Nova Act 的执行效率会有所影响。因此,对于需要综合理解多页面内容、动态决策或嵌入自定义工具的复杂任务,使用 Browser Use 与 AgentCore Browser Tool 的组合能更好地发挥灵活性和可扩展性,和 Amazon Nova ActAgentCore Browser Tool 的组合优势互补。

测试场景

接下来,我们通过一个相对复杂的测试场景演示 Browser Use 与 AgentCore Browser Tool 的组合,针对高维测试目标,执行复杂测试任务的能力。

环境准备

git clone https://github.com/awslabs/amazon-bedrock-agentcore-samples.git 
pip install bedrock-agentcore browser-use boto3 rich

场景示例

测试场景:

这个测试用例旨在验证 AI Agent 在真实电商测试场景中的自主决策与端到端执行能力。任务内容为:在 Amazon.com 上搜索 MacBook,提取前两个商品信息,将其加入购物车,计算总金额并验证与购物车显示金额是否一致。
通过 Browser Use 与 AgentCore Browser Tool 的结合,AI 仅需接收高层级目标,即可自主分解出页面导航、元素识别、数据提取、购物车操作、数值计算和结果验证等具体步骤,充分体现其从高维目标到低维执行计划的自主规划和执行能力。

核心能力:

  • 自主任务规划 – AI自动分解复杂任务
  • 智能网页交互 – 自动搜索、提取、填写、点击
  • 多步骤操作 – 完整的购物流程自动化
  • 数据计算与验证 – 自动执行数学计算并验证结果
  • 实时可视化 – DCV协议实时查看操作过程

该场景覆盖了搜索、提取、加购、计算与验证等多个复杂步骤,充分体现了 Browser Use 的自主决策与执行能力。

live_view_browser_use.py

#!/usr/bin/env python3
"""
Live Browser Viewer with Browser-Use and Amazon Bedrock AgentCore
Demonstrates AI-driven browser automation with real-time viewing capability
"""

import asyncio
import argparse
import sys
from contextlib import suppress
from pathlib import Path

from browser_use import Agent, Browser, BrowserProfile
from browser_use.llm import ChatAnthropicBedrock
from bedrock_agentcore.tools.browser_client import BrowserClient
from boto3.session import Session
from rich.console import Console
from rich.panel import Panel

# Add interactive_tools to path if using BrowserViewerServer
sys.path.append(str(Path(__file__).parent / "01-tutorials" / "05-AgentCore-tools" / "02-Agent-Core-browser-tool" / "interactive_tools"))

try:
    from browser_viewer import BrowserViewerServer
    HAS_VIEWER = True
except ImportError:
    HAS_VIEWER = False
    print("Warning: BrowserViewerServer not available. Live viewing disabled.")

console = Console()


async def run_browser_task(browser_session: Browser, bedrock_chat: ChatAnthropicBedrock, task: str) -> None:
    """Execute browser automation task using Browser-Use agent"""
    try:
        console.print(f"\n[bold blue]🤖 Executing task:[/bold blue] {task}")
        
        agent = Agent(task=task, llm=bedrock_chat, browser_session=browser_session)
        
        with console.status("[bold green]Running browser automation...[/bold green]", spinner="dots"):
            await agent.run()
        
        console.print("[bold green]✅ Task completed successfully![/bold green]")
    
    except Exception as e:
        console.print(f"[bold red]❌ Error during task execution:[/bold red] {str(e)}")
        raise


async def main(prompt: str, region: str = "us-west-2", enable_viewer: bool = True):
    """
    Main workflow:
    1. Create Bedrock AgentCore browser client
    2. Start live viewer server (optional)
    3. Initialize browser session with CDP
    4. Execute AI-driven tasks
    5. Cleanup resources
    """
    console.print(Panel(
        "[bold cyan]Browser Live Viewer with Browser-Use[/bold cyan]\n\n"
        "Features:\n"
        "• Live browser viewing with DCV\n"
        "• AI automation with Browser-Use\n"
        "• Claude 3.7 Sonnet model\n"
        "• Real-time task execution\n\n"
        "[yellow]Press Ctrl+C to stop[/yellow]",
        title="Amazon Bedrock AgentCore Browser Tool",
        border_style="blue"
    ))
    
    client = None
    browser_session = None
    viewer = None
    
    try:
        # Step 1: Create browser client
        console.print("\n[cyan]Step 1: Creating browser client...[/cyan]")
        client = BrowserClient(region)
        client.start()
        ws_url, headers = client.generate_ws_headers()
        console.print(f"[green]✅ Browser client started in {region}[/green]")
        
        # Step 2: Start viewer (optional)
        if enable_viewer and HAS_VIEWER:
            console.print("\n[cyan]Step 2: Starting live viewer...[/cyan]")
            viewer = BrowserViewerServer(client, port=8000)
            viewer_url = viewer.start(open_browser=True)
            console.print(f"[green]✅ Viewer available at: {viewer_url}[/green]")
        
        # Step 3: Wait for browser initialization
        console.print("\n[cyan]Step 3: Waiting for browser initialization...[/cyan]")
        await asyncio.sleep(10)  # Required delay for browser to be ready
        console.print("[green]✅ Browser initialized[/green]")
        
        # Step 4: Initialize browser session with headers
        console.print("\n[cyan]Step 4: Creating browser session...[/cyan]")
        browser_session = Browser(
            cdp_url=ws_url,
            browser_profile=BrowserProfile(headers=headers, timeout=150000),
            keep_alive=True
        )
        await browser_session.start()
        console.print("[green]✅ Browser session ready[/green]")
        
        # Step 5: Create LLM and execute task
        console.print("\n[cyan]Step 5: Executing browser task...[/cyan]")
        bedrock_chat = ChatAnthropicBedrock(
            model="us.anthropic.claude-3-7-sonnet-20250219-v1:0",
            aws_region=region
        )
        
        await run_browser_task(browser_session, bedrock_chat, prompt)
        
    except KeyboardInterrupt:
        console.print("\n[yellow]⚠️ Interrupted by user[/yellow]")
    except Exception as e:
        console.print(f"\n[red]❌ Error: {e}[/red]")
        raise
    finally:
        # Cleanup
        console.print("\n[yellow]Shutting down...[/yellow]")
        if browser_session:
            with suppress(Exception):
                await browser_session.close()
            console.print("✅ Browser session closed")
        if client:
            client.stop()
            console.print("✅ Browser client stopped")


if __name__ == "__main__":
    parser = argparse.ArgumentParser(description="Live browser automation with Browser-Use and Bedrock AgentCore")
    parser.add_argument("--prompt", required=True, help="Natural language task for the browser agent")
    parser.add_argument("--region", default="us-west-2", help="AWS region (default: us-west-2)")
    parser.add_argument("--no-viewer", action="store_true", help="Disable live viewer")
    
    args = parser.parse_args()
    
    asyncio.run(main(args.prompt, args.region, enable_viewer=not args.no_viewer))

live_view_browser_use.py引入browser_use和bedrock_agentcore依赖,实现功能模块:

run_browser_task 函数:

  • 执行浏览器自动化任务
  • 使用 Claude 3.7 模型处理自然语言指令
  • 提供任务执行状态的实时反馈

main 函数(主要工作流程):

  • 创建浏览器客户端(BrowserClient)
  • 启动实时查看服务器(可选)
  • 初始化浏览器会话
  • 执行 AI 驱动的任务
  • 管理资源清理

patch_browser_use.py

#!/usr/bin/env python3
"""Auto-detect and patch browser_use session.py"""

import os
import shutil
import sys
from pathlib import Path

def find_browser_use_path():
    """Automatically find the browser_use installation path"""
    try:
        import browser_use
        browser_use_path = Path(browser_use.__file__).parent
        session_file = browser_use_path / "browser" / "session.py"
        return str(session_file)
    except ImportError:
        print("❌ browser_use not installed. Install with: pip install browser-use")
        return None

def patch_browser_use():
    file_path = find_browser_use_path()
    if not file_path:
        return False
    
    if not os.path.exists(file_path):
        print(f"❌ File not found: {file_path}")
        return False
    
    print(f"📁 Found browser_use at: {file_path}")
    
    # Create backup
    backup_path = file_path + ".backup"
    if not os.path.exists(backup_path):
        shutil.copy2(file_path, backup_path)
        print(f"💾 Created backup: {backup_path}")
    else:
        print(f"📋 Backup already exists: {backup_path}")
    
    # Read file
    with open(file_path, 'r', encoding='utf-8') as f:
        content = f.read()
    
    # Replacement 1: Add headers check after cdp_url check
    old1 = "if not cdp_url:\n\t\t\tprofile_kwargs['is_local'] = True"
    new1 = "if not cdp_url:\n\t\t\tprofile_kwargs['is_local'] = True\n\n\t\tif headers:\n\t\t\tprofile_kwargs['headers'] = headers"
    
    if old1 in content and "if headers:\n\t\t\tprofile_kwargs['headers'] = headers" not in content:
        content = content.replace(old1, new1)
        print("✅ Added headers check")
    elif "if headers:\n\t\t\tprofile_kwargs['headers'] = headers" in content:
        print("✅ Headers check already exists")
    else:
        print("⚠️ Headers check pattern not found")
    
    # Replacement 2: Add headers to CDPClient
    old2 = "self._cdp_client_root = CDPClient(self.cdp_url)"
    new2 = "self._cdp_client_root = CDPClient(self.cdp_url, additional_headers=self.browser_profile.headers)"
    
    if old2 in content:
        content = content.replace(old2, new2)
        print("✅ Added headers to CDPClient")
    elif "additional_headers=self.browser_profile.headers" in content:
        print("✅ CDPClient headers already exists")
    else:
        print("⚠️ CDPClient pattern not found")
    
    # Write back
    with open(file_path, 'w', encoding='utf-8') as f:
        f.write(content)
    
    print("🎉 Patching complete!")
    return True

if __name__ == "__main__":
    success = patch_browser_use()
    sys.exit(0 if success else 1)

patch_browser_use.py的主要作用是增强 browser_use 的自定义请求头,在CDP Client初始化时确保 WebSocket 连接认证正确。这也是browser_use的灵活性体现,在真实网页交互,内容提取,复杂来源整合等场景中,browser_use的交互功能与跨平台整合功能更加强大。

setup_and_run.py

#!/usr/bin/env python3
"""
Setup script: Patches browser-use and runs the live viewer
Usage: python setup_and_run.py --prompt "Your task here"
"""

import subprocess
import sys
import argparse

def main():
    parser = argparse.ArgumentParser(description="Setup and run browser automation")
    parser.add_argument("--prompt", required=True, help="Natural language task for the browser agent")
    parser.add_argument("--region", default="us-west-2", help="AWS region")
    parser.add_argument("--no-viewer", action="store_true", help="Disable live viewer")
    args = parser.parse_args()
    
    print("=" * 80)
    print("STEP 1: Patching browser-use library...")
    print("=" * 80)
    result = subprocess.run([sys.executable, "patch_browser_use.py"], capture_output=False)
    
    if result.returncode != 0:
        print("\n❌ Patching failed. Please check the error above.")
        sys.exit(1)
    
    print("\n" + "=" * 80)
    print("STEP 2: Running browser automation...")
    print("=" * 80)
    
    cmd = [sys.executable, "live_view_browser_use.py", "--prompt", args.prompt, "--region", args.region]
    if args.no_viewer:
        cmd.append("--no-viewer")
    
    subprocess.run(cmd)

if __name__ == "__main__":
    main()

setup_and_run.py是项目主执行文件。

调用范例:

python setup_and_run.py --prompt "Search for MacBook on amazon.com and extract the first two items, add them to the shopping cart, and calculate the total amount to see if it matches the amount shown in the shopping cart."

STEP 1: Patching browser-use library...

📁 Found browser_use at: D:\Users\yz\AppData\Local\Programs\Python\Python313\Lib\site-packages\browser_use\browser\session.py
📋 Backup already exists: D:\Users\yz\AppData\Local\Programs\Python\Python313\Lib\site-packages\browser_use\browser\session.py.backup
✅ Headers check already exists
✅ CDPClient headers already exists
🎉 Patching complete!


STEP 2: Running browser automation...

╭───────────────────────────────────────────────────────────────────────── Amazon Bedrock AgentCore Browser Tool ──────────────────────────────────────────────────────────────────────────╮│ Browser Live Viewer with Browser-Use ││ ││ Features: ││ • Live browser viewing with DCV ││ • AI automation with Browser-Use ││ • Claude 3.7 Sonnet model ││ • Real-time task execution ││ ││ Press Ctrl+C to stop │╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Step 1: Creating browser client...
INFO [botocore.credentials] Found credentials in shared credentials file: ~/.aws/credentials
INFO [bedrock_agentcore.tools.browser_client] Starting browser session...
INFO [bedrock_agentcore.tools.browser_client] Generating websocket headers...
INFO [botocore.credentials] Found credentials in shared credentials file: ~/.aws/credentials
✅ Browser client started in us-west-2

Step 2: Starting live viewer...
✅ DCV SDK found (1,938,399 bytes)

✅ Viewer server running at: http://localhost:8000
Check browser console (F12) for detailed debug information

Opening browser...
✅ Viewer available at: http://localhost:8000

Step 3: Waiting for browser initialization...
INFO [bedrock_agentcore.tools.browser_client] Generating live view url...
INFO [botocore.credentials] Found credentials in shared credentials file: ~/.aws/credentials

Generated presigned URL:
https://bedrock-agentcore.us-west-2.amazonaws.com/browser-streams/aws.browser.v1/sessions/01K7XGD9WT28PQ458T5XHCD5D0/live-view?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAJX774AJ4ZSH5JZDQ%2F20251019%2Fus-west-2%2Fbedrock-agentcore%2Faws4_request&X-Amz-Date=20251019T055224Z&X-Amz-Expires=300&X-Amz-SignedHeaders=host&X-Amz-Signature=d6eb4383ae0762d6cb6c38ff4709be545da63862a8b3ae9f866775278fb71c0f

✅ Browser initialized

Step 4: Creating browser session...
✅ Browser session ready

Step 5: Executing browser task...

🤖 Executing task: Search for MacBook on amazon.com and extract the first two items, add them to the shopping cart, and calculate the total amount to see if it matches the amount shown in
the shopping cart.
INFO [service] Using anonymized telemetry, see https://docs.browser-use.com/development/telemetry.
INFO [Agent] 🔗 Found URL in task: https://amazon.com, adding as initial action...
INFO [Agent] 🎯 Task: Search for MacBook on amazon.com and extract the first two items, add them to the shopping cart, and calculate the total amount to see if it matches the amount shown in the shopping cart.
⠹ Running browser automation...INFO [Agent] Starting a browser-use agent with version 0.8.1, with provider=anthropic_bedrock and model=us.anthropic.claude-3-7-sonnet-20250219-v1:0
INFO [Agent] ▶️ navigate: url: https://amazon.com, new_tab: False
⠙ Running browser automation...INFO [tools] 🔗 Navigated to https://amazon.com
⠇ Running browser automation...WARNING [bubus] ================================================================================
WARNING [bubus] ⏱️ TIMEOUT ERROR - Handling took more than 8.0s for EventBus_54b77007.browser_use.browser.watchdog_base.ScreenshotWatchdog.on_ScreenshotEvent(?▶ ScreenshotEvent#0f9d ✅)
WARNING [bubus] ================================================================================
WARNING [bubus] 📣 BrowserStateRequestEvent#1fe9 10s
WARNING [bubus] ☑️ browser_use.browser.watchdog_base.DownloadsWatchdog.on_BrowserStateRequestEvent(#1fe9) 0s/30s ✓
WARNING [bubus] 📣 NavigationCompleteEvent#e41a 9s
WARNING [bubus] ☑️ browser_use.browser.watchdog_base.DownloadsWatchdog.on_NavigationCompleteEvent(#e41a) 0s/30s ✓
WARNING [bubus] ☑️ browser_use.browser.watchdog_base.SecurityWatchdog.on_NavigationCompleteEvent(#e41a) 0s/30s ✓
WARNING [bubus] ➡️ browser_use.browser.watchdog_base.DOMWatchdog.on_BrowserStateRequestEvent(#1fe9) ⏳ 10s/30s
WARNING [bubus] 📣 ScreenshotEvent#0f9d 8s
WARNING [bubus] ⏰ browser_use.browser.watchdog_base.ScreenshotWatchdog.on_ScreenshotEvent(#0f9d) ⌛️ 8s/ 8s ⬅️ TIMEOUT HERE ⏰
WARNING [bubus]


WARNING [BrowserSession] 📸 Clean screenshot timed out after 6 seconds - no handler registered or slow page?
⠹ Running browser automation...WARNING [BrowserSession] 🔍 DOMWatchdog.on_BrowserStateRequestEvent: Clean screenshot failed: Event handler browser_use.browser.watchdog_base.ScreenshotWatchdog.on_ScreenshotEvent#0512(?▶ ScreenshotEvent#0f9d 🏃) timed out after 8.0s
⠦ Running browser automation...INFO [Agent]

INFO [Agent] 📍 Step 1:
⠋ Running browser automation...INFO [botocore.credentials] Found credentials in shared credentials file: ~/.aws/credentials
⠦ Running browser automation...INFO [Agent] 👍 Eval: The page has just loaded for the first time, so there is no previous goal to evaluate. Amazon.com homepage is successfully loaded. Verdict: Success.
INFO [Agent] 🧠 Memory: I'm on the Amazon.com homepage. I need to search for MacBooks, extract details of the first two items, add them to the cart, and verify the total price calculation. This is step 1 of the process.
INFO [Agent] 🎯 Next goal: Search for MacBook by entering the keyword in the search box and clicking the search button.
INFO [Agent] ▶️ [1/2] input: index: 9, text: MacBook
⠏ Running browser automation...INFO [BrowserSession] ⌨️ Typed "MacBook" into element with index 9
⠙ Running browser automation...INFO [Agent] ▶️ [2/2] click: index: 12
⠋ Running browser automation...INFO [tools] 🖱️ Clicked element
⠋ Running browser automation...WARNING [bubus] ================================================================================
WARNING [bubus] ⏱️ TIMEOUT ERROR - Handling took more than 8.0s for EventBus_54b77007.browser_use.browser.watchdog_base.ScreenshotWatchdog.on_ScreenshotEvent(?▶ ScreenshotEvent#f6d1 ✅)
WARNING [bubus] ================================================================================
WARNING [bubus] 📣 BrowserStateRequestEvent#0dd3 10s
WARNING [bubus] ☑️ browser_use.browser.watchdog_base.DownloadsWatchdog.on_BrowserStateRequestEvent(#0dd3) 0s/30s ✓
WARNING [bubus] 📣 NavigationCompleteEvent#0a6d 9s
WARNING [bubus] ☑️ browser_use.browser.watchdog_base.DownloadsWatchdog.on_NavigationCompleteEvent(#0a6d) 1s/30s ✓
WARNING [bubus] ☑️ browser_use.browser.watchdog_base.SecurityWatchdog.on_NavigationCompleteEvent(#0a6d) 0s/30s ✓
WARNING [bubus] ➡️ browser_use.browser.watchdog_base.DOMWatchdog.on_BrowserStateRequestEvent(#0dd3) ⏳ 10s/30s
WARNING [bubus] 📣 ScreenshotEvent#f6d1 8s
WARNING [bubus] ⏰ browser_use.browser.watchdog_base.ScreenshotWatchdog.on_ScreenshotEvent(#f6d1) ⌛️ 8s/ 8s ⬅️ TIMEOUT HERE ⏰
WARNING [bubus]


⠙ Running browser automation...WARNING [BrowserSession] 📸 Clean screenshot timed out after 6 seconds - no handler registered or slow page?
⠧ Running browser automation...WARNING [BrowserSession] 🔍 DOMWatchdog.on_BrowserStateRequestEvent: Clean screenshot failed: Event handler browser_use.browser.watchdog_base.ScreenshotWatchdog.on_ScreenshotEvent#0512(?▶ ScreenshotEvent#f6d1 🏃) timed out after 8.0s
⠹ Running browser automation...INFO [Agent]

INFO [Agent] 📍 Step 2:
⠇ Running browser automation...INFO [Agent] 👍 Eval: Successfully searched for MacBook on Amazon and the search results page is now displayed showing various MacBook models. Verdict: Success.
INFO [Agent] 🧠 Memory: I'm on the Amazon search results page for MacBook. I can see multiple MacBook models with their prices. The first one is a 2025 MacBook Air 13-inch with M4 chip priced at $849.99, and the second is a 2024 MacBook Pro with M4 chip priced at $1,349.00. I need to add both to cart and verify the total.
INFO [Agent] 🎯 Next goal: Click the "Add to cart" button for the first MacBook (2025 MacBook Air) to begin the shopping cart process.
INFO [Agent] ▶️ click: index: 129
⠏ Running browser automation...INFO [tools] 🖱️ Clicked element
⠹ Running browser automation...INFO [Agent]

INFO [Agent] 📍 Step 3:
⠸ Running browser automation...INFO [Agent] 👍 Eval: Successfully added the first MacBook (2025 MacBook Air) to the cart as indicated by the "1 item in cart" [20] text in the navigation bar. Verdict: Success.
INFO [Agent] 🧠 Memory: I'm on the Amazon search results page for MacBook. I've already added the first MacBook (2025 MacBook Air at $849.99) to the cart. Now I need to add the second MacBook (2024 MacBook Pro at $1,349.00) to the cart, then view the cart to verify the total amount calculation.
INFO [Agent] 🎯 Next goal: Click the "Add to cart" button for the second MacBook (2024 MacBook Pro) to complete adding both items to the shopping cart.
INFO [Agent] ▶️ click: index: 163
⠏ Running browser automation...INFO [tools] 🖱️ Clicked element
⠧ Running browser automation...INFO [Agent]

INFO [Agent] 📍 Step 4:
⠼ Running browser automation...INFO [Agent] 👍 Eval: Successfully added the second MacBook (2024 MacBook Pro) to the cart as indicated by the "2 items in cart" [20] text in the navigation bar. Verdict: Success.
INFO [Agent] 🧠 Memory: I've successfully added both MacBooks to the cart: the 2025 MacBook Air at $849.99 and the 2024 MacBook Pro at $1,349.00. The expected total should be $2,198.99. Now I need to check the cart to verify if the actual total matches this calculation.
INFO [Agent] 🎯 Next goal: Click on the cart icon to view the cart contents and verify if the total price matches our calculation of $2,198.99.
INFO [Agent] ▶️ click: index: 20
⠦ Running browser automation...WARNING [BrowserSession] No visible quad found, using first quad
⠏ Running browser automation...INFO [tools] 🖱️ Clicked element
⠇ Running browser automation...INFO [Agent]

INFO [Agent] 📍 Step 5:
⠼ Running browser automation...INFO [Agent] ⚠️ Eval: I clicked on the cart icon in the previous step, but I'm still on the search results page. The action might not have been registeered correctly. Verdict: Failure.
INFO [Agent] 🧠 Memory: I've searched for MacBooks on Amazon and added two items to the cart: the 2025 MacBook Air at $849.99 and the 2024 MacBook Pro at $1,349.00. The expected total is $2,198.99. I attempted to view the cart but need to try again.
INFO [Agent] 🎯 Next goal: Click on the cart icon [20] again to view the cart contents and verify if the total price matches our calculated total of $2,198.99.
INFO [Agent] ▶️ click: index: 20
⠦ Running browser automation...WARNING [BrowserSession] No visible quad found, using first quad
⠇ Running browser automation...INFO [tools] 🖱️ Clicked element
⠴ Running browser automation...INFO [Agent]

INFO [Agent] 📍 Step 6:
⠇ Running browser automation...INFO [Agent] 👍 Eval: Successfully clicked the cart icon in the previous step, but the action did not result in navigating to the cart page as expected. Verdict: Failure.
INFO [Agent] 🧠 Memory: I've searched for MacBooks on Amazon and successfully added two items to the cart: the 2025 MacBook Air at $849.99 and the 2024 MacBook Pro at $1,349.00. The expected total should be $2,198.99. I've attempted to navigate to the cart twice by clicking on the cart icon [20], but remain on the search results page.
INFO [Agent] 🎯 Next goal: Click on the cart icon [20] with text "Cart" again, using a more deliberate approach to navigate to the shopping cart page and verify the total price.
INFO [Agent] ▶️ click: index: 20
⠹ Running browser automation...WARNING [BrowserSession] No visible quad found, using first quad
⠴ Running browser automation...INFO [tools] 🖱️ Clicked element
⠼ Running browser automation...INFO [Agent]

INFO [Agent] 📍 Step 7:
⠋ Running browser automation...INFO [Agent] 👍 Eval: I clicked on the cart icon [20] in the previous step, but remained on the search results page. Three attempts to navigate to the cart via the icon have been unsuccessful. Verdict: Failure.
INFO [Agent] 🧠 Memory: I've searched for MacBooks on Amazon and successfully added two items to the cart: the 2025 MacBook Air at $849.99 and the 2024 MacBook Pro at $1,349.00. The expected total should be $2,198.99. I've attempted to click the cart icon three times without success in navigating to the cart page.
⠙ Running browser automation...INFO [Agent] 🎯 Next goal: Try navigating directly to the Amazon cart URL to view the cart contents and verify the total price calculation.
INFO [Agent] ▶️ navigate: url: https://www.amazon.com/gp/cart/view.html
⠼ Running browser automation...INFO [tools] 🔗 Navigated to https://www.amazon.com/gp/cart/view.html
⠦ Running browser automation...INFO [Agent]

INFO [Agent] 📍 Step 8:
⠸ Running browser automation...INFO [Agent] 👍 Eval: Successfully navigated to the Amazon cart page, which shows both MacBooks and allows verification of the total price calculation. Verdict: Success.
INFO [Agent] 🧠 Memory: I've completed the entire task. I searched for MacBooks on Amazon, added two items to the cart (2025 MacBook Air at $849.99 and 2024 MacBook Pro at $1,349.00), and verified the total matches my calculation of $2,198.99.
INFO [Agent] 🎯 Next goal: Complete the task by confirming the total price calculation is correct and reporting the findings.
INFO [Agent] ▶️ done: text: I've completed the search for MacBooks on Amazon and verified the shopping cart calculation. Here's what I found:

1. First item: Apple 2025 MacBook A..., success: True
⠴ Running browser automation...INFO [Agent]
📄 Final Result:
I've completed the search for MacBooks on Amazon and verified the shopping cart calculation. Here's what I found:

1. First item: Apple 2025 MacBook Air 13-inch with M4 chip
- Price: $849.99
- Specifications: 16GB Unified Memory, 256GB SSD Storage
- Color: Midnight

2. Second item: Apple 2024 MacBook Pro Laptop with M4 chip
- Price: $1,349.00
- Specifications: 10‑core CPU and 10‑core GPU, 16GB Unified Memory, 512GB SSD Storage
- Color: Space Black

3. Cart Total Calculation:
- MacBook Air: $849.99
- MacBook Pro: $1,349.00
- Expected subtotal: $2,198.99
- Actual subtotal shown on Amazon: $2,198.99

The calculation is correct - the subtotal displayed in the shopping cart ($2,198.99) exactly matches the sum of the individual item prices.


INFO [Agent] ✅ Task completed successfully
✅ Task completed successfully!

Shutting down...
✅ Browser session closed
INFO [bedrock_agentcore.tools.browser_client] Stopping browser session...
✅ Browser client stopped

Browser Viewer截图:

此时,在Cloudwatch监控服务的GenAI Observability下,可以看到对应的Bedrock 模型调用metric和Agentcore BrowserUse记录:

结语

从 Nova Act 到 Browser Use,测试领域展现出从“相对精确执行”到“更加自主决策”的互补演进:前者确保可控性与稳定性,后者强化灵活性与智能化,两者共同构成智能体驱动测试的完整图景。

当测试跨越多个页面和不确定流程时,仅靠固定脚本已难以满足需求。Browser Use 与 AgentCore Browser Tool 的结合,使测试智能体能够根据目标灵活规划和调整,从而让测试过程更加高效、弹性和可持续。这标志着测试工作正逐步从人工驱动走向 AI 驱动。

*前述特定亚马逊云科技生成式人工智能相关的服务目前在亚马逊云科技海外区域可用。亚马逊云科技中国区域相关云服务由西云数据和光环新网运营,具体信息以中国区域官网为准。

本篇作者

张羽

亚马逊云科技解决方案架构师

赵安蓓

AWS解决方案架构师,负责基于AWS云平台的解决方案咨询和设计,机器学习TFC成员。在数据处理与建模领域有着丰富的实践经验,特别关注医疗领域的机器学习工程化与运用。