Browser Automation — Controlling Webview Tabs with AI

🌐 Browser Automation javascript v1

How the AI controls browser tabs in Flowork. The Electron GUI manages BrowserView instances. AI captures screenshots (vision), reads DOM, clicks elements, types text, imports cookies, and executes scripts — all through the IPC bridge.

In this guide, we cover "Browser Automation — Controlling Webview Tabs with AI" for the Flowork AI Knowledge Base. Category: browser. Language: javascript.

Architecture Overview

Browser automation uses a 3-layer approach: (1) AI sends tool calls from ai-builder.html → (2) agent_engine.js translates to IPC calls via floworkDesktop API → (3) main.js (Electron) executes on BrowserView instances. Screenshots use webContents.capturePage(), DOM reads use webContents.executeJavaScript(), and cookie management uses session.cookies API.

Key Patterns

  • capture_browser(tab_id) → Takes Base64 screenshot via Electron's capturePage()
  • read_dom(tab_id, selector) → Reads DOM elements using executeJavaScript on BrowserView
  • click_element(tab_id, selector) → Simulates click via DOM querySelector().click()
  • type_text(tab_id, selector, text) → Focus element and dispatch keyboard events
  • import_cookies(tab_id, cookie_json) → Import cookies via session.cookies.set()
  • execute_browser_script(tab_id, script) → Run arbitrary JS in the webview context
  • open_browser(url, tab_name) → Creates new BrowserView tab
  • CRITICAL PATTERN: Always capture_browser FIRST → analyze screenshot → THEN execute action

Project Structure

├── main.js (Electron BrowserView manager)
├── preload.js (IPC bridge)
├── renderer_modules/agent_engine.js (tool handlers)

Implementation Details

Capture + Act Pattern

// Step 1: Capture screenshot for visual analysis
{ "action": "capture_browser", "tab_id": "tab-1" }

// Step 2: Read DOM to find elements { "action": "read_dom", "tab_id": "tab-1", "selector": "input[type='email']" }

// Step 3: Type into the element { "action": "type_text", "tab_id": "tab-1", "selector": "input[type='email']", "text": "[email protected]" }

// Step 4: Click submit { "action": "click_element", "tab_id": "tab-1", "selector": "button[type='submit']" }

Cookie Import

// Export cookies from browser extension as JSON string
{ "action": "import_cookies", "tab_id": "tab-1", "cookies": "[{"name":"session_id","value":"abc123","domain":".example.com","path":"/"}]" }

Troubleshooting

  • ⚠️ NEVER execute browser actions without capturing first — you're blind without screenshots
  • ⚠️ Cookie domain must include the leading dot (.example.com) for subdomain matching
  • ⚠️ CSS selectors change frequently — always read_dom to find current selectors
  • ⚠️ Some sites detect automation — use stealth-preload.js to bypass detection
  • ⚠️ Tab IDs are assigned by Electron — use list_browsers to get current tab IDs

Summary

This article covers browser patterns for Flowork OS. Generated by Flowork AI from verified system architecture.