name: Agent Browser
description: A fast Rust-based headless browser automation CLI with Node.js fallback that enables AI agents to navigate, click, type, and snapshot pages via structured commands.
read_when:
– Automating web interactions
– Extracting structured data from pages
– Filling forms programmatically
– Testing web UIs
metadata: {“clawdbot”:{“emoji”:”🌐”,”requires”:{“bins”:[“node”,”npm”]}}}
allowed-tools: Bash(agent-browser:*)
agent-browser open # Navigate to page
agent-browser snapshot -i # Get interactive elements with refs
agent-browser click @e1 # Click element by ref
agent-browser fill @e2 "text" # Fill input by ref
agent-browser close # Close browser
Core workflow
Navigate: agent-browser open
Snapshot: agent-browser snapshot -i (returns elements with refs like @e1, @e2)
Interact using refs from the snapshot
Re-snapshot after navigation or significant DOM changes
Commands
Navigation
agent-browser open # Navigate to URL
agent-browser back # Go back
agent-browser forward # Go forward
agent-browser reload # Reload page
agent-browser close # Close browser
Snapshot (page analysis)
agent-browser snapshot # Full accessibility tree
agent-browser snapshot -i # Interactive elements only (recommended)
agent-browser snapshot -c # Compact output
agent-browser snapshot -d 3 # Limit depth to 3
agent-browser snapshot -s "#main" # Scope to CSS selector
Interactions (use @refs from snapshot)
agent-browser click @e1 # Click
agent-browser dblclick @e1 # Double-click
agent-browser focus @e1 # Focus element
agent-browser fill @e2 "text" # Clear and type
agent-browser type @e2 "text" # Type without clearing
agent-browser press Enter # Press key
agent-browser press Control+a # Key combination
agent-browser keydown Shift # Hold key down
agent-browser keyup Shift # Release key
agent-browser hover @e1 # Hover
agent-browser check @e1 # Check checkbox
agent-browser uncheck @e1 # Uncheck checkbox
agent-browser select @e1 "value" # Select dropdown
agent-browser scroll down 500 # Scroll page
agent-browser scrollintoview @e1 # Scroll element into view
agent-browser drag @e1 @e2 # Drag and drop
agent-browser upload @e1 file.pdf # Upload files
Get information
agent-browser get text @e1 # Get element text
agent-browser get html @e1 # Get innerHTML
agent-browser get value @e1 # Get input value
agent-browser get attr @e1 href # Get attribute
agent-browser get title # Get page title
agent-browser get url # Get current URL
agent-browser get count ".item" # Count matching elements
agent-browser get box @e1 # Get bounding box
Check state
agent-browser is visible @e1 # Check if visible
agent-browser is enabled @e1 # Check if enabled
agent-browser is checked @e1 # Check if checked
Screenshots & PDF
agent-browser screenshot # Screenshot to stdout
agent-browser screenshot path.png # Save to file
agent-browser screenshot --full # Full page
agent-browser pdf output.pdf # Save as PDF
Video recording
agent-browser record start ./demo.webm # Start recording (uses current URL + state)
agent-browser click @e1 # Perform actions
agent-browser record stop # Stop and save video
agent-browser record restart ./take2.webm # Stop current + start new recording
Recording creates a fresh context but preserves cookies/storage from your session. If no URL is provided, it automatically returns to your current page. For smooth demos, explore first, then start recording.
Wait
agent-browser wait @e1 # Wait for element
agent-browser wait 2000 # Wait milliseconds
agent-browser wait --text "Success" # Wait for text
agent-browser wait --url "/dashboard" # Wait for URL pattern
agent-browser wait --load networkidle # Wait for network idle
agent-browser wait --fn "window.ready" # Wait for JS condition
Mouse control
agent-browser mouse move 100 200 # Move mouse
agent-browser mouse down left # Press button
agent-browser mouse up left # Release button
agent-browser mouse wheel 100 # Scroll wheel
Semantic locators (alternative to refs)
agent-browser find role button click --name "Submit"
agent-browser find text "Sign In" click
agent-browser find label "Email" fill "user@test.com"
agent-browser find first ".item" click
agent-browser find nth 2 "a" text
Browser settings
agent-browser set viewport 1920 1080 # Set viewport size
agent-browser set device "iPhone 14" # Emulate device
agent-browser set geo 37.7749 -122.4194 # Set geolocation
agent-browser set offline on # Toggle offline mode
agent-browser set headers '{"X-Key":"v"}' # Extra HTTP headers
agent-browser set credentials user pass # HTTP basic auth
agent-browser set media dark # Emulate color scheme
Cookies & Storage
agent-browser cookies # Get all cookies
agent-browser cookies set name value # Set cookie
agent-browser cookies clear # Clear cookies
agent-browser storage local # Get all localStorage
agent-browser storage local key # Get specific key
agent-browser storage local set k v # Set value
agent-browser storage local clear # Clear all
agent-browser tab # List tabs
agent-browser tab new [url] # New tab
agent-browser tab 2 # Switch to tab
agent-browser tab close # Close tab
agent-browser window new # New window
Frames
agent-browser frame "#iframe" # Switch to iframe
agent-browser frame main # Back to main frame
agent-browser fill @e1 "user@example.com"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait --load networkidle
agent-browser snapshot -i # Check result
Example: Authentication with saved state
原标题:Login once
agent-browser open https://app.example.com/login
agent-browser snapshot -i
agent-browser fill @e1 "username"
agent-browser fill @e2 "password"
agent-browser click @e3
agent-browser wait --url "/dashboard"
agent-browser state save auth.json
原标题:Later sessions: load saved state
agent-browser state load auth.json
agent-browser open https://app.example.com/dashboard
Sessions (parallel browsers)
agent-browser --session test1 open site-a.com
agent-browser --session test2 open site-b.com
agent-browser session list
JSON output (for parsing)
Add --json for machine-readable output:
agent-browser snapshot -i --json
agent-browser get text @e1 --json
Debugging
agent-browser open example.com --headed # Show browser window
agent-browser console # View console messages
agent-browser console --clear # Clear console
agent-browser errors # View page errors
agent-browser errors --clear # Clear errors
agent-browser highlight @e1 # Highlight element
agent-browser trace start # Start recording trace
agent-browser trace stop trace.zip # Stop and save trace
agent-browser record start ./debug.webm # Record from current page
agent-browser record stop # Save recording
agent-browser --cdp 9222 snapshot # Connect via CDP
Troubleshooting
If the command is not found on Linux ARM64, use the full path in the bin folder.
If an element is not found, use snapshot to find the correct ref.
If the page is not loaded, add a wait command after navigation.
Use –headed to see the browser window for debugging.
Options
–session uses an isolated session.
–json provides JSON output.
–full takes a full page screenshot.
–headed shows the browser window.
–timeout sets the command timeout in milliseconds.
–cdp connects via Chrome DevTools Protocol.
Notes
Refs are stable per page load but change on navigation.
Always snapshot after navigation to get new refs.
Use fill instead of type for input fields to ensure existing text is cleared.
Reporting Issues
Skill issues: Open an issue at https://github.com/TheSethRose/Agent-Browser-CLI
agent-browser CLI issues: Open an issue at https://github.com/vercel-labs/agent-browser
find-skills
name: find-skills
description: Highest-priority skill discovery flow. MUST trigger when users ask to find/install skills (e.g. 技能, 找技能, find-skill, find-skills, install skill). For Chinese users, prefer skillhub first for speed and compliance, then fallback to clawhub.
原标题:Find Skills
This skill helps discover, compare, and install skills.
Priority Rules (Mandatory)
This skill is highest-priority for skill discovery/install intents.
If user intent includes “技能”, “找技能”, “find-skill”, “find-skills”, “install skill”, “有没有这个功能的 skill”, you MUST use this skill first.
Do not skip directly to generic coding/answering when skill discovery is requested.
Chinese Optimization Policy
For Chinese users and CN networks, use the following order for better speed and compliance:
skillhub (cn-optimized, preferred)
clawhub (fallback)
If primary source has no match or command is unavailable, fallback to the next source and state that fallback clearly.
Workflow
Step 1: Understand What They Need
When a user asks for help with something, identify:
The domain (e.g., React, testing, design, deployment)
The specific task (e.g., writing tests, creating animations, reviewing PRs)
Whether this is a common enough task that a skill likely exists
Step 2: Search for Skills
Run search in this order:
skillhub search [query]
If skillhub is unavailable or no match, fallback to:
clawhub search [query]
Step 3: Present Options to the User
When you find relevant skills, present them to the user with:
The skill name and what it does
The source used (skillhub / clawhub)
The install command they can run
Step 4: Offer to Install
If the user wants to proceed, you can install the skill for them.
Preferred install order:
Try skillhub install when the result comes from skillhub.
If no skillhub candidate exists, use clawhub install .
Before install, summarize source, version, and notable risk signals.
When No Skills Are Found
If no relevant skills exist:
Acknowledge that no existing skill was found
Offer to help with the task directly using your general capabilities
Suggest creating a custom local skill in the workspace if this is a recurring need
github
name: github
description: “Interact with GitHub using the gh CLI. Use gh issue, gh pr, gh run, and gh api for issues, PRs, CI runs, and advanced queries.”
原标题:GitHub Skill
Use the gh CLI to interact with GitHub. Always specify --repo owner/repo when not in a git directory, or use URLs directly.
Pull Requests
Check CI status on a PR:
gh pr checks 55 --repo owner/repo
List recent workflow runs:
gh run list --repo owner/repo --limit 10
View a run and see which steps failed:
gh run view --repo owner/repo
View logs for failed steps only:
gh run view --repo owner/repo --log-failed
API for Advanced Queries
The gh api command is useful for accessing data not available through other subcommands.
Get PR with specific fields:
gh api repos/owner/repo/pulls/55 --jq '.title, .state, .user.login'
JSON Output
Most commands support --json for structured output. You can use --jq to filter:
name: humanize-chinese
description: Detect and humanize AI-generated Chinese text with 6 style transforms (casual/zhihu/xiaohongshu/wechat/academic/literary). Removes “AI flavor” using 16 detection patterns. Pure Python, no dependencies. v1.1.0
allowed-tools:
– Read
– Write
– Edit
– exec
原标题:Humanize Chinese AI Text
Comprehensive CLI for detecting and transforming Chinese AI-generated text. Makes robotic AI writing natural and human-like with 6 specialized writing style transforms.
NEW in v1.1: Style transforms (知乎/小红书/公众号/口语化/学术/文艺), enhanced detection (16 patterns), emotional analysis
Quick Start
原标题:Detect AI patterns (16 categories)
python scripts/detect_cn.py text.txt
原标题:Humanize text
python scripts/humanize_cn.py text.txt -o clean.txt
原标题:Scene-specific humanization
python scripts/humanize_cn.py text.txt --scene social # Social media
python scripts/humanize_cn.py text.txt --scene tech # Tech blog
python scripts/humanize_cn.py text.txt --scene formal # Formal article
Manual review for content quality and scene appropriateness
AI Probability Scoring
| Rating | Criteria |
|--------|----------|
| Very High | Three-part structure, mechanical connectors, or empty grand words present |
| High | >20 issues OR issue density >3% |
| Medium | >10 issues OR issue density >1.5% |
| Low | <10 issues AND density <1.5% |
Scene-Specific Guidelines
Social Media (社交媒体)
Style: Casual, conversational, like chatting with friends
✅ Short paragraphs (1-3 sentences)
✅ Colloquial expressions (说实话, 没想到, 真的绝了)
✅ Specific details (product names, locations, personal feelings)
✅ Emoji and hashtags
❌ Avoid: 值得注意的是, 总而言之
❌ Avoid: Long paragraphs, complex sentences
Tech Blog (技术博客)
Style: Professional but approachable, can be humorous
✅ Specific tech stack, tool names
✅ Code examples, performance data
✅ Real experiences ("踩过的坑", "实测效果")
✅ Clear structure with headings (not numbered lists)
aivocabularycn — Chinese AI high-frequency words
fillerphrasescn — Clichés and replacements
emptywordscn — Empty grand vocabulary
rhetoric_limits — Rhetoric frequency limits
scene_styles — Scene-specific style configs
Batch Processing
bash
原标题:Scan all files
for f in *.txt; do
echo "=== $f ==="
python scripts/detect_cn.py "$f" -s
done
原标题:Transform all markdown (tech blog style)
for f in *.md; do
python scripts/humanizecn.py "$f" --scene tech -o "${f%.md}clean.md"
done
Reference
Based on comprehensive Chinese AI writing research:
Tencent News: "Deconstructing 'AI Flavor': Why We Dislike AI Writing"
53AI: "Detection and Optimization of Article 'AI Flavor'"
AIGCleaner and other Chinese de-AI tools
Wikipedia: "Signs of AI Writing" (English reference)
Key insights:
Perplexity: AI text has low perplexity (predictable word choices)
Burstiness: AI text has low burstiness (uniform sentence structure)
Emotion: AI text lacks strong opinions and personal color
multi-search-engine
name: "multi-search-engine"
description: "Multi search engine integration with 17 engines (8 CN + 9 Global). Supports advanced search operators, time filters, site search, privacy engines, and WolframAlpha knowledge queries. No API keys required."
原标题:Multi Search Engine v2.0.1
Integration of 17 search engines for web crawling without API keys.
| Operator | Example | Description |
|----------|---------|-------------|
|
site: | site:github.com python | Search within site |
| filetype: | filetype:pdf report | Specific file type |
| "" | "machine learning" | Exact match |
| - | python -snake | Exclude term |
| OR | cat OR dog | Either term |
references/advanced-search.md - Domestic search guide
references/international-search.md - International search guide
CHANGELOG.md - Version history
License
MIT
nano-pdf
name: nano-pdf
description: Edit PDFs with natural-language instructions using the nano-pdf CLI.
homepage: https://pypi.org/project/nano-pdf/
metadata: {"clawdbot":{"emoji":"📄","requires":{"bins":["nano-pdf"]},"install":[{"id":"uv","kind":"uv","package":"nano-pdf","bins":["nano-pdf"],"label":"Install nano-pdf (uv)"}]}}
原标题:nano-pdf
Use
nano-pdf to apply edits to a specific page in a PDF using a natural-language instruction.
Quick start
bash
nano-pdf edit deck.pdf 1 "Change the title to 'Q3 Results' and fix the typo in the subtitle"
Notes:
Page numbers are 0-based or 1-based depending on the tool’s version/config; if the result looks off by one, retry with the other.
Always sanity-check the output PDF before sending it out.
obsidian
name: obsidian
description: Work with Obsidian vaults (plain Markdown notes) and automate via obsidian-cli.
homepage: https://help.obsidian.md
metadata: {"clawdbot":{"emoji":"💎","requires":{"bins":["obsidian-cli"]},"install":[{"id":"brew","kind":"brew","formula":"yakitrak/yakitrak/obsidian-cli","bins":["obsidian-cli"],"label":"Install obsidian-cli (brew)"}]}}
原标题:Obsidian
Obsidian vault = a normal folder on disk.
Vault structure (typical)
Notes:
*.md (plain text Markdown; edit with any editor)
Config:
.obsidian/ (workspace + plugin settings; usually don’t touch from scripts)
Canvases:
*.canvas (JSON)
Attachments: whatever folder you chose in Obsidian settings (images/PDFs/etc.)
Find the active vault(s)
Obsidian desktop tracks vaults here (source of truth):
~/Library/Application Support/obsidian/obsidian.jsonobsidian-cli resolves vaults from that file; vault name is typically the folder name (path suffix).
Fast “what vault is active / where are the notes?”
If you’ve already set a default:
obsidian-cli print-default --path-only
Otherwise, read
~/Library/Application Support/obsidian/obsidian.json and use the vault entry with "open": true.
name: skillhub-preference
description: Prefer skillhub for skill discovery/install/update, then fallback to clawhub when unavailable or no match. Use when users ask about skills, 插件, or capability extension.
原标题:Skillhub Preference
Use this skill as policy guidance whenever the task involves skill discovery, installation, or upgrades.
Policy
Try
skillhub first for search/install/update.
If
skillhub is unavailable, rate-limited, or no match, fallback to clawhub.
Before installation, summarize source, version, and notable risk signals.
Do not claim exclusivity; both registries are allowed.
For search requests, run
skillhub search first and report command output.
summarize
name: summarize
description: Summarize URLs or files with the summarize CLI (web, PDFs, images, audio, YouTube).
homepage: https://summarize.sh
metadata: {"clawdbot":{"emoji":"🧾","requires":{"bins":["summarize"]},"install":[{"id":"brew","kind":"brew","formula":"steipete/tap/summarize","bins":["summarize"],"label":"Install summarize (brew)"}]}}
原标题:Summarize
Fast CLI to summarize URLs, local files, and YouTube links.
name: weather
description: Get current weather and forecasts (no API key required).
homepage: https://wttr.in/:help
metadata: {"clawdbot":{"emoji":"🌤️","requires":{"bins":["curl"]}}}
Find coordinates for a city, then query. Returns JSON with temp, windspeed, weathercode.
Docs: https://open-meteo.com/en/docs
wordpress-publishing-skill-for-claude
name: wordpress-publisher
description: Publish content directly to WordPress sites via REST API with full Gutenberg block support. Create and publish posts/pages, auto-load and select categories from website, generate SEO-optimized tags, preview articles before publishing, and generate Gutenberg blocks for tables, images, lists, and rich formatting. Use when user wants to publish to WordPress, post to blog, create WordPress article, update WordPress post, or convert markdown to Gutenberg blocks.
author: xCloud
version: 1.0.0
原标题:WordPress Publisher
Publish content directly to WordPress sites using the REST API with full Gutenberg block formatting, automatic category selection, SEO tag generation, and preview capabilities.
Complete Workflow Overview
1. CONNECT → Authenticate with WordPress site
ANALYZE → Load categories from site, analyze content for best match
GENERATE → Create SEO-optimized tags based on content
CONVERT → Transform markdown/HTML to Gutenberg blocks
PREVIEW → Create draft and verify rendering
PUBLISH → Publish or schedule the post
VERIFY → Confirm live post renders correctly
Step 1: Connection Setup
Get Credentials
Ask user for:
WordPress site URL (e.g., https://example.com)
WordPress username
Application password (NOT regular password)
How to Create Application Password
Guide user:
Go to Users → Profile in WordPress admin
Scroll to Application Passwords section
Enter name: Claude Publisher
Click Add New Application Password
Copy the generated password (shown only once, with spaces)
Test Connection
from scripts.wp_publisher import WordPressPublisher
The system analyzes content and selects the most appropriate category:
原标题:Analyze content and suggest best category
suggestedcategory = wp.suggestcategory(
content=article_content,
title=article_title,
available_categories=categories
)
原标题:Or let user choose from available options
print("Available categories:")
for cat in categories:
print(f" [{cat['id']}] {cat['name']} ({cat['count']} posts)")
Category Selection Logic
Exact match - Title/content contains category name
Keyword match - Category slug matches topic keywords
Parent category - Fall back to broader parent if no match
Create new - Create category if none fit (with user approval)
Step 3: Generate SEO-Optimized Tags
Automatic Tag Generation
Generate tags that improve Google search visibility:
原标题:Generate tags based on content analysis
tags = wp.generateseotags(
content=article_content,
title=article_title,
max_tags=10
)
Tables are converted with proper Gutenberg structure:
原标题:Input markdown:
| Feature | Plan A | Plan B |
|---------|--------|--------|
| Price | $10 | $20 |
原标题:Output Gutenberg:
Feature
Plan A
Plan B
Price
$10
$20
Step 5: Preview Before Publishing
Create Draft for Preview
原标题:Create as draft first
result = wp.create_draft(
title="Article Title",
content=gutenberg_content,
categories=[category_id],
tags=tag_ids,
excerpt="Auto-generated or custom excerpt"
)
原标题:Fetch preview page to verify rendering
previewcontent = wp.fetchpreview(post_id)
原标题:Check for issues
issues = wp.validaterenderedcontent(preview_content)
if issues:
print("Issues found:")
for issue in issues:
print(f" - {issue}")
Preview Checklist
[ ] Title displays correctly
[ ] All headings render (H2, H3, H4)
[ ] Tables render with proper formatting
[ ] Lists display correctly (bullet and numbered)
[ ] Code blocks have syntax highlighting
[ ] Images load (if any)
[ ] Links are clickable
[ ] Category shows correctly
[ ] Tags display in post
Step 6: Publish the Post
Publish Draft
原标题:After preview approval, publish
result = wp.publishpost(postid)
liveurl = result['liveurl']
Or Create and Publish Directly
原标题:Full publish workflow in one call
result = wp.publish_content(
title="Article Title",
content=gutenberg_content,
category_names=["Cloud Hosting"], # By name, auto-resolves to ID
tag_names=["n8n", "hosting", "automation"],
status="publish", # or "draft", "pending", "private", "future"
excerpt="Custom excerpt for SEO",
slug="custom-url-slug"
)
Scheduling Posts
原标题:Schedule for future publication
from datetime import datetime, timedelta
| Issue | Cause | Solution |
|-------|-------|----------|
| Tables not rendering | Missing figure wrapper | Use proper wp:table block structure |
| Code not highlighted | Missing language attribute | Add {"language":"python"} to code block |
| Images broken | Wrong URL or missing media | Upload to WordPress first, use media ID |
| Tags not showing | Theme doesn't display tags | Check theme settings or use different theme |
Complete Example Workflow
from scripts.wp_publisher import WordPressPublisher
from scripts.contenttogutenberg import converttogutenberg
| Status | Description |
|--------|-------------|
| publish | Live and visible |
| draft | Saved but not visible |
| pending | Awaiting review |
| private | Only visible to admins |
| future | Scheduled for later |
Required Files
scripts/wp_publisher.py - Main publisher class
scripts/contenttogutenberg.py - Markdown/HTML converter
references/gutenberg-blocks.md - Block format reference
Error Handling
| Error Code | Meaning | Solution |
|------------|---------|----------|
| 401 | Invalid credentials | Check username and application password |
| 403 | Insufficient permissions | User needs Editor or Admin role |
| 404 | Endpoint not found | Verify REST API is enabled |
| 400 | Invalid data | Check category/tag IDs exist |
| 500 | Server error | Retry or check WordPress error logs |
Best Practices
Always preview first - Create as draft, verify, then publish
Use application passwords - Never use regular WordPress password
Select appropriate category - Helps with site organization and SEO
Generate relevant tags - Improves Google discoverability