diff --git a/.gitignore b/.gitignore index 8e8be59..666c383 100644 --- a/.gitignore +++ b/.gitignore @@ -161,4 +161,6 @@ cython_debug/ #.idea/ .czrc -.ruff_cache/ \ No newline at end of file +.ruff_cache/ +.idea +/.idea/ diff --git a/README-zh.md b/README-zh.md new file mode 100644 index 0000000..8000502 --- /dev/null +++ b/README-zh.md @@ -0,0 +1,367 @@ +

+ Pydoll Logo

+

+ +

+ + + + Tests + Ruff CI + Release + MyPy CI + Ask DeepWiki +

+ +

+ Documentation • + Getting Started • + Advanced Features • + Contributing • + Support • + License +

+ +- [英文介绍](README.md) + +## 核心特性 + +🔹 **无需Webdriver!** 从此告别webdriver兼容性地狱 +🔹 **绕过本地验证码!** 平滑处理Cloudflare Turnstile和reCAPTCHA v3验证码 +🔹 **异步性能加持** 闪电般快速的自动化操作 +🔹 **模拟真人交互** 模拟真实用户行为 +🔹 **强大的事件系统** 响应式自动化 +🔹 **多浏览器支持** 支持Chrome以及Edge + +## 为什么选择Pydoll + +想象一下场景: 你正在尝试写一个浏览器自动化任务,比如测试你的网站,从网站抓取数据,或者自动化一些重复性任务。通常来说,这需要解决外部驱动,复杂的配置,和一些无缘无故的兼容性问题。 +不仅如此,目前还有一个更重要的问题: **网站反爬保护系统** 例如: Cloudflare Turnstile 验证码, reCAPTCHA v3以及一些检测出传统自动化工具的机器人检测算法,你的自动化脚本没有任何bug,但是这些网站却直接风控你。 + +**Pydoll的诞生就是为了解决此类问题!** + +Pydoll从头开始构建,Pydoll可以直接通过Chrome DevTools Protocol (CDP协议)链接到浏览器,完全消除了传统自动化框架需要外部驱动的问题(例如selenuim)。更重要地是,它可以更先进地模拟真人行为操作以及拥有更智能的验证码绕过能力使你的自动化任务和真人行为一样几乎无法区分。 + +强大的自动化框架不需要复杂配置并且可以更方便地绕过反爬系统。在Pydoll的加持下,你只需要专注于业务逻辑而并不是复杂的底层设计以及绕过反爬系统。 + +## 特点 + +- **智能验证码绕过**: 内置Cloudflare Turnstile与reCAPTCHA v3验证码的自动破解能力,无需依赖外部服务、API密钥或复杂配置。即使遭遇防护系统,您的自动化流程仍可畅行无阻。 +- **模拟真人交互**: 通过先进算法模拟真实人类行为特征——通过随机操作间隔,到鼠标移动轨迹、页面滚动模式乃至输入速度,皆可骗过最严苛的反爬虫系统。 +- **极简哲学**: 无需浪费太多时间在配置驱动或解决兼容问题上。Pydoll开箱即用。 +- **原生异步性能**: 基于`asyncio`库深度设计, Pydoll不仅支持异步操作——更为高并发而生,可同时进行多个受防护站点的数据采集。 +- **强大的网络监控**: 轻松实现请求拦截、流量篡改与响应分析,完整掌控网络通信链路,轻松突破层层防护体系。 +- **事件驱动架构**: 实时响应页面事件、网络请求与用户交互,构建能动态适应防护系统的智能自动化流。 +- **直观的元素定位**: 使用符合人类直觉的定位方法 `find()` 和 `query()` ,面对动态加载的防护内容,定位依然精准。 +- **强类型安全**: 完备的类型系统为复杂自动化场景提供更优IDE支持和更好地预防运行时报错。 + +## 安装 + +```bash +pip install pydoll-python +``` + +无需额外的驱动下载,无需复杂的配置,开箱即用。 + +## 开始开始 + +### 开始第一个自动化应用 + +这是一个简单的例子,主要包括打开一个浏览器,访问网站以及和网页元素交互: + +```python +import asyncio +from pydoll.browser import Chrome + +async def my_first_automation(): + # 创建浏览器实例 + async with Chrome() as browser: + # 启动浏览器并获取一个标签 + tab = await browser.start() + + # 访问网站 + await tab.go_to('https://example.com') + + # 直观地查找元素 + button = await tab.find(tag_name='button', class_name='submit') + await button.click() + + # 或者直接使用CSS selectors/XPath表达式 + link = await tab.query('a[href*="contact"]') + await link.click() + +# 运行自动化程序 +asyncio.run(my_first_automation()) +``` + +### 定制配置 + +Pydoll提供了灵活的配置选项 + +```python +from pydoll.browser import Chrome +from pydoll.browser.options import ChromiumOptions + +async def custom_automation(): + # 配置浏览器命令行参数 + options = ChromiumOptions() + options.add_argument('--proxy-server=username:password@ip:port') + options.add_argument('--window-size=1920,1080') + options.add_argument('--disable-web-security') + options.binary_location = '/path/to/your/browser' + + async with Chrome(options=options) as browser: + tab = await browser.start() + + # Your automation code here + await tab.go_to('https://example.com') + + # The browser is now using your custom settings + +asyncio.run(custom_automation()) +``` + +## 进阶特性 + +### 智能验证码绕过 + +Pydoll最具代表性的功能之一,是能自动处理现代验证码系统——这些系统通常会检测拦截自动化工具。这不仅仅是绕过验证码,更是让您的自动化操作在防护系统面前完全隐形。 + +**Supported Captcha Types:** +- **Cloudflare Turnstile** - reCAPTCHA 的现代替代方案 +- **reCAPTCHA v3** - 谷歌的无感验证系统 +- **自定义实现** - 可扩展框架,适配新型验证码 + +```python +import asyncio +from pydoll.browser import Chrome + +async def advanced_captcha_bypass(): + async with Chrome() as browser: + tab = await browser.start() + + # Method 1: Context manager (waits for captcha completion) + async with tab.expect_and_bypass_cloudflare_captcha(): + await tab.go_to('https://site-with-cloudflare.com') + print("Cloudflare Turnstile automatically solved!") + + # Continue with your automation - captcha is handled + await tab.find(id='username').type_text('user@example.com') + await tab.find(id='password').type_text('password123') + await tab.find(tag_name='button', text='Login').click() + + # Method 2: Background processing (non-blocking) + await tab.enable_auto_solve_cloudflare_captcha() + await tab.go_to('https://another-protected-site.com') + # Captcha solved automatically in background while code continues + + # Method 3: Custom captcha selector for specific implementations + await tab.enable_auto_solve_cloudflare_captcha( + custom_selector=(By.CLASS_NAME, 'custom-captcha-widget'), + time_before_click=3, # Wait 3 seconds before solving + time_to_wait_captcha=10 # Timeout after 10 seconds + ) + + await tab.disable_auto_solve_cloudflare_captcha() + +asyncio.run(advanced_captcha_bypass()) +``` + +### 高级元素定位 + +Pydoll 提供多种直观的元素定位方式,无论您的使用习惯如何,总有一种方案适合您: + +```python +import asyncio +from pydoll.browser import Chrome + +async def element_finding_examples(): + async with Chrome() as browser: + tab = await browser.start() + await tab.go_to('https://example.com') + + # 属性匹配 + submit_btn = await tab.find( + tag_name='button', + class_name='btn-primary', + text='Submit' + ) + + # id匹配 + username_field = await tab.find(id='username') + + # 多个元素匹配 + all_links = await tab.find(tag_name='a', find_all=True) + + # CSS selectors 和 XPath表达式 + nav_menu = await tab.query('nav.main-menu') + specific_item = await tab.query('//div[@data-testid="item-123"]') + + # 超时和错误处理 + delayed_element = await tab.find( + class_name='dynamic-content', + timeout=10, + raise_exc=False # Returns None if not found + ) + + # 自定义属性字段匹配 + custom_element = await tab.find( + data_testid='submit-button', + aria_label='Submit form' + ) + +asyncio.run(element_finding_examples()) +``` + +### 并行自动化 + +由于Pydoll是基于异步设计的,能够更好地同时处理多个任务: + +```python +import asyncio +from pydoll.browser import Chrome + +async def scrape_page(url): + """Extract data from a single page""" + async with Chrome() as browser: + tab = await browser.start() + await tab.go_to(url) + + title = await tab.execute_script('return document.title') + links = await tab.find(tag_name='a', find_all=True) + + return { + 'url': url, + 'title': title, + 'link_count': len(links) + } + +async def concurrent_scraping(): + urls = [ + 'https://example1.com', + 'https://example2.com', + 'https://example3.com' + ] + + # Process all URLs simultaneously + tasks = [scrape_page(url) for url in urls] + results = await asyncio.gather(*tasks) + + for result in results: + print(f"{result['url']}: {result['title']} ({result['link_count']} links)") + +asyncio.run(concurrent_scraping()) +``` + +### 事件驱动的自动化 + +Pydoll 支持实时响应页面事件与用户交互,从而实现更智能、响应更迅捷的自动化流程: + +```python +import asyncio +from pydoll.browser import Chrome +from pydoll.protocol.page.events import PageEvent + +async def event_driven_automation(): + async with Chrome() as browser: + tab = await browser.start() + + # Enable page events + await tab.enable_page_events() + + # React to page load + async def on_page_load(event): + print("Page loaded! Starting automation...") + # Perform actions after page loads + search_box = await tab.find(id='search-box') + await search_box.type_text('automation') + + # React to navigation + async def on_navigation(event): + url = event['params']['url'] + print(f"Navigated to: {url}") + + await tab.on(PageEvent.LOAD_EVENT_FIRED, on_page_load) + await tab.on(PageEvent.FRAME_NAVIGATED, on_navigation) + + await tab.go_to('https://example.com') + await asyncio.sleep(5) # Let events process + +asyncio.run(event_driven_automation()) +``` + +### 处理 iframe 内容 + +Pydoll 通过`get_frame()`方法提供无缝的 iframe 交互能力,尤其适用于处理嵌入式内容: + +```python +import asyncio +from pydoll.browser.chromium import Chrome + +async def iframe_interaction(): + async with Chrome() as browser: + tab = await browser.start() + await tab.go_to('https://example.com/page-with-iframe') + + # 查找iframe元素 + iframe_element = await tab.query('.hcaptcha-iframe', timeout=10) + + # 从iframe中获取一个tab实例 + frame = await tab.get_frame(iframe_element) + + # Now interact with elements inside the iframe + submit_button = await frame.find(tag_name='button', class_name='submit') + await submit_button.click() + + # You can use all Tab methods on the frame + form_input = await frame.find(id='captcha-input') + await form_input.type_text('verification-code') + + # Find elements by various methods + links = await frame.find(tag_name='a', find_all=True) + specific_element = await frame.query('#specific-id') + +asyncio.run(iframe_interaction()) +``` + +## 文档 + +如需完整文档、详细示例以及想要深入了解Pydoll的特性,请访问[官方文档](https://autoscrape-labs.github.io/pydoll/) + +文档包含以下内容: +- **入门指南** - 手把手教学教程 +- **API参考** - 完整的方法文档说明 +- **高级技巧** - 网络请求拦截、事件处理、性能优化 +- **故障排查** - 常见问题及解决方案 +- **最佳实践** - 构建稳定自动化流程的推荐方案 + +## 贡献 + +诚邀您携手,共铸 Pydoll 更佳体验!请参阅[贡献指南](CONTRIBUTING.md)开启协作。无论修复疏漏、添砖新能,抑或完善文档——所有热忱,皆为至宝! + +请务必遵循以下规范: +- 为新功能或 Bug 修复编写测试 +- 遵守代码风格与项目规范 +- 提交 Pull Request 时使用约定式提交(Conventional Commits) +- 提交前运行代码检查(Lint)和测试 + +## 赞助我们 + +如果你觉得本项目对你有帮助,可以考虑[赞助我们](https://github.com/sponsors/thalissonvs). +您将获取独家优先支持,定制需求以及更多的福利! + +现在不能赞助?无妨,你可以通过以下方式支持我们: +- ⭐ Star 本项目 +- 📢 社交平台分享 +- ✍️ 撰写教程或博文 +- 🐛 反馈建议或提交issues + +点滴相助,铭记于心——诚谢! + +## 许可 + +Pydoll是在 [MIT License](LICENSE) 许可下许可的开源软件。 + +

+ Pydoll — Making browser automation magical! +

diff --git a/README.md b/README.md index b78746a..e95d8fd 100644 --- a/README.md +++ b/README.md @@ -23,6 +23,8 @@

+- [简体中文介绍](README_zh.md) + ## Key Features - **Zero Webdrivers!** Say goodbye to webdriver compatibility nightmares diff --git a/docs/api/commands/browser.md b/docs/api/commands/browser.md deleted file mode 100644 index 5ebe4a5..0000000 --- a/docs/api/commands/browser.md +++ /dev/null @@ -1,41 +0,0 @@ -# Browser Commands - -Browser commands provide low-level control over browser instances and their configuration. - -## Overview - -The browser commands module handles browser-level operations such as version information, target management, and browser-wide settings. - -::: pydoll.commands.browser_commands - options: - show_root_heading: true - show_source: false - heading_level: 2 - filters: - - "!^_" - - "!^__" - -## Usage - -Browser commands are typically used internally by browser classes to manage browser instances: - -```python -from pydoll.commands.browser_commands import get_version -from pydoll.connection.connection_handler import ConnectionHandler - -# Get browser version information -connection = ConnectionHandler() -version_info = await get_version(connection) -``` - -## Available Commands - -The browser commands module provides functions for: - -- Getting browser version and user agent information -- Managing browser targets (tabs, windows) -- Controlling browser-wide settings and permissions -- Handling browser lifecycle events - -!!! note "Internal Usage" - These commands are primarily used internally by the `Chrome` and `Edge` browser classes. Direct usage is recommended only for advanced scenarios. \ No newline at end of file diff --git a/docs/api/commands/dom.md b/docs/api/commands/dom.md deleted file mode 100644 index a61baef..0000000 --- a/docs/api/commands/dom.md +++ /dev/null @@ -1,58 +0,0 @@ -# DOM Commands - -DOM commands provide comprehensive functionality for interacting with the Document Object Model of web pages. - -## Overview - -The DOM commands module is one of the most important modules in Pydoll, providing all the functionality needed to find, interact with, and manipulate HTML elements on web pages. - -::: pydoll.commands.dom_commands - options: - show_root_heading: true - show_source: false - heading_level: 2 - filters: - - "!^_" - - "!^__" - -## Usage - -DOM commands are used extensively by the `WebElement` class and element finding methods: - -```python -from pydoll.commands.dom_commands import query_selector, get_attributes -from pydoll.connection.connection_handler import ConnectionHandler - -# Find element and get its attributes -connection = ConnectionHandler() -node_id = await query_selector(connection, selector="#username") -attributes = await get_attributes(connection, node_id=node_id) -``` - -## Key Functionality - -The DOM commands module provides functions for: - -### Element Finding -- `query_selector()` - Find single element by CSS selector -- `query_selector_all()` - Find multiple elements by CSS selector -- `get_document()` - Get the document root node - -### Element Interaction -- `click_element()` - Click on elements -- `focus_element()` - Focus elements -- `set_attribute_value()` - Set element attributes -- `get_attributes()` - Get element attributes - -### Element Information -- `get_box_model()` - Get element positioning and dimensions -- `describe_node()` - Get detailed element information -- `get_outer_html()` - Get element HTML content - -### DOM Manipulation -- `remove_node()` - Remove elements from DOM -- `set_node_value()` - Set element values -- `request_child_nodes()` - Get child elements - -!!! tip "High-Level APIs" - While these commands provide powerful low-level access, most users should use the higher-level `WebElement` class methods like `click()`, `type_text()`, and `get_attribute()` which use these commands internally. \ No newline at end of file diff --git a/docs/api/commands/input.md b/docs/api/commands/input.md deleted file mode 100644 index ad0157e..0000000 --- a/docs/api/commands/input.md +++ /dev/null @@ -1,75 +0,0 @@ -# Input Commands - -Input commands handle mouse and keyboard interactions, providing human-like input simulation. - -## Overview - -The input commands module provides functionality for simulating user input including mouse movements, clicks, keyboard typing, and key presses. - -::: pydoll.commands.input_commands - options: - show_root_heading: true - show_source: false - heading_level: 2 - filters: - - "!^_" - - "!^__" - -## Usage - -Input commands are used by element interaction methods and can be used directly for advanced input scenarios: - -```python -from pydoll.commands.input_commands import dispatch_mouse_event, dispatch_key_event -from pydoll.connection.connection_handler import ConnectionHandler - -# Simulate mouse click -connection = ConnectionHandler() -await dispatch_mouse_event( - connection, - type="mousePressed", - x=100, - y=200, - button="left" -) - -# Simulate keyboard typing -await dispatch_key_event( - connection, - type="keyDown", - key="Enter" -) -``` - -## Key Functionality - -The input commands module provides functions for: - -### Mouse Events -- `dispatch_mouse_event()` - Mouse clicks, movements, and wheel events -- Mouse button states (left, right, middle) -- Coordinate-based positioning -- Drag and drop operations - -### Keyboard Events -- `dispatch_key_event()` - Key press and release events -- `insert_text()` - Direct text insertion -- Special key handling (Enter, Tab, Arrow keys, etc.) -- Modifier keys (Ctrl, Alt, Shift) - -### Touch Events -- Touch screen simulation -- Multi-touch gestures -- Touch coordinates and pressure - -## Human-like Behavior - -The input commands support human-like behavior patterns: - -- Natural mouse movement curves -- Realistic typing speeds and patterns -- Random micro-delays between actions -- Pressure-sensitive touch events - -!!! tip "Element Methods" - For most use cases, use the higher-level element methods like `element.click()` and `element.type_text()` which provide a more convenient API and handle common scenarios automatically. \ No newline at end of file diff --git a/docs/api/commands/network.md b/docs/api/commands/network.md deleted file mode 100644 index 4e1b947..0000000 --- a/docs/api/commands/network.md +++ /dev/null @@ -1,100 +0,0 @@ -# Network Commands - -Network commands provide comprehensive control over network requests, responses, and browser networking behavior. - -## Overview - -The network commands module enables request interception, response modification, cookie management, and network monitoring capabilities. - -::: pydoll.commands.network_commands - options: - show_root_heading: true - show_source: false - heading_level: 2 - filters: - - "!^_" - - "!^__" - -## Usage - -Network commands are used for advanced scenarios like request interception and network monitoring: - -```python -from pydoll.commands.network_commands import enable, set_request_interception -from pydoll.connection.connection_handler import ConnectionHandler - -# Enable network monitoring -connection = ConnectionHandler() -await enable(connection) - -# Enable request interception -await set_request_interception(connection, patterns=[{"urlPattern": "*"}]) -``` - -## Key Functionality - -The network commands module provides functions for: - -### Request Management -- `enable()` / `disable()` - Enable/disable network monitoring -- `set_request_interception()` - Intercept and modify requests -- `continue_intercepted_request()` - Continue or modify intercepted requests -- `get_request_post_data()` - Get request body data - -### Response Handling -- `get_response_body()` - Get response content -- `fulfill_request()` - Provide custom responses -- `fail_request()` - Simulate network failures - -### Cookie Management -- `get_cookies()` - Get browser cookies -- `set_cookies()` - Set browser cookies -- `delete_cookies()` - Delete specific cookies -- `clear_browser_cookies()` - Clear all cookies - -### Cache Control -- `clear_browser_cache()` - Clear browser cache -- `set_cache_disabled()` - Disable browser cache -- `get_response_body_for_interception()` - Get cached responses - -### Security & Headers -- `set_user_agent_override()` - Override user agent -- `set_extra_http_headers()` - Add custom headers -- `emulate_network_conditions()` - Simulate network conditions - -## Advanced Use Cases - -### Request Interception -```python -# Intercept and modify requests -await set_request_interception(connection, patterns=[ - {"urlPattern": "*/api/*", "requestStage": "Request"} -]) - -# Handle intercepted request -async def handle_request(request): - if "api/login" in request.url: - # Modify request headers - headers = request.headers.copy() - headers["Authorization"] = "Bearer token" - await continue_intercepted_request( - connection, - request_id=request.request_id, - headers=headers - ) -``` - -### Response Mocking -```python -# Mock API responses -await fulfill_request( - connection, - request_id=request_id, - response_code=200, - response_headers={"Content-Type": "application/json"}, - body='{"status": "success"}' -) -``` - -!!! warning "Performance Impact" - Network interception can impact page loading performance. Use selectively and disable when not needed. \ No newline at end of file diff --git a/docs/api/commands/page.md b/docs/api/commands/page.md deleted file mode 100644 index 065505b..0000000 --- a/docs/api/commands/page.md +++ /dev/null @@ -1,98 +0,0 @@ -# Page Commands - -Page commands handle page navigation, lifecycle events, and page-level operations. - -## Overview - -The page commands module provides functionality for navigating between pages, managing page lifecycle, handling JavaScript execution, and controlling page behavior. - -::: pydoll.commands.page_commands - options: - show_root_heading: true - show_source: false - heading_level: 2 - filters: - - "!^_" - - "!^__" - -## Usage - -Page commands are used extensively by the `Tab` class for navigation and page management: - -```python -from pydoll.commands.page_commands import navigate, reload, enable -from pydoll.connection.connection_handler import ConnectionHandler - -# Navigate to a URL -connection = ConnectionHandler() -await enable(connection) # Enable page events -await navigate(connection, url="https://example.com") - -# Reload the page -await reload(connection) -``` - -## Key Functionality - -The page commands module provides functions for: - -### Navigation -- `navigate()` - Navigate to URLs -- `reload()` - Reload current page -- `go_back()` - Navigate back in history -- `go_forward()` - Navigate forward in history -- `stop_loading()` - Stop page loading - -### Page Lifecycle -- `enable()` / `disable()` - Enable/disable page events -- `get_frame_tree()` - Get page frame structure -- `get_navigation_history()` - Get navigation history - -### Content Management -- `get_resource_content()` - Get page resource content -- `search_in_resource()` - Search within page resources -- `set_document_content()` - Set page HTML content - -### Screenshots & PDF -- `capture_screenshot()` - Take page screenshots -- `print_to_pdf()` - Generate PDF from page -- `capture_snapshot()` - Capture page snapshots - -### JavaScript Execution -- `add_script_to_evaluate_on_new_document()` - Add startup scripts -- `remove_script_to_evaluate_on_new_document()` - Remove startup scripts - -### Page Settings -- `set_lifecycle_events_enabled()` - Control lifecycle events -- `set_ad_blocking_enabled()` - Enable/disable ad blocking -- `set_bypass_csp()` - Bypass Content Security Policy - -## Advanced Features - -### Frame Management -```python -# Get all frames in the page -frame_tree = await get_frame_tree(connection) -for frame in frame_tree.child_frames: - print(f"Frame: {frame.frame.url}") -``` - -### Resource Interception -```python -# Get resource content -content = await get_resource_content( - connection, - frame_id=frame_id, - url="https://example.com/script.js" -) -``` - -### Page Events -The page commands work with various page events: -- `Page.loadEventFired` - Page load completed -- `Page.domContentEventFired` - DOM content loaded -- `Page.frameNavigated` - Frame navigation -- `Page.frameStartedLoading` - Frame loading started - -!!! tip "Tab Class Integration" - Most page operations are available through the `Tab` class methods like `tab.go_to()`, `tab.reload()`, and `tab.screenshot()` which provide a more convenient API. \ No newline at end of file diff --git a/docs/api/commands/runtime.md b/docs/api/commands/runtime.md deleted file mode 100644 index a667504..0000000 --- a/docs/api/commands/runtime.md +++ /dev/null @@ -1,110 +0,0 @@ -# Runtime Commands - -Runtime commands provide JavaScript execution capabilities and runtime environment management. - -## Overview - -The runtime commands module enables JavaScript code execution, object inspection, and runtime environment control within browser contexts. - -::: pydoll.commands.runtime_commands - options: - show_root_heading: true - show_source: false - heading_level: 2 - filters: - - "!^_" - - "!^__" - -## Usage - -Runtime commands are used for JavaScript execution and runtime management: - -```python -from pydoll.commands.runtime_commands import evaluate, enable -from pydoll.connection.connection_handler import ConnectionHandler - -# Enable runtime events -connection = ConnectionHandler() -await enable(connection) - -# Execute JavaScript -result = await evaluate( - connection, - expression="document.title", - return_by_value=True -) -print(result.value) # Page title -``` - -## Key Functionality - -The runtime commands module provides functions for: - -### JavaScript Execution -- `evaluate()` - Execute JavaScript expressions -- `call_function_on()` - Call functions on objects -- `compile_script()` - Compile JavaScript for reuse -- `run_script()` - Run compiled scripts - -### Object Management -- `get_properties()` - Get object properties -- `release_object()` - Release object references -- `release_object_group()` - Release object groups - -### Runtime Control -- `enable()` / `disable()` - Enable/disable runtime events -- `discard_console_entries()` - Clear console entries -- `set_custom_object_formatter_enabled()` - Enable custom formatters - -### Exception Handling -- `set_async_call_stack_depth()` - Set call stack depth -- Exception capture and reporting -- Error object inspection - -## Advanced Usage - -### Complex JavaScript Execution -```python -# Execute complex JavaScript with error handling -script = """ -try { - const elements = document.querySelectorAll('.item'); - return Array.from(elements).map(el => ({ - text: el.textContent, - href: el.href - })); -} catch (error) { - return { error: error.message }; -} -""" - -result = await evaluate( - connection, - expression=script, - return_by_value=True, - await_promise=True -) -``` - -### Object Inspection -```python -# Get detailed object properties -properties = await get_properties( - connection, - object_id=object_id, - own_properties=True, - accessor_properties_only=False -) - -for prop in properties: - print(f"{prop.name}: {prop.value}") -``` - -### Console Integration -Runtime commands integrate with browser console: -- Console messages and errors -- Console API method calls -- Custom console formatters - -!!! note "Performance Considerations" - JavaScript execution through runtime commands can be slower than native browser execution. Use judiciously for complex operations. \ No newline at end of file diff --git a/docs/api/commands/storage.md b/docs/api/commands/storage.md deleted file mode 100644 index 420bcfb..0000000 --- a/docs/api/commands/storage.md +++ /dev/null @@ -1,131 +0,0 @@ -# Storage Commands - -Storage commands provide comprehensive browser storage management including cookies, localStorage, sessionStorage, and IndexedDB. - -## Overview - -The storage commands module enables management of all browser storage mechanisms, providing functionality for data persistence and retrieval. - -::: pydoll.commands.storage_commands - options: - show_root_heading: true - show_source: false - heading_level: 2 - filters: - - "!^_" - - "!^__" - -## Usage - -Storage commands are used for managing browser storage across different mechanisms: - -```python -from pydoll.commands.storage_commands import get_cookies, set_cookies, clear_data_for_origin -from pydoll.connection.connection_handler import ConnectionHandler - -# Get cookies for a domain -connection = ConnectionHandler() -cookies = await get_cookies(connection, urls=["https://example.com"]) - -# Set a new cookie -await set_cookies(connection, cookies=[{ - "name": "session_id", - "value": "abc123", - "domain": "example.com", - "path": "/", - "httpOnly": True, - "secure": True -}]) - -# Clear all storage for an origin -await clear_data_for_origin( - connection, - origin="https://example.com", - storage_types="all" -) -``` - -## Key Functionality - -The storage commands module provides functions for: - -### Cookie Management -- `get_cookies()` - Get cookies by URL or domain -- `set_cookies()` - Set new cookies -- `delete_cookies()` - Delete specific cookies -- `clear_cookies()` - Clear all cookies - -### Local Storage -- `get_dom_storage_items()` - Get localStorage items -- `set_dom_storage_item()` - Set localStorage item -- `remove_dom_storage_item()` - Remove localStorage item -- `clear_dom_storage()` - Clear localStorage - -### Session Storage -- Session storage operations (similar to localStorage) -- Session-specific data management -- Tab-isolated storage - -### IndexedDB -- `get_database_names()` - Get IndexedDB databases -- `request_database()` - Access database structure -- `request_data()` - Query database data -- `clear_object_store()` - Clear object stores - -### Cache Storage -- `request_cache_names()` - Get cache names -- `request_cached_response()` - Get cached responses -- `delete_cache()` - Delete cache entries - -### Application Cache (Deprecated) -- Legacy application cache support -- Manifest-based caching - -## Advanced Features - -### Bulk Operations -```python -# Clear all storage types for multiple origins -origins = ["https://example.com", "https://api.example.com"] -for origin in origins: - await clear_data_for_origin( - connection, - origin=origin, - storage_types="cookies,local_storage,session_storage,indexeddb" - ) -``` - -### Storage Quotas -```python -# Get storage quota information -quota_info = await get_usage_and_quota(connection, origin="https://example.com") -print(f"Used: {quota_info.usage} bytes") -print(f"Quota: {quota_info.quota} bytes") -``` - -### Cross-Origin Storage -```python -# Manage storage across different origins -await set_cookies(connection, cookies=[{ - "name": "cross_site_token", - "value": "token123", - "domain": ".example.com", # Applies to all subdomains - "sameSite": "None", - "secure": True -}]) -``` - -## Storage Types - -The module supports various storage mechanisms: - -| Storage Type | Persistence | Scope | Capacity | -|--------------|-------------|-------|----------| -| Cookies | Persistent | Domain/Path | ~4KB per cookie | -| localStorage | Persistent | Origin | ~5-10MB | -| sessionStorage | Session | Tab | ~5-10MB | -| IndexedDB | Persistent | Origin | Large (GB+) | -| Cache API | Persistent | Origin | Large | - -!!! warning "Privacy Considerations" - Storage operations can affect user privacy. Always handle storage data responsibly and in compliance with privacy regulations. \ No newline at end of file diff --git a/docs/api/elements/mixins.md b/docs/api/elements/mixins.md deleted file mode 100644 index 43e15d1..0000000 --- a/docs/api/elements/mixins.md +++ /dev/null @@ -1,39 +0,0 @@ -# Element Mixins - -The mixins module provides reusable functionality that can be mixed into element classes to extend their capabilities. - -## Find Elements Mixin - -The `FindElementsMixin` provides element finding capabilities to classes that include it. - -::: pydoll.elements.mixins.find_elements_mixin - options: - show_root_heading: true - show_source: false - heading_level: 2 - filters: - - "!^_" - - "!^__" - -## Usage - -Mixins are typically used internally by the library to compose functionality. The `FindElementsMixin` is used by classes like `Tab` and `WebElement` to provide element finding methods: - -```python -# These methods come from FindElementsMixin -element = await tab.find(id="username") -elements = await tab.find(class_name="item", find_all=True) -element = await tab.query("#submit-button") -``` - -## Available Methods - -The `FindElementsMixin` provides several methods for finding elements: - -- `find()` - Modern element finding with keyword arguments -- `query()` - CSS selector and XPath queries -- `find_element()` - Legacy element finding method -- `find_elements()` - Legacy method for finding multiple elements - -!!! tip "Modern vs Legacy" - The `find()` method is the modern, recommended approach for finding elements. The `find_element()` and `find_elements()` methods are maintained for backward compatibility. \ No newline at end of file diff --git a/docs/api/index.md b/docs/api/index.md deleted file mode 100644 index 48acf3a..0000000 --- a/docs/api/index.md +++ /dev/null @@ -1,138 +0,0 @@ -# API Reference - -Welcome to the Pydoll API Reference! This section provides comprehensive documentation for all classes, methods, and functions available in the Pydoll library. - -## Overview - -Pydoll is organized into several key modules, each serving a specific purpose in browser automation: - -### Browser Module -The browser module contains classes for managing browser instances and their lifecycle. - -- **[Chrome](browser/chrome.md)** - Chrome browser automation -- **[Edge](browser/edge.md)** - Microsoft Edge browser automation -- **[Options](browser/options.md)** - Browser configuration options -- **[Tab](browser/tab.md)** - Tab management and interaction -- **[Managers](browser/managers.md)** - Browser lifecycle managers - -### Elements Module -The elements module provides classes for interacting with web page elements. - -- **[WebElement](elements/web_element.md)** - Individual element interaction -- **[Mixins](elements/mixins.md)** - Reusable element functionality - -### Connection Module -The connection module handles communication with the browser through the Chrome DevTools Protocol. - -- **[Connection Handler](connection/connection.md)** - WebSocket connection management -- **[Managers](connection/managers.md)** - Connection lifecycle managers - -### Commands Module -The commands module provides low-level Chrome DevTools Protocol command implementations. - -- **[Commands Overview](commands/index.md)** - CDP command implementations by domain - -### Protocol Module -The protocol module implements the Chrome DevTools Protocol commands and events. - -- **[Commands](protocol/commands.md)** - CDP command implementations -- **[Events](protocol/events.md)** - CDP event handling - -### Core Module -The core module contains fundamental utilities, constants, and exceptions. - -- **[Constants](core/constants.md)** - Library constants and enums -- **[Exceptions](core/exceptions.md)** - Custom exception classes -- **[Utils](core/utils.md)** - Utility functions - -## Quick Navigation - -### Most Common Classes - -| Class | Purpose | Module | -|-------|---------|--------| -| `Chrome` | Chrome browser automation | `pydoll.browser.chromium` | -| `Edge` | Edge browser automation | `pydoll.browser.chromium` | -| `Tab` | Tab interaction and control | `pydoll.browser.tab` | -| `WebElement` | Element interaction | `pydoll.elements.web_element` | -| `ChromiumOptions` | Browser configuration | `pydoll.browser.options` | - -### Key Enums and Constants - -| Name | Purpose | Module | -|------|---------|--------| -| `By` | Element selector strategies | `pydoll.constants` | -| `Key` | Keyboard key constants | `pydoll.constants` | -| `PermissionType` | Browser permission types | `pydoll.constants` | - -### Common Exceptions - -| Exception | When Raised | Module | -|-----------|-------------|--------| -| `ElementNotFound` | Element not found in DOM | `pydoll.exceptions` | -| `WaitElementTimeout` | Element wait timeout | `pydoll.exceptions` | -| `BrowserNotStarted` | Browser not started | `pydoll.exceptions` | - -## Usage Patterns - -### Basic Browser Automation - -```python -from pydoll.browser.chromium import Chrome - -async with Chrome() as browser: - tab = await browser.start() - await tab.go_to("https://example.com") - element = await tab.find(id="my-element") - await element.click() -``` - -### Element Finding - -```python -# Using the modern find() method -element = await tab.find(id="username") -element = await tab.find(tag_name="button", class_name="submit") - -# Using CSS selectors or XPath -element = await tab.query("#username") -element = await tab.query("//button[@class='submit']") -``` - -### Event Handling - -```python -await tab.enable_page_events() -await tab.on('Page.loadEventFired', handle_page_load) -``` - -## Type Hints - -Pydoll is fully typed and provides comprehensive type hints for better IDE support and code safety. All public APIs include proper type annotations. - -```python -from typing import Optional, List -from pydoll.elements.web_element import WebElement - -# Methods return properly typed objects -element: Optional[WebElement] = await tab.find(id="test", raise_exc=False) -elements: List[WebElement] = await tab.find(class_name="item", find_all=True) -``` - -## Async/Await Support - -All Pydoll operations are asynchronous and must be used with `async`/`await`: - -```python -import asyncio - -async def main(): - # All Pydoll operations are async - async with Chrome() as browser: - tab = await browser.start() - await tab.go_to("https://example.com") - -asyncio.run(main()) -``` - -Browse the sections below to explore the complete API documentation for each module. \ No newline at end of file diff --git a/docs/deep-dive/cdp.md b/docs/deep-dive/cdp.md deleted file mode 100644 index 6f2f4e4..0000000 --- a/docs/deep-dive/cdp.md +++ /dev/null @@ -1,185 +0,0 @@ -# Chrome DevTools Protocol (CDP) - -The Chrome DevTools Protocol (CDP) is the foundation that enables Pydoll to control browsers without traditional webdrivers. Understanding how CDP works provides valuable insight into Pydoll's capabilities and internal architecture. - - -## What is CDP? - -The Chrome DevTools Protocol is a powerful interface developed by the Chromium team that allows programmatic interaction with Chromium-based browsers. It's the same protocol used by Chrome DevTools when you inspect a webpage, but exposed as a programmable API that can be leveraged by automation tools. - -At its core, CDP provides a comprehensive set of methods and events for interfacing with browser internals. This allows for fine-grained control over every aspect of the browser, from navigating between pages to manipulating the DOM, intercepting network requests, and monitoring performance metrics. - -!!! info "CDP Evolution" - The Chrome DevTools Protocol has been continuously evolving since its introduction. Google maintains and updates the protocol with each Chrome release, regularly adding new functionality and improving existing features. - - While the protocol was initially designed for Chrome's DevTools, its comprehensive capabilities have made it the foundation for next-generation browser automation tools like Puppeteer, Playwright, and of course, Pydoll. - -## WebSocket Communication - -One of the key architectural decisions in CDP is its use of WebSockets for communication. When a Chromium-based browser is started with the remote debugging flag enabled, it opens a WebSocket server on a specified port: - -``` -chrome --remote-debugging-port=9222 -``` - -Pydoll connects to this WebSocket endpoint to establish a bidirectional communication channel with the browser. This connection: - -1. **Remains persistent** throughout the automation session -2. **Enables real-time events** from the browser to be pushed to the client -3. **Allows commands** to be sent to the browser -4. **Supports binary data** for efficient transfer of screenshots, PDFs, and other assets - -The WebSocket protocol is particularly well-suited for browser automation because it provides: - -- **Low latency communication** - Necessary for responsive automation -- **Bidirectional messaging** - Essential for event-driven architecture -- **Persistent connections** - Eliminating connection setup overhead for each operation - -Here's a simplified view of how Pydoll's communication with the browser works: - -```mermaid -sequenceDiagram - participant App as Pydoll Application - participant WS as WebSocket Connection - participant Browser as Chrome Browser - - App ->> WS: Command: navigate to URL - WS ->> Browser: Execute navigation - - Browser -->> WS: Send page load event - WS -->> App: Receive page load event -``` - -!!! info "WebSocket vs HTTP" - Earlier browser automation protocols often relied on HTTP endpoints for communication. CDP's switch to WebSockets represents a significant architectural improvement that enables more responsive automation and real-time event monitoring. - - HTTP-based protocols require continuous polling to detect changes, creating overhead and delays. WebSockets allow the browser to push notifications to your automation script exactly when events occur, with minimal latency. - -## Key CDP Domains - -CDP is organized into logical domains, each responsible for a specific aspect of browser functionality. Some of the most important domains include: - - -| Domain | Responsibility | Example Use Cases | -|--------|----------------|------------------| -| **Browser** | Control of the browser application itself | Window management, browser context creation | -| **Page** | Interaction with page lifecycle | Navigation, JavaScript execution, frame management | -| **DOM** | Access to page structure | Query selectors, attribute modification, event listeners | -| **Network** | Network traffic monitoring and control | Request interception, response examination, caching | -| **Runtime** | JavaScript execution environment | Evaluate expressions, call functions, handle exceptions | -| **Input** | Simulating user interactions | Mouse movements, keyboard input, touch events | -| **Target** | Managing browser contexts and targets | Creating tabs, accessing iframes, handling popups | -| **Fetch** | Low-level network interception | Modifying requests, simulating responses, authentication | - -Pydoll maps these CDP domains to a more intuitive API structure while preserving the full capabilities of the underlying protocol. - -## Event-Driven Architecture - -One of CDP's most powerful features is its event system. The protocol allows clients to subscribe to various events that the browser emits during normal operation. These events cover virtually every aspect of browser behavior: - -- **Lifecycle events**: Page loads, frame navigation, target creation -- **DOM events**: Element changes, attribute modifications -- **Network events**: Request/response cycles, WebSocket messages -- **Execution events**: JavaScript exceptions, console messages -- **Performance events**: Metrics for rendering, scripting, and more - - -When you enable event monitoring in Pydoll (e.g., with `page.enable_network_events()`), the library sets up the necessary subscriptions with the browser and provides hooks for your code to react to these events. - -```python -from pydoll.events.network import NetworkEvents -from functools import partial - -async def on_request(page, event): - url = event['params']['request']['url'] - print(f"Request to: {url}") - -# Subscribe to network request events -await page.enable_network_events() -await page.on(NetworkEvents.REQUEST_WILL_BE_SENT, partial(on_request, page)) -``` - -This event-driven approach allows automation scripts to react immediately to browser state changes without relying on inefficient polling or arbitrary delays. - -## Performance Advantages of Direct CDP Integration - -Using CDP directly, as Pydoll does, offers several performance advantages over traditional webdriver-based automation: - -### 1. Elimination of Protocol Translation Layer - -Traditional webdriver-based tools like Selenium use a multi-layered approach: - -```mermaid -graph LR - AS[Automation Script] --> WC[WebDriver Client] - WC --> WS[WebDriver Server] - WS --> B[Browser] -``` - -Each layer adds overhead, especially the WebDriver server, which acts as a translation layer between the WebDriver protocol and the browser's native APIs. - -Pydoll's approach streamlines this to: - -```mermaid -graph LR - AS[Automation Script] --> P[Pydoll] - P --> B[Browser via CDP] -``` - -This direct communication eliminates the computational and network overhead of the intermediate server, resulting in faster operations. - -### 2. Efficient Command Batching - -CDP allows for the batching of multiple commands in a single message, reducing the number of round trips required for complex operations. This is particularly valuable for operations that require several steps, such as finding an element and then interacting with it. - -### 3. Asynchronous Operation - -CDP's WebSocket-based, event-driven architecture aligns perfectly with Python's asyncio framework, enabling true asynchronous operation. This allows Pydoll to: - -- Execute multiple operations concurrently -- Process events as they occur -- Avoid blocking the main thread during I/O operations - -```mermaid -graph TD - subgraph "Pydoll Async Architecture" - EL[Event Loop] - - subgraph "Concurrent Tasks" - T1[Task 1: Navigate] - T2[Task 2: Wait for Element] - T3[Task 3: Handle Network Events] - end - - EL --> T1 - EL --> T2 - EL --> T3 - - T1 --> WS[WebSocket Connection] - T2 --> WS - T3 --> WS - - WS --> B[Browser] - end -``` - -!!! info "Async Performance Gains" - The combination of asyncio and CDP creates a multiplicative effect on performance. In benchmark tests, Pydoll's asynchronous approach can process multiple pages in parallel with near-linear scaling, while traditional synchronous tools see diminishing returns as concurrency increases. - - For example, scraping 10 pages that each take 2 seconds to load might take over 20 seconds with a synchronous tool, but just over 2 seconds with Pydoll's async architecture (plus some minimal overhead). - -### 4. Fine-Grained Control - -CDP provides more granular control over browser behavior than the WebDriver protocol. This allows Pydoll to implement optimized strategies for common operations: - -- More precise waiting conditions (vs. arbitrary timeouts) -- Direct access to browser caches and storage -- Targeted JavaScript execution in specific contexts -- Detailed network control for request optimization - - -## Conclusion - -The Chrome DevTools Protocol forms the foundation of Pydoll's zero-webdriver approach to browser automation. By leveraging CDP's WebSocket communication, comprehensive domain coverage, event-driven architecture, and direct browser integration, Pydoll achieves superior performance and reliability compared to traditional automation tools. - -In the following sections, we'll dive deeper into how Pydoll implements specific CDP domains and transforms the low-level protocol into an intuitive, developer-friendly API. \ No newline at end of file diff --git a/docs/deep-dive/connection-layer.md b/docs/deep-dive/connection-layer.md deleted file mode 100644 index dc48e96..0000000 --- a/docs/deep-dive/connection-layer.md +++ /dev/null @@ -1,473 +0,0 @@ -# Connection Handler - -The Connection Handler is the foundational layer of Pydoll's architecture, serving as the bridge between your Python code and the browser's Chrome DevTools Protocol (CDP). This component manages the WebSocket connection to the browser, handles command execution, and processes events in a non-blocking, asynchronous manner. - -```mermaid -graph TD - A[Python Code] --> B[Connection Handler] - B <--> C[WebSocket] - C <--> D[Browser CDP Endpoint] - - subgraph "Connection Handler" - E[Command Manager] - F[Events Handler] - G[WebSocket Client] - end - - B --> E - B --> F - B --> G -``` - -## Asynchronous Programming Model - -Pydoll is built on Python's `asyncio` framework, which enables non-blocking I/O operations. This design choice is critical for high-performance browser automation, as it allows multiple operations to occur concurrently without waiting for each to complete. - -### Understanding Async/Await - - -To understand how async/await works in practice, let's examine a more detailed example with two concurrent operations: - -```python -import asyncio -from pydoll.browser.chrome import Chrome - -async def fetch_page_data(url): - print(f"Starting fetch for {url}") - browser = Chrome() - await browser.start() - page = await browser.get_page() - - # Navigation takes time - this is where we yield control - await page.go_to(url) - - # Get page title - title = await page.execute_script("return document.title") - - # Extract some data - description = await page.execute_script( - "return document.querySelector('meta[name=\"description\"]')?.content || ''" - ) - - await browser.stop() - print(f"Completed fetch for {url}") - return {"url": url, "title": title, "description": description} - -async def main(): - # Start two page operations concurrently - task1 = asyncio.create_task(fetch_page_data("https://example.com")) - task2 = asyncio.create_task(fetch_page_data("https://github.com")) - - # Wait for both to complete and get results - result1 = await task1 - result2 = await task2 - - return [result1, result2] - -# Run the async function -results = asyncio.run(main()) -``` - -This example demonstrates how we can fetch data from two different websites concurrently, potentially cutting the overall execution time nearly in half compared to sequential execution. - -#### Async Execution Flow Diagram - -Here's what happens in the event loop when executing the code above: - -```mermaid -sequenceDiagram - participant A as Main Code - participant B as Task 1
(example.com) - participant C as Task 2
(github.com) - participant D as Event Loop - - A->>B: Create task1 - B->>D: Register in loop - A->>C: Create task2 - C->>D: Register in loop - D->>B: Execute until browser.start() - D->>C: Execute until browser.start() - D-->>B: Resume after WebSocket connected - D-->>C: Resume after WebSocket connected - D->>B: Execute until page.go_to() - D->>C: Execute until page.go_to() - D-->>B: Resume after page loaded - D-->>C: Resume after page loaded - B-->>A: Return result - C-->>A: Return result -``` - -This sequence diagram illustrates how Python's asyncio manages the two concurrent tasks in our example code: - -1. The main function creates two tasks for fetching data from different websites -2. Both tasks are registered in the event loop -3. The event loop executes each task until it hits an `await` statement (like `browser.start()`) -4. When async operations complete (like a WebSocket connection being established), tasks resume -5. The loop continues to switch between tasks at each `await` point -6. When each task completes, it returns its result back to the main function - -In the `fetch_page_data` example, this allows both browser instances to work concurrently - while one is waiting for a page to load, the other can be making progress. This is significantly more efficient than sequentially processing each website, as I/O wait times don't block the execution of other tasks. - -!!! info "Cooperative Multitasking" - Asyncio uses cooperative multitasking, where tasks voluntarily yield control at `await` points. This differs from preemptive multitasking (threads), where tasks can be interrupted at any time. Cooperative multitasking can provide better performance for I/O-bound operations but requires careful coding to avoid blocking the event loop. - -## Connection Handler Implementation - -The `ConnectionHandler` class is designed to manage both command execution and event processing, providing a robust interface to the CDP WebSocket connection. - -### Class Initialization - -```python -def __init__( - self, - connection_port: int, - page_id: str = 'browser', - ws_address_resolver: Callable[[int], str] = get_browser_ws_address, - ws_connector: Callable = websockets.connect, -): - # Initialize components... -``` - -The ConnectionHandler accepts several parameters: - -| Parameter | Type | Description | -|-----------|------|-------------| -| `connection_port` | `int` | Port number where the browser's CDP endpoint is listening | -| `page_id` | `str` | Identifier for the specific page/target (use 'browser' for browser-level connections) | -| `ws_address_resolver` | `Callable` | Function to resolve the WebSocket URL from the port number | -| `ws_connector` | `Callable` | Function to establish the WebSocket connection | - -### Internal Components - -The ConnectionHandler orchestrates three primary components: - -1. **WebSocket Connection**: Manages the actual WebSocket communication with the browser -2. **Command Manager**: Handles sending commands and receiving responses -3. **Events Handler**: Processes events from the browser and triggers appropriate callbacks - -```mermaid -classDiagram - class ConnectionHandler { - -_connection_port: int - -_page_id: str - -_ws_connection - -_command_manager: CommandManager - -_events_handler: EventsHandler - +execute_command(command, timeout) async - +register_callback(event_name, callback) async - +remove_callback(callback_id) async - +ping() async - +close() async - -_receive_events() async - } - - class CommandManager { - -_pending_commands: dict - +create_command_future(command) - +resolve_command(id, response) - +remove_pending_command(id) - } - - class EventsHandler { - -_callbacks: dict - -_network_logs: list - -_dialog: dict - +register_callback(event_name, callback, temporary) - +remove_callback(callback_id) - +clear_callbacks() - +process_event(event) async - } - - ConnectionHandler *-- CommandManager - ConnectionHandler *-- EventsHandler -``` - -## Command Execution Flow - -When executing a command through the CDP, the ConnectionHandler follows a specific pattern: - -1. Ensure an active WebSocket connection exists -2. Create a Future object to represent the pending response -3. Send the command over the WebSocket -4. Await the Future to be resolved with the response -5. Return the response to the caller - -```python -async def execute_command(self, command: dict, timeout: int = 10) -> dict: - # Validate command - if not isinstance(command, dict): - logger.error('Command must be a dictionary.') - raise exceptions.InvalidCommand('Command must be a dictionary') - - # Ensure connection is active - await self._ensure_active_connection() - - # Create future for this command - future = self._command_manager.create_command_future(command) - command_str = json.dumps(command) - - # Send command and await response - try: - await self._ws_connection.send(command_str) - response: str = await asyncio.wait_for(future, timeout) - return json.loads(response) - except asyncio.TimeoutError as exc: - self._command_manager.remove_pending_command(command['id']) - raise exc - except websockets.ConnectionClosed as exc: - await self._handle_connection_loss() - raise exc -``` - -!!! warning "Command Timeout" - Commands that don't receive a response within the specified timeout period will raise a `TimeoutError`. This prevents automation scripts from hanging indefinitely due to missing responses. The default timeout is 10 seconds, but can be adjusted based on expected response times for complex operations. - -## Event Processing System - -The event system is a key architectural component that enables reactive programming patterns in Pydoll. It allows you to register callbacks for specific browser events and have them executed automatically when those events occur. - -### Event Flow - -The event processing flow follows these steps: - -1. The `_receive_events` method runs as a background task, continuously receiving messages from the WebSocket -2. Each message is parsed and classified as either a command response or an event -3. Events are passed to the EventsHandler for processing -4. The EventsHandler identifies registered callbacks for the event and invokes them - -```mermaid -flowchart TD - A[WebSocket Message] --> B{Is Command Response?} - B -->|Yes| C[Resolve Command Future] - B -->|No| D[Process as Event] - D --> E[Find Matching Callbacks] - E --> F[Execute Callbacks] - F --> G{Is Temporary?} - G -->|Yes| H[Remove Callback] - G -->|No| I[Keep Callback] -``` - -### Callback Registration - -The ConnectionHandler provides methods to register, remove, and manage event callbacks: - -```python -# Register a callback for a specific event -callback_id = await connection.register_callback( - 'Page.loadEventFired', - handle_page_load -) - -# Remove a specific callback -await connection.remove_callback(callback_id) - -# Remove all callbacks -await connection.clear_callbacks() -``` - -!!! tip "Temporary Callbacks" - You can register a callback as temporary, which means it will be automatically removed after being triggered once. This is useful for one-time events like dialog handling: - - ```python - await connection.register_callback( - 'Page.javascriptDialogOpening', - handle_dialog, - temporary=True - ) - ``` - -### Asynchronous Callback Execution - -Callbacks can be either synchronous functions or asynchronous coroutines. The ConnectionHandler handles both types properly: - -```python -# Synchronous callback -def synchronous_callback(event): - print(f"Event received: {event['method']}") - -# Asynchronous callback -async def asynchronous_callback(event): - await asyncio.sleep(0.1) # Perform some async operation - print(f"Event processed asynchronously: {event['method']}") - -# Both can be registered the same way -await connection.register_callback('Network.requestWillBeSent', synchronous_callback) -await connection.register_callback('Network.responseReceived', asynchronous_callback) -``` - -For asynchronous callbacks, the ConnectionHandler wraps them in a task that runs in the background, allowing the event processing loop to continue without waiting for the callback to complete. - -## Connection Management - -The ConnectionHandler implements several strategies to ensure robust connections: - -### Lazy Connection Establishment - -Connections are established only when needed, typically when the first command is executed or when explicitly requested. This lazy initialization approach conserves resources and allows for more flexible connection management. - -### Automatic Reconnection - -If the WebSocket connection is lost or closed unexpectedly, the ConnectionHandler will attempt to re-establish it automatically when the next command is executed. This provides resilience against transient network issues. - -```python -async def _ensure_active_connection(self): - """ - Guarantees that an active connection exists before proceeding. - """ - if self._ws_connection is None or self._ws_connection.closed: - await self._establish_new_connection() -``` - -### Resource Cleanup - -The ConnectionHandler implements both explicit cleanup methods and Python's asynchronous context manager protocol (`__aenter__` and `__aexit__`), ensuring resources are properly released when no longer needed: - -```python -async def close(self): - """ - Closes the WebSocket connection and clears all callbacks. - """ - await self.clear_callbacks() - if self._ws_connection is not None: - try: - await self._ws_connection.close() - except websockets.ConnectionClosed as e: - logger.info(f'WebSocket connection has closed: {e}') - logger.info('WebSocket connection closed.') -``` - -!!! info "Context Manager Usage" - Using the ConnectionHandler as a context manager is the recommended pattern for ensuring proper resource cleanup: - - ```python - async with ConnectionHandler(9222, 'browser') as connection: - # Work with the connection... - await connection.execute_command(...) - # Connection is automatically closed when exiting the context - ``` - -## Message Processing Pipeline - -The ConnectionHandler implements a sophisticated message processing pipeline that handles the continuous stream of messages from the WebSocket connection: - -```mermaid -sequenceDiagram - participant WS as WebSocket - participant RCV as _receive_events - participant MSG as _process_single_message - participant PARSE as _parse_message - participant CMD as _handle_command_message - participant EVT as _handle_event_message - - loop While connected - WS->>RCV: message - RCV->>MSG: raw_message - MSG->>PARSE: raw_message - PARSE-->>MSG: parsed JSON or None - - alt Is command response - MSG->>CMD: message - CMD->>CMD: resolve command future - else Is event notification - MSG->>EVT: message - EVT->>EVT: process event & trigger callbacks - end - end -``` - -This pipeline ensures efficient processing of both command responses and asynchronous events, allowing Pydoll to maintain responsive operation even under high message volume. - -## Advanced Usage - -The ConnectionHandler is usually used indirectly through the Browser and Page classes, but it can also be used directly for advanced scenarios: - -### Direct Event Monitoring - -For specialized use cases, you might want to bypass the higher-level APIs and directly monitor specific CDP events: - -```python -from pydoll.connection.connection import ConnectionHandler - -async def monitor_network(): - connection = ConnectionHandler(9222) - - async def log_request(event): - url = event['params']['request']['url'] - print(f"Request: {url}") - - await connection.register_callback( - 'Network.requestWillBeSent', - log_request - ) - - # Enable network events via CDP command - await connection.execute_command({ - "id": 1, - "method": "Network.enable" - }) - - # Keep running until interrupted - try: - while True: - await asyncio.sleep(1) - finally: - await connection.close() -``` - -### Custom Command Execution - -You can execute arbitrary CDP commands directly: - -```python -async def custom_cdp_command(connection, method, params=None): - command = { - "id": random.randint(1, 10000), - "method": method, - "params": params or {} - } - return await connection.execute_command(command) - -# Example: Get document HTML without using Page class -async def get_html(connection): - result = await custom_cdp_command( - connection, - "Runtime.evaluate", - {"expression": "document.documentElement.outerHTML"} - ) - return result['result']['result']['value'] -``` - -!!! warning "Advanced Interface" - Direct use of the ConnectionHandler requires a deep understanding of the Chrome DevTools Protocol. For most use cases, the higher-level Browser and Page APIs provide a more intuitive and safer interface. - - -## Advanced Concurrency Patterns - -The ConnectionHandler's asynchronous design enables sophisticated concurrency patterns: - -### Parallel Command Execution - -Execute multiple commands concurrently and wait for all results: - -```python -async def get_page_metrics(connection): - commands = [ - {"id": 1, "method": "Performance.getMetrics"}, - {"id": 2, "method": "Network.getResponseBody", "params": {"requestId": "..."}}, - {"id": 3, "method": "DOM.getDocument"} - ] - - results = await asyncio.gather( - *(connection.execute_command(cmd) for cmd in commands) - ) - - return results -``` - -## Conclusion - -The ConnectionHandler serves as the foundation of Pydoll's architecture, providing a robust, efficient interface to the Chrome DevTools Protocol. By leveraging Python's asyncio framework and WebSocket communication, it enables high-performance browser automation with elegant, event-driven programming patterns. - -Understanding the ConnectionHandler's design and operation provides valuable insights into Pydoll's internal workings and offers opportunities for advanced customization and optimization in specialized scenarios. - -For most use cases, you'll interact with the ConnectionHandler indirectly through the higher-level Browser and Page APIs, which provide a more intuitive interface while leveraging the ConnectionHandler's powerful capabilities. \ No newline at end of file diff --git a/docs/deep-dive/event-system.md b/docs/deep-dive/event-system.md deleted file mode 100644 index 0317f14..0000000 --- a/docs/deep-dive/event-system.md +++ /dev/null @@ -1,838 +0,0 @@ -# Event System - -The event system is a foundational component of Pydoll's architecture, providing a powerful mechanism for responding to browser activities in real-time. This asynchronous notification system enables your automation code to react to various browser events as they occur, creating dynamic and responsive interactions. - -## WebSocket Communication and CDP - -At the core of Pydoll's event system is the Chrome DevTools Protocol (CDP), which provides a structured way to interact with and monitor browser activities over WebSocket connections. This bidirectional communication channel allows your code to both send commands to the browser and receive events back. - -```mermaid -sequenceDiagram - participant Client as Pydoll Code - participant Connection as ConnectionHandler - participant WebSocket - participant Browser - - Client->>Connection: Register callback for event - Connection->>Connection: Store callback in registry - - Client->>Connection: Enable event domain - Connection->>WebSocket: Send CDP command to enable domain - WebSocket->>Browser: Forward command - Browser-->>WebSocket: Acknowledge domain enabled - WebSocket-->>Connection: Forward response - Connection-->>Client: Domain enabled - - Browser->>WebSocket: Event occurs, sends CDP event message - WebSocket->>Connection: Forward event message - Connection->>Connection: Look up callbacks for this event - Connection->>Client: Execute registered callback -``` - -### WebSocket Communication Model - -The WebSocket connection between Pydoll and the browser follows this pattern: - -1. **Connection Establishment**: When the browser starts, a WebSocket server is created, and Pydoll establishes a connection to it -2. **Bidirectional Messaging**: Both Pydoll and the browser can send messages at any time -3. **Message Types**: - - **Commands**: Sent from Pydoll to the browser (e.g., navigation, DOM manipulation) - - **Command Responses**: Sent from the browser to Pydoll in response to commands - - **Events**: Sent from the browser to Pydoll when something happens (e.g., page load, network activity) - -### Chrome DevTools Protocol Structure - -CDP organizes its functionality into domains, each responsible for a specific area of browser functionality: - -| Domain | Responsibility | Typical Events | -|--------|----------------|----------------| -| Page | Page lifecycle | Load events, navigation, dialogs | -| Network | Network activity | Request/response monitoring, WebSockets | -| DOM | Document structure | DOM changes, attribute modifications | -| Fetch | Request interception | Request paused, authentication required | -| Runtime | JavaScript execution | Console messages, exceptions | -| Browser | Browser management | Window creation, tabs, contexts | - -Each domain must be explicitly enabled before it will emit events, which helps manage performance by only processing events that are actually needed. - -## Event Domains and Enabling - -Pydoll organizes events into logical domains that correspond to the CDP domains. Each domain must be explicitly enabled before it will emit events, which is handled through specific enabling methods. - -```python -# Enable page events to monitor page load, navigation, dialogs, etc. -await tab.enable_page_events() - -# Enable network events to monitor requests, responses, etc. -await tab.enable_network_events() - -# Enable DOM events to monitor DOM changes -await tab.enable_dom_events() - -# Enable fetch events to intercept and modify requests -await tab.enable_fetch_events() -``` - -!!! info "Domain Ownership" - Events belong to specific domains based on their functionality. For example, page load events belong to the Page domain, while network request events belong to the Network domain. Some domains are only available at certain levels - for instance, Page events are available on the Tab instance but not directly at the Browser level. - -### Why Enable/Disable is Required - -The explicit enable/disable pattern serves several important purposes: - -1. **Performance Optimization**: By only enabling domains you're interested in, you reduce the overhead of event processing -2. **Resource Management**: Some event domains (like Network or DOM monitoring) can generate large volumes of events that consume memory -3. **Clarity of Intent**: Explicit enabling makes the automation code's intentions clear and self-documenting -4. **Controlled Cleanup**: Explicitly disabling domains ensures proper cleanup when events are no longer needed - -```mermaid -stateDiagram-v2 - [*] --> Disabled: Initial State - Disabled --> Enabled: enable_xxx_events() - Enabled --> Disabled: disable_xxx_events() - Enabled --> [*]: Tab Closed - Disabled --> [*]: Tab Closed -``` - -!!! warning "Event Leak Prevention" - Failing to disable event domains when they're no longer needed can lead to memory leaks and performance degradation, especially in long-running automation. Always disable event domains when you're done with them, particularly for high-volume events like network monitoring. - -### Domain-Specific Enabling Methods - -Different domains are enabled through specific methods on the appropriate objects: - -| Domain | Enable Method | Disable Method | Available On | -|--------|--------------|----------------|--------------| -| Page | `enable_page_events()` | `disable_page_events()` | Tab | -| Network | `enable_network_events()` | `disable_network_events()` | Tab | -| DOM | `enable_dom_events()` | `disable_dom_events()` | Tab | -| Fetch | `enable_fetch_events()` | `disable_fetch_events()` | Tab, Browser | -| File Chooser | `enable_intercept_file_chooser_dialog()` | `disable_intercept_file_chooser_dialog()` | Tab | - -## Registering Event Callbacks - -The central method for subscribing to events is the `on()` method, available on both Tab and Browser instances: - -```python -async def on( - self, event_name: str, callback: callable, temporary: bool = False -) -> int: - """ - Registers an event listener for the tab. - - Args: - event_name (str): The event name to listen for. - callback (callable): The callback function to execute when the - event is triggered. - temporary (bool): If True, the callback will be removed after it's - triggered once. Defaults to False. - - Returns: - int: The ID of the registered callback. - """ -``` - -This method returns a callback ID that can be used to remove the callback later if needed. - -### Callback Types and Parameters - -Callbacks can be either synchronous functions or asynchronous coroutines: - -```python -# Synchronous callback example -def handle_page_load(event): - print(f"Page loaded at: {time.time()}") - -# Asynchronous callback example -async def handle_network_request(event): - request_url = event['params']['request']['url'] - print(f"Request sent to: {request_url}") - # Can perform async operations here - await save_request_details(request_url) - -# Register the callbacks -await tab.on('Page.loadEventFired', handle_page_load) -await tab.on('Network.requestWillBeSent', handle_network_request) -``` - -!!! tip "Asynchronous Callbacks" - Using async callbacks provides greater flexibility, allowing you to perform other async operations within the callback, such as making additional CDP commands or waiting for conditions. - -### Using Partial for Tab Access in Callbacks - -A powerful technique is to use `functools.partial` to pass the Tab instance to your callbacks, allowing the callback to interact with the tab: - -```python -from functools import partial - -# Define a callback that needs access to the tab -async def handle_navigation(tab, event): - # The callback can now use the tab object - print(f"Navigation occurred to: {await tab.current_url}") - - # Access tab methods directly - elements = await tab.find(tag_name="a") - print(f"Found {len(elements)} links on the new page") - -# Register with partial to bind the tab parameter -await tab.enable_page_events() -await tab.on(PageEvent.FRAME_NAVIGATED, partial(handle_navigation, tab)) -``` - -This technique is essential when: -1. Your callback needs to interact with the tab (finding elements, executing scripts) -2. You want to maintain state between events -3. You need to coordinate actions across different event types - -!!! info "Why Use Partial?" - The event system only passes the event data to callbacks. Using `partial` lets you pre-configure callbacks with additional parameters (like the tab object) without modifying the callback signature expected by the event system. - -### Temporary Callbacks - -For events you only want to handle once, you can use the `temporary` flag: - -```python -# This callback will automatically be removed after the first time it fires -await tab.on('Page.loadEventFired', handle_first_load, temporary=True) -``` - -This is particularly useful for: -- One-time setup operations -- Waiting for a specific event before continuing -- Handling the first occurrence of an event differently - -## Event Flow and Lifecycle - -Understanding the event flow is crucial for effective event handling: - -```mermaid -flowchart TD - A[Browser Activity] -->|Generates| B[CDP Event] - B -->|Sent via WebSocket| C[ConnectionHandler] - C -->|Filters by Event Name| D{Registered Callbacks?} - D -->|Yes| E[Process Event] - D -->|No| F[Discard Event] - E -->|For Each Callback| G[Execute Callback] - G -->|If Temporary| H[Remove Callback] - G -->|If Permanent| I[Retain for Future Events] -``` - -The event lifecycle follows these steps: - -1. Something happens in the browser (page loads, request sent, DOM changes) -2. Browser generates a CDP event message -3. Message is sent over WebSocket to Pydoll -4. The ConnectionHandler receives the event -5. ConnectionHandler checks its registry for callbacks matching the event name -6. If callbacks exist, each is executed with the event data -7. If a callback was registered as temporary, it's removed after execution - -## Predefined Event Constants - -Pydoll provides a comprehensive set of predefined event constants in the `protocol` package, making it easier to reference common events without remembering exact CDP event strings: - -```python -from pydoll.protocol.page.events import PageEvent -from pydoll.protocol.network.events import NetworkEvent -from pydoll.protocol.dom.events import DomEvent -from pydoll.protocol.fetch.events import FetchEvent - -# Using predefined events -await tab.on(PageEvent.LOAD_EVENT_FIRED, handle_page_load) -await tab.on(NetworkEvent.REQUEST_WILL_BE_SENT, handle_request) -await tab.on(DomEvent.DOCUMENT_UPDATED, handle_dom_update) -await tab.on(FetchEvent.REQUEST_PAUSED, handle_fetch_intercept) -``` - -!!! info "Custom CDP Events" - While Pydoll provides constants for common events, you can use any valid CDP event string directly. This is useful for less common events that don't have predefined constants: - - ```python - # Using a direct CDP event string - await tab.on('Security.certificateError', handle_cert_error) - ``` - -### Common Event Types - -Here are some of the most useful events for automation and scraping: - -#### Page Events - -| Constant | CDP Event | Description | -|----------|-----------|-------------| -| `PageEvent.LOAD_EVENT_FIRED` | `Page.loadEventFired` | Fired when the page load event is triggered | -| `PageEvent.DOM_CONTENT_EVENT_FIRED` | `Page.domContentEventFired` | Fired when DOM content has been loaded | -| `PageEvent.FILE_CHOOSER_OPENED` | `Page.fileChooserOpened` | Fired when a file picker dialog is shown | -| `PageEvent.JAVASCRIPT_DIALOG_OPENING` | `Page.javascriptDialogOpening` | Fired when a JavaScript dialog is shown | -| `PageEvent.FRAME_NAVIGATED` | `Page.frameNavigated` | Fired when a frame has navigated to a new URL | - -#### Network Events - -| Constant | CDP Event | Description | -|----------|-----------|-------------| -| `NetworkEvent.REQUEST_WILL_BE_SENT` | `Network.requestWillBeSent` | Fired when a request is about to be sent | -| `NetworkEvent.RESPONSE_RECEIVED` | `Network.responseReceived` | Fired when an HTTP response is received | -| `NetworkEvent.LOADING_FAILED` | `Network.loadingFailed` | Fired when a request fails to load | -| `NetworkEvent.LOADING_FINISHED` | `Network.loadingFinished` | Fired when a request has finished loading | -| `NetworkEvent.WEB_SOCKET_FRAME_SENT` | `Network.webSocketFrameSent` | Fired when a WebSocket frame is sent | - -#### DOM Events - -| Constant | CDP Event | Description | -|-----------|------------|-----------| -| `DomEvent.DOCUMENT_UPDATED` | `DOM.documentUpdated` | Fired when the document is updated | -| `DomEvent.SET_CHILD_NODES` | `DOM.setChildNodes` | Fired when child nodes are set | -| `DomEvent.ATTRIBUTE_MODIFIED` | `DOM.attributeModified` | Fired when an element's attribute is modified | -| `DomEvent.ATTRIBUTE_REMOVED` | `DOM.attributeRemoved` | Fired when an element's attribute is removed | - -## Advanced Event Patterns - -### Event-Driven Scraping - -Events allow you to create reactive scrapers that respond to page changes in real-time: - -```python -import asyncio -from functools import partial -from pydoll.browser.chromium import Chrome -from pydoll.constants import By -from pydoll.protocol.network.events import NetworkEvent -from pydoll.protocol.page.events import PageEvent - -async def scrape_dynamic_content(): - async with Chrome() as browser: - tab = await browser.start() - - # Create a data storage container - scraped_data = [] - data_complete = asyncio.Event() - - # Set up a callback to extract data when AJAX responses are received - async def extract_data_from_response(tab, event): - if 'api/products' in event['params']['response']['url']: - # Extract the response body - request_id = event['params']['requestId'] - body = await tab.get_network_response_body(request_id) - - # Process the data - products = json.loads(body) - for product in products: - scraped_data.append({ - 'id': product['id'], - 'name': product['name'], - 'price': product['price'] - }) - - print(f"Extracted {len(products)} products") - - # If we've collected enough data, signal completion - if len(scraped_data) >= 100: - data_complete.set() - - # Set up navigation monitoring - async def handle_page_load(tab, event): - print(f"Page loaded: {await tab.current_url}") - - # Now that the page is loaded, trigger the infinite scroll - await tab.execute_script(""" - function scrollDown() { - window.scrollTo(0, document.body.scrollHeight); - setTimeout(scrollDown, 1000); - } - scrollDown(); - """) - - # Enable events and register callbacks - await tab.enable_network_events() - await tab.enable_page_events() - await tab.on(NetworkEvent.RESPONSE_RECEIVED, partial(extract_data_from_response, tab)) - await tab.on(PageEvent.LOAD_EVENT_FIRED, partial(handle_page_load, tab)) - - # Navigate to the page with dynamic content - await tab.go_to("https://example.com/products") - - # Wait for data collection to complete or timeout after 60 seconds - try: - await asyncio.wait_for(data_complete.wait(), timeout=60) - except asyncio.TimeoutError: - print("Timeout reached, continuing with data collected so far") - - # Process the collected data - print(f"Total products collected: {len(scraped_data)}") - - return scraped_data -``` - -### Parallel Scraping with Events - -Events are particularly powerful when combined with concurrent execution for maximum efficiency. Pydoll excels at managing multiple Tabs simultaneously, which is one of its greatest advantages for high-performance automation. - -#### Multi-Tab Single Browser Approach - -A more efficient approach is to use multiple tabs within a single browser instance: - -```python -import asyncio -from functools import partial -import json -from pydoll.browser.chromium import Chrome -from pydoll.constants import By -from pydoll.protocol.network.events import NetworkEvent - -async def multi_tab_scraping(): - # Create a single browser instance for all tabs - async with Chrome() as browser: - tab = await browser.start() - - # Categories to scrape - categories = ['electronics', 'clothing', 'books', 'home'] - base_url = 'https://example.com/products' - - # Track results for each category - results = {category: [] for category in categories} - completion_events = {category: asyncio.Event() for category in categories} - - # Create a callback for processing category data - async def process_category_data(tab, category, event): - if f'api/{category}' in event['params'].get('response', {}).get('url', ''): - request_id = event['params']['requestId'] - body, _ = await tab.get_network_response_body(request_id) - data = json.loads(body) - - # Add results to the appropriate category - results[category].extend(data['items']) - print(f"Added {len(data['items'])} items to {category}, total: {len(results[category])}") - - # Signal completion if we have enough data - if len(results[category]) >= 20 or data.get('isLastPage', False): - completion_events[category].set() - - # Prepare tabs, one for each category - tabs = {} - for category in categories: - # Create a new tab - new_tab = await browser.new_tab() - tabs[category] = new_tab - - # Setup event monitoring for this tab - await new_tab.enable_network_events() - await new_tab.on( - NetworkEvent.RESPONSE_RECEIVED, - partial(process_category_data, new_tab, category) - ) - - # Start navigation (don't await here to allow parallel loading) - asyncio.create_task(new_tab.go_to(f"{base_url}/{category}")) - - # Wait for all categories to complete or timeout - try: - await asyncio.wait_for( - asyncio.gather(*(event.wait() for event in completion_events.values())), - timeout=45 - ) - except asyncio.TimeoutError: - print("Some categories timed out, proceeding with collected data") - - # Display results - total_items = 0 - for category, items in results.items(): - count = len(items) - total_items += count - print(f"{category}: collected {count} items") - - # Show sample items - for item in items[:2]: - print(f" - {item['name']}: ${item['price']}") - - print(f"Total items across all categories: {total_items}") - - return results - -# Run the multi-tab scraper -asyncio.run(multi_tab_scraping()) -``` - -#### Dynamic Tab Creation with Events - -You can even create new tabs dynamically in response to events: - -```python -import asyncio -from functools import partial -from pydoll.browser.chromium import Chrome -from pydoll.constants import By -from pydoll.protocol.page.events import PageEvent -from pydoll.protocol.network.events import NetworkEvent - -async def dynamic_tab_creation(): - async with Chrome() as browser: - main_tab = await browser.start() - - # Store results from all product pages - all_results = [] - # Count active tabs to know when we're done - active_tabs = 1 # Start with 1 (main tab) - # Event that signals all work is complete - all_done = asyncio.Event() - - # This callback processes category links and creates a new tab for each - async def process_category_links(main_tab, event): - # Check if this is the categories response - if 'api/categories' not in event['params'].get('response', {}).get('url', ''): - return - - # Extract categories from the response - request_id = event['params']['requestId'] - body = await main_tab.get_network_response_body(request_id) - categories = json.loads(body) - - print(f"Found {len(categories)} categories to process") - nonlocal active_tabs - active_tabs += len(categories) # Update tab counter - - # Create a new tab for each category - for category in categories: - # Create a new tab - new_tab = await browser.new_tab() - - # Setup a callback for this tab - async def process_product_data(tab, category_name, event): - if 'api/products' not in event['params'].get('response', {}).get('url', ''): - return - - # Process the product data - request_id = event['params']['requestId'] - body = await tab.get_network_response_body(request_id) - products = json.loads(body) - - # Add to results - nonlocal all_results - all_results.extend(products) - print(f"Added {len(products)} products from {category_name}") - - # Close this tab when done - nonlocal active_tabs - await tab.close() - active_tabs -= 1 - - # If this was the last tab, signal completion - if active_tabs == 0: - all_done.set() - - # Enable network events on the new tab - await new_tab.enable_network_events() - await new_tab.on( - NetworkEvent.RESPONSE_RECEIVED, - partial(process_product_data, new_tab, category['name']) - ) - - # Navigate to the category page - asyncio.create_task(new_tab.go_to(f"https://example.com/products/{category['id']}")) - - # Set up the main tab to find categories - await main_tab.enable_network_events() - await main_tab.on( - NetworkEvent.RESPONSE_RECEIVED, - partial(process_category_links, main_tab) - ) - - # Navigate to the main categories page - await main_tab.go_to("https://example.com/categories") - - # Wait for all tabs to complete their work - try: - await asyncio.wait_for(all_done.wait(), timeout=60) - except asyncio.TimeoutError: - print("Timeout reached, continuing with data collected so far") - - # Process results - print(f"Total products collected: {len(all_results)}") - - return all_results -``` - -### Key Advantages of Multi-Tab Automation - -Using multiple tabs in a single browser instance offers several significant advantages: - -1. **Resource Efficiency**: A single browser instance uses fewer system resources than multiple browsers -2. **Shared Session**: All tabs share the same session, cookies, and cache -3. **Reduced Startup Time**: Opening new tabs is much faster than starting new browser instances -4. **Dynamic Workflows**: Create tabs in response to discoveries on other tabs -5. **Memory Efficiency**: Better memory utilization compared to multiple browser instances - -!!! tip "Tab Management Best Practices" - - Keep track of all tab references to avoid orphaned tabs - - Consider implementing a tab pool pattern for large-scale operations - - Close tabs when they're no longer needed to free up resources - - Use tab IDs to identify and organize tabs - - Consider adding timeouts to prevent hanging tabs - -This multi-tab approach is ideal for scenarios like: -- Category-based scraping where each category needs its own context -- Processing search results where each result needs detailed exploration -- Following multiple user journeys simultaneously -- Load balancing requests across multiple tabs to avoid rate limiting -- Maintaining different user sessions in different tabs - -### Coordinating with Event-Driven Actions - -Events can be used to coordinate actions in response to browser behavior: - -```python -async def wait_for_network_idle(): - network_idle = asyncio.Event() - in_flight_requests = 0 - - async def track_request(event): - nonlocal in_flight_requests - in_flight_requests += 1 - - async def track_response(event): - nonlocal in_flight_requests - in_flight_requests -= 1 - if in_flight_requests == 0: - network_idle.set() - - await tab.enable_network_events() - await tab.on(NetworkEvent.REQUEST_WILL_BE_SENT, track_request) - await tab.on(NetworkEvent.LOADING_FINISHED, track_response) - await tab.on(NetworkEvent.LOADING_FAILED, track_response) - - await network_idle.wait() - - # Clean up - await tab.disable_network_events() -``` - -### Using Async Context Managers - -Pydoll implements context managers for some common event patterns, like file uploads: - -```python -async with tab.expect_file_chooser(files="path/to/file.pdf"): - # Trigger the file chooser dialog - upload_button = await tab.find(id="upload-button") - await upload_button.click() - # Context manager handles waiting for and responding to the file chooser event -``` - -!!! tip "Creating Custom Context Managers" - You can create custom context managers for common event patterns in your own code: - - ```python - @asynccontextmanager - async def wait_for_navigation(): - navigation_complete = asyncio.Event() - - async def on_navigation(event): - navigation_complete.set() - - # Enable events if not already enabled - was_enabled = tab.page_events_enabled - if not was_enabled: - await tab.enable_page_events() - - # Register temporary callback - await tab.on(PageEvent.FRAME_NAVIGATED, on_navigation, temporary=True) - - try: - yield - # Wait for navigation to complete - await navigation_complete.wait() - finally: - # Clean up if we enabled events - if not was_enabled: - await tab.disable_page_events() - ``` - -## Domain-Specific Event Features - -### Page Domain Events - -The Page domain provides events for page lifecycle and JavaScript dialogs: - -```python -from functools import partial - -# Enable page events -await tab.enable_page_events() - -# Handle page load -async def handle_page_load(tab, event): - print(f"Page loaded: {await tab.current_url}") - # Perform actions after page load - await tab.find(id="search").type_text("pydoll") - -await tab.on(PageEvent.LOAD_EVENT_FIRED, partial(handle_page_load, tab)) - -# Handle JavaScript dialogs -async def handle_dialog(tab, event): - if await tab.has_dialog(): - message = await tab.get_dialog_message() - print(f"Dialog message: {message}") - await tab.handle_dialog(accept=True) - -await tab.on(PageEvent.JAVASCRIPT_DIALOG_OPENING, partial(handle_dialog, tab)) -``` - -### Network Domain Events and Logging - -The Network domain provides comprehensive request monitoring and logging: - -```python -from functools import partial - -# Enable network events -await tab.enable_network_events() - -# Monitor request activity -async def log_request(tab, event): - url = event['params']['request']['url'] - method = event['params']['request']['method'] - print(f"{method} request to: {url}") - - # You can trigger actions based on specific requests - if 'api/login' in url and method == 'POST': - print("Login request detected, waiting for response...") - -await tab.on(NetworkEvent.REQUEST_WILL_BE_SENT, partial(log_request, tab)) - -# After performing actions, retrieve logs -api_logs = await tab.get_network_logs(filter="api") - -# Get response bodies for specific requests by filtering logs first -api_logs = await tab.get_network_logs(filter="api/data") -for log in api_logs: - request_id = log['params']['requestId'] - body = await tab.get_network_response_body(request_id) -``` - -### DOM Events for Structure Monitoring - -The DOM domain provides events for monitoring document structure changes: - -```python -from functools import partial - -# Enable DOM events -await tab.enable_dom_events() - -# Track attribute changes -async def track_attribute_change(tab, event): - node_id = event['params']['nodeId'] - name = event['params']['name'] - value = event['params']['value'] - print(f"Attribute changed on node {node_id}: {name}={value}") - - # You can react to specific attribute changes - if name == 'data-status' and value == 'loaded': - element = await tab.find(css_selector=f"[data-id='{node_id}']") - await element.click() - -await tab.on(DomEvent.ATTRIBUTE_MODIFIED, partial(track_attribute_change, tab)) -``` - -## Browser-Level vs. Tab-Level Events - -Pydoll's event system operates at both the browser and tab levels, with important distinctions: - -```mermaid -graph TD - Browser[Browser Instance] -->|"Global Events (e.g., Target events)"| BrowserCallbacks[Browser-Level Callbacks] - Browser -->|"Creates"| Tab1[Tab Instance 1] - Browser -->|"Creates"| Tab2[Tab Instance 2] - Tab1 -->|"Tab-Specific Events"| Tab1Callbacks[Tab 1 Callbacks] - Tab2 -->|"Tab-Specific Events"| Tab2Callbacks[Tab 2 Callbacks] -``` - -### Browser-Level Events - -Browser-level events operate globally across all tabs: - -```python -# Register a browser-level event -await browser.on('Target.targetCreated', handle_new_target) -``` - -Browser-level event domains are limited, and trying to use tab-specific events will raise an exception: - -```python -# This would raise an EventNotSupported exception -await browser.on(PageEvent.LOAD_EVENT_FIRED, handle_page_load) # Error! -``` - -### Tab-Level Events - -Tab-level events are specific to an individual tab: - -```python -# Get a specific tab -tab = await browser.start() - -# Register a tab-level event -await tab.enable_page_events() -await tab.on(PageEvent.LOAD_EVENT_FIRED, handle_page_load) - -# Each tab has its own event context -tab2 = await browser.new_tab() -await tab2.enable_page_events() -await tab2.on(PageEvent.LOAD_EVENT_FIRED, handle_different_page_load) -``` - -!!! info "Domain-Specific Scope" - Not all event domains are available at both levels. For example: - - - **Fetch Events**: Available at both browser and tab levels - - **Page Events**: Available only at the tab level - - **Target Events**: Available only at the browser level - -## Performance Considerations - -### Event System Overhead - -The event system adds overhead to browser automation, especially for high-frequency events: - -| Event Domain | Typical Event Volume | Performance Impact | -|--------------|---------------------|-------------------| -| Page | Low | Minimal | -| Network | High | Moderate to High | -| DOM | Very High | High | -| Fetch | Moderate | Moderate (higher if intercepting) | - -To minimize performance impact: - -1. **Enable Only What You Need**: Only enable event domains you're actively using -2. **Scope Appropriately**: Use browser-level events only for truly browser-wide concerns -3. **Disable When Done**: Always disable event domains when you're finished with them -4. **Filter Early**: In callbacks, filter out irrelevant events as early as possible -5. **Use Temporary Callbacks**: For one-time events, use the `temporary=True` flag - -### Efficient Callback Patterns - -Write efficient callbacks to minimize overhead: - -```python -# LESS EFFICIENT: Processes every request -async def log_all_requests(event): - print(f"Request: {event['params']['request']['url']}") - -# MORE EFFICIENT: Early filtering -async def log_api_requests(event): - url = event['params']['request']['url'] - if '/api/' not in url: - return # Early exit for non-API requests - print(f"API Request: {url}") -``` - -## Conclusion - -Pydoll's event system provides a powerful mechanism for creating dynamic, responsive browser automation. By understanding the event flow, domain organization, and callback patterns, you can build sophisticated automation that reacts intelligently to browser state changes. - -The event system is particularly valuable for: -- Building reactive scrapers that capture data as soon as it's available -- Creating parallel automation tasks that maximize efficiency -- Coordinating complex interactions that depend on browser state changes -- Implementing robust error handling and retry mechanisms - -With techniques like using `partial` to bind tab instances to callbacks and combining events with `asyncio.gather` for concurrent operations, you can create highly efficient and scalable automation solutions. diff --git a/docs/deep-dive/find-elements-mixin.md b/docs/deep-dive/find-elements-mixin.md deleted file mode 100644 index ab996be..0000000 --- a/docs/deep-dive/find-elements-mixin.md +++ /dev/null @@ -1,681 +0,0 @@ -# FindElements Mixin - -The FindElementsMixin is a fundamental component in Pydoll's architecture that implements element location strategies using various selector types. This mixin provides the core capabilities for finding and interacting with elements in the DOM, serving as a bridge between high-level automation code and the browser's rendering engine. - -```mermaid -graph TB - User["User Code"] --> Tab["Tab Class"] - Tab --> Mixin["FindElementsMixin"] - Mixin --> Methods["Core Methods"] - Mixin --> Wait["Wait Mechanisms"] - - Methods --> Find["find()"] - Methods --> Query["query()"] - Methods --> Internal["Internal Methods"] - - Mixin --> DOM["Browser DOM"] - DOM --> Elements["WebElements"] -``` - -## Understanding Mixins in Python - -In object-oriented programming, a mixin is a class that provides methods to other classes without being considered a base class. Unlike traditional inheritance where a subclass inherits from a parent class representing an "is-a" relationship, mixins implement a "has-a" capability relationship. - -```python -# Example of a mixin in Python -class LoggerMixin: - def log(self, message): - print(f"LOG: {message}") - - def log_error(self, error): - print(f"ERROR: {error}") - -class DataProcessor(LoggerMixin): - def process_data(self, data): - self.log("Processing data...") - # Process the data - self.log("Data processing complete") -``` - -Mixins offer several advantages in complex software architecture: - -1. **Code Reuse**: The same functionality can be used by multiple unrelated classes -2. **Separation of Concerns**: Each mixin handles a specific aspect of functionality -3. **Composition Over Inheritance**: Avoids deep inheritance hierarchies -4. **Modularity**: Features can be added or removed independently - -!!! info "Mixin vs. Multiple Inheritance" - While Python supports multiple inheritance, mixins are a specific design pattern within that capability. A mixin is not meant to be instantiated on its own and typically doesn't maintain state. It provides methods that can be used by other classes without establishing an "is-a" relationship. - -## The Document Object Model (DOM) - -Before diving into element selection strategies, it's important to understand the DOM, which represents the structure of an HTML document as a tree of objects. - -```mermaid -graph TD - A[Document] --> B[html] - B --> C[head] - B --> D[body] - C --> E[title] - D --> F[div id='content'] - F --> G[h1] - F --> H[p] - F --> I[ul] - I --> J[li] - I --> K[li] - I --> L[li] -``` - -The DOM is: - -1. **Hierarchical**: Elements nest within other elements, forming parent-child relationships -2. **Manipulable**: JavaScript can modify the structure, content, and styling -3. **Queryable**: Elements can be located using various selection strategies -4. **Event-driven**: Elements can respond to user interactions and other events - -### Chrome DevTools Protocol and DOM Access - -Pydoll interacts with the DOM through the Chrome DevTools Protocol (CDP), which provides methods for querying and manipulating the document: - -| CDP Domain | Purpose | Example Commands | -|------------|---------|------------------| -| DOM | Access to document structure | `querySelector`, `getDocument` | -| Runtime | JavaScript execution in page context | `evaluate`, `callFunctionOn` | -| Page | Page-level operations | `navigate`, `captureScreenshot` | - -The CDP allows both direct DOM manipulation through the DOM domain and JavaScript-based interaction through the Runtime domain. FindElementsMixin leverages both approaches for robust element selection. - -## Core API Methods - -Pydoll introduces two primary methods for element finding that provide a more intuitive and flexible approach: - -### The find() Method - -The `find()` method provides an intuitive way to locate elements using common HTML attributes: - -```python -# Find by ID -element = await tab.find(id="username") - -# Find by class name -element = await tab.find(class_name="submit-button") - -# Find by tag name -element = await tab.find(tag_name="button") - -# Find by text content -element = await tab.find(text="Click here") - -# Find by name attribute -element = await tab.find(name="email") - -# Combine multiple attributes -element = await tab.find(tag_name="input", name="password", type="password") - -# Find all matching elements -elements = await tab.find(class_name="item", find_all=True) - -# Find with timeout -element = await tab.find(id="dynamic-content", timeout=10) -``` - -#### Method Signature - -```python -async def find( - self, - id: Optional[str] = None, - class_name: Optional[str] = None, - name: Optional[str] = None, - tag_name: Optional[str] = None, - text: Optional[str] = None, - timeout: int = 0, - find_all: bool = False, - raise_exc: bool = True, - **attributes, -) -> Union[WebElement, list[WebElement], None]: -``` - -#### Parameters - -| Parameter | Type | Description | -|-----------|------|-------------| -| `id` | `Optional[str]` | Element ID attribute value | -| `class_name` | `Optional[str]` | CSS class name to match | -| `name` | `Optional[str]` | Element name attribute value | -| `tag_name` | `Optional[str]` | HTML tag name (e.g., "div", "input") | -| `text` | `Optional[str]` | Text content to match within element | -| `timeout` | `int` | Maximum seconds to wait for elements to appear | -| `find_all` | `bool` | If True, returns all matches; if False, first match only | -| `raise_exc` | `bool` | Whether to raise exception if no elements found | -| `**attributes` | `dict` | Additional HTML attributes to match | - -### The query() Method - -The `query()` method provides direct access using CSS selectors or XPath expressions: - -```python -# CSS selectors -element = await tab.query("div.content > p.intro") -element = await tab.query("#login-form input[type='password']") - -# XPath expressions -element = await tab.query("//div[@id='content']/p[contains(text(), 'Welcome')]") -element = await tab.query("//button[text()='Submit']") - -# ID shorthand (automatically detected) -element = await tab.query("#username") - -# Class shorthand (automatically detected) -element = await tab.query(".submit-button") - -# Find all matching elements -elements = await tab.query("div.item", find_all=True) - -# Query with timeout -element = await tab.query("#dynamic-content", timeout=10) -``` - -#### Method Signature - -```python -async def query( - self, - expression: str, - timeout: int = 0, - find_all: bool = False, - raise_exc: bool = True -) -> Union[WebElement, list[WebElement], None]: -``` - -#### Parameters - -| Parameter | Type | Description | -|-----------|------|-------------| -| `expression` | `str` | Selector expression (CSS, XPath, ID with #, class with .) | -| `timeout` | `int` | Maximum seconds to wait for elements to appear | -| `find_all` | `bool` | If True, returns all matches; if False, first match only | -| `raise_exc` | `bool` | Whether to raise exception if no elements found | - -## Practical Usage Examples - -### Basic Element Finding - -```python -import asyncio -from pydoll.browser.chromium import Chrome - -async def basic_element_finding(): - browser = Chrome() - tab = await browser.start() - - try: - await tab.go_to("https://example.com/login") - - # Find login form elements - username_field = await tab.find(id="username") - password_field = await tab.find(name="password") - submit_button = await tab.find(tag_name="button", type="submit") - - # Interact with elements - await username_field.type_text("user@example.com") - await password_field.type_text("password123") - await submit_button.click() - - finally: - await browser.stop() - -asyncio.run(basic_element_finding()) -``` - -### Advanced Selector Combinations - -```python -async def advanced_selectors(): - browser = Chrome() - tab = await browser.start() - - try: - await tab.go_to("https://example.com/products") - - # Find specific product by combining attributes - product = await tab.find( - tag_name="div", - class_name="product", - data_category="electronics", - data_price_range="500-1000" - ) - - # Find all products in a category - electronics = await tab.find( - class_name="product", - data_category="electronics", - find_all=True - ) - - # Find element by text content - add_to_cart = await tab.find(text="Add to Cart") - - print(f"Found {len(electronics)} electronics products") - - finally: - await browser.stop() -``` - -### Using CSS Selectors and XPath - -```python -async def css_and_xpath_examples(): - browser = Chrome() - tab = await browser.start() - - try: - await tab.go_to("https://example.com/table") - - # CSS selectors - header_cells = await tab.query("table thead th", find_all=True) - first_row = await tab.query("table tbody tr:first-child") - - # XPath for complex selections - # Find table cell containing specific text - price_cell = await tab.query("//td[contains(text(), '$')]") - - # Find button in the same row as specific text - edit_button = await tab.query( - "//tr[td[contains(text(), 'John Doe')]]//button[text()='Edit']" - ) - - # Find all rows with price > $100 (using XPath functions) - expensive_items = await tab.query( - "//tr[number(translate(td[3], '$,', '')) > 100]", - find_all=True - ) - - print(f"Found {len(expensive_items)} expensive items") - - finally: - await browser.stop() -``` - -## Waiting Mechanisms - -The FindElementsMixin implements sophisticated waiting mechanisms for handling dynamic content: - -### Timeout-Based Waiting - -```python -async def waiting_examples(): - browser = Chrome() - tab = await browser.start() - - try: - await tab.go_to("https://example.com/dynamic") - - # Wait up to 10 seconds for element to appear - dynamic_content = await tab.find(id="dynamic-content", timeout=10) - - # Wait for multiple elements - items = await tab.find(class_name="item", timeout=5, find_all=True) - - # Handle cases where element might not appear - optional_element = await tab.find( - id="optional-banner", - timeout=3, - raise_exc=False - ) - - if optional_element: - await optional_element.click() - else: - print("Optional banner not found, continuing...") - - finally: - await browser.stop() -``` - -### Error Handling Strategies - -```python -async def robust_element_finding(): - browser = Chrome() - tab = await browser.start() - - try: - await tab.go_to("https://example.com") - - # Strategy 1: Try multiple selectors - submit_button = None - selectors = [ - {"id": "submit"}, - {"class_name": "submit-btn"}, - {"tag_name": "button", "type": "submit"}, - {"text": "Submit"} - ] - - for selector in selectors: - try: - submit_button = await tab.find(**selector, timeout=2) - break - except ElementNotFound: - continue - - if not submit_button: - raise Exception("Submit button not found with any selector") - - # Strategy 2: Graceful degradation - try: - premium_feature = await tab.find(class_name="premium-only", timeout=1) - await premium_feature.click() - except ElementNotFound: - # Fall back to basic feature - basic_feature = await tab.find(class_name="basic-feature") - await basic_feature.click() - - finally: - await browser.stop() -``` - -## Selector Strategy Selection - -The FindElementsMixin automatically chooses the most appropriate selector strategy based on the provided parameters: - -### Single Attribute Selection - -When only one attribute is provided, the mixin uses the most efficient selector: - -```python -# These use optimized single-attribute selectors -await tab.find(id="username") # Uses By.ID -await tab.find(class_name="button") # Uses By.CLASS_NAME -await tab.find(tag_name="input") # Uses By.TAG_NAME -await tab.find(name="email") # Uses By.NAME -``` - -### Multiple Attribute Selection - -When multiple attributes are provided, the mixin builds an XPath expression: - -```python -# This builds XPath: //input[@type='password' and @name='password'] -await tab.find(tag_name="input", type="password", name="password") - -# This builds XPath: //div[@class='product' and @data-id='123'] -await tab.find(tag_name="div", class_name="product", data_id="123") -``` - -### Expression Type Detection - -The `query()` method automatically detects the expression type: - -```python -# Detected as XPath (starts with //) -await tab.query("//div[@id='content']") - -# Detected as ID (starts with #) -await tab.query("#username") - -# Detected as class (starts with . but not ./) -await tab.query(".submit-button") - -# Detected as CSS selector (default) -await tab.query("div.content > p") -``` - -## Internal Architecture - -The FindElementsMixin implements element location through a sophisticated internal architecture: - -```mermaid -classDiagram - class FindElementsMixin { - +find(**kwargs) WebElement|List[WebElement] - +query(expression) WebElement|List[WebElement] - +find_or_wait_element(by, value, timeout) WebElement|List[WebElement] - -_find_element(by, value) WebElement - -_find_elements(by, value) List[WebElement] - -_get_by_and_value(**kwargs) Tuple[By, str] - -_build_xpath(**kwargs) str - -_get_expression_type(expression) By - } - - class Tab { - -_connection_handler - +go_to(url) - +execute_script(script) - } - - class WebElement { - -_object_id - -_connection_handler - +click() - +type(text) - +text - } - - Tab --|> FindElementsMixin : inherits - FindElementsMixin ..> WebElement : creates -``` - -### Core Internal Methods - -#### find_or_wait_element() - -The core method that handles both immediate finding and waiting: - -```python -async def find_or_wait_element( - self, - by: By, - value: str, - timeout: int = 0, - find_all: bool = False, - raise_exc: bool = True, -) -> Union[WebElement, list[WebElement], None]: - """ - Core element finding method with optional waiting capability. - - Searches for elements with flexible waiting. If timeout specified, - repeatedly attempts to find elements with 0.5s delays until success or timeout. - """ -``` - -This method: -1. Determines the appropriate find method (`_find_element` or `_find_elements`) -2. Implements polling logic with 0.5-second intervals -3. Handles timeout and exception raising logic -4. Returns appropriate results based on `find_all` parameter - -#### _get_by_and_value() - -Converts high-level parameters into CDP-compatible selector strategies: - -```python -def _get_by_and_value( - self, - by_map: dict[str, By], - id: Optional[str] = None, - class_name: Optional[str] = None, - name: Optional[str] = None, - tag_name: Optional[str] = None, - text: Optional[str] = None, - **attributes, -) -> tuple[By, str]: -``` - -This method: -1. Identifies which attributes were provided -2. For single attributes, returns the appropriate `By` enum and value -3. For multiple attributes, builds an XPath expression using `_build_xpath()` - -#### _build_xpath() - -Constructs complex XPath expressions from multiple criteria: - -```python -@staticmethod -def _build_xpath( - id: Optional[str] = None, - class_name: Optional[str] = None, - name: Optional[str] = None, - tag_name: Optional[str] = None, - text: Optional[str] = None, - **attributes, -) -> str: -``` - -This method: -1. Builds the base XPath (`//tag` or `//*`) -2. Adds conditions for each provided attribute -3. Handles special cases like class names and text content -4. Combines conditions with `and` operators - -### CDP Command Generation - -The mixin generates appropriate CDP commands based on selector type: - -#### For CSS Selectors - -```python -def _get_find_element_command(self, by: By, value: str, object_id: str = ''): - # Converts to CSS selector format - if by == By.CLASS_NAME: - selector = f'.{escaped_value}' - elif by == By.ID: - selector = f'#{escaped_value}' - - # Uses DOM.querySelector or Runtime.evaluate -``` - -#### For XPath Expressions - -```python -def _get_find_element_by_xpath_command(self, xpath: str, object_id: str): - # Uses Runtime.evaluate with document.evaluate() - script = Scripts.FIND_XPATH_ELEMENT.replace('{escaped_value}', escaped_value) - command = RuntimeCommands.evaluate(expression=script) -``` - -## Performance Considerations - -### Selector Efficiency - -Different selector types have varying performance characteristics: - -| Selector Type | Performance | Use Case | -|---------------|-------------|----------| -| ID | Fastest | Unique elements with ID attributes | -| CSS Class | Fast | Elements with specific styling | -| Tag Name | Fast | When you need all elements of a type | -| CSS Selector | Good | Complex but common patterns | -| XPath | Slower | Complex relationships and text matching | - -### Optimization Strategies - -```python -# Good: Use ID when available -element = await tab.find(id="unique-element") - -# Good: Use simple CSS selectors -element = await tab.query("#form .submit-button") - -# Avoid: Complex XPath when CSS would work -# element = await tab.query("//div[@id='form']//button[@class='submit-button']") - -# Good: Combine attributes efficiently -element = await tab.find(tag_name="input", type="email", required=True) - -# Good: Use find_all=False when you only need the first match -first_item = await tab.find(class_name="item", find_all=False) -``` - -### Waiting Best Practices - -```python -# Good: Use appropriate timeouts -quick_element = await tab.find(id="static-content", timeout=2) -slow_element = await tab.find(id="ajax-content", timeout=10) - -# Good: Handle optional elements gracefully -optional = await tab.find(class_name="optional", timeout=1, raise_exc=False) - -# Good: Use specific selectors to reduce false positives -specific_button = await tab.find( - tag_name="button", - class_name="submit", - type="submit", - timeout=5 -) -``` - -## Error Handling - -The FindElementsMixin provides comprehensive error handling: - -### Exception Types - -```python -from pydoll.exceptions import ElementNotFound, WaitElementTimeout - -try: - element = await tab.find(id="missing-element") -except ElementNotFound: - print("Element not found immediately") - -try: - element = await tab.find(id="slow-element", timeout=10) -except WaitElementTimeout: - print("Element did not appear within timeout") -``` - -### Graceful Handling - -```python -# Option 1: Use raise_exc=False -element = await tab.find(id="optional-element", raise_exc=False) -if element: - await element.click() - -# Option 2: Try-except with fallback -try: - primary_button = await tab.find(id="primary-action") - await primary_button.click() -except ElementNotFound: - # Fallback to alternative selector - fallback_button = await tab.find(class_name="action-button") - await fallback_button.click() -``` - -## Integration with WebElement - -Found elements are returned as WebElement instances that provide rich interaction capabilities: - -```python -# Find and interact with form elements -username = await tab.find(name="username") -await username.type_text("user@example.com") - -password = await tab.find(type="password") -await password.type_text("secretpassword") - -submit = await tab.find(tag_name="button", type="submit") -await submit.click() - -# Get element properties -text_content = await username.text -is_visible = await username.is_visible() -attribute_value = await username.get_attribute("placeholder") -``` - -## Conclusion - -The FindElementsMixin serves as the foundation for element interaction in Pydoll, providing a powerful and intuitive API for locating DOM elements. The combination of the `find()` and `query()` methods offers flexibility for both simple and complex element selection scenarios. - -Key advantages of the FindElementsMixin design: - -1. **Intuitive API**: The `find()` method uses natural HTML attribute names -2. **Flexible Selection**: Support for CSS selectors, XPath, and attribute combinations -3. **Robust Waiting**: Built-in timeout and polling mechanisms -4. **Performance Optimization**: Automatic selection of the most efficient selector strategy -5. **Error Handling**: Comprehensive exception handling with graceful degradation options - -By understanding the capabilities and patterns of the FindElementsMixin, you can create robust and maintainable browser automation that handles the complexities of modern web applications. \ No newline at end of file diff --git a/docs/deep-dive/index.md b/docs/deep-dive/index.md deleted file mode 100644 index 2c5ee1c..0000000 --- a/docs/deep-dive/index.md +++ /dev/null @@ -1,28 +0,0 @@ -# Deep Dive - -Welcome to the in-depth technical documentation section of Pydoll. This area is dedicated to developers who want to understand the internal workings of the library, its architectural design, and the technical principles behind its operation. - -## What You'll Find Here - -Unlike the introduction and features sections that focus on "how to use" Pydoll, the Deep Dive section explores "how it works" and the "why" behind the design and implementation decisions. - -In this section, you'll find detailed documentation about: - -- **Chrome DevTools Protocol (CDP)** - How Pydoll communicates with browsers without relying on webdrivers -- **Internal Architecture** - The layered structure that makes Pydoll efficient and extensible -- **Domain Implementations** - Technical details of each functional domain (Browser, Page, WebElement) -- **Event System** - How the reactive event system works internally -- **Performance Optimizations** - Details about how we achieve high asynchronous performance - -## Who This Section Is For - -This documentation is especially useful for: - -- Developers looking to contribute code to Pydoll -- Engineers creating advanced integrations or extensions -- Technical users who need to understand the execution model for debugging -- Anyone interested in the technical aspects of browser automation - -Each topic in this section is self-contained, so you can navigate directly to the areas of greatest interest using the navigation menu. - -Explore the different domains and technical features using the sidebar links to dive deep into Pydoll's implementation details. \ No newline at end of file diff --git a/docs/deep-dive/network-capabilities.md b/docs/deep-dive/network-capabilities.md deleted file mode 100644 index 948f293..0000000 --- a/docs/deep-dive/network-capabilities.md +++ /dev/null @@ -1,757 +0,0 @@ -# Network Capabilities - -Pydoll provides powerful capabilities for monitoring, intercepting, and manipulating network traffic during browser automation. These features give you fine-grained control over how your browser communicates with the web, enabling advanced use cases like request modification, response analysis, and network optimization. - -## Network Architecture Overview - -Pydoll's network capabilities are built on top of the Chrome DevTools Protocol (CDP), which provides a direct interface to the browser's internal networking stack. This architecture eliminates the limitations of traditional proxy-based approaches and enables real-time monitoring and modification of requests and responses. - -```mermaid -flowchart TB - subgraph Browser["Chrome/Edge Browser"] - Net["Network Stack"] --> CDP["Chrome DevTools Protocol"] - end - - subgraph Pydoll["Pydoll Library"] - CDP --> NetMon["Network Monitoring"] - CDP --> Interception["Request Interception"] - CDP --> Headers["Headers Manipulation"] - CDP --> Body["Body Modification"] - CDP --> Emulation["Network Condition Emulation"] - end - - subgraph UserCode["User Automation Code"] - NetMon --> Analysis["Traffic Analysis"] - Interception --> Auth["Authentication Handling"] - Headers --> CustomHeaders["Custom Headers Injection"] - Body --> DataModification["Request/Response Data Modification"] - Emulation --> Testing["Network Condition Testing"] - end - - class Browser,Pydoll,UserCode rounded - - - class Browser blue - class Pydoll green - class UserCode orange -``` - -The network capabilities in Pydoll can be organized into two main categories: - -1. **Network Monitoring**: Passive observation of network activity -2. **Request Interception**: Active modification of network requests and responses - -## Network Monitoring - -Network monitoring allows you to observe and analyze the network activity of your browser session without modifying it. This is useful for understanding how a website loads resources, detecting API endpoints, or troubleshooting performance issues. - -### Enabling Network Monitoring - -To start monitoring network activity, you need to enable network events: - -```python -import asyncio -from pydoll.browser.chromium import Chrome -from pydoll.protocol.network.events import NetworkEvent -from functools import partial - -async def main(): - async with Chrome() as browser: - tab = await browser.start() - - # Enable network monitoring - await tab.enable_network_events() - - # Navigate to a page - await tab.go_to('https://example.com') - - print("Network monitoring enabled and page loaded") - -asyncio.run(main()) -``` - -When you enable network events, Pydoll automatically captures information about all network requests, including: - -- URLs -- HTTP methods -- Request headers -- Status codes -- Response sizes -- Content types -- Timing information - -### Network Event Callbacks - -You can register callbacks to be notified about specific network events in real-time: - -```python -from pydoll.protocol.network.events import NetworkEvent -from functools import partial - -# Define a callback to handle request events -async def on_request(tab, event): - url = event['params']['request']['url'] - method = event['params']['request']['method'] - - print(f"{method} request to: {url}") - - # You can access request headers - headers = event['params']['request'].get('headers', {}) - if 'content-type' in headers: - print(f"Content-Type: {headers['content-type']}") - -# Define a callback to handle response events -async def on_response(tab, event): - url = event['params']['response']['url'] - status = event['params']['response']['status'] - - print(f"Response from {url}: Status {status}") - - # Extract response timing information - timing = event['params']['response'].get('timing') - if timing: - total_time = timing['receiveHeadersEnd'] - timing['requestTime'] - print(f"Request completed in {total_time:.2f}s") - -async def main(): - async with Chrome() as browser: - tab = await browser.start() - - # Register the callbacks - await tab.enable_network_events() - await tab.on(NetworkEvent.REQUEST_WILL_BE_SENT, partial(on_request, tab)) - await tab.on(NetworkEvent.RESPONSE_RECEIVED, partial(on_response, tab)) - - # Navigate to trigger network activity - await tab.go_to('https://example.com') - - # Wait to see network activity - await asyncio.sleep(5) - -asyncio.run(main()) -``` - -### Key Network Events - -Pydoll provides access to a wide range of network-related events: - -| Event Constant | Description | Useful Information Available | -|----------------|-------------|------------------------------| -| `NetworkEvent.REQUEST_WILL_BE_SENT` | Fired when a request is about to be sent | URL, method, headers, POST data | -| `NetworkEvent.RESPONSE_RECEIVED` | Fired when HTTP response is available | Status code, headers, MIME type, timing | -| `NetworkEvent.LOADING_FAILED` | Fired when a request fails | Error information, canceled status | -| `NetworkEvent.LOADING_FINISHED` | Fired when a request completes | Encoding, compressed data size | -| `NetworkEvent.RESOURCE_CHANGED_PRIORITY` | Fired when resource loading priority changes | New priority level | -| `NetworkEvent.WEBSOCKET_CREATED` | Fired when a WebSocket is created | URL, initiator | -| `NetworkEvent.WEBSOCKET_FRAME_SENT` | Fired when a WebSocket frame is sent | Payload data | -| `NetworkEvent.WEBSOCKET_FRAME_RECEIVED` | Fired when a WebSocket frame is received | Response data | - -### Advanced Network Monitoring Example - -Here's a more comprehensive example that tracks various network metrics: - -```python -import asyncio -import time -from pydoll.browser.chromium import Chrome -from pydoll.protocol.network.events import NetworkEvent -from functools import partial - -async def main(): - # Statistics counters - stats = { - 'total_requests': 0, - 'completed_requests': 0, - 'failed_requests': 0, - 'bytes_received': 0, - 'request_types': {}, - 'status_codes': {}, - 'domains': {}, - 'start_time': time.time() - } - - async def update_dashboard(): - while True: - # Calculate elapsed time - elapsed = time.time() - stats['start_time'] - - # Clear console and print stats - print("\033c", end="") # Clear console - print(f"Network Activity Dashboard - Running for {elapsed:.1f}s") - print(f"Total Requests: {stats['total_requests']}") - print(f"Completed: {stats['completed_requests']} | Failed: {stats['failed_requests']}") - print(f"Data Received: {stats['bytes_received'] / 1024:.1f} KB") - - print("\nRequest Types:") - for rtype, count in sorted(stats['request_types'].items(), key=lambda x: x[1], reverse=True): - print(f" {rtype}: {count}") - - print("\nStatus Codes:") - for code, count in sorted(stats['status_codes'].items()): - print(f" {code}: {count}") - - print("\nTop Domains:") - top_domains = sorted(stats['domains'].items(), key=lambda x: x[1], reverse=True)[:5] - for domain, count in top_domains: - print(f" {domain}: {count}") - - await asyncio.sleep(1) - - # Start the dashboard updater task - dashboard_task = asyncio.create_task(update_dashboard()) - - async with Chrome() as browser: - tab = await browser.start() - - # Track request starts - async def on_request_sent(tab, event): - stats['total_requests'] += 1 - - # Track request type - resource_type = event['params'].get('type', 'Other') - stats['request_types'][resource_type] = stats['request_types'].get(resource_type, 0) + 1 - - # Track domain - url = event['params']['request']['url'] - try: - from urllib.parse import urlparse - domain = urlparse(url).netloc - stats['domains'][domain] = stats['domains'].get(domain, 0) + 1 - except: - pass - - # Track responses - async def on_response(tab, event): - status = event['params']['response']['status'] - stats['status_codes'][status] = stats['status_codes'].get(status, 0) + 1 - - # Track request completions - async def on_loading_finished(tab, event): - stats['completed_requests'] += 1 - if 'encodedDataLength' in event['params']: - stats['bytes_received'] += event['params']['encodedDataLength'] - - # Track failures - async def on_loading_failed(tab, event): - stats['failed_requests'] += 1 - - # Register callbacks - await tab.enable_network_events() - await tab.on(NetworkEvent.REQUEST_WILL_BE_SENT, partial(on_request_sent, tab)) - await tab.on(NetworkEvent.RESPONSE_RECEIVED, partial(on_response, tab)) - await tab.on(NetworkEvent.LOADING_FINISHED, partial(on_loading_finished, tab)) - await tab.on(NetworkEvent.LOADING_FAILED, partial(on_loading_failed, tab)) - - # Navigate to a page with lots of requests - await tab.go_to('https://news.ycombinator.com') - - # Wait for user to press Enter to exit - await asyncio.sleep(60) - - # Clean up - dashboard_task.cancel() - -asyncio.run(main()) -``` - -## Request Interception and Modification - -Request interception is where Pydoll's network capabilities truly shine. Unlike traditional browser automation tools that can only observe network traffic, Pydoll allows you to intercept and modify network requests before they are sent. - -### The Fetch Domain - -The Fetch domain in the Chrome DevTools Protocol provides advanced functionality for intercepting and manipulating network requests. Pydoll exposes this functionality through a clean API that makes it easy to implement complex network manipulation scenarios. - -```mermaid -sequenceDiagram - participant App as Application Code - participant Pydoll as Pydoll Library - participant Browser as Browser - participant Server as Web Server - - App->>Pydoll: Enable fetch events - Pydoll->>Browser: FetchCommands.enable() - Browser-->>Pydoll: Enabled - - App->>Pydoll: Register callback for REQUEST_PAUSED - - App->>Pydoll: Navigate to URL - Pydoll->>Browser: Navigate command - Browser->>Browser: Initiates request - Browser->>Pydoll: Fetch.requestPaused event - Pydoll->>App: Execute callback - - App->>Pydoll: Modify and continue request - Pydoll->>Browser: browser.continue_request() with modifications - Browser->>Server: Modified request - - Server-->>Browser: Response - Browser-->>Pydoll: Complete - Pydoll-->>App: Continue execution -``` - -### Enabling Request Interception - -To intercept requests, you need to enable the Fetch domain: - -```python -import asyncio -from pydoll.browser.chromium import Chrome -from pydoll.protocol.fetch.events import FetchEvent -from functools import partial - -async def main(): - async with Chrome() as browser: - tab = await browser.start() - - # Define a request interceptor - async def intercept_request(tab, event): - request_id = event['params']['requestId'] - request = event['params']['request'] - url = request['url'] - - print(f"Intercepted request to: {url}") - - # You must continue the request to proceed - await browser.continue_request(request_id) - - # Enable fetch events and register the interceptor - await tab.enable_fetch_events() - await tab.on(FetchEvent.REQUEST_PAUSED, partial(intercept_request, tab)) - - # Navigate to a page - await tab.go_to('https://example.com') - -asyncio.run(main()) -``` - -!!! warning "Always Continue Intercepted Requests" - When intercepting requests, you must always call `browser.continue_request()`, `browser.fail_request()`, or `browser.fulfill_request()` to resolve the intercepted request. If you don't, the browser will hang, waiting for a resolution of the intercepted request. - -### Interception Scope and Resource Types - -You can limit the scope of request interception to specific resource types: - -```python -from pydoll.constants import ResourceType - -# Intercept all requests (could be resource-intensive) -await tab.enable_fetch_events() - -# Intercept only document (HTML) requests -await tab.enable_fetch_events(resource_type=ResourceType.DOCUMENT) - -# Intercept only XHR/fetch API requests -await tab.enable_fetch_events(resource_type=ResourceType.XHR) - -# Intercept only image requests -await tab.enable_fetch_events(resource_type=ResourceType.IMAGE) -``` - -Resource types available for interception: - -| Resource Type | Description | Common Examples | -|---------------|-------------|----------------| -| `ResourceType.DOCUMENT` | Main HTML documents | HTML pages, iframes | -| `ResourceType.STYLESHEET` | CSS files | .css files | -| `ResourceType.IMAGE` | Image resources | .jpg, .png, .gif, .webp | -| `ResourceType.MEDIA` | Media files | .mp4, .webm, audio files | -| `ResourceType.FONT` | Font files | .woff, .woff2, .ttf | -| `ResourceType.SCRIPT` | JavaScript files | .js files | -| `ResourceType.TEXTTRACK` | Text track files | .vtt, .srt (captions, subtitles) | -| `ResourceType.XHR` | XMLHttpRequest calls | API calls, AJAX requests | -| `ResourceType.FETCH` | Fetch API requests | Modern API calls | -| `ResourceType.EVENTSOURCE` | Server-sent events | Stream connections | -| `ResourceType.WEBSOCKET` | WebSocket connections | Real-time communications | -| `ResourceType.MANIFEST` | Web app manifests | .webmanifest files | -| `ResourceType.OTHER` | Other resource types | Miscellaneous resources | - -### Request Modification Capabilities - -When intercepting requests, you can modify various aspects of the request before it's sent to the server: - -#### 1. Modifying URL and Method - -```python -async def redirect_request(tab, event): - request_id = event['params']['requestId'] - request = event['params']['request'] - url = request['url'] - - # Redirect requests for one domain to another - if 'old-domain.com' in url: - new_url = url.replace('old-domain.com', 'new-domain.com') - print(f"Redirecting {url} to {new_url}") - - await browser.continue_request( - request_id=request_id, - url=new_url - ) - # Change GET to POST for specific endpoints - elif '/api/data' in url and request['method'] == 'GET': - print(f"Converting GET to POST for {url}") - - await browser.continue_request( - request_id=request_id, - method='POST' - ) - else: - # Continue normally - await browser.continue_request(request_id) -``` - -#### 2. Adding or Modifying Headers - -```python -async def inject_headers(tab, event): - request_id = event['params']['requestId'] - request = event['params']['request'] - url = request['url'] - - # Get existing headers - headers = request.get('headers', {}) - - # Add or modify headers - custom_headers = [ - {'name': 'X-Custom-Header', 'value': 'CustomValue'}, - {'name': 'Authorization', 'value': 'Bearer your-token-here'}, - {'name': 'User-Agent', 'value': 'Custom User Agent String'}, - ] - - # Add existing headers to the list - for name, value in headers.items(): - custom_headers.append({'name': name, 'value': value}) - - await browser.continue_request( - request_id=request_id, - headers=custom_headers - ) -``` - -#### 3. Modifying Request Body - -```python -import json -import time - -async def modify_post_data(tab, event): - request_id = event['params']['requestId'] - request = event['params']['request'] - url = request['url'] - method = request['method'] - - # Only process POST requests to specific endpoints - if method == 'POST' and '/api/submit' in url: - # Get the original post data, if any - original_post_data = request.get('postData', '{}') - - try: - # Parse the original data - data = json.loads(original_post_data) - - # Modify the data - data['additionalField'] = 'injected-value' - data['timestamp'] = int(time.time()) - - # Convert back to string - modified_post_data = json.dumps(data) - - print(f"Modified POST data for {url}") - - await browser.continue_request( - request_id=request_id, - post_data=modified_post_data - ) - except json.JSONDecodeError: - # If not JSON, continue normally - await browser.continue_request(request_id) - else: - # Continue normally for non-POST requests - await browser.continue_request(request_id) -``` - -### Failing and Fulfilling Requests - -Besides continuing requests with modifications, you can also fail requests or fulfill them with custom responses: - -#### Failing Requests - -```python -from pydoll.constants import NetworkErrorReason - -async def block_requests(tab, event): - request_id = event['params']['requestId'] - request = event['params']['request'] - url = request['url'] - - # Block requests to tracking domains - blocked_domains = ['google-analytics.com', 'facebook.com/tr'] - - if any(domain in url for domain in blocked_domains): - print(f"Blocking request to: {url}") - await browser.fail_request(request_id, NetworkErrorReason.BLOCKED_BY_CLIENT) - else: - await browser.continue_request(request_id) -``` - -#### Fulfilling Requests with Custom Responses - -```python -async def mock_api_response(tab, event): - request_id = event['params']['requestId'] - request = event['params']['request'] - url = request['url'] - - # Mock API responses - if '/api/user' in url: - mock_response = { - 'id': 123, - 'name': 'Mock User', - 'email': 'mock@example.com' - } - - response_headers = [ - {'name': 'Content-Type', 'value': 'application/json'}, - {'name': 'Access-Control-Allow-Origin', 'value': '*'} - ] - - print(f"Mocking response for: {url}") - - await browser.fulfill_request( - request_id=request_id, - response_code=200, - response_headers=response_headers, - response_body=json.dumps(mock_response) - ) - else: - await browser.continue_request(request_id) -``` - -### Authentication Handling - -The Fetch domain can also intercept authentication challenges, allowing you to automatically handle HTTP authentication: - -```python -async def main(): - async with Chrome() as browser: - tab = await browser.start() - - # Define authentication handler - async def handle_auth(tab, event): - request_id = event['params']['requestId'] - auth_challenge = event['params']['authChallenge'] - - print(f"Authentication required: {auth_challenge['origin']}") - - # Provide credentials - await browser.continue_request_with_auth( - request_id=request_id, - auth_challenge_response='ProvideCredentials', - username="username", - password="password" - ) - - # Enable fetch events with auth handling - await tab.enable_fetch_events(handle_auth=True) - await tab.on(FetchEvent.AUTH_REQUIRED, partial(handle_auth, tab)) - - # Navigate to a page requiring authentication - await tab.go_to('https://protected-site.com') -``` - -## Advanced Network Patterns - -### Comprehensive Request Interception Example - -Here's a complete example that demonstrates various interception techniques: - -```python -import asyncio -import json -from pydoll.browser.chromium import Chrome -from pydoll.protocol.fetch.events import FetchEvent -from pydoll.constants import NetworkErrorReason, ResourceType -from functools import partial - -async def main(): - async with Chrome() as browser: - tab = await browser.start() - - async def comprehensive_interceptor(tab, event): - request_id = event['params']['requestId'] - request = event['params']['request'] - url = request['url'] - method = request['method'] - - print(f"Intercepting {method} request to: {url}") - - # Block tracking scripts - if any(tracker in url for tracker in ['google-analytics', 'facebook.com/tr']): - print(f"Blocking tracker: {url}") - await browser.fail_request(request_id, NetworkErrorReason.BLOCKED_BY_CLIENT) - return - - # Mock API responses - if '/api/config' in url: - mock_config = { - 'feature_flags': {'new_ui': True, 'beta_features': True}, - 'api_version': '2.0' - } - - await browser.fulfill_request( - request_id=request_id, - response_code=200, - response_headers=[ - {'name': 'Content-Type', 'value': 'application/json'}, - {'name': 'Cache-Control', 'value': 'no-cache'} - ], - response_body=json.dumps(mock_config) - ) - return - - # Inject custom headers for API requests - if '/api/' in url: - headers = [ - {'name': 'X-Custom-Client', 'value': 'Pydoll-Automation'}, - {'name': 'X-Request-ID', 'value': f'req-{request_id}'} - ] - - # Preserve existing headers - for name, value in request.get('headers', {}).items(): - headers.append({'name': name, 'value': value}) - - await browser.continue_request( - request_id=request_id, - headers=headers - ) - return - - # Continue all other requests normally - await browser.continue_request(request_id) - - # Enable fetch events for XHR and Fetch requests only - await tab.enable_fetch_events(resource_type=ResourceType.XHR) - await tab.on(FetchEvent.REQUEST_PAUSED, partial(comprehensive_interceptor, tab)) - - # Navigate and interact with the page - await tab.go_to('https://example.com') - await asyncio.sleep(5) # Wait for network activity - -asyncio.run(main()) -``` - -## Performance Considerations - -While Pydoll's network capabilities are powerful, there are some performance considerations to keep in mind: - -1. **Selective Interception**: Intercepting all requests can significantly slow down page loading. Be selective about which resource types you intercept. - -2. **Memory Management**: Network event callbacks can consume memory if they store large amounts of data. Be mindful of memory usage in long-running automations. - -3. **Callback Efficiency**: Keep your event callbacks efficient, especially for high-frequency events like network requests. Inefficient callbacks can slow down the entire automation process. - -4. **Cleanup**: Always disable network and fetch events when you're done using them to prevent memory leaks. - -```python -# Enable events only when needed -await tab.enable_network_events() -await tab.enable_fetch_events(resource_type=ResourceType.XHR) # Only intercept XHR requests - -# Do your automation work... - -# Clean up when done -await tab.disable_network_events() -await tab.disable_fetch_events() -``` - -## Best Practices - -### 1. Use Resource Type Filtering Effectively - -```python -# Bad: Intercept all requests (performance impact) -await tab.enable_fetch_events() - -# Good: Only intercept the specific resource types you need -await tab.enable_fetch_events(resource_type=ResourceType.XHR) # For API calls -await tab.enable_fetch_events(resource_type=ResourceType.DOCUMENT) # For main documents -``` - -### 2. Always Resolve Intercepted Requests - -```python -# Always resolve every intercepted request -async def intercept_handler(tab, event): - request_id = event['params']['requestId'] - - try: - # Make any modifications needed - custom_headers = [{'name': 'X-Custom', 'value': 'Value'}] - - # Continue the request - await browser.continue_request( - request_id=request_id, - headers=custom_headers - ) - except Exception as e: - print(f"Error in request handler: {e}") - # Always try to continue the request even if there was an error - try: - await browser.continue_request(request_id) - except: - pass -``` - -### 3. Implement Proper Error Handling - -```python -async def safe_network_handler(tab, event): - request_id = event['params']['requestId'] - - try: - # Your interception logic here - await process_request(event) - await browser.continue_request(request_id) - except Exception as e: - print(f"Error in request handler: {e}") - # Try to continue the request even if there was an error - try: - await browser.continue_request(request_id) - except: - # If we can't continue, try to fail it gracefully - try: - await browser.fail_request(request_id, NetworkErrorReason.FAILED) - except: - pass -``` - -### 4. Use Partial for Clean Callback Management - -```python -from functools import partial - -# Define your handler with tab object as first parameter -async def handle_request(tab, config, event): - # Now you have access to both tab and custom config - request_id = event['params']['requestId'] - - if config['block_trackers'] and is_tracker(event['params']['request']['url']): - await browser.fail_request(request_id, NetworkErrorReason.BLOCKED_BY_CLIENT) - else: - await browser.continue_request(request_id) - -# Register with partial to pre-bind parameters -config = {"block_trackers": True} -await tab.on( - FetchEvent.REQUEST_PAUSED, - partial(handle_request, tab, config) -) -``` - -## Conclusion - -Pydoll's network capabilities provide unprecedented control over browser network traffic, enabling advanced use cases that go beyond traditional browser automation. Whether you're monitoring API calls, injecting custom headers, or modifying request data, these features can greatly enhance your automation workflows. - -By leveraging the power of the Chrome DevTools Protocol, Pydoll makes it easy to implement sophisticated network monitoring and interception patterns while maintaining high performance and reliability. - -Remember to use these capabilities responsibly and consider the performance implications of extensive network monitoring and interception in your automation scripts. diff --git a/docs/deep-dive/tab-domain.md b/docs/deep-dive/tab-domain.md deleted file mode 100644 index 1b304b3..0000000 --- a/docs/deep-dive/tab-domain.md +++ /dev/null @@ -1,830 +0,0 @@ -# Tab Domain - -The Tab domain forms the core of Pydoll's architecture, providing a comprehensive interface for controlling browser tabs and their content. This domain bridges your high-level automation code with the browser's capabilities, enabling everything from basic navigation to complex interaction patterns. - -```mermaid -graph TB - User["User Code"] --> Tab["Tab Domain"] - - subgraph "Core Capabilities" - Tab --> Nav["Navigation"] - Tab --> Elements["Element Operations"] - Tab --> JS["JavaScript Execution"] - Tab --> Events["Event System"] - Tab --> State["Session Management"] - end - - Nav & Elements & JS --> Website["Website"] - Events <--> Website -``` - -## Technical Architecture - -The Tab domain in Pydoll acts as an integration layer between your automation code and multiple Chrome DevTools Protocol (CDP) domains. It's implemented as a concrete class that integrates multiple functional capabilities through composition and inheritance. - -```mermaid -classDiagram - class Tab { - -_browser: Browser - -_connection_handler: ConnectionHandler - -_target_id: str - -_browser_context_id: Optional[str] - -_page_events_enabled: bool - -_network_events_enabled: bool - -_fetch_events_enabled: bool - -_dom_events_enabled: bool - -_runtime_events_enabled: bool - -_intercept_file_chooser_dialog_enabled: bool - -_cloudflare_captcha_callback_id: Optional[int] - +go_to(url: str, timeout: int) - +refresh() - +execute_script(script: str, element: WebElement) - +find(**kwargs) WebElement|List[WebElement] - +query(expression: str) WebElement|List[WebElement] - +take_screenshot(path: str) - +print_to_pdf(path: str) - +enable_page_events() - +enable_network_events() - +on(event_name: str, callback: callable) - +close() - } - - class FindElementsMixin { - +find(**kwargs) WebElement|List[WebElement] - +query(expression: str) WebElement|List[WebElement] - +find_or_wait_element(by: By, value: str, timeout: int) WebElement|List[WebElement] - } - - class ConnectionHandler { - +execute_command(command: dict) - +register_callback(event_name: str, callback: callable) - } - - class WebElement { - -_connection_handler: ConnectionHandler - -_object_id: str - +click() - +type(text: str) - +get_attribute(name: str) - +text - +is_visible() - } - - Tab --|> FindElementsMixin : inherits - Tab *-- ConnectionHandler : uses - Tab ..> WebElement : creates - WebElement *-- ConnectionHandler : uses -``` - -The design leverages several key patterns: - -1. **Inheritance** - The Tab class inherits from FindElementsMixin to gain element location capabilities -2. **Composition** - It uses a ConnectionHandler to manage CDP communication -3. **Factory Method** - It creates WebElement instances when finding elements in the tab -4. **Command** - It translates high-level methods into CDP commands -5. **Observer** - It implements an event system for reacting to browser events - -### CDP Integration - -The Tab domain integrates with multiple CDP domains to provide its functionality: - -| CDP Domain | Purpose | -|------------|---------| -| **Page** | Core page lifecycle and navigation | -| **Runtime** | JavaScript execution in page context | -| **DOM** | Document structure and element access | -| **Network** | Network operations and cookie management | -| **Fetch** | Request interception and modification | -| **Storage** | Cookie and storage management | - -This integration creates a powerful abstraction that simplifies browser automation while providing access to the full capabilities of the underlying protocol. - -```mermaid -sequenceDiagram - participant Client as User Code - participant Tab as Tab Domain - participant CDP as Chrome DevTools Protocol - participant Browser as Browser - - Client->>Tab: await tab.go_to("https://example.com") - Tab->>CDP: Page.navigate - CDP->>Browser: Execute navigation - - Browser-->>CDP: Page.loadEventFired - CDP-->>Tab: Event notification - Tab-->>Client: Navigation completed - - Client->>Tab: await tab.find(id="login") - Tab->>CDP: Runtime.evaluate / DOM.querySelector - CDP->>Browser: Execute DOM query - Browser-->>CDP: Return element - CDP-->>Tab: Element response - Tab->>Tab: Create WebElement - Tab-->>Client: Return WebElement -``` - -## Initialization and State Management - -The Tab class is initialized with parameters from the browser instance: - -```python -def __init__( - self, - browser: 'Browser', - connection_port: int, - target_id: str, - browser_context_id: Optional[str] = None, -): - """ - Initialize tab controller for existing browser tab. - - Args: - browser: Browser instance that created this tab. - connection_port: CDP WebSocket port. - target_id: CDP target identifier for this tab. - browser_context_id: Optional browser context ID. - """ - self._browser = browser - self._connection_port = connection_port - self._target_id = target_id - self._connection_handler = ConnectionHandler(connection_port, target_id) - self._page_events_enabled = False - self._network_events_enabled = False - self._fetch_events_enabled = False - self._dom_events_enabled = False - self._runtime_events_enabled = False - self._intercept_file_chooser_dialog_enabled = False - self._cloudflare_captcha_callback_id = None - self._browser_context_id = browser_context_id -``` - -The Tab class maintains several state flags to track which event domains are currently enabled. This state management is crucial for: - -1. Preventing duplicate event registrations -2. Accurately reflecting the current capabilities of the tab -3. Enabling proper cleanup when the tab is closed - -## Core Patterns and Usage - -The Tab domain follows a consistent pattern for interaction in Pydoll v2.0+: - -```python -import asyncio -from pydoll.browser.chromium import Chrome - -async def pydoll_example(): - # Create a browser instance and get initial tab - browser = Chrome() - tab = await browser.start() # Returns Tab directly - - try: - # Work with the tab... - await tab.go_to("https://example.com") - - # Find and interact with elements - button = await tab.find(id="submit") - await button.click() - - finally: - # Clean up when done - await browser.stop() - -# Run your example with asyncio -asyncio.run(pydoll_example()) -``` - -Most examples in this documentation assume a browser and tab have already been created and will be properly cleaned up. - -## Navigation System - -The Tab domain provides a fluid navigation experience through a combination of methods that abstract the complexities of browser navigation: - -```python -# Navigate to a page with custom timeout -await tab.go_to("https://example.com", timeout=60) - -# Get the current URL -current_url = await tab.current_url -print(f"Current URL: {current_url}") - -# Get the page source -source = await tab.page_source -print(f"Page source length: {len(source)}") - -# Refresh the page -await tab.refresh() -``` - -!!! tip "Advanced Navigation" - For specialized navigation scenarios, you can combine navigation with event listeners: - - ```python - # Listen for network requests during navigation - await tab.enable_network_events() - await tab.on('Network.responseReceived', handle_response) - - # Navigate to the page - await tab.go_to('https://example.com') - ``` - -Under the hood, the navigation system performs several operations: - -1. Sends the navigation command through the connection handler -2. Monitors page load status through periodic JavaScript evaluation -3. Manages timeouts to prevent infinite waits -4. Handles refresh optimization if navigating to the current URL - - -## JavaScript Execution - -The JavaScript execution system in the Tab domain provides two distinct execution modes: - -1. **Global Execution**: Evaluates JavaScript in the global page context -2. **Element Context Execution**: Executes JavaScript with an element as the context - -```python -# Execute JavaScript in page context -dimensions = await tab.execute_script(""" - return { - width: window.innerWidth, - height: window.innerHeight, - devicePixelRatio: window.devicePixelRatio - } -""") -print(f"Window dimensions: {dimensions}") - -# Find an element and manipulate it with JavaScript -heading = await tab.find(tag_name="h1") - -# Execute JavaScript with the element as context -await tab.execute_script(""" - // 'argument' refers to the element - argument.style.color = 'red'; - argument.style.fontSize = '32px'; - argument.textContent = 'Modified by JavaScript'; -""", heading) -``` - -!!! warning "Script Execution Security" - When executing scripts, be aware of security implications: - - - Scripts run with the full permissions of the page - - Input validation is crucial if script content includes user data - - Consider using element methods instead of scripts for standard operations - -The implementation transforms the provided JavaScript code and parameters to match the CDP requirements: - -1. For global execution: - - The script is sent directly to Runtime.evaluate -2. For element context execution: - - The script is wrapped in a function - - 'argument' references are replaced with 'this' - - The function is called with the element's objectId as context - -## Session State Management - -The Tab domain implements sophisticated session state management that works with browser contexts: - -```python -# Set cookies for this tab -cookies_to_set = [ - { - "name": "session_id", - "value": "test_session_123", - "domain": "example.com", - "path": "/", - "secure": True, - "httpOnly": True - } -] -await tab.set_cookies(cookies_to_set) - -# Get all cookies accessible from this tab -all_cookies = await tab.get_cookies() -print(f"Number of cookies: {len(all_cookies)}") - -# Delete all cookies from this tab's context -await tab.delete_all_cookies() -``` - -!!! info "Tab-Specific Cookie Management" - A powerful feature of Pydoll is the ability to control cookies at the individual Tab level within browser contexts: - - ```python - # Create different contexts for isolation - context1 = await browser.create_browser_context() - context2 = await browser.create_browser_context() - - # Tabs in different contexts have isolated cookies - tab1 = await browser.new_tab("https://example.com", browser_context_id=context1) - tab2 = await browser.new_tab("https://example.com", browser_context_id=context2) - - # Set different cookies for each tab - await tab1.set_cookies([{"name": "user", "value": "user_a", "domain": "example.com"}]) - await tab2.set_cookies([{"name": "user", "value": "user_b", "domain": "example.com"}]) - ``` - - This capability enables: - - Testing user interactions between different account types - - Comparing different user permission levels side-by-side - - Maintaining multiple authenticated sessions simultaneously - -## Content Capture - -The Tab domain provides flexible methods for capturing visual content: - -```python -# Take a screenshot and save it to a file -await tab.take_screenshot("homepage.png") - -# Get a screenshot as base64 (useful for embedding in reports) -screenshot_base64 = await tab.take_screenshot(as_base64=True) - -# Take a high-quality screenshot -await tab.take_screenshot("high_quality.jpg", quality=95) - -# Export page as PDF -await tab.print_to_pdf("homepage.pdf") - -# Export PDF with custom settings -await tab.print_to_pdf( - "custom.pdf", - landscape=True, - print_background=True, - scale=0.8 -) -``` - -!!! info "Supported Screenshot Formats" - Pydoll supports saving screenshots in several formats: - - PNG (.png): Lossless compression, best for UI testing - - JPEG (.jpg/.jpeg): Lossy compression, smaller file size - - If you attempt to use an unsupported format, Pydoll will raise an `InvalidFileExtension` exception. - -These visual capture capabilities are invaluable for: -- Visual regression testing -- Creating documentation -- Debugging automation scripts -- Archiving page content - -## Event System Overview - -The Tab domain provides a comprehensive event system for monitoring and reacting to browser events: - -```python -# Enable different event domains -await tab.enable_page_events() -await tab.enable_network_events() -await tab.enable_fetch_events() -await tab.enable_dom_events() -await tab.enable_runtime_events() - -# Register event handlers -async def handle_load_event(event): - print("Page loaded!") - -async def handle_network_response(event): - url = event['params']['response']['url'] - print(f"Response received from: {url}") - -await tab.on('Page.loadEventFired', handle_load_event) -await tab.on('Network.responseReceived', handle_network_response) -``` - -### Event Properties - -The Tab class provides convenient properties to check event states: - -```python -# Check which events are enabled -print(f"Page events enabled: {tab.page_events_enabled}") -print(f"Network events enabled: {tab.network_events_enabled}") -print(f"Fetch events enabled: {tab.fetch_events_enabled}") -print(f"DOM events enabled: {tab.dom_events_enabled}") -print(f"Runtime events enabled: {tab.runtime_events_enabled}") -``` - -!!! info "Event Categories" - Pydoll supports several event categories, each requiring explicit enabling: - - - **Page Events**: Navigation, loading, errors, dialog handling - - **Network Events**: Requests, responses, WebSockets - - **DOM Events**: Document updates, attribute changes - - **Fetch Events**: Request interception and modification - - **Runtime Events**: JavaScript execution and console messages - -## Advanced Capabilities - -### Cloudflare Captcha Handling - -The Tab domain provides intelligent Cloudflare captcha handling through two distinct approaches: - -```python -# Context manager approach (blocks until captcha is solved) -async with tab.expect_and_bypass_cloudflare_captcha(): - await tab.go_to("https://site-with-cloudflare.com") - # Continue only after captcha is solved - -# Background processing approach -await tab.enable_auto_solve_cloudflare_captcha() -await tab.go_to("https://another-protected-site.com") -# Code continues immediately, captcha solved in background - -# When finished with auto-solving -await tab.disable_auto_solve_cloudflare_captcha() -``` - -### Dialog Management - -Pydoll implements dialog handling through event monitoring and state tracking: - -```python -# Set up a dialog handler -async def handle_dialog(event): - if await tab.has_dialog(): - message = await tab.get_dialog_message() - print(f"Dialog detected: {message}") - await tab.handle_dialog(accept=True) - -# Enable page events to detect dialogs -await tab.enable_page_events() -await tab.on('Page.javascriptDialogOpening', handle_dialog) - -# Trigger an alert dialog -await tab.execute_script("alert('This is a test alert')") -``` - -## Network Analysis Methods - -The Tab domain provides specialized methods for analyzing network traffic and extracting response data. These methods require network events to be enabled first. - -### Network Logs Retrieval - -The `get_network_logs()` method provides access to all captured network requests: - -```python -# Enable network monitoring -await tab.enable_network_events() - -# Navigate to trigger network requests -await tab.go_to('https://example.com/api-heavy-page') - -# Get all network logs -all_logs = await tab.get_network_logs() -print(f"Captured {len(all_logs)} network requests") - -# Filter logs by URL content -api_logs = await tab.get_network_logs(filter='api') -static_logs = await tab.get_network_logs(filter='.js') -domain_logs = await tab.get_network_logs(filter='example.com') - -print(f"API requests: {len(api_logs)}") -print(f"JavaScript files: {len(static_logs)}") -print(f"Domain requests: {len(domain_logs)}") -``` - -### Response Body Extraction - -The `get_network_response_body()` method allows extraction of actual response content: - -```python -from functools import partial -from pydoll.protocol.network.events import NetworkEvent - -# Storage for captured responses -captured_responses = {} - -async def capture_api_responses(tab, event): - """Capture response bodies from API calls""" - request_id = event['params']['requestId'] - response = event['params']['response'] - url = response['url'] - - # Only capture API responses - if '/api/' in url and response['status'] == 200: - try: - # Extract the response body - body = await tab.get_network_response_body(request_id) - captured_responses[url] = body - print(f"Captured response from: {url}") - except Exception as e: - print(f"Failed to capture response: {e}") - -# Enable network monitoring and register callback -await tab.enable_network_events() -await tab.on(NetworkEvent.RESPONSE_RECEIVED, partial(capture_api_responses, tab)) - -# Navigate to trigger API calls -await tab.go_to('https://example.com/dashboard') -await asyncio.sleep(3) # Wait for API calls - -print(f"Captured {len(captured_responses)} API responses") -``` - -### Practical Network Analysis Example - -Here's a comprehensive example combining both methods for thorough network analysis: - -```python -import asyncio -import json -from functools import partial -from pydoll.browser.chromium import Chrome -from pydoll.protocol.network.events import NetworkEvent - -async def comprehensive_network_analysis(): - async with Chrome() as browser: - tab = await browser.start() - - # Storage for analysis results - analysis_results = { - 'api_responses': {}, - 'failed_requests': [], - 'request_summary': {} - } - - async def analyze_responses(tab, event): - """Analyze network responses""" - request_id = event['params']['requestId'] - response = event['params']['response'] - url = response['url'] - status = response['status'] - - # Track failed requests - if status >= 400: - analysis_results['failed_requests'].append({ - 'url': url, - 'status': status, - 'request_id': request_id - }) - return - - # Capture successful API responses - if '/api/' in url and status == 200: - try: - body = await tab.get_network_response_body(request_id) - - # Try to parse JSON responses - try: - data = json.loads(body) - analysis_results['api_responses'][url] = { - 'data': data, - 'size': len(body), - 'type': 'json' - } - except json.JSONDecodeError: - analysis_results['api_responses'][url] = { - 'data': body, - 'size': len(body), - 'type': 'text' - } - - except Exception as e: - print(f"Failed to capture response from {url}: {e}") - - # Enable monitoring and register callback - await tab.enable_network_events() - await tab.on(NetworkEvent.RESPONSE_RECEIVED, partial(analyze_responses, tab)) - - # Navigate and perform actions - await tab.go_to('https://example.com/complex-app') - await asyncio.sleep(5) # Wait for network activity - - # Get comprehensive logs - all_logs = await tab.get_network_logs() - api_logs = await tab.get_network_logs(filter='api') - - # Generate summary - analysis_results['request_summary'] = { - 'total_requests': len(all_logs), - 'api_requests': len(api_logs), - 'failed_requests': len(analysis_results['failed_requests']), - 'captured_responses': len(analysis_results['api_responses']) - } - - # Display results - print("🔍 Network Analysis Results:") - print(f" Total requests: {analysis_results['request_summary']['total_requests']}") - print(f" API requests: {analysis_results['request_summary']['api_requests']}") - print(f" Failed requests: {analysis_results['request_summary']['failed_requests']}") - print(f" Captured responses: {analysis_results['request_summary']['captured_responses']}") - - # Show failed requests - if analysis_results['failed_requests']: - print("\n❌ Failed Requests:") - for failed in analysis_results['failed_requests']: - print(f" {failed['status']} - {failed['url']}") - - # Show captured API data - if analysis_results['api_responses']: - print("\n✅ Captured API Responses:") - for url, info in analysis_results['api_responses'].items(): - print(f" {url} ({info['type']}, {info['size']} bytes)") - - return analysis_results - -# Run the analysis -asyncio.run(comprehensive_network_analysis()) -``` - -### Use Cases for Network Analysis - -These network analysis methods enable powerful automation scenarios: - -**API Testing and Validation:** -```python -# Validate API responses during automated testing -api_logs = await tab.get_network_logs(filter='/api/users') -for log in api_logs: - request_id = log['params']['requestId'] - response_body = await tab.get_network_response_body(request_id) - data = json.loads(response_body) - - # Validate response structure - assert 'users' in data - assert len(data['users']) > 0 -``` - -**Performance Monitoring:** -```python -# Monitor request timing and sizes -all_logs = await tab.get_network_logs() -large_responses = [] - -for log in all_logs: - if 'response' in log['params']: - response = log['params']['response'] - if response.get('encodedDataLength', 0) > 1000000: # > 1MB - large_responses.append({ - 'url': response['url'], - 'size': response['encodedDataLength'] - }) - -print(f"Found {len(large_responses)} large responses") -``` - -**Data Extraction:** -```python -# Extract dynamic content loaded via AJAX -await tab.go_to('https://spa-application.com') -await asyncio.sleep(3) # Wait for AJAX calls - -data_logs = await tab.get_network_logs(filter='/data/') -extracted_data = [] - -for log in data_logs: - request_id = log['params']['requestId'] - try: - body = await tab.get_network_response_body(request_id) - data = json.loads(body) - extracted_data.extend(data.get('items', [])) - except: - continue - -print(f"Extracted {len(extracted_data)} data items") -``` - -### File Upload Handling - -The Tab domain provides a context manager for handling file uploads: - -```python -# Path to a file to upload -file_path = "document.pdf" - -# Use the context manager to handle file chooser dialog -async with tab.expect_file_chooser(files=file_path): - # Find and click the upload button - upload_button = await tab.find(id="upload-button") - await upload_button.click() -``` - -### IFrame Interaction - -Work with iframes through the Tab domain: - -```python -# Find an iframe element -iframe_element = await tab.find(tag_name="iframe") - -# Get a Tab instance for the iframe -iframe_tab = await tab.get_frame(iframe_element) - -# Interact with content inside the iframe -iframe_button = await iframe_tab.find(id="iframe-button") -await iframe_button.click() -``` - -## Tab Lifecycle Management - -### Closing Tabs - -```python -# Close a specific tab -await tab.close() - -# Note: Tab instance becomes invalid after closing -``` - -### Multiple Tab Management - -```python -# Create multiple tabs -tab1 = await browser.start() # Initial tab -tab2 = await browser.new_tab("https://example.com") -tab3 = await browser.new_tab("https://github.com") - -# Work with different tabs -await tab1.go_to("https://google.com") -await tab2.find(id="search").type_text("Pydoll") -await tab3.find(class_name="header-search-input").type_text("automation") - -# Close specific tabs when done -await tab2.close() -await tab3.close() -``` - -## Performance Optimization - -### Event Optimization - -Enable only the specific event domains necessary for your current task: - -```python -# GOOD: Enable only what you need -await tab.enable_network_events() # Only enable network events - -# BAD: Enabling unnecessary events creates overhead -await tab.enable_page_events() -await tab.enable_network_events() -await tab.enable_dom_events() -await tab.enable_fetch_events() -await tab.enable_runtime_events() -``` - -### Resource Management - -```python -# Use context managers for automatic cleanup -async with Chrome() as browser: - tab = await browser.start() - - # Enable events only when needed - await tab.enable_page_events() - - try: - # Your automation code - await tab.go_to("https://example.com") - finally: - # Events are automatically cleaned up when browser closes - pass -``` - -## Domain Relationships - -Understanding Pydoll's domain architecture helps clarify how the Tab Domain fits into the library's broader ecosystem: - -```mermaid -graph LR - Browser["Browser Domain
(Browser management)"] - Tab["Tab Domain
(Tab interaction)"] - Element["WebElement Domain
(Element interaction)"] - - Browser -->|"creates and manages"| Tab - Tab -->|"locates and creates"| Element -``` - -The **Browser Domain** sits at the top of the hierarchy, responsible for browser lifecycle, connection management, and global configuration. It creates and manages tab instances through methods like `start()` and `new_tab()`. - -The **Tab Domain** acts as the crucial intermediary, operating within the context of a specific browser tab. It exposes methods for navigation, content interaction, JavaScript execution, and event handling. A fundamental aspect is its ability to locate elements within the tab and create WebElement instances. - -The **WebElement Domain** represents specific DOM elements. Each WebElement belongs to a tab and provides specialized methods for interactions such as clicking, typing, or retrieving properties. - -This layered architecture provides several benefits: - -- **Separation of Concerns**: Each domain has a clear, well-defined purpose -- **Reusability**: Components can be used independently when needed -- **Ease of Use**: The API follows a natural flow from browser → tab → element -- **Flexibility**: Multiple tabs can operate within a single browser with independent states - -## Conclusion - -The Tab domain is the central workspace for most Pydoll automation tasks. Its sophisticated architecture integrates multiple CDP domains into a unified API that simplifies complex automation scenarios while maintaining the full power of the Chrome DevTools Protocol. - -The domain's design leverages several architectural patterns: -- Inheritance and composition for code organization -- Command pattern for CDP communication -- Observer pattern for event handling -- Factory pattern for element creation -- Context managers for resource management - -Key advantages of the Tab domain in Pydoll v2.0+: - -1. **Intuitive Element Finding**: Modern `find()` and `query()` methods -2. **Browser Context Integration**: Seamless work with isolated browser contexts -3. **Comprehensive Event System**: Full CDP event support with easy enabling/disabling -4. **Advanced Automation**: Built-in captcha handling, dialog management, and file uploads -5. **Performance Optimization**: Selective event enabling and proper resource management - -By understanding the Tab domain's architecture, capabilities, and patterns, you can create sophisticated browser automation scripts that effectively handle navigation, interaction, events, and state management in modern web applications. \ No newline at end of file diff --git a/docs/deep-dive/webelement-domain.md b/docs/deep-dive/webelement-domain.md deleted file mode 100644 index 90db9fa..0000000 --- a/docs/deep-dive/webelement-domain.md +++ /dev/null @@ -1,446 +0,0 @@ -# WebElement Domain - -The WebElement domain is a cornerstone of Pydoll's architecture, providing a rich representation of DOM elements that allows for intuitive and powerful interactions with web page components. This domain bridges the gap between high-level automation code and the underlying DOM elements rendered by the browser. - -```mermaid -graph TB - Client["User Code"] --> Tab["Tab Domain"] - Tab --> FindElement["FindElementsMixin"] - FindElement --> WebElement["WebElement Domain"] - WebElement --> DOM["Browser DOM"] - - WebElement --> Properties["Properties & Attributes"] - WebElement --> Interactions["User Interactions"] - WebElement --> State["Element State"] - WebElement --> TextOperations["Text Operations"] - - class WebElement stroke:#4CAF50,stroke-width:3px -``` - -## Understanding WebElement - -At its core, a WebElement represents a snapshot of a DOM element within a tab. Unlike traditional DOM references in JavaScript, a WebElement in Pydoll is: - -1. **Asynchronous** - All interactions follow Python's async/await pattern -2. **Persistent** - Maintains a reference to the element across page changes -3. **Self-contained** - Encapsulates all operations possible on a DOM element -4. **Intelligent** - Implements specialized handling for different element types - -Each WebElement instance maintains several crucial pieces of information: - -```python -class WebElement(FindElementsMixin): - def __init__( - self, - object_id: str, - connection_handler: ConnectionHandler, - method: Optional[str] = None, - selector: Optional[str] = None, - attributes_list: list[str] = [], - ): - self._object_id = object_id - self._search_method = method - self._selector = selector - self._connection_handler = connection_handler - self._attributes: dict[str, str] = {} - self._def_attributes(attributes_list) -``` - -The core components include: -- The `object_id` provides a remote JavaScript reference to the element -- The `connection_handler` enables communication with the browser -- The `_search_method` and `_selector` track how the element was found -- The `_attributes` dictionary stores element attributes - -By inheriting from `FindElementsMixin`, each WebElement can also function as a starting point for finding child elements. - -## Technical Architecture - -The WebElement domain combines several key design patterns to provide a robust and flexible API: - -```mermaid -classDiagram - class WebElement { - -_object_id: str - -_search_method: Optional[str] - -_selector: Optional[str] - -_connection_handler: ConnectionHandler - -_attributes: dict[str, str] - +click() - +click_using_js() - +type_text(text: str) - +insert_text(text: str) - +get_attribute(name: str) - +set_input_files(files: list[str]) - +scroll_into_view() - +take_screenshot(path: str) - +text - +inner_html - +bounds - +value - +id - +class_name - +tag_name - +is_enabled - } - - class FindElementsMixin { - +find(**kwargs) WebElement|List[WebElement] - +query(expression: str) WebElement|List[WebElement] - +find_or_wait_element(by: By, value: str, timeout: int) WebElement|List[WebElement] - } - - class ConnectionHandler { - +execute_command(command: dict) - } - - WebElement --|> FindElementsMixin : inherits - WebElement *-- ConnectionHandler : uses -``` - -The architectural design follows several key principles: - -1. **Command Pattern** - Element interactions are translated into CDP commands -2. **Property System** - Combines synchronous attribute access with asynchronous DOM property retrieval -3. **Mixin Inheritance** - Inherits element finding capabilities through the FindElementsMixin -4. **Bridge Pattern** - Abstracts the CDP protocol details from the user-facing API - -### Attribute Management - -A unique aspect of WebElement's design is how it handles HTML attributes: - -```python -def _def_attributes(self, attributes_list: list): - """ - Defines element attributes from a flat list of key-value pairs. - """ - for i in range(0, len(attributes_list), 2): - key = attributes_list[i] - key = key if key != 'class' else 'class_name' - value = attributes_list[i + 1] - self._attributes[key] = value -``` - -This approach: -1. Processes attributes during element creation -2. Provides fast, synchronous access to common attributes -3. Handles Python reserved keywords (like `class` → `class_name`) -4. Forms the basis for the element's string representation - -!!! info "Attribute vs. Property Access" - WebElement provides two complementary ways to access element data: - - - **Attribute Dictionary**: Fast, synchronous access to HTML attributes available at element creation - - **Asynchronous Properties**: Dynamic access to current DOM state through CDP commands - - ```python - # Synchronous attribute access (from initial HTML) - element_id = element.id - element_class = element.class_name - - # Asynchronous property access (current state from DOM) - element_text = await element.text - element_bounds = await element.bounds - ``` - -## Core Interaction Patterns - -The WebElement domain provides several categories of interactions: - -### Element Properties - -WebElement offers both synchronous and asynchronous property access: - -```python -# Synchronous properties (from attributes present at element creation) -element_id = element.id -element_class = element.class_name -is_element_enabled = element.is_enabled -element_value = element.value - -# Asynchronous properties (retrieved from live DOM) -element_text = await element.text -element_html = await element.inner_html -element_bounds = await element.bounds -``` - -The implementation balances performance and freshness by determining which properties should be synchronous (static HTML attributes) and which should be asynchronous (dynamic DOM state): - -```python -@property -async def text(self) -> str: - """ - Retrieves the text of the element. - """ - outer_html = await self.inner_html - soup = BeautifulSoup(outer_html, 'html.parser') - return soup.get_text(strip=True) - -@property -def id(self) -> str: - """ - Retrieves the id of the element. - """ - return self._attributes.get('id') -``` - -### Mouse Interactions - -WebElement provides multiple ways to interact with elements through mouse events: - -```python -# Standard click at element center -await element.click() - -# Click with offset from center -await element.click(x_offset=10, y_offset=5) - -# Click with longer hold time (like for long press) -await element.click(hold_time=1.0) - -# JavaScript-based click (useful for elements that are difficult to click) -await element.click_using_js() -``` - -The implementation intelligently handles different element types and visibility states: - -```python -async def click( - self, - x_offset: int = 0, - y_offset: int = 0, - hold_time: float = 0.1, -): - """ - Clicks on the element using mouse events. - """ - if self._is_option_tag(): - return await self.click_option_tag() - - if not await self._is_element_visible(): - raise exceptions.ElementNotVisible( - 'Element is not visible on the page.' - ) - - await self.scroll_into_view() - - # Get element position and calculate click point - # ... (position calculation code) - - # Send mouse press and release events - press_command = InputCommands.mouse_press(*position_to_click) - release_command = InputCommands.mouse_release(*position_to_click) - await self._connection_handler.execute_command(press_command) - await asyncio.sleep(hold_time) - await self._connection_handler.execute_command(release_command) -``` - -!!! tip "Special Element Handling" - The WebElement implementation includes specialized handling for different element types: - - ```python - # Option elements in dropdowns need special click handling - if self._is_option_tag(): - return await self.click_option_tag() - - # File inputs need special file selection handling - await input_element.set_input_files("path/to/file.pdf") - ``` - -### Keyboard Interactions - -WebElement provides multiple ways to input text into form elements: - -```python -# Quick text insertion (faster but less realistic) -await element.insert_text("Hello, world!") - -# Realistic typing with configurable speed -await element.type_text("Hello, world!", interval=0.1) - -# Individual key events -await element.key_down(Key.CONTROL) -await element.key_down(Key.A) -await element.key_up(Key.A) -await element.key_up(Key.CONTROL) - -# Press and release key combination -await element.press_keyboard_key(Key.ENTER, interval=0.1) -``` - -!!! info "File Upload Handling" - For file input elements, WebElement provides a specialized method: - - ```python - # Upload a single file - await file_input.set_input_files(["path/to/file.pdf"]) - - # Upload multiple files - await file_input.set_input_files(["file1.jpg", "file2.jpg"]) - ``` - -## Visual Capabilities - -### Element Screenshots - -WebElement can capture screenshots of specific elements: - -```python -# Take a screenshot of just this element -await element.take_screenshot("element.png") - -# Take a high-quality screenshot -await element.take_screenshot("element.jpg", quality=95) -``` - -This implementation involves: -1. Getting the element's bounds using JavaScript -2. Creating a clip region for the screenshot -3. Taking a screenshot of just that region -4. Saving the image to the specified path - -```python -async def take_screenshot(self, path: str, quality: int = 100): - """ - Capture screenshot of this element only. - - Automatically scrolls element into view before capturing. - """ - bounds = await self.get_bounds_using_js() - clip = Viewport( - x=bounds['x'], - y=bounds['y'], - width=bounds['width'], - height=bounds['height'], - scale=1, - ) - screenshot = await self._connection_handler.execute_command( - PageCommands.capture_screenshot( - format=ScreenshotFormat.JPEG, clip=clip, quality=quality - ) - ) - async with aiofiles.open(path, 'wb') as file: - image_bytes = decode_base64_to_bytes(screenshot['result']['data']) - await file.write(image_bytes) -``` - -!!! tip "Multiple Bounds Methods" - WebElement provides two ways to get element bounds: - - ```python - # Using the DOM domain (primary method) - bounds = await element.bounds - - # Fallback using JavaScript (more reliable in some cases) - bounds = await element.get_bounds_using_js() - ``` - -## JavaScript Integration - -WebElement provides seamless integration with JavaScript for operations that require direct DOM interaction: - -```python -# Execute JavaScript in the context of this element -await element._execute_script("this.style.border = '2px solid red';") - -# Get result from JavaScript execution -visibility = await element._is_element_visible() -``` - -The implementation uses the CDP Runtime domain to execute JavaScript with the element as the context: - -```python -async def _execute_script( - self, script: str, return_by_value: bool = False -): - """ - Executes a JavaScript script in the context of this element. - """ - return await self._execute_command( - RuntimeCommands.call_function_on( - self._object_id, script, return_by_value - ) - ) -``` - -## Element State Verification - -WebElement provides methods to check the element's visibility and interactability: - -```python -# Check if element is visible -is_visible = await element._is_element_visible() - -# Check if element is the topmost at its position -is_on_top = await element._is_element_on_top() -``` - -These verifications are crucial for reliable automation, ensuring that elements can be interacted with before attempting operations. - -## Position and Scrolling - -The WebElement domain includes methods for positioning and scrolling: - -```python -# Scroll element into view -await element.scroll_into_view() - -# Get element bounds -bounds = await element.bounds -``` - -These capabilities ensure that elements are visible in the viewport before interaction, mimicking how a real user would interact with a page. - -## Performance and Reliability Considerations - -The WebElement domain balances performance and reliability through several key strategies: - -### Smart Fallbacks - -Many methods implement multiple approaches to ensure operations succeed even in challenging scenarios: - -```python -async def click(self, ...): - # Try using CDP mouse events first - # If that fails, fallback to JavaScript click - # If that fails, provide a clear error message -``` - -### Appropriate Context Selection - -The implementation chooses the most appropriate context for each operation: - -| Operation | Approach | Rationale | -|-----------|----------|-----------| -| Get Text | Parse HTML with BeautifulSoup | More accurate text extraction | -| Click | Mouse events via CDP | Most realistic user simulation | -| Select Option | Specialized JavaScript | Required for dropdown elements | -| Check Visibility | JavaScript | Most reliable across browser variations | - -### Command Batching - -Where possible, operations are combined to reduce round-trips to the browser: - -```python -# Get element bounds in a single operation -bounds = await element.get_bounds_using_js() - -# Calculate position in local code without additional browser calls -position_to_click = ( - bounds['x'] + bounds['width'] / 2, - bounds['y'] + bounds['height'] / 2, -) -``` - -## Conclusion - -The WebElement domain provides a comprehensive and intuitive interface for interacting with elements in a web page. By encapsulating the complexities of DOM interaction, event handling, and state management, it allows automation code to focus on high-level tasks rather than low-level details. - -The domain demonstrates several key design principles: - -1. **Abstraction** - Hides the complexity of CDP commands behind a clean API -2. **Specialization** - Provides unique handling for different element types -3. **Hybrid Access** - Balances synchronous and asynchronous operations for optimal performance -4. **Resilience** - Implements fallback strategies for common operations - -When used in conjunction with the Tab domain and Browser domain, WebElement creates a powerful toolset for web automation that handles the complexities of modern web applications while providing a straightforward and reliable API for developers. \ No newline at end of file diff --git a/docs/api/browser/chrome.md b/docs/zh/api/browser/chrome.md similarity index 100% rename from docs/api/browser/chrome.md rename to docs/zh/api/browser/chrome.md diff --git a/docs/api/browser/edge.md b/docs/zh/api/browser/edge.md similarity index 100% rename from docs/api/browser/edge.md rename to docs/zh/api/browser/edge.md diff --git a/docs/api/browser/managers.md b/docs/zh/api/browser/managers.md similarity index 63% rename from docs/api/browser/managers.md rename to docs/zh/api/browser/managers.md index b2af203..4ee1e5c 100644 --- a/docs/api/browser/managers.md +++ b/docs/zh/api/browser/managers.md @@ -1,8 +1,8 @@ -# Browser Managers +# 浏览器管理器 -The managers module provides specialized classes for managing different aspects of browser lifecycle and configuration. +管理器模块提供专门的类来管理浏览器生命周期和配置。 -## Overview +## 总览 Browser managers handle specific responsibilities in browser automation: @@ -15,10 +15,10 @@ Browser managers handle specific responsibilities in browser automation: - "!^_" - "!^__" -## Manager Classes +## 管理器类 -### Browser Process Manager -Manages the browser process lifecycle, including starting, stopping, and monitoring browser processes. +### 浏览器进程管理器 +管理浏览器进程的生命周期,包括启动、停止和监控浏览器进程。 ::: pydoll.browser.managers.browser_process_manager options: @@ -26,8 +26,8 @@ Manages the browser process lifecycle, including starting, stopping, and monitor show_source: false heading_level: 3 -### Browser Options Manager -Handles browser configuration options and command-line arguments. +### 浏览器选项管理器 +处理浏览器配置选项和命令行参数。 ::: pydoll.browser.managers.browser_options_manager options: @@ -35,8 +35,8 @@ Handles browser configuration options and command-line arguments. show_source: false heading_level: 3 -### Proxy Manager -Manages proxy configuration and authentication for browser instances. +### 代理管理器 +管理浏览器实例的代理配置和身份验证。 ::: pydoll.browser.managers.proxy_manager options: @@ -44,8 +44,8 @@ Manages proxy configuration and authentication for browser instances. show_source: false heading_level: 3 -### Temporary Directory Manager -Handles creation and cleanup of temporary directories used by browser instances. +### 临时目录管理器 +处理浏览器实例使用的临时目录的创建和清理。 ::: pydoll.browser.managers.temp_dir_manager options: @@ -53,9 +53,8 @@ Handles creation and cleanup of temporary directories used by browser instances. show_source: false heading_level: 3 -## Usage - -Managers are typically used internally by browser classes like `Chrome` and `Edge`. They provide modular functionality that can be composed together: +## 用法 +管理器通常由 Chrome 和 Edge 等浏览器类内部使用。它们提供可组合的模块化功能: ```python from pydoll.browser.managers.proxy_manager import ProxyManager diff --git a/docs/api/browser/options.md b/docs/zh/api/browser/options.md similarity index 100% rename from docs/api/browser/options.md rename to docs/zh/api/browser/options.md diff --git a/docs/api/browser/tab.md b/docs/zh/api/browser/tab.md similarity index 100% rename from docs/api/browser/tab.md rename to docs/zh/api/browser/tab.md diff --git a/docs/zh/api/commands/browser.md b/docs/zh/api/commands/browser.md new file mode 100644 index 0000000..213d842 --- /dev/null +++ b/docs/zh/api/commands/browser.md @@ -0,0 +1,41 @@ +# 浏览器命令 + +浏览器命令提供对浏览器实例及其配置的底层控制。 + +## 概述 + +浏览器命令模块处理浏览器级别的操作,例如版本信息、目标管理和浏览器范围的设置。 + +::: pydoll.commands.browser_commands + options: + show_root_heading: true + show_source: false + heading_level: 2 + filters: + - "!^_" + - "!^__" + +## 用法 + +浏览器命令通常由浏览器类在内部使用,用于管理浏览器实例: + +```python +from pydoll.commands.browser_commands import get_version +from pydoll.connection.connection_handler import ConnectionHandler + +# Get browser version information +connection = ConnectionHandler() +version_info = await get_version(connection) +``` + +## 可用命令 + +浏览器命令模块提供以下功能: + +- 获取浏览器版本和用户代理信息 +- 管理浏览器目标(标签页、窗口) +- 控制浏览器范围的设置和权限 +- 处理浏览器生命周期事件 + +!!! note "Internal Usage" + These commands are primarily used internally by the `Chrome` and `Edge` browser classes. Direct usage is recommended only for advanced scenarios. \ No newline at end of file diff --git a/docs/zh/api/commands/dom.md b/docs/zh/api/commands/dom.md new file mode 100644 index 0000000..1f444cc --- /dev/null +++ b/docs/zh/api/commands/dom.md @@ -0,0 +1,62 @@ +# DOM命令 + +DOM 命令模块提供了与网页文档对象模型交互的全面功能。 + +## 概述 + +DOM 命令模块是 Pydoll 中最重要的模块之一,它提供了查找、交互和操作网页上的 HTML 元素所需的所有功能。 + +::: pydoll.commands.dom_commands + options: + show_root_heading: true + show_source: false + heading_level: 2 + filters: + - "!^_" + - "!^__" + +## Usage + +DOM commands are used extensively by the `WebElement` class and element finding methods: + +## 用法 + +`WebElement` 类和元素查找方法广泛使用DOM 命令: + +```python +from pydoll.commands.dom_commands import query_selector, get_attributes +from pydoll.connection.connection_handler import ConnectionHandler + +# Find element and get its attributes +connection = ConnectionHandler() +node_id = await query_selector(connection, selector="#username") +attributes = await get_attributes(connection, node_id=node_id) +``` + +## 主要功能 + +DOM 命令模块提供以下功能: + +### 元素定位 +- `query_selector()` - 通过CSS选择器进行元素定位 +- `query_selector_all()` - 通过CSS选择器进行元素定位(查找多个元素) +- `get_document()` - 获取document的根节点 + +### 元素交互 +- `click_element()` - 点击元素 +- `focus_element()` - 焦点置于元素 +- `set_attribute_value()` - 设置元素属性 +- `get_attributes()` - 获取元素属性 + +### 元素信息 +- `get_box_model()` - 获取元素位置和尺寸 +- `describe_node()` - 获取元素详细信息 +- `get_outer_html()` - 获取元素的HTML内容 + +### DOM 操作 +- `remove_node()` - 从DOM节点中删除元素 +- `set_node_value()` - 设置元素值 +- `request_child_nodes()` - 获取子元素 + +!!! tip "High-Level APIs" + While these commands provide powerful low-level access, most users should use the higher-level `WebElement` class methods like `click()`, `type_text()`, and `get_attribute()` which use these commands internally. \ No newline at end of file diff --git a/docs/api/commands/fetch.md b/docs/zh/api/commands/fetch.md similarity index 56% rename from docs/api/commands/fetch.md rename to docs/zh/api/commands/fetch.md index 93f1d1d..e287d4e 100644 --- a/docs/api/commands/fetch.md +++ b/docs/zh/api/commands/fetch.md @@ -1,10 +1,10 @@ -# Fetch Commands +# Fetch 命令 -Fetch commands provide advanced network request handling and interception capabilities using the Fetch API domain. +Fetch 命令使用 Fetch API 域提供高级网络请求处理和拦截功能。 -## Overview +## 概述 -The fetch commands module enables sophisticated network request management, including request modification, response interception, and authentication handling. +Fetch 命令模块支持复杂的网络请求管理,包括请求修改、响应拦截和身份验证处理。 ::: pydoll.commands.fetch_commands options: @@ -15,9 +15,9 @@ The fetch commands module enables sophisticated network request management, incl - "!^_" - "!^__" -## Usage +## 用法 -Fetch commands are used for advanced network interception and request handling: +Fetch 命令用于高级网络拦截和请求处理: ```python from pydoll.commands.fetch_commands import enable, request_paused, continue_request @@ -36,36 +36,37 @@ async def handle_paused_request(request_id, request): await continue_request(connection, request_id=request_id) ``` -## Key Functionality +## 关键功能 -The fetch commands module provides functions for: +fetch 命令模块提供以下功能: -### Request Interception -- `enable()` - Enable fetch domain with patterns -- `disable()` - Disable fetch domain -- `continue_request()` - Continue intercepted requests -- `fail_request()` - Fail requests with specific errors +### 请求拦截 +- `enable()` - 激活fetch模式 +- `disable()` - 关闭fetch模式 +- `continue_request()` - 继续请求(放行) +- `fail_request()` - 返回特定错误请求 -### Request Modification -- Modify request headers -- Change request URLs -- Alter request methods (GET, POST, etc.) -- Modify request bodies +### 修改请求 +- 修改请求headers +- 更改请求 URL +- 更改请求方法(GET、POST 等) +- 修改请求body -### Response Handling -- `fulfill_request()` - Provide custom responses -- `get_response_body()` - Get response content -- Response header modification -- Response status code control +### 响应处理 +- `fulfill_request()` - 提供自定义响应 +- `get_response_body()` - 获取响应内容 +- 修改响应头 +- 响应状态码控制 -### Authentication -- `continue_with_auth()` - Handle authentication challenges -- Basic authentication support -- Custom authentication flows +### 身份验证 +- `continue_with_auth()` - 处理身份验证挑战 +- 基本身份验证支持 +- 自定义身份验证流程 -## Advanced Features +## 高级功能 + +### 基于模式的拦截 -### Pattern-Based Interception ```python # Intercept specific URL patterns patterns = [ @@ -77,7 +78,7 @@ patterns = [ await enable(connection, patterns=patterns) ``` -### Request Modification +### 请求修改 ```python # Modify intercepted requests async def modify_request(request_id, request): @@ -93,7 +94,7 @@ async def modify_request(request_id, request): ) ``` -### Response Mocking +### 响应模拟 ```python # Mock API responses await fulfill_request( @@ -108,7 +109,7 @@ await fulfill_request( ) ``` -### Authentication Handling +### 身份验证处理 ```python # Handle authentication challenges await continue_with_auth( @@ -122,16 +123,16 @@ await continue_with_auth( ) ``` -## Request Stages +## 请求阶段 -Fetch commands can intercept requests at different stages: +Fetch 命令可以在不同阶段拦截请求: -| Stage | Description | Use Cases | +| 阶段 | 描述 | 用例 | |-------|-------------|-----------| -| Request | Before request is sent | Modify headers, URL, method | -| Response | After response received | Mock responses, modify content | +| 请求 | 请求发送前 | 修改标头、URL 和方法 | +| 响应 | 收到响应后 | 模拟响应,修改内容 | -## Error Handling +## 错误处理 ```python # Fail requests with specific errors @@ -142,12 +143,12 @@ await fail_request( ) ``` -## Integration with Network Commands +## 与网络命令集成 -Fetch commands work alongside network commands but provide more granular control: +Fetch 命令与网络命令协同工作,但提供更精细的控制: -- **Network Commands**: Broader network monitoring and control -- **Fetch Commands**: Specific request/response interception and modification +- **网络命令**:更广泛的网络监控和控制 +- **Fetch 命令**:特定的请求/响应拦截和修改 !!! tip "Performance Considerations" Fetch interception can impact page loading performance. Use specific URL patterns and disable when not needed to minimize overhead. \ No newline at end of file diff --git a/docs/zh/api/commands/input.md b/docs/zh/api/commands/input.md new file mode 100644 index 0000000..8306031 --- /dev/null +++ b/docs/zh/api/commands/input.md @@ -0,0 +1,77 @@ +# 输入命令 + +输入命令处理鼠标和键盘交互,提供真人仿真的输入模拟。 + +## 概述 + +输入命令模块提供模拟用户输入的功能,包括鼠标移动、点击、键盘输入和按键操作。 + +::: pydoll.commands.input_commands + options: + show_root_heading: true + show_source: false + heading_level: 2 + filters: + - "!^_" + - "!^__" + +## 用法 + +输入命令由元素交互方法使用,可直接用于高级输入场景: + +```python +from pydoll.commands.input_commands import dispatch_mouse_event, dispatch_key_event +from pydoll.connection.connection_handler import ConnectionHandler + +# Simulate mouse click +connection = ConnectionHandler() +await dispatch_mouse_event( + connection, + type="mousePressed", + x=100, + y=200, + button="left" +) + +# Simulate keyboard typing +await dispatch_key_event( + connection, + type="keyDown", + key="Enter" +) +``` + +## 主要功能 + +输入命令模块提供以下函数: + +### 鼠标事件 +- `dispatch_mouse_event()` - 鼠标点击、移动和滚轮事件 +- 鼠标按键状态(左键、右键、中键) +- 基于坐标的定位 +- 拖放操作 + + +### 键盘事件 +- `dispatch_key_event()` - 键盘按下和释放事件 +- `insert_text()` - 直接插入文本 +- 特殊键处理(Enter、Tab、箭头键等) +- 修饰键(Ctrl、Alt、Shift) + + +### 触摸事件 +- 触摸屏模拟 +- 多点触控手势 +- 触摸坐标和压力控制 + +## 仿真行为 + +输入命令支持仿真行为模式: + +- 平滑的鼠标移动曲线 +- 真实的打字速度和模式 +- 操作之间随机的微延迟 +- 压力感应触摸事件 + +!!! tip "Element Methods" + For most use cases, use the higher-level element methods like `element.click()` and `element.type_text()` which provide a more convenient API and handle common scenarios automatically. \ No newline at end of file diff --git a/docs/zh/api/commands/network.md b/docs/zh/api/commands/network.md new file mode 100644 index 0000000..5cdbede --- /dev/null +++ b/docs/zh/api/commands/network.md @@ -0,0 +1,103 @@ +# 网络命令 + +网络命令提供对网络请求、响应和浏览器网络行为的全面控制。 + +## 概述 + +网络命令模块支持请求拦截、响应修改、Cookie 管理和网络监控功能。 + +::: pydoll.commands.network_commands + options: + show_root_heading: true + show_source: false + heading_level: 2 + filters: + - "!^_" + - "!^__" + +## 用法 + +网络命令用于请求拦截和网络监控等高级场景: + +```python +from pydoll.commands.network_commands import enable, set_request_interception +from pydoll.connection.connection_handler import ConnectionHandler + +# Enable network monitoring +connection = ConnectionHandler() +await enable(connection) + +# Enable request interception +await set_request_interception(connection, patterns=[{"urlPattern": "*"}]) +``` + +## 主要功能 + +网络命令模块提供以下功能: + + +### 请求管理 +- `enable()` / `disable()` - 启用/禁用网络监控 +- `set_request_interception()` - 拦截并修改请求 +- `continue_intercepted_request()` - 继续或修改拦截的请求 +- `get_request_post_data()` - 获取请求体数据 + + +### 响应处理 +- `get_response_body()` - 获取响应内容 +- `fulfill_request()` - 提供自定义响应 +- `fail_request()` - 模拟网络异常 + +### Cookie 管理 +- `get_cookies()` - 获取浏览器 Cookie +- `set_cookies()` - 设置浏览器 Cookie +- `delete_cookies()` - 删除指定 Cookie +- `clear_browser_cookies()` - 清除所有 Cookie + +### 缓存控制 +- `clear_browser_cache()` - 清除浏览器缓存 +- `set_cache_disabled()` - 禁用浏览器缓存 +- `get_response_body_for_interception()` - 获取缓存的响应 + +### 安全和标头 +- `set_user_agent_override()` - 覆盖用户代理 +- `set_extra_http_headers()` - 添加自定义标头 +- `emulate_network_conditions()` - 模拟网络连接状况 + +## 高级用例 + +### 请求拦截 + +```python +# 拦截修改请求 +await set_request_interception(connection, patterns=[ + {"urlPattern": "*/api/*", "requestStage": "Request"} +]) + +# 拦截请求处理 +async def handle_request(request): + if "api/login" in request.url: + # 修改请求头 + headers = request.headers.copy() + headers["Authorization"] = "Bearer token" + await continue_intercepted_request( + connection, + request_id=request.request_id, + headers=headers + ) +``` + +### 响应模拟 +```python +# 模拟 API 响应 +await fulfill_request( + connection, + request_id=request_id, + response_code=200, + response_headers={"Content-Type": "application/json"}, + body='{"status": "success"}' +) +``` + +!!! warning "Performance Impact" + Network interception can impact page loading performance. Use selectively and disable when not needed. \ No newline at end of file diff --git a/docs/zh/api/commands/page.md b/docs/zh/api/commands/page.md new file mode 100644 index 0000000..5ac5de7 --- /dev/null +++ b/docs/zh/api/commands/page.md @@ -0,0 +1,99 @@ +# 页面命令 + +页面命令处理页面导航、生命周期事件和页面操作。 + +## 概述 + +页面命令模块提供页面间导航、管理页面生命周期、处理 JavaScript 执行以及控制页面行为的功能。 + +::: pydoll.commands.page_commands + options: + show_root_heading: true + show_source: false + heading_level: 2 + filters: + - "!^_" + - "!^__" + +## 用法 + +“Tab”类广泛使用页面命令进行导航和页面管理: + +```python +from pydoll.commands.page_commands import navigate, reload, enable +from pydoll.connection.connection_handler import ConnectionHandler + +# Navigate to a URL +connection = ConnectionHandler() +await enable(connection) # Enable page events +await navigate(connection, url="https://example.com") + +# Reload the page +await reload(connection) +``` + +## 关键功能 + +页面命令模块提供以下函数: + +### 导航 +- `navigate()` - 访问URL +- `reload()` - 重新加载当前页面 +- `go_back()` - 后退一步 +- `go_forward()` - 前进一步 +- `stop_loading()` - 停止页面加载 + +### 页面生命周期 +- `enable()` / `disable()` - 启用/禁用页面事件 +- `get_frame_tree()` - 获取页面框架结构 +- `get_navigation_history()` - 获取导航历史记录 + +### 内容管理 +- `get_resource_content()` - 获取页面资源内容 +- `search_in_resource()` - 在页面资源内搜索 +- `set_document_content()` - 设置页面 HTML 内容 + +### 截图和 PDF +- `capture_screenshot()` - 页面截图 +- `print_to_pdf()` - 将页面保存为PDF +- `capture_snapshot()` - 页面快照 + +### JavaScript 执行 +- `add_script_to_evaluate_on_new_document()` - 添加启动脚本(在网页加载前注入js) +- `remove_script_to_evaluate_on_new_document()` - 移除启动脚本 + +### 页面设置 +- `set_lifecycle_events_enabled()` - 控制生命周期事件 +- `set_ad_blocking_enabled()` - 启用/禁用广告拦截 +- `set_bypass_csp()` - 绕过内容安全策略 + +## 高级功能 +### 框架管理 + +```python +# Get all frames in the page +frame_tree = await get_frame_tree(connection) +for frame in frame_tree.child_frames: + print(f"Frame: {frame.frame.url}") +``` + +### 资源拦截 +```python +# Get resource content +content = await get_resource_content( + connection, + frame_id=frame_id, + url="https://example.com/script.js" +) +``` + +### 页面事件 +页面命令可与各种页面事件配合使用: +- `Page.loadEventFired` - 页面加载完成 +- `Page.domContentEventFired` - DOM 内容已加载 +- `Page.frameNavigated` - 框架访问结束 +- `Page.frameStartedLoading` - 框架加载开始 + + +!!! 小提示“Tab 类集成” +大多数页面操作都可以通过 `Tab` 类方法实现,例如 `tab.go_to()`、`tab.reload()` 和 `tab.screenshot()`,这些方法提供了更便捷的 API。 \ No newline at end of file diff --git a/docs/zh/api/commands/runtime.md b/docs/zh/api/commands/runtime.md new file mode 100644 index 0000000..3dcfbd2 --- /dev/null +++ b/docs/zh/api/commands/runtime.md @@ -0,0 +1,111 @@ +# 运行时命令 + +运行时命令提供 JavaScript 执行功能和运行时环境管理。 + +## 概述 + +运行时命令模块支持在浏览器上下文中执行 JavaScript 代码、检查对象以及控制运行时环境。 + +::: pydoll.commands.runtime_commands + options: + show_root_heading: true + show_source: false + heading_level: 2 + filters: + - "!^_" + - "!^__" + +## 用法 + +运行时命令用于 JavaScript 执行和运行时管理: + +```python +from pydoll.commands.runtime_commands import evaluate, enable +from pydoll.connection.connection_handler import ConnectionHandler + +# Enable runtime events +connection = ConnectionHandler() +await enable(connection) + +# Execute JavaScript +result = await evaluate( + connection, + expression="document.title", + return_by_value=True +) +print(result.value) # Page title +``` + +## 主要功能 + +运行时命令模块提供以下功能: + +### JavaScript 执行 +- `evaluate()` - 执行 JavaScript 表达式 +- `call_function_on()` - 调用对象上的函数 +- `compile_script()` - 编译 JavaScript 以供复用 +- `run_script()` - 运行已编译的脚本 + +### 对象管理 +- `get_properties()` - 获取对象属性 +- `release_object()` - 释放对象引用 +- `release_object_group()` - 释放对象组 + +### 运行时控制 +- `enable()` / `disable()` - 启用/禁用运行时事件 +- `discard_console_entries()` - 清除控制台记录 +- `set_custom_object_formatter_enabled()` - 启用自定义格式化程序 + +### 异常处理 +- `set_async_call_stack_depth()` - 设置调用堆栈深度 +- 异常捕获和报告 +- 错误对象检查 + +## 高级用法 + +### 复杂的 JavaScript 执行 + +```python +# 执行带有错误处理的复杂 JavaScript +script = """ +try { + const elements = document.querySelectorAll('.item'); + return Array.from(elements).map(el => ({ + text: el.textContent, + href: el.href + })); +} catch (error) { + return { error: error.message }; +} +""" + +result = await evaluate( + connection, + expression=script, + return_by_value=True, + await_promise=True +) +``` + +### 对象检查 +```python +# Get detailed object properties +properties = await get_properties( + connection, + object_id=object_id, + own_properties=True, + accessor_properties_only=False +) + +for prop in properties: + print(f"{prop.name}: {prop.value}") +``` + +### 控制台集成 +运行时命令与浏览器控制台集成: +- 控制台消息和错误 +- 控制台 API 方法调用 +- 自定义控制台格式化程序 + +!!! note "Performance Considerations" + JavaScript execution through runtime commands can be slower than native browser execution. Use judiciously for complex operations. \ No newline at end of file diff --git a/docs/zh/api/commands/storage.md b/docs/zh/api/commands/storage.md new file mode 100644 index 0000000..6b78e1f --- /dev/null +++ b/docs/zh/api/commands/storage.md @@ -0,0 +1,132 @@ +# 存储命令 + +存储命令提供全面的浏览器存储管理,包括 Cookie、localStorage、sessionStorage 和 IndexedDB。 + +## 概述 + +存储命令模块支持管理所有浏览器存储机制,提供数据持久化和检索功能。 + +::: pydoll.commands.storage_commands + options: + show_root_heading: true + show_source: false + heading_level: 2 + filters: + - "!^_" + - "!^__" + +## 用法 + +存储命令用于跨不同机制管理浏览器存储: + +```python +from pydoll.commands.storage_commands import get_cookies, set_cookies, clear_data_for_origin +from pydoll.connection.connection_handler import ConnectionHandler + +# Get cookies for a domain +connection = ConnectionHandler() +cookies = await get_cookies(connection, urls=["https://example.com"]) + +# Set a new cookie +await set_cookies(connection, cookies=[{ + "name": "session_id", + "value": "abc123", + "domain": "example.com", + "path": "/", + "httpOnly": True, + "secure": True +}]) + +# Clear all storage for an origin +await clear_data_for_origin( + connection, + origin="https://example.com", + storage_types="all" +) +``` + +## 关键功能 + +存储命令模块提供以下函数: + +### Cookie 管理 +- `get_cookies()` - 通过 URL 或域名获取 Cookie +- `set_cookies()` - 设置新 Cookie +- `delete_cookies()` - 删除特定 Cookie +- `clear_cookies()` - 清除所有 Cookie + + +### 本地存储 +- `get_dom_storage_items()` - 获取localStorage +- `set_dom_storage_item()` - 设置localStorage +- `remove_dom_storage_item()` - 移除localStorage +- `clear_dom_storage()` - 清除localStorage + +### 会话存储 +- 会话存储操作(类似于本地存储) +- 特定会话的数据管理 +- 选项卡隔离存储 + +### IndexedDB +- `get_database_names()` - 获取 IndexedDB 数据库 +- `request_database()` - 访问数据库结构 +- `request_data()` - 查询数据库数据 +- `clear_object_store()` - 清除对象存储 + +### 缓存存储 +- `request_cache_names()` - 获取缓存名称 +- `request_cached_response()` - 获取缓存响应 +- `delete_cache()` - 删除缓存条目 + +### 应用程序缓存(已弃用) +- 支持旧版应用程序缓存 +- 基于清单的缓存 + +## 高级功能 + +### 批量操作 +```python +# Clear all storage types for multiple origins +origins = ["https://example.com", "https://api.example.com"] +for origin in origins: + await clear_data_for_origin( + connection, + origin=origin, + storage_types="cookies,local_storage,session_storage,indexeddb" + ) +``` + +### 存储配额 +```python +# Get storage quota information +quota_info = await get_usage_and_quota(connection, origin="https://example.com") +print(f"Used: {quota_info.usage} bytes") +print(f"Quota: {quota_info.quota} bytes") +``` + +### Cross-Origin 存储 +```python +# Manage storage across different origins +await set_cookies(connection, cookies=[{ + "name": "cross_site_token", + "value": "token123", + "domain": ".example.com", # Applies to all subdomains + "sameSite": "None", + "secure": True +}]) +``` + +## 存储类型 + +该模块支持多种存储机制: + +| 存储类型 | 持久性 | 范围 | 容量 | +|-----------|----------|----------|----------| +| Cookies | 持久性 | 域/路径 | 每个 cookie 约 4KB | +| localStorage | 持久性 | 来源 | 约 5-10MB | +| sessionStorage | 会话 | Tab | 约 5-10MB | +| IndexedDB | 持久性 | 来源 | 大容量 (GB+) | +| Cache API | 持久性 | 来源 | 大容量 | + +!!! warning "Privacy Considerations" + Storage operations can affect user privacy. Always handle storage data responsibly and in compliance with privacy regulations. \ No newline at end of file diff --git a/docs/api/commands/target.md b/docs/zh/api/commands/target.md similarity index 57% rename from docs/api/commands/target.md rename to docs/zh/api/commands/target.md index 28b90bf..6d0b243 100644 --- a/docs/api/commands/target.md +++ b/docs/zh/api/commands/target.md @@ -1,10 +1,10 @@ -# Target Commands +# Target命令 -Target commands manage browser targets including tabs, windows, and other browsing contexts. +Target命令管理浏览器目标,包括标签页、窗口和其他浏览上下文。 -## Overview +## 概述 -The target commands module provides functionality for creating, managing, and controlling browser targets such as tabs, popup windows, and service workers. +Target命令模块提供创建、管理和控制浏览器目标(例如标签页、弹出窗口和服务工作线程)的功能。 ::: pydoll.commands.target_commands options: @@ -15,9 +15,9 @@ The target commands module provides functionality for creating, managing, and co - "!^_" - "!^__" -## Usage +## 用法 -Target commands are used internally by browser classes to manage tabs and windows: +Target命令由浏览器类内部使用,用于管理标签页和窗口: ```python from pydoll.commands.target_commands import get_targets, create_target, close_target @@ -34,36 +34,37 @@ new_target = await create_target(connection, url="https://example.com") await close_target(connection, target_id=new_target.target_id) ``` -## Key Functionality +## 主要功能 -The target commands module provides functions for: +Target命令模块提供以下功能: -### Target Management -- `get_targets()` - List all browser targets -- `create_target()` - Create new tabs or windows -- `close_target()` - Close specific targets -- `activate_target()` - Bring target to foreground -### Target Information -- `get_target_info()` - Get detailed target information -- Target types: page, background_page, service_worker, browser -- Target states: attached, detached, crashed +### Target管理 +- `get_targets()` - 列出所有浏览器Target +- `create_target()` - 创建新的标签页或窗口 +- `close_target()` - 关闭特定Target +- `activate_target()` - 将Target置于前台 -### Session Management -- `attach_to_target()` - Attach to target for control -- `detach_from_target()` - Detach from target -- `send_message_to_target()` - Send commands to targets +### Target 信息 +- `get_target_info()` - 获取详细的Target信息 +- Target类型:页面、background_page、service_worker、浏览器 +- Target状态:已连接、已分离、崩溃 -### Browser Context -- `create_browser_context()` - Create isolated browser context -- `dispose_browser_context()` - Remove browser context -- `get_browser_contexts()` - List browser contexts +### Session 管理 +- `attach_to_target()` - 附加到Target进行控制 +- `detach_from_target()` - 分离Target +- `send_message_to_target()` - 向Target发送命令 -## Target Types +### 浏览器上下文 +- `create_browser_context()` - 创建独立的浏览器上下文 +- `dispose_browser_context()` - 移除浏览器上下文 +- `get_browser_contexts()` - 列出浏览器上下文 -Different types of targets can be managed: +## 目标类型 -### Page Targets +可以管理不同类型的目标: + +### 页面 Targets ```python # Create a new tab page_target = await create_target( @@ -75,7 +76,7 @@ page_target = await create_target( ) ``` -### Popup Windows +### 弹窗 ```python # Create a popup window popup_target = await create_target( @@ -87,7 +88,7 @@ popup_target = await create_target( ) ``` -### Incognito Contexts +### 无痕上下文 ```python # Create incognito browser context incognito_context = await create_browser_context(connection) @@ -100,16 +101,17 @@ incognito_tab = await create_target( ) ``` -## Advanced Features +## 高级特性 + +### 目标事件 +Target命令可与各种Target事件配合使用: +- `Target.targetCreated` - 新Target创建 +- `Target.targetDestroyed` - Target关闭 +- `Target.targetInfoChanged` - Target信息更新 +- `Target.targetCrashed` - Target崩溃 -### Target Events -Target commands work with various target events: -- `Target.targetCreated` - New target created -- `Target.targetDestroyed` - Target closed -- `Target.targetInfoChanged` - Target information updated -- `Target.targetCrashed` - Target crashed +### 多Target协调 -### Multi-Target Coordination ```python # Manage multiple tabs targets = await get_targets(connection) @@ -121,7 +123,7 @@ for target in page_targets: # ... do work in this tab ``` -### Target Isolation +### Target 隔离 ```python # Create isolated browser context for testing test_context = await create_browser_context(connection) diff --git a/docs/api/connection/connection.md b/docs/zh/api/connection/connection.md similarity index 87% rename from docs/api/connection/connection.md rename to docs/zh/api/connection/connection.md index 677f602..b748a03 100644 --- a/docs/api/connection/connection.md +++ b/docs/zh/api/connection/connection.md @@ -1,4 +1,4 @@ -# Connection Handler +# 连接处理器 ::: pydoll.connection.connection_handler.ConnectionHandler options: diff --git a/docs/api/connection/managers.md b/docs/zh/api/connection/managers.md similarity index 84% rename from docs/api/connection/managers.md rename to docs/zh/api/connection/managers.md index e6129d7..eabed67 100644 --- a/docs/api/connection/managers.md +++ b/docs/zh/api/connection/managers.md @@ -1,6 +1,6 @@ -# Connection Managers +# 连接管理器 -## CommandsManager +## 命令管理器 ::: pydoll.connection.managers.commands_manager.CommandsManager options: @@ -8,7 +8,7 @@ show_source: false heading_level: 3 -## EventsManager +## 事件管理器 ::: pydoll.connection.managers.events_manager.EventsManager options: diff --git a/docs/api/core/constants.md b/docs/zh/api/core/constants.md similarity index 61% rename from docs/api/core/constants.md rename to docs/zh/api/core/constants.md index f055063..62d72e1 100644 --- a/docs/api/core/constants.md +++ b/docs/zh/api/core/constants.md @@ -1,6 +1,6 @@ -# Constants +# 常量 -This section documents all constants, enums, and configuration values used throughout Pydoll. +本节记录了 Pydoll 中使用的所有常量、枚举和配置值。 ::: pydoll.constants options: diff --git a/docs/api/core/exceptions.md b/docs/zh/api/core/exceptions.md similarity index 63% rename from docs/api/core/exceptions.md rename to docs/zh/api/core/exceptions.md index 2a68988..74ec9f6 100644 --- a/docs/api/core/exceptions.md +++ b/docs/zh/api/core/exceptions.md @@ -1,6 +1,6 @@ -# Exceptions +# 异常 -This section documents all custom exceptions that can be raised by Pydoll operations. +本节记录了 Pydoll 操作可能引发的所有自定义异常。 ::: pydoll.exceptions options: diff --git a/docs/api/core/utils.md b/docs/zh/api/core/utils.md similarity index 63% rename from docs/api/core/utils.md rename to docs/zh/api/core/utils.md index 5c6db66..8e9a8ac 100644 --- a/docs/api/core/utils.md +++ b/docs/zh/api/core/utils.md @@ -1,6 +1,6 @@ -# Utilities +# 实用功能 -This section documents utility functions and helper classes used throughout Pydoll. +本节记录了 Pydoll 中使用的实用程序函数和辅助类。 ::: pydoll.utils options: diff --git a/docs/zh/api/elements/mixins.md b/docs/zh/api/elements/mixins.md new file mode 100644 index 0000000..28376b0 --- /dev/null +++ b/docs/zh/api/elements/mixins.md @@ -0,0 +1,40 @@ +# 元素mixins + +mixins 模块提供可复用的功能,可以将其混合到元素类中以扩展其功能。 + +## 元素定位mixins + +`FindElementsMixin` 为包含它的类提供元素查找功能。 + +::: pydoll.elements.mixins.find_elements_mixin + options: + show_root_heading: true + show_source: false + heading_level: 2 + filters: + - "!^_" + - "!^__" + +## 用法 + +Mixin 通常由库内部使用,用于组合功能。`Tab` 和 `WebElement` 等类使用 `FindElementsMixin` 来提供元素定位方法: + +```python +# 这些方法来自 FindElementsMixin +element = await tab.find(id="username") +elements = await tab.find(class_name="item", find_all=True) +element = await tab.query("#submit-button") +``` + + +## 可用方法 + +`FindElementsMixin` 提供了多种元素定位的方法: + +- `find()` - 使用关键字参数的现代元素查找方法 +- `query()` - CSS 选择器和 XPath 查询 +- `find_element()` - 旧版元素定位方法 +- `find_elements()` - 查找多个元素的旧版方法 + +!!! 提示“现代 vs 传统” +`find()` 方法是最新的、推荐的查找元素的方法。`find_element()` 和 `find_elements()` 方法保留下来,以实现向后兼容。 \ No newline at end of file diff --git a/docs/api/elements/web_element.md b/docs/zh/api/elements/web_element.md similarity index 92% rename from docs/api/elements/web_element.md rename to docs/zh/api/elements/web_element.md index d27c33b..23d2dd0 100644 --- a/docs/api/elements/web_element.md +++ b/docs/zh/api/elements/web_element.md @@ -1,4 +1,4 @@ -# WebElement +# 网页元素 ::: pydoll.elements.web_element.WebElement options: diff --git a/docs/zh/api/index.md b/docs/zh/api/index.md new file mode 100644 index 0000000..d243200 --- /dev/null +++ b/docs/zh/api/index.md @@ -0,0 +1,138 @@ +# API 参考 + +这里是Pydoll API 参考!本节提供 Pydoll 库中所有类、方法和函数的详尽文档。 + +## 概述 + +Pydoll 几个关键模块组成,每个模块在浏览器自动化中都有特定的用途: + +### 浏览器模块 +浏览器模块可以管理浏览器实例和生命周期。 + +- **[Chrome](browser/chrome.md)** - Chrome 浏览器自动化 +- **[Edge](browser/edge.md)** - Microsoft Edge 浏览器自动化 +- **[Options](browser/options.md)** - 浏览器配置选项 +- **[Tab](browser/tab.md)** - 页面标签和交互 +- **[Managers](browser/managers.md)** - 浏览器生命周期管理器 + +### 元素模块 +元素模块提供与网页元素交互的功能。 + +- **[WebElement](elements/web_element.md)** - 网页元素交互 +- **[Mixins](elements/mixins.md)** - 可复用的元素交互功能 + +### 连接模块 +连接模块通过 Chrome DevTools 协议处理与浏览器的通信。 + +- **[Connection Handler](connection/connection.md)** - WebSocket连接管理器 +- **[Managers](connection/managers.md)** - 连接生命周期管理器 + +### 命令模块 +命令模块提供低级 Chrome DevTools 协议命令实现。 + +- **[Commands Overview](commands/index.md)** - CDP command implementations by domain + +### 协议模块 +协议模块实现了 Chrome DevTools 协议命令和事件。 + +- **[Commands](protocol/commands.md)** - CDP 命令封装 +- **[Events](protocol/events.md)** - CDP 事件处理 + +### 核心模块 +核心模块包含基础程序、常量和异常。 + +- **[Constants](core/constants.md)** - 库常量和枚举 +- **[Exceptions](core/exceptions.md)** - 自定义异常类 +- **[Utils](core/utils.md)** - 实用功能 + +## 快捷导航 + +### 常用类 + +| 类 | 功能 | 模块 | +|-------------------|--------------|-------------------------------| +| `Chrome` | Chrome浏览器自动化 | `pydoll.browser.chromium` | +| `Edge` | Edge浏览器自动化 | `pydoll.browser.chromium` | +| `Tab` | 标签页交互和控制 | `pydoll.browser.tab` | +| `WebElement` | 元素交互 | `pydoll.elements.web_element` | +| `ChromiumOptions` | 浏览器配置 | `pydoll.browser.options` | + +### 关键枚举和常量 + +| 名称 | 功能 | 模块 | +|------------------|---------|--------| +| `By` | 元素选择器策略 | `pydoll.constants` | +| `Key` | 键盘按键常量 | `pydoll.constants` | +| `PermissionType` | 浏览器权限类型 | `pydoll.constants` | + +### 常见异常类型 + +| 异常 | 原因 | 模块 | +|----------------------|-----------|---------------------| +| `ElementNotFound` | 元素在DOM未找到 | `pydoll.exceptions` | +| `WaitElementTimeout` | 元素等待超时 | `pydoll.exceptions` | +| `BrowserNotStarted` | 浏览器未开启 | `pydoll.exceptions` | + +## 使用模式 + +### 基本浏览器自动化 + +```python +from pydoll.browser.chromium import Chrome + +async with Chrome() as browser: + tab = await browser.start() + await tab.go_to("https://example.com") + element = await tab.find(id="my-element") + await element.click() +``` + +### 元素定位 + +```python +# Using the modern find() method +element = await tab.find(id="username") +element = await tab.find(tag_name="button", class_name="submit") + +# Using CSS selectors or XPath +element = await tab.query("#username") +element = await tab.query("//button[@class='submit']") +``` + +### 事件处理 + +```python +await tab.enable_page_events() +await tab.on('Page.loadEventFired', handle_page_load) +``` + +## 类型提示 + +Pydoll 具有完整的类型支持,并提供全面的类型提示,以提供更好的 IDE 支持和代码安全性。所有公共 API 均包含正确的类型注释。 + +```python +from typing import Optional, List +from pydoll.elements.web_element import WebElement + +# Methods return properly typed objects +element: Optional[WebElement] = await tab.find(id="test", raise_exc=False) +elements: List[WebElement] = await tab.find(class_name="item", find_all=True) +``` + +## Async/Await 支持 + +所有 Pydoll 操作都是异步的,必须与 `async`/`await` 一起使用: + +```python +import asyncio + +async def main(): + # All Pydoll operations are async + async with Chrome() as browser: + tab = await browser.start() + await tab.go_to("https://example.com") + +asyncio.run(main()) +``` + +浏览以下部分以了解每个模块的完整 API 文档。 \ No newline at end of file diff --git a/docs/api/protocol/commands.md b/docs/zh/api/protocol/commands.md similarity index 92% rename from docs/api/protocol/commands.md rename to docs/zh/api/protocol/commands.md index ea1d864..5208539 100644 --- a/docs/api/protocol/commands.md +++ b/docs/zh/api/protocol/commands.md @@ -1,6 +1,6 @@ -# Protocol Commands +# 协议命令 -This section documents the Chrome DevTools Protocol command implementations used by Pydoll. +本节记录了 Pydoll 使用的 Chrome DevTools 协议命令的实现。 ## Page Commands diff --git a/docs/api/protocol/events.md b/docs/zh/api/protocol/events.md similarity index 91% rename from docs/api/protocol/events.md rename to docs/zh/api/protocol/events.md index 79095a8..b61f78c 100644 --- a/docs/api/protocol/events.md +++ b/docs/zh/api/protocol/events.md @@ -1,6 +1,6 @@ -# Protocol Events +# 协议事件 -This section documents the Chrome DevTools Protocol event constants and handlers used by Pydoll. +本节记录了 Pydoll 使用的 Chrome DevTools 协议事件常量和处理程序。 ## Page Events diff --git a/docs/deep-dive/browser-domain.md b/docs/zh/deep-dive/browser-domain.md similarity index 98% rename from docs/deep-dive/browser-domain.md rename to docs/zh/deep-dive/browser-domain.md index 59a03fb..929750e 100644 --- a/docs/deep-dive/browser-domain.md +++ b/docs/zh/deep-dive/browser-domain.md @@ -1,6 +1,6 @@ -# Browser Domain +# 浏览器域 -The Browser domain is the backbone of Pydoll's zero-webdriver architecture. This component provides a direct interface to browser instances through the Chrome DevTools Protocol (CDP), eliminating the need for traditional webdrivers while delivering superior performance and reliability. +浏览器域是 Pydoll无webdriver架构的核心。该组件通过 Chrome DevTools 协议 (CDP) 为浏览器实例提供直接接口,无需传统的 Web 驱动程序,同时提供极佳的性能和可靠性。 ```mermaid graph LR @@ -17,7 +17,7 @@ graph LR end ``` -## Technical Architecture +## 技术架构 At its core, the Browser domain is implemented as an abstract base class (`Browser`) that establishes the fundamental contract for all browser implementations. Specific browser classes like `Chrome` and `Edge` extend this base class to provide browser-specific behavior while sharing the common architecture. diff --git a/docs/features.md b/docs/zh/features.md similarity index 75% rename from docs/features.md rename to docs/zh/features.md index fdd0bf8..c4f600e 100644 --- a/docs/features.md +++ b/docs/zh/features.md @@ -1,69 +1,72 @@ -# Key Features +__# 核心特性 -Pydoll brings groundbreaking capabilities to browser automation, making it significantly more powerful than traditional tools while being easier to use. +Pydoll为浏览器自动化带来了突破性的功能,比传统浏览器自动化工具更加强大更易于使用。 -## Core Capabilities +## 核心功能 -### Zero WebDrivers +### 无WebDriver依赖 -Unlike traditional browser automation tools like Selenium, Pydoll eliminates the need for WebDrivers entirely. By connecting directly to browsers through the Chrome DevTools Protocol, Pydoll: +与传统浏览器自动化框架(例如Selenium)不同的是,Pydoll完全消除了对WebDriver的依赖。通过 Chrome DevTools 协议直接连接到浏览器,Pydoll 可以: -- Eliminates version compatibility issues between browser and driver -- Reduces setup complexity and maintenance overhead -- Provides more reliable connections without driver-related issues -- Allows for automation of all Chromium-based browsers with a unified API +- 消除浏览器和驱动程序之间的版本兼容性问题 +- 降低设置复杂性和维护开销 +- 提供更可靠的连接,避免驱动程序相关问题 +- 允许使用统一的 API 实现所有基于 Chromium 的浏览器的自动化 -No more "chromedriver version doesn't match Chrome version" errors or mysterious webdriver crashes. +不再出现“chromedriver 版本与 Chrome 版本不匹配”的错误或神秘的 webdriver 崩溃。 -### Async-First Architecture +### 异步优先架构 -Built from the ground up with Python's asyncio, Pydoll provides: +Pydoll 基于 Python 的 asyncio 全新构建,提供以下功能: -- **True Concurrency**: Run multiple operations in parallel without blocking -- **Efficient Resource Usage**: Manage many browser instances with minimal overhead -- **Modern Python Patterns**: Context managers, async iterators, and other asyncio-friendly interfaces -- **Performance Optimizations**: Reduced latency and increased throughput for automation tasks +- **真正的并发**:并行运行多个操作,无阻塞 +- **高效的资源利用**:以最小的开销管理多个浏览器实例 +- **现代 Python 模式**:上下文管理器、异步迭代器和其他异步友好接口 +- **性能优化**:降低自动化任务的延迟并提高吞吐量 -### Human-Like Interactions +### 模拟真人交互 -Avoid detection by mimicking real user behavior: +通过模仿真实用户行为来绕过反爬行为检测: -- **Natural Typing**: Type text with randomized timing between keystrokes -- **Realistic clicking**: Click with realistic timing and movement, including offset +- **自然输入**:以随机的按键时间输入文本 +- **仿真点击**:以模拟真人的点击时间和移动方式进行点击,包括坐标偏移 -### Event-Driven Capabilities +### 事件驱动功能 -Respond to browser events in real-time: +实时响应浏览器事件: - **Network Monitoring**: Track requests, responses, and failed loads -- **DOM Observation**: React to changes in the page structure -- **Page Lifecycle Events**: Capture navigation, loading, and rendering events -- **Custom Event Handlers**: Register callbacks for specific events of interest +- **DOM 结构观测**: 响应页面结构的变化 +- **页面生命周期事件**: 捕获导航、加载和渲染事件 +- **自定义事件处理程序**: 为感兴趣的特定事件注册回调 -### Multi-Browser Support -Pydoll works seamlessly with: +### 多浏览器支持 -- **Google Chrome**: Primary support with all features available -- **Microsoft Edge**: Full support for Edge-specific features -- **Chromium**: Support for other Chromium-based browsers +Pydoll支持操作任何Chromium核心的浏览器: -### Screenshot and PDF Export +- **Google Chrome**:主要支持所有可用功能 +- **Microsoft Edge**:全面支持 Edge 特定功能 +- **Chromium**:支持其他基于 Chromium 的浏览器 -Capture visual content from web pages: -- **Full Page Screenshots**: Capture entire page content, even beyond the viewport -- **Element Screenshots**: Target specific elements for capture -- **High-Quality PDF Export**: Generate PDF documents from web pages -- **Custom Formatting**: Coming soon! +### 导出网页截图和PDF -## Intuitive Element Finding +从网页截图: -Pydoll v2.0+ introduces a revolutionary approach to finding elements that's both more intuitive and more powerful than traditional selector-based methods. +- **全页截图**:截取整个页面内容包括超出视口范围的 +- **元素截图**:截取特定元素 +- **高质量 PDF 导出**:从网页生成 PDF 文档 +- **自定义格式**:即将推出! -### Modern find() Method +## 直观的元素查找 + +Pydoll v2.0+ 引入了一种革命性的元素查找方法,比传统的基于选择器的方法更直观、更强大。 + +### 现代化的 find() 方法 + +全新的 `find()` 方法允许您使用自然属性搜索元素: -The new `find()` method allows you to search for elements using natural attributes: ```python import asyncio @@ -109,9 +112,9 @@ async def element_finding_examples(): asyncio.run(element_finding_examples()) ``` -### CSS Selectors and XPath with query() +### 使用 query() 实现 CSS 选择器和 XPath -For developers who prefer traditional selectors, the `query()` method provides direct CSS selector and XPath support: +对于喜欢传统选择器的开发者,`query()` 方法提供了直接的 CSS 选择器和 XPath 支持: ```python import asyncio @@ -137,21 +140,21 @@ async def query_examples(): asyncio.run(query_examples()) ``` -## Native Cloudflare Captcha Bypass +## 原生 Cloudflare 验证码绕过 -!!! warning "Important Information About Captcha Bypass" - The effectiveness of Cloudflare Turnstile bypass depends on several factors: - - - **IP Reputation**: Cloudflare assigns a "trust score" to each IP address. Clean residential IPs typically receive higher scores. - - **Previous History**: IPs with a history of suspicious activity may be permanently flagged. - - Pydoll can achieve scores comparable to a regular browser session, but cannot overcome IP-based blocks or extremely restrictive configurations. For best results, use residential IPs with good reputation. - - Remember that captcha bypass techniques operate in a gray area and should be used responsibly. +!!! 警告“关于验证码绕过的重要信息” +Cloudflare Turnstile 绕过的有效性取决于以下几个因素: + +- **IP可信度**:Cloudflare 为每个 IP 地址分配一个“可信度”。干净的住宅 IP 通常会获得更高的分数。 +- **过往历史记录**:有可疑活动历史记录的 IP 可能会被永久标记。 -One of Pydoll's most powerful features is its ability to automatically bypass Cloudflare Turnstile captchas that block most automation tools: +Pydoll 可以获得与常规浏览器会话相当的分数,但无法解决IP质量导致的风控。为了获得最佳效果,请使用质量良好的住宅 IP。 -### Context Manager Approach (Synchronous) +请记住,验证码绕过技术处于灰色地带,应谨慎使用。 + +Pydoll 最强大的功能之一是它能够自动绕过阻止大多数自动化工具的 Cloudflare Turnstile 验证码: + +### 上下文管理器方法(同步) ```python import asyncio @@ -176,7 +179,7 @@ async def bypass_cloudflare_example(): asyncio.run(bypass_cloudflare_example()) ``` -### Background Processing Approach +### 后台自动处理验证码 ```python import asyncio @@ -207,19 +210,19 @@ async def background_bypass_example(): asyncio.run(background_bypass_example()) ``` -Access websites that actively block automation tools without using third-party captcha solving services. This native captcha handling makes Pydoll suitable for automating previously inaccessible websites. +无需使用第三方验证码服务,即可访问屏蔽自动化工具的网站。 -## Multi-Tab Management +## 多标签页管理 -Pydoll provides sophisticated tab management capabilities with a singleton pattern that ensures efficient resource usage and prevents duplicate Tab instances for the same browser tab. +Pydoll 采用单例模式提供完善的标签页管理功能,确保资源高效利用,并防止同一浏览器标签页出现重复的标签页实例。 -### Tab Singleton Pattern +### 标签页单例模式 -Pydoll implements a singleton pattern for Tab instances based on the browser's target ID. This means: +Pydoll 根据浏览器的目标 ID 为标签页实例实现单例模式。这意味着: -- **One Tab instance per browser tab**: Multiple references to the same browser tab return the same Tab object -- **Automatic resource management**: No duplicate connections or handlers for the same tab -- **Consistent state**: All references to a tab share the same state and event handlers +- **标签页对象唯一性**:对同一浏览器标签页的多次引用将返回相同的标签页对象 +- **自动化资源治理**:同一标签页不会出现重复的连接或处理程序 +- **全局状态一致性**:对同一标签页的所有引用共享相同的状态和事件处理程序 ```python import asyncio @@ -241,9 +244,10 @@ async def singleton_demonstration(): asyncio.run(singleton_demonstration()) ``` -### Creating New Tabs Programmatically -Use `new_tab()` to create tabs programmatically with full control: +### 程序化创建新标签页 + +使用 `new_tab()` 程序化创建拥有完全控制权的新标签页: ```python import asyncio @@ -275,9 +279,9 @@ async def programmatic_tab_creation(): asyncio.run(programmatic_tab_creation()) ``` -### Handling User-Opened Tabs +### 处理用户打开的标签页 -When users click links that open new tabs (target="_blank"), use `get_opened_tabs()` to detect and manage them: +当用户点击打开新标签页 (target="_blank") 的链接时,使用 `get_opened_tabs()` 来检测和管理它们: ```python import asyncio @@ -321,24 +325,23 @@ async def handle_user_opened_tabs(): asyncio.run(handle_user_opened_tabs()) ``` -### Key Benefits of Pydoll's Tab Management - -1. **Singleton Pattern**: Prevents resource duplication and ensures consistent state -2. **Automatic Detection**: `get_opened_tabs()` finds all tabs, including user-opened ones -3. **Concurrent Processing**: Handle multiple tabs simultaneously with asyncio -4. **Resource Management**: Proper cleanup prevents memory leaks -5. **Event Isolation**: Each tab maintains its own event handlers and state +### Pydoll 标签页管理的主要优势 -This sophisticated tab management makes Pydoll ideal for: -- **Multi-page workflows** that require coordination between tabs -- **Parallel data extraction** from multiple sources -- **Testing applications** that use popup windows or new tabs -- **Monitoring user behavior** across multiple browser tabs +1. **单例模式**:防止资源重复并确保状态一致 +2. **自动检测**:`get_opened_tabs()` 查找所有标签页,包括用户打开的标签页 +3. **并发处理**:使用 asyncio 同时处理多个标签页 +4. **资源管理**:适当的清理可防止内存泄漏 +5. **事件隔离**:每个标签页都维护自己的事件处理程序和状态 +这种复杂的标签页管理功能使 Pydoll 非常适合: +- 需要标签页之间协调的**多页面工作流** +- 从多个来源**并行提取数据** +- 使用弹出窗口或新标签页的**测试应用程序** +- 跨多个浏览器标签页**监控用户行为** -## Concurrent Scraping +## 并发抓取 -Pydoll's async architecture allows you to scrape multiple pages or websites simultaneously for maximum efficiency: +Pydoll 的异步架构允许您同时抓取多个页面或网站,以实现最高效率: ```python import asyncio @@ -403,11 +406,13 @@ async def main(): all_data = asyncio.run(main()) ``` -This approach provides dramatic performance improvements over sequential scraping, especially for I/O-bound tasks like web scraping. Instead of waiting for each page to load one after another, Pydoll processes them all simultaneously, reducing total execution time significantly. For example, scraping 10 pages that each take 2 seconds to load would take just over 2 seconds total instead of 20+ seconds with sequential processing. -## Advanced Keyboard Control +与单线程控制标签页抓取相比,这种方法显著提升了性能,尤其适用于像网页抓取这样 I/O 密集型任务。Pydoll 无需等待每个页面逐个加载,而是同时处理所有页面,从而显著缩短了总执行时间。例如,如果抓取 10 个页面,每个页面加载时间为 2 秒,那么总共只需 2 秒多一点,而单线程处理则需要 20 多秒。 + +## 高级键盘控制 + +Pydoll 提供仿真的键盘交互,并可精确控制输入行为: -Pydoll provides human-like keyboard interaction with precise control over typing behavior: ```python import asyncio @@ -442,11 +447,12 @@ async def realistic_typing_example(): asyncio.run(realistic_typing_example()) ``` -This realistic typing helps avoid detection by websites that look for automation patterns. The natural timing and ability to use special key combinations makes Pydoll's interactions virtually indistinguishable from human users. +这种仿真的输入方式有助于避免被那些有用户行为检测的网站检测到。模拟真人操作和使用特殊组合键的能力使得 Pydoll 的交互几乎与人类用户难以区分。 + +## 强大的事件系统 -## Powerful Event System +Pydoll 的事件系统允许您实时响应浏览器事件: -Pydoll's event system allows you to react to browser events in real-time: ```python import asyncio @@ -479,15 +485,15 @@ async def event_monitoring_example(): asyncio.run(event_monitoring_example()) ``` -The event system makes Pydoll uniquely powerful for monitoring API requests and responses, creating reactive automations, debugging complex web applications, and building comprehensive web monitoring tools. +事件系统使 Pydoll 在监控 API 请求和响应、创建响应式自动化、调试复杂的 Web 应用程序以及构建全面的 Web 监控工具方面拥有独特的强大功能。 -## Network Analysis and Response Extraction +## 网络分析和响应提取 -Pydoll provides powerful methods for analyzing network traffic and extracting response data from web applications. These capabilities are essential for API monitoring, data extraction, and debugging network-related issues. +Pydoll 提供了强大的方法来分析网络流量并从 Web 应用程序中提取响应数据。这些功能适用于API 监控、数据提取和调试网络相关问题。 -### Network Logs Analysis +### 网络日志分析 -The `get_network_logs()` method allows you to retrieve and analyze all network requests made by a page: +`get_network_logs()` 方法允许您查找和分析页面发出的所有网络请求: ```python import asyncio @@ -528,9 +534,9 @@ async def network_analysis_example(): asyncio.run(network_analysis_example()) ``` -### Response Body Extraction +### 响应体提取 -The `get_network_response_body()` method enables you to extract the actual response content from network requests: +`get_network_response_body()` 方法可让您从网络请求中提取实际的响应内容: ```python import asyncio @@ -597,9 +603,9 @@ async def response_extraction_example(): asyncio.run(response_extraction_example()) ``` -### Advanced Network Monitoring +### 高级网络监控 -Combine both methods for comprehensive network analysis: +结合两种方法进行全面的网络分析: ```python import asyncio @@ -654,17 +660,17 @@ async def comprehensive_network_monitoring(): asyncio.run(comprehensive_network_monitoring()) ``` -These network analysis capabilities make Pydoll ideal for: +网络分析功能非常适合以下情况: -- **API Testing**: Monitor and validate API responses -- **Performance Analysis**: Track request timing and sizes -- **Data Extraction**: Extract dynamic content loaded via AJAX -- **Debugging**: Identify failed requests and network issues -- **Security Testing**: Analyze request/response patterns +- **API 测试**:监控和验证 API 响应 +- **性能分析**:跟踪请求时间和大小 +- **数据提取**:提取通过 AJAX 加载的动态内容 +- **调试**:识别失败的请求和网络问题 +- **安全测试**:分析请求/响应模式 -## File Upload Support +## 上传文件支持 -Seamlessly handle file uploads in your automation: +在您的自动化脚本中无缝上传文件: ```python import asyncio @@ -695,11 +701,11 @@ async def file_upload_example(): asyncio.run(file_upload_example()) ``` -File uploads are notoriously difficult to automate in other frameworks, often requiring workarounds. Pydoll makes it straightforward with both direct file input and file chooser dialog support. +在其他框架中,文件上传的自动化难度非常高,通常需要一些变通方法。Pydoll 通过直接文件输入和文件选择器对话框支持,让这项工作变得简单易行。 -## Multi-Browser Example +## 多浏览器示例 -Pydoll works with different browsers through a consistent API: +Pydoll通过一致的API兼容不同的浏览器: ```python import asyncio @@ -723,18 +729,18 @@ async def multi_browser_example(): asyncio.run(multi_browser_example()) ``` -Cross-browser compatibility without changing your code. Test your automations across different browsers to ensure they work everywhere. +无需更改代码即可实现跨浏览器兼容性。跨不同浏览器测试您的自动化功能,确保其在所有浏览器中都能正常运行。 -## Proxy Integration +## 集成代理 -Unlike many automation tools that struggle with proxy implementation, Pydoll offers native proxy support with full authentication capabilities. This makes it ideal for: +与许多自动化工具在代理实现方面遇到困难不同,Pydoll 提供原生代理支持和完整的身份验证功能。这使其成为以下应用的理想之选: -- **Web scraping** projects that need to rotate IPs -- **Geo-targeted testing** of applications across different regions -- **Privacy-focused automation** that requires anonymizing traffic -- **Testing web applications** through corporate proxies +- 需要轮换 IP 地址的 **Web 数据抓取** 项目 +- 跨不同区域进行应用程序的 **地理定位测试** +- 需要匿名流量的 **注重隐私的自动化** +- 通过企业代理进行 **Web 应用程序测试** -Configuring proxies in Pydoll is straightforward: +在 Pydoll中配置代理非常简单: ```python import asyncio @@ -770,9 +776,10 @@ async def proxy_example(): asyncio.run(proxy_example()) ``` -## Working with iFrames +## 使用iFrames -Pydoll provides seamless iframe interaction through the `get_frame()` method: +Pydoll提供了 +Pydoll 通过 `get_frame()` 方法提供无缝的 iframe 交互: ```python import asyncio @@ -804,11 +811,11 @@ async def iframe_interaction(): asyncio.run(iframe_interaction()) ``` -## Request Interception +## 请求拦截 -Intercept and modify network requests before they're sent: +在网络请求发送之前拦截并修改它们: -### Basic Request Modification +### 简单请求修改例子 ```python import asyncio @@ -854,9 +861,9 @@ async def request_interception_example(): asyncio.run(request_interception_example()) ``` -### Blocking Unwanted Requests +### 拦截指定请求 -Use `fail_request` to block specific requests like ads, trackers, or unwanted resources: +使用 `fail_request` 可以拦截指定请求例如广告、隐私追踪器、不想要的网页资源: ```python import asyncio @@ -907,9 +914,9 @@ async def block_requests_example(): asyncio.run(block_requests_example()) ``` -### Mocking API Responses +### 拦截修改API响应 -Use `fulfill_request` to return custom responses without making actual network requests: +使用 `fulfill_request` 可以做到不请求后端直接模拟请求返回: ```python import asyncio @@ -989,9 +996,9 @@ async def mock_api_responses_example(): asyncio.run(mock_api_responses_example()) ``` -### Advanced Request Manipulation +### 高级请求操作 -Combine all interception methods for comprehensive request control: +整合所有拦截方法,实现全面的请求控制: ```python import asyncio @@ -1060,13 +1067,13 @@ async def advanced_request_control(): asyncio.run(advanced_request_control()) ``` -This powerful capability allows you to: +这项强大的功能允许您: -- **Add authentication headers dynamically** for API requests -- **Block unwanted resources** like ads, trackers, and heavy images for faster loading -- **Mock API responses** for testing without backend dependencies -- **Simulate network errors** to test error handling -- **Modify request payloads** before they're sent -- **Analyze and debug network traffic** in real-time +- 为API请求**动态添加身份验证标头** +- **屏蔽不需要的资源**,例如广告、跟踪器和大图片,以加快加载速度 +- **Mock API响应**,以便在没有后端依赖的情况下进行测试 +- **模拟网络错误**,以测试错误处理 +- **修改请求内容** +- **实时分析和调试网络流量** -Each of these features showcases what makes Pydoll a next-generation browser automation tool, combining the power of direct browser control with an intuitive, async-native API. \ No newline at end of file +这些功能充分展现了Pydoll作为新一代浏览器自动化工具的优势,它直接控制浏览器的强大功能与直观的异步原生API完美结合。 \ No newline at end of file diff --git a/docs/zh/index.md b/docs/zh/index.md new file mode 100644 index 0000000..f3accf5 --- /dev/null +++ b/docs/zh/index.md @@ -0,0 +1,288 @@ +

+ Pydoll Logo

+

+ +

+ + + + Tests + Ruff CI + Release + MyPy CI +

+ + +# 欢迎使用Pydoll + +嗨,开发者大大!欢迎来到 Pydoll 的世界~这是为 Python 量身打造的新一代浏览器自动化神器! + +## 什么是Pydoll? + +Pydoll采用全新的浏览器自动化技术——完全无需 WebDriver!与其他依赖外部驱动的解决方案不同,Pydoll 通过浏览器原生 DevTools 协议直接通信,提供零依赖的自动化体验,并自带原生异步高性能支持。 + +无论是数据采集、Web应用测试,还是自动化重复任务,Pydoll 都能通过其直观的 API 和强大功能,让这些工作变得异常简单。 + +## 安装 + +创建并激活一个 [虚拟环境](https://docs.python.org/3/tutorial/venv.html),然后安装Pydoll: + +
+```bash +$ pip install pydoll-python + +---> 100% +``` +
+ +你可以直接在GitHub上找到最新的开发版本: + +```bash +$ pip install git+https://github.com/autoscrape-labs/pydoll.git +``` + +## 为何选择Pydoll? + +- **智能验证码绕过**: 内置Cloudflare Turnstile与reCAPTCHA v3验证码的自动破解能力,无需依赖外部服务、API密钥或复杂配置。即使遭遇防护系统,您的自动化流程仍可畅行无阻。 +- **模拟真人交互**: 通过先进算法模拟真实人类行为特征——通过随机操作间隔,到鼠标移动轨迹、页面滚动模式乃至输入速度,皆可骗过最严苛的反爬虫系统。 +- **极简哲学**: 无需浪费太多时间在配置驱动或解决兼容问题上。Pydoll开箱即用。 +- **原生异步性能**: 基于`asyncio`库深度设计, Pydoll不仅支持异步操作——更为高并发而生,可同时进行多个受防护站点的数据采集。 +- **强大的网络监控**: 轻松实现请求拦截、流量篡改与响应分析,完整掌控网络通信链路,轻松突破层层防护体系。 +- **事件驱动架构**: 实时响应页面事件、网络请求与用户交互,构建能动态适应防护系统的智能自动化流。 +- **直观的元素定位**: 使用符合人类直觉的定位方法 `find()` 和 `query()` ,面对动态加载的防护内容,定位依然精准。 +- **强类型安全**: 完备的类型系统为复杂自动化场景提供更优IDE支持和更好地预防运行时报错。 + + +准备好开始了吗?以下内容将带您从安装配置、基础使用到高级功能,全面掌握 Pydoll 的最佳实践。 + +让我们以最优雅的方式,开启您的网页自动化之旅!🚀 + +## 简单的例子上手 + +让我们从一个实际案例开始。以下脚本将打开 Pydoll 的 GitHub 仓库并star: + +```python +import asyncio +from pydoll.browser.chromium import Chrome + +async def main(): + async with Chrome() as browser: + tab = await browser.start() + await tab.go_to('https://github.com/autoscrape-labs/pydoll') + + star_button = await tab.find( + tag_name='button', + timeout=5, + raise_exc=False + ) + if not star_button: + print("Ops! The button was not found.") + return + + await star_button.click() + await asyncio.sleep(3) + +asyncio.run(main()) +``` + +此示例演示了如何导航到网站、等待元素出现并与之交互。您可以使用这样的模式来自动执行许多不同的 Web 任务。 + +??? note "或者使用不带上下文管理器的..." + 如果你不想要使用上下文管理器模式,你可以手动管理浏览器实例: + + ```python + import asyncio + from pydoll.browser.chromium import Chrome + + async def main(): + browser = Chrome() + tab = await browser.start() + await tab.go_to('https://github.com/autoscrape-labs/pydoll') + + star_button = await tab.find( + tag_name='button', + timeout=5, + raise_exc=False + ) + if not star_button: + print("Ops! The button was not found.") + return + + await star_button.click() + await asyncio.sleep(3) + await browser.stop() + + asyncio.run(main()) + ``` + + Note that when not using the context manager, you'll need to explicitly call `browser.stop()` to release resources. + +## 补充例子: 自定义浏览器配置 + +对于更高级的使用场景,Pydoll 允许您使用 `ChromiumOptions` 类自定义浏览器配置。此功能在您需要执行以下操作时非常有用: + +- 在无头模式下运行(无可见浏览器窗口) +- 指定自定义浏览器可执行文件路径 +- 配置代理、用户代理或其他浏览器设置 +- 设置窗口尺寸或启动参数 + +以下示例展示了如何使用 Chrome 的自定义选项: + +```python hl_lines="8-12 30-32 34-38" +import asyncio +import os +from pydoll.browser.chromium import Chrome +from pydoll.browser.options import ChromiumOptions + +async def main(): + options = ChromiumOptions() + options.binary_location = '/usr/bin/google-chrome-stable' + options.add_argument('--headless=new') + options.add_argument('--start-maximized') + options.add_argument('--disable-notifications') + + async with Chrome(options=options) as browser: + tab = await browser.start() + await tab.go_to('https://github.com/autoscrape-labs/pydoll') + + star_button = await tab.find( + tag_name='button', + timeout=5, + raise_exc=False + ) + if not star_button: + print("Ops! The button was not found.") + return + + await star_button.click() + await asyncio.sleep(3) + + screenshot_path = os.path.join(os.getcwd(), 'pydoll_repo.png') + await tab.take_screenshot(path=screenshot_path) + print(f"Screenshot saved to: {screenshot_path}") + + base64_screenshot = await tab.take_screenshot(as_base64=True) + + repo_description_element = await tab.find( + class_name='f4.my-3' + ) + repo_description = await repo_description_element.text + print(f"Repository description: {repo_description}") + +if __name__ == "__main__": + asyncio.run(main()) +``` + + +此扩展示例演示了: + +1. 创建和配置浏览器选项 +2. 设置自定义Chrome可执行程序路径 +3. 启用无头模式以实现无痕操作 +4. 设置其他浏览器命令行flags +5. 屏幕截图(在无头模式下尤其有用) + +??? info "关于Chrome配置选项" + The `options.add_argument()` 方法允许您传递任何 Chromium 命令行参数来自定义浏览器行为。有数百个可用选项可用于控制从网络到渲染行为的所有内容。 + + 常用Chrome配置选项 + + ```python + # 性能与行为选项 + options.add_argument('--headless=new') # 以无头模式运行Chrome + options.add_argument('--disable-gpu') # 禁用GPU加速 + options.add_argument('--no-sandbox') # 禁用沙盒模式(需谨慎使用) + options.add_argument('--disable-dev-shm-usage') # 解决资源限制问题 + + # 界面显示选项 + options.add_argument('--start-maximized') # 以最大化窗口启动 + options.add_argument('--window-size=1920,1080') # 设置特定窗口尺寸 + options.add_argument('--hide-scrollbars') # 隐藏滚动条 + + # 网络选项 + options.add_argument('--proxy-server=socks5://127.0.0.1:9050') # 使用代理服务器 + options.add_argument('--disable-extensions') # 禁用扩展程序 + options.add_argument('--disable-notifications') # 禁用通知 + + # 隐私与安全 + options.add_argument('--incognito') # 以隐身模式运行 + options.add_argument('--disable-infobars') # 禁用信息栏 + ``` + + 完整参考指南 + + 如需获取所有可用的Chrome命令行参数完整列表,请参考以下资源: + + - [Chromium Command Line Switches](https://peter.sh/experiments/chromium-command-line-switches/) - Complete reference list + - [Chrome Flags](chrome://flags) - Enter this in your Chrome browser address bar to see experimental features + - [Chromium Source Code Flags](https://source.chromium.org/chromium/chromium/src/+/main:chrome/common/chrome_switches.cc) - Direct source code reference + + 请注意某些选项在不同Chrome版本中可能有差异表现,建议在升级Chrome时测试您的配置。 + +通过这些配置,您可以在各种环境中运行 Pydoll,包括 CI/CD 流水线、无显示器的服务器或 Docker 容器。 + +继续阅读文档,探索 Pydoll 在处理验证码、处理多个标签页、与元素交互等方面的强大功能。 + +## 极简依赖 + +Pydoll 的优势之一是其轻量级的占用空间。与其他需要大量依赖项的浏览器自动化工具不同,Pydoll 在保留了强大的功能的同时力求精简。 + +### 核心依赖 + +Pydoll仅依赖少量的核心库: + +``` +python = "^3.10" +websockets = "^13.1" +aiohttp = "^3.9.5" +aiofiles = "^23.2.1" +bs4 = "^0.0.2" +``` + +这种极简依赖策略带来五大核心优势: + +- **⚡闪电安装** - 无需解析复杂的依赖树 +- **🧩 零冲突** - 与其他包发生版本冲突的概率极低 +- **📦 轻量化** - 更低的磁盘空间占用 +- **🔒 更好的安全** - 更小的攻击面和供应链漏洞 +- **🔄 方便升级** - 方便维护已经无破坏性更新 + +更少的依赖项带来了: 更高的运行可靠性以及更强的性能表现。 + +## 许可证 + +Pydoll 遵循 MIT 许可证(完整文本见 LICENSE 文件),主要授权条款包括: + +1. 权利授予 + - 永久、全球范围、免版税的使用权 + - 允许修改创作衍生作品 + - 可再授权给第三方 + +2. 唯一责任限制 + - 所有修改件必须保留原版权声明 + - 不提供任何明示或默示担保 + +??? info "View Full MIT License Text" + ``` + MIT License + + Copyright (c) 2023 Pydoll Contributors + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. + ```