adv-web-scraper-api

Navigation Step Types

This document describes all available navigation step types that can be used in the navigation engine.

Basic Navigation

goto

Navigates to a URL.

{
  type: 'goto',
  value: 'https://example.com',
  waitFor: 'networkidle' // optional
}

wait

Waits for a condition.

{
  type: 'wait',
  value: 5000, // ms
  humanLike: true // Optional: Randomizes wait time slightly around the specified value (only for numeric values)
  // OR
  value: '#element' // selector
  // OR
  waitFor: 'navigation' // 'networkidle'
}

Parameters:

Mouse Interactions

click

Clicks an element using various methods (single, double, keyboard) and options. This step uses Playwright’s robust elementHandle.click() or elementHandle.dblclick() methods, which handle scrolling the element into view and performing actionability checks.

{
  type: 'click',
  selector: '#button',
  clickMethod: 'single', // 'single' (default), 'double', 'keyboard'
  button: 'left',      // 'left' (default), 'right', 'middle'
  modifiers: ['Shift'], // Optional: ['Alt', 'Control', 'Meta', 'Shift']
  position: { x: 10, y: 10 }, // Optional: Click offset within element
  force: false,        // Optional: Bypass actionability checks (default: false)
  waitFor: '#next-page', // Optional: Wait condition after click
  timeout: 30000       // Optional: Timeout for the click action itself
}

Parameters:

Examples:

// Right-click an element
{
  type: 'click',
  selector: '#context-menu-trigger',
  button: 'right'
}

// Double-click an item
{
  type: 'click',
  selector: '.list-item',
  clickMethod: 'double'
}

// Shift-click a link
{
  type: 'click',
  selector: 'a.special-link',
  modifiers: ['Shift']
}

// Force click a potentially obscured button
{
  type: 'click',
  selector: '#tricky-button',
  force: true
}

Note on click vs. mousemove with action: 'click':

mousemove

Moves the mouse cursor to a target element or specific coordinates, with options for intermediate points and subsequent actions like clicking, dragging, or scrolling the wheel. Uses a structured approach for defining targets and parameters.

Parameters:

Targeting:

Movement Path (Optional):

Action-Specific Parameters:

Enhancements (Optional):

Example:

{
  type: 'mousemove',
  action: 'drag',
  mouseTarget: { selector: '#draggable-item', offsetY: -10 }, // Start drag 10px above center
  endPoint: { x: 300, y: 400 }, // Where to drop it
  startPoint: { x: 10, y: 10 }, // Start movement from specific coords
  duration: 1500, // Duration for the drag movement
  humanLike: true,
  randomizeOffset: 3, // Randomize start/end points within 3px radius if selectors used
  delayAfterAction: { min: 100, max: 300 } // Pause 100-300ms after dropping
}

{
  type: 'mousemove',
  action: 'click',
  mouseTarget: { selector: '#button', randomizeOffset: true }, // Click near button center
  delayBeforeAction: 50 // Wait 50ms before clicking
}

{
  type: 'mousemove',
  action: 'wheel',
  mouseTarget: { selector: '#scrollable-div' }, // Move mouse over div first
  delta: { y: 200 }, // Scroll down by 200px
  duration: 300 // Duration for the scroll action
}

Note on click vs. mousemove with action: 'click':

hover

Hovers over an element.

{
  type: 'hover',
  selector: '.tooltip-trigger',
  duration: 1500, // total hover time (move + pause) in ms
  humanLike: true, // Optional: Randomizes the pause duration after moving the mouse
  waitFor: '.tooltip' // optional
}

Parameters:

Selecting within Shadow DOM

Playwright’s locators can pierce Shadow DOM boundaries using standard CSS selectors. You don’t need a special step type. Simply chain .locator() calls or use CSS descendant combinators that pierce the shadow root.

Example:

Assume the following structure:

<host-element>
  #shadow-root (open)
    <div id="inner-element">Click Me</div>
    <nested-component>
      #shadow-root (open)
        <button>Nested Button</button>
    </nested-component>
</host-element>

You can target elements inside the shadow roots like this:

// Target #inner-element directly using standard CSS
{
  type: 'click',
  selector: 'host-element #inner-element'
}

// Target the nested button
{
  type: 'click',
  selector: 'host-element nested-component button'
}

// Alternative using Playwright's text selector (also pierces shadow DOM)
{
  type: 'click',
  selector: 'text="Nested Button"'
}

Key Points:

Input Operations

input

Enters text into an input field or the currently focused element.

{
  type: 'input',
  selector: '#search', // Optional if useFocusedElement is true
  value: 'query',
  clearInput: true, // Optional: Clears the input before typing (requires selector)
  humanInput: true, // Optional: Enables human-like typing (default: false if not specified)
  useFocusedElement: false // Optional: If true, types into the currently focused element, ignoring selector (default: false)
}

Parameters:

Example (Typing into Focused Element):

[
  // Assume a previous step focused an input field
  {
    type: 'input',
    useFocusedElement: true,
    value: 'Text for the focused input',
    humanInput: true // Human-like typing into focused element
  }
]

select

Selects an option from a dropdown.

{
  type: 'select',
  selector: '#country',
  value: 'US'
}

login

Handles common login flows involving username, password, and submit button.

{
  type: 'login',
  usernameSelector: '#username', // Selector for username/email input
  passwordSelector: '#password', // Selector for password input
  submitSelector: 'button[type="submit"]', // Selector for submit button
  usernameValue: '', // Username (use context/secrets)
  passwordValue: '', // Password (use context/secrets)
  waitForNavigation: true, // Optional: Wait for navigation after submit (default: true). Can be selector or timeout (ms).
  // strategy: 'standard', // Optional: For future complex flows (e.g., SSO)
  description: 'Log into the application', // Optional description
  humanLike: false // Optional: If true, adds small, randomized "think time" pauses after filling username and password fields. Defaults to `false`.
}

Parameters:

Functionality:

  1. Locates and fills the username field.
  2. Locates and fills the password field.
  3. Locates and clicks the submit button.
  4. Handles waiting based on the waitForNavigation parameter.

This step simplifies common login sequences compared to using separate input and click steps.

uploadFile

Uploads a file to an <input type="file"> element.

{
  type: 'uploadFile',
  selector: 'input#file-upload', // Selector for the file input element
  filePath: './data/my-document.pdf', // Path to the file to upload
  description: 'Upload the user document' // Optional description
}

Parameters:

Functionality:

  1. Locates the file input element specified by the selector.
  2. Uses Playwright’s setInputFiles method to set the value of the input to the specified filePath.

Security Note: Be cautious when allowing arbitrary file paths. The implementation includes a basic check (fs.existsSync) but consider adding more robust validation based on allowed base directories if the filePath can be influenced by external input.

press

Simulates keyboard key presses, including single keys, modifiers, and sequences using Playwright’s keyboard.press, keyboard.down, and keyboard.up methods.

{
  type: 'press',
  key: 'Enter', // Key to press (e.g., 'A', 'Enter', 'ArrowDown', '$', see Playwright key names)
  modifiers: ['Shift', 'Meta'], // Optional: ['Alt', 'Control', 'Meta', 'Shift']
  action: 'press', // Optional: 'press' (default, combines down+up), 'down', 'up'
  delay: 100, // Optional: Delay (ms) between down/up for 'press' action
  selector: '#myInput', // Optional: CSS selector to focus before pressing
  waitFor: '#result', // Optional: Wait condition after pressing
  description: 'Press Shift+Command+Enter in the input field', // Optional
  timeout: 30000, // Optional: Maximum time in milliseconds for associated waits (like `waitForSelector` if `selector` is used, or the `waitFor` condition). Defaults to 30000ms.
  optional: false // Optional: If true, failure during the step (e.g., element not found for focus) will not halt the flow (default: false).
}

Parameters:

Functionality:

  1. If selector is provided, waits for the element and focuses it.
  2. Performs the specified keyboard action (press, down, or up) using the key.
  3. For the 'press' action, it correctly combines the key with any specified modifiers (e.g., Shift+A).
  4. Handles the optional delay for the 'press' action.
  5. Handles the optional waitFor condition after the action.

Examples:

// Press Enter key
{
  type: 'press',
  key: 'Enter'
}

// Type 'A' (Shift + a)
{
  type: 'press',
  key: 'A', // Playwright handles Shift+a as 'A'
  // OR explicitly:
  // key: 'a',
  // modifiers: ['Shift']
}

// Press Ctrl+C (Cmd+C on Mac)
{
  type: 'press',
  key: 'C',
  modifiers: ['Control'] // Playwright maps Control to Meta on macOS automatically for shortcuts
  // OR explicitly for Mac:
  // key: 'C',
  // modifiers: ['Meta']
}

// Sequence: Hold Shift, press 'a', release Shift
[
  { type: 'press', key: 'Shift', action: 'down' },
  { type: 'press', key: 'a' }, // 'a' is pressed while Shift is down
  { type: 'press', key: 'Shift', action: 'up' }
]

// Focus input and press Tab
{
  type: 'press',
  selector: '#username',
  key: 'Tab'
}

EXAMPLE: : Shift + Option + Command + S (Chord Press)

The configuration step would look like this:

{
  "type": "press",
  "key": "S",
  "modifiers": ["Shift", "Alt", "Meta"],
  "description": "Press Shift + Option + Command + S"
}

EXAMPLE: : Shift + Command + s + d (Sequence Press):

[
  {
    "type": "press",
    "key": "Shift",
    "action": "down",
    "description": "Hold Shift down"
  },
  {
    "type": "press",
    "key": "Meta", // Command key on macOS
    "action": "down",
    "description": "Hold Command down"
  },
  {
    "type": "press",
    "key": "s",
    // Modifiers are implicitly active due to 'down' actions above
    "description": "Press 's' while holding Shift+Command"
  },
  {
    "type": "press",
    "key": "d",
    // Modifiers are still active
    "description": "Press 'd' while holding Shift+Command"
  },
  {
    "type": "press",
    "key": "Meta", // Release Command
    "action": "up",
    "description": "Release Command"
  },
  {
    "type": "press",
    "key": "Shift", // Release Shift
    "action": "up",
    "description": "Release Shift"
  }
]

Note: While page.keyboard.press() can sometimes handle simple combinations like Shift+A directly in the key property (e.g., key: “Shift+A”), complex chords and sequences are more reliably handled using the modifiers array and the down/up/press actions as shown above. The planned handler focuses on the explicit modifiers and action approach for clarity and robustness.

Data Extraction

extract

Extracts data from elements.

{
  type: 'extract',
  selector: '.product', // Can be omitted if running within forEachElement context
  name: 'products', // stored in context
  multiple: true, // Optional: Extract from multiple elements matching selector
  fields: {
    title: { selector: '.title', type: 'css' },
    price: { selector: '.price', type: 'css' },
    // Example using "self": Get the element's own text
    productText: { selector: 'self', type: 'css' },
    // Example using "self": Get an attribute from the element itself
    productId: { selector: 'self', type: 'css', attribute: 'data-product-id' }
  },
  continueOnError: false, // Optional: Continue if an error occurs (default: false)
  defaultValue: null, // Optional: Default value if extraction fails and continueOnError is true
  usePageScope: false // Optional: Force using page scope even if inside forEachElement (default: false)
}

example:

  "type": "extract",
            "name": "newsData",
            "selector": "div.card-header:text('News') + div",
            "description": "Extract news items from the right sidebar News box",
            "fields": {
                "newsItems": {
                    "selector": "a",
                    "type": "css",
                    "multiple": true,
                    "fields": {
                        "title": {
                            "selector": "self",
                            "type": "css"
                        },
                        "url": {
                            "selector": "self",
                            "type": "css",
                            "attribute": "href"
                        }
                    }
                }
            }
        }

Parameters:

Flow Control

condition

Conditional step execution.

{
  type: 'condition',
  condition: '#next-page', // selector or function
  thenSteps: [...],       // steps if true
  elseSteps: [...]        // steps if false
}

gotoStep

Jumps execution to a specific step index (1-based). Useful for creating loops, especially in combination with condition.

{
  type: 'gotoStep',
  step: 5 // Jump back to step 5
}

paginate

Handles pagination.

{
  type: 'paginate',
  selector: '.next-page',
  maxPages: 5,
  extractSteps: [...] // steps to repeat per page
}

Validation

assert

Performs an assertion check on an element’s state. If the assertion fails and the step is not marked as optional, the navigation flow will halt with an error.

{
  type: 'assert',
  selector: '#status-message', // Selector for the element to check
  assertionType: 'containsText', // Type of check to perform
  expectedValue: 'Success', // Value to check against (for text/attribute checks)
  attributeName: 'class', // Attribute name (for attribute checks)
  timeout: 10000, // Optional: Max time (ms) to wait for condition (default: 5000)
  optional: false, // Optional: If true, failure won't stop the flow (default: false)
  description: 'Verify success message appears' // Optional description
}

Assertion Types (assertionType):

  1. exists: Checks if at least one element matching the selector exists in the DOM.
  2. isVisible: Checks if the element is visible (i.e., not display: none, visibility: hidden, etc.). Waits for the element to become visible within the timeout.
  3. isHidden: Checks if the element is hidden. Waits for the element to become hidden within the timeout. An element that doesn’t exist is considered hidden.
  4. containsText: Checks if the element’s text content includes the expectedValue (string or RegExp). Waits for the condition within the timeout.
  5. hasAttribute: Checks if the element possesses the attribute specified by attributeName. Waits for the condition within the timeout.
  6. attributeEquals: Checks if the element’s attribute (attributeName) matches the expectedValue (string or RegExp). Waits for the condition within the timeout.

Use Cases:

Difference from wait and condition:

Advanced

switchToFrame

Switches the execution context to an iframe and executes a series of steps within it. Optionally switches back to the parent frame afterwards.

{
  type: 'switchToFrame',
  selector: '#ad-iframe', // Option 1: Selector for the iframe element
  // frameId: 'user-form-frame', // Option 2: ID of the iframe
  // frameName: 'formTarget', // Option 3: Name of the iframe
  steps: [ // Steps to execute within the frame's context
    {
      type: 'input',
      selector: '#email-in-frame',
      value: 'test@example.com'
    },
    {
      type: 'click',
      selector: 'button.submit-in-frame'
    }
  ],
  switchToDefault: true, // Optional: Switch back to parent frame (default: true)
  description: 'Interact with elements inside the ad iframe' // Optional
}

Parameters:

Functionality:

  1. Locates the specified iframe using the provided selector, frameId, or frameName.
  2. Switches the execution context to the found frame.
  3. Executes the nested steps array sequentially within the frame’s context.
  4. If switchToDefault is true, implicitly restores the execution context to the parent frame upon completion or error.

Important Note: The current implementation passes the Playwright Frame object to the execute method of the handlers for the nested steps. This relies on the handlers being able to correctly use the Frame object (e.g., using frame.locator()) instead of the Page object. This might require adjustments to other handlers if they strictly expect a Page object.

handleDialog

Sets up a handler for the next browser dialog (alert, confirm, prompt) that appears on the page. This step should be placed before the action that triggers the dialog.

{
  type: 'handleDialog',
  action: 'accept', // 'accept' or 'dismiss'
  promptText: 'Optional text for prompt dialogs', // Only used if action is 'accept' and dialog is a prompt
  description: 'Accept the confirmation dialog' // Optional
}

// Example Usage:
[
  {
    type: 'handleDialog', // Set up listener first
    action: 'accept'
  },
  {
    type: 'click', // This click triggers the dialog
    selector: '#delete-button'
  }
]

Parameters:

Functionality:

  1. Registers a one-time listener for the dialog event on the current page.
  2. When a dialog appears, the listener automatically performs the specified action (accepting or dismissing).
  3. The handleDialog step itself completes immediately after setting up the listener; it does not wait for a dialog to appear.

Use Cases:

Important: Place this step immediately before the action step (like click) that is expected to trigger the dialog. The listener only applies to the next dialog that appears after this step is executed.

manageCookies

Allows for adding, deleting, clearing, or retrieving browser cookies for the current context.

// Example: Add a cookie
{
  type: 'manageCookies',
  action: 'add',
  cookies: [
    {
      name: 'session_id',
      value: '', // Use value from context
      domain: '.example.com',
      path: '/',
      secure: true,
      httpOnly: true,
      sameSite: 'Lax'
    }
  ],
  description: 'Add session cookie'
}

// Example: Get specific cookies and store in context
{
  type: 'manageCookies',
  action: 'get',
  domain: '.example.com', // Filter by domain
  contextKey: 'exampleCookies', // Store result in context.exampleCookies
  description: 'Get example.com cookies'
}

// Example: Clear all cookies
{
  type: 'manageCookies',
  action: 'clear',
  description: 'Clear all cookies'
}

// Example: Delete a specific cookie
{
  type: 'manageCookies',
  action: 'delete', // Note: 'clear' with filters achieves the same
  name: 'tracking_id',
  domain: '.example.com', // Optional filter
  description: 'Delete tracking cookie'
}

Parameters:

Functionality:

manageStorage

Allows for interacting with the browser’s localStorage or sessionStorage.

// Example: Set an item in localStorage
{
  type: 'manageStorage',
  storageType: 'local', // 'local' or 'session'
  action: 'setItem',
  key: 'userPreferences',
  value: { theme: 'dark', notifications: false }, // Value will be JSON.stringified
  description: 'Save user preferences to localStorage'
}

// Example: Get an item from sessionStorage and store in context
{
  type: 'manageStorage',
  storageType: 'session',
  action: 'getItem',
  key: 'sessionToken',
  contextKey: 'retrievedSessionToken', // Store result here
  description: 'Get session token from sessionStorage'
}

// Example: Remove an item from localStorage
{
  type: 'manageStorage',
  storageType: 'local',
  action: 'removeItem',
  key: 'tempData',
  description: 'Remove temporary data from localStorage'
}

// Example: Clear all sessionStorage
{
  type: 'manageStorage',
  storageType: 'session',
  action: 'clear',
  description: 'Clear all sessionStorage'
}

Parameters:

Functionality:

scroll

Scrolls the page or to a specific element with advanced options.

{
  type: 'scroll',
  // Either use directional scrolling:
  direction: 'down', // 'up'/'left'/'right'
  distance: 500,     // pixels

  // OR scroll to an element:
  selector: '#element',
  scrollIntoView: true, // use native scrollIntoView (default: false)
  scrollMargin: 50,     // additional margin in pixels
  behavior: 'smooth',   // 'smooth' or 'auto'
  timeout: 5000,       // maximum wait time in ms

  // Common options:
  waitFor: '#next-section', // optional selector to wait for
  humanLike: true       // Optional: If true, randomizes internal waits and, for directional scrolls, randomizes distance slightly and uses smoother, variable-speed animation.
}

Scroll Features:

  1. Directional Scrolling: Scroll by fixed amount in any direction
  2. Element Scrolling: Scroll to bring element into view with configurable margin
  3. Scroll Behavior: Control smoothness with ‘smooth’ or instant with ‘auto’
  4. Timeout Handling: Set maximum wait time for scroll operations
  5. Native vs Custom: Choose between browser-native or human-like scrolling
  6. Wait Conditions: Optionally wait for elements after scrolling

Implementation Details:

Examples:

// Directional scrolling with smooth behavior
{
  type: 'scroll',
  direction: 'down',
  distance: 1000,
  behavior: 'smooth',
  timeout: 3000
}

// Scroll to element with margin and wait
{
  type: 'scroll',
  selector: '#footer',
  scrollMargin: 100,
  waitFor: '#content-loaded'
}

// Native browser scrollIntoView with timeout
{
  type: 'scroll',
  selector: '#header',
  scrollIntoView: true,
  timeout: 5000
}

// Horizontal scrolling
{
  type: 'scroll',
  direction: 'right',
  distance: 300
}

forEachElement

Loops through elements matching a selector and executes steps for each one.

{
  type: 'forEachElement',
  selector: 'tr.items', // CSS selector for elements to loop through
  description: 'Optional description',
  maxIterations: 50, // Optional limit on number of elements to process
  elementSteps: [ // Steps to execute for each matched element
    {
      type: 'click',
      selector: '.details-btn', // Relative to current element
      description: 'Click details button'
    },
    {
      type: 'wait',
      value: '.details-panel',
      timeout: 5000
    },
    {
      type: 'extract',
      name: 'panelData',
      selector: '.details-panel',
      fields: {
        // Extraction fields
      }
    },
    {
      type: 'mergeContext', // Special step to merge data back
      source: 'panelData',
      target: 'results[]', //  is available in elementSteps
      mergeStrategy: {
        // Define how to merge fields
      }
    }
  ]
}

Key Features:

Example with Data Merging:

{
  type: 'forEachElement',
  selector: 'tr.products',
  elementSteps: [
    {
      type: 'click',
      selector: 'button.more-info'
    },
    {
      type: 'wait',
      value: '.product-details',
      timeout: 5000
    },
    {
      type: 'extract',
      name: 'productDetails',
      selector: '.product-details',
      fields: {
        name: { selector: '.name', type: 'css' },
        price: { selector: '.price', type: 'css' }
      }
    },
    {
      type: 'mergeContext',
      source: 'productDetails',
      target: 'products[]',
      mergeStrategy: {
        name: 'overwrite',
        price: 'overwrite'
      }
    }
  ]
}

Real-world Example: For a complete implementation showing how to use forEachElement and mergeContext to scrape Google Trends data, see:
Google Trends Navigation Example

This example demonstrates:

executeScript

Executes custom JavaScript.

{
  type: 'executeScript',
  script: 'window.scrollTo(0, document.body.scrollHeight)'
}

Mouse Movement Details

The mouse movement system provides:

  1. Human-like movement patterns using Bezier curves
  2. Adjustable speed via duration parameter
  3. Both element-based and coordinate-based targeting
  4. Smooth transitions between points
  5. Random delays to simulate human behavior

Example complex mouse flow:

[
  {
    type: 'mousemove',
    endPoint: { x: 100, y: 100 },
    duration: 800,
  },
  {
    type: 'mousemove',
    mouseTarget: { selector: '#menu'},
    duration: 1200,
  },
  {
    type: 'hover',
    mouseTarget: { selector: '#submenu'},
    duration: 2000,
  },
];