Crawl Configuration
Customize crawl depth, speed, rendering, and extraction rules.
Crawltable's default settings are designed to work well out of the box. But for larger sites or specific use cases, you'll want to fine-tune your configuration.
Crawl depth
Control how deep Crawltable follows links from your start URL.
- Unlimited — follow every discoverable link (default)
- Custom depth — set a maximum number of hops from the start URL
- URL pattern — restrict the crawl to URLs matching a regex pattern
For large sites, combining a depth limit with URL patterns is the most efficient approach.
Crawl speed
Adjust the number of concurrent requests to balance speed and server load.
| Setting | Concurrent requests | Best for |
|---|---|---|
| Gentle | 2 | Production sites with limited resources |
| Standard | 5 | Most sites (default) |
| Fast | 10 | Development/staging environments |
| Custom | 1–20 | Fine-tuned control |
Be respectful of the sites you crawl. If you're crawling a production site you don't own, use Gentle mode or slower.
JavaScript rendering
By default, Crawltable renders every page with a headless browser to capture dynamically loaded content. You can adjust this:
- Full rendering — every page is rendered with JavaScript (default)
- Selective — only render pages matching a URL pattern
- Disabled — static HTML only, faster but may miss dynamic content
Extraction rules
Beyond the standard data Crawltable extracts (titles, metas, headers, links), you can define custom extraction rules:
- CSS selectors — extract text or attributes from specific elements
- Regex patterns — match and capture content from the raw HTML
- Schema.org types — extract specific structured data types
Custom extractions appear as additional columns in your crawl data table.
Saving configurations
Click Save as preset to store your configuration for reuse. Presets are saved locally and appear in the crawl setup dropdown for quick access.