Scanning
Auditoro scans your website to discover pages and run quality checks. This page explains how scanning works and the options available.
How Scans Work
1. Page Discovery
Auditoro discovers pages using two methods:
Sitemap-based discovery (preferred):
- Reads your sitemap.xml file
- Follows sitemap index files
- Discovers all listed URLs
Crawl-based discovery (fallback):
- Starts from your homepage
- Follows internal links
- Discovers pages recursively
If a sitemap is available, Auditoro uses it for faster, more complete discovery. If not, it crawls from your homepage.
2. Page Fetching
For each discovered page, Auditoro:
- Requests the page HTML
- Respects robots.txt directives
- Follows redirects (up to a limit)
- Captures response headers
3. Check Execution
Each page is analyzed with 20+ quality checks across:
- SEO (titles, meta, headings)
- Performance and delivery (compression, cache headers)
- Security (HTTPS, headers)
- Accessibility (alt text, lang)
- Content and runtime quality (broken links, spelling, JS errors)
4. Score Calculation
After all checks complete, Auditoro:
- Aggregates all detected issues
- Calculates the health score
- Updates trend data
- Sends notifications (if configured)
Scan Types
On-Demand Scans
Start a scan manually at any time from the site dashboard. Use on-demand scans to:
- Audit a new site
- Check fixes you just deployed
- Get updated results before a meeting
- Investigate a reported problem
Scheduled Scans
Configure automatic recurring scans to monitor your site continuously.
Frequency options:
- Weekly - Best for active sites (Growth and Scale plans)
- Monthly - Available on all plans
See Scheduled Scans for setup instructions.
Scan Limits
Each plan includes a monthly scan allowance:
| Plan | Scans/Month |
|---|---|
| Starter | 30 |
| Growth | 100 |
| Scale | 500 |
What counts as a scan:
- Each on-demand scan counts as 1
- Each scheduled scan counts as 1
Scan budget resets on your billing cycle date each month.
Page Limits
There's no strict page limit per scan, but very large sites may be crawled incrementally. Auditoro prioritizes pages in your sitemap.
For sites with thousands of pages:
- Ensure your sitemap lists priority pages
- Critical pages are always included
- Less important pages may be sampled
Scan Duration
Scan duration depends on:
- Site size - More pages take longer
- Server speed - Slow servers extend scan time
- Check complexity - Browser-based checks like external links and JavaScript errors take longer
Typical scan times:
- 10-50 pages: 1-3 minutes
- 50-200 pages: 3-10 minutes
- 200-1000 pages: 10-30 minutes
You can close the browser during a scan—it continues in the background. You'll receive a notification when complete.
Robots.txt Respect
Auditoro respects your robots.txt file:
- Pages disallowed for all bots are skipped
- Crawl-delay directives are honored
- User-agent specific rules are followed
The Auditoro crawler identifies as:
User-agent: Auditoro
To allow Auditoro while blocking other bots:
User-agent: *
Disallow: /private/
User-agent: Auditoro
Allow: /
Scan Failures
Scans may fail if:
- Site unreachable - Server down or blocking requests
- No pages found - Sitemap empty or crawl blocked
- Authentication required - Login-protected pages
- Rate limiting - Site blocking rapid requests
If a scan fails, check:
- Your site is accessible in a browser
- Robots.txt isn't blocking Auditoro
- No firewall rules are blocking the crawler
- Your sitemap is valid and accessible
Real-Time Progress
During a scan, the dashboard shows:
- Pages discovered
- Pages scanned
- Issues found so far
- Current check being run
You can watch progress in real-time or return later for results.