Probeo Bot
Probeo Bot is the crawler used by Probeo to observe how a website behaves as a system. It is designed to be predictable, limited in scope, and respectful of site infrastructure.
This page explains what Probeo Bot does, what it does not do, and how site owners can control its behavior.
What Probeo Bot does
Probeo Bot observes your site on every run and updates a shared view as things change. Below is how it discovers key areas, gathers signals, prioritizes what matters, and stays current over time.
- Requests HTML documents only by default
- Observes pages as they are delivered to real browsers
- Builds an inventory of pages, templates, and shared systems
- Makes very limited asset requests only when required to understand page structure
- Operates in a read-only manner
- Does not modify site content or configuration
Probeo Bot exists to observe behavior, not to interact with the site
What Probeo Bot does not do
Probeo Bot is limited and cautious by design to avoid side effects on your site.
- Submit forms
- Execute transactions
- Log in to user accounts
- Trigger application workflows
- Write data
- Modify content
- Execute destructive actions
- Perform load testing or stress testing
Probeo Bot does not attempt to bypass authentication or access restricted areas.
Request behavior
- Requests are rate-limited and controlled
- Unnecessary repeat requests are avoided
- Crawl behavior is designed to minimize impact on site performance
- Large sites are processed in stages, not all at once
Site stability is prioritized over crawl speed.
Assets and scripts
By default, Probeo Bot does the following.
- Requests HTML documents
- Does not load all page assets
- Does not execute JavaScript beyond what is required for basic rendering
In some cases, limited asset requests may be made to understand page structure. This can include assets such as fonts or third-party scripts that are required for layout or rendering.
Tracking scripts and analytics requests are ignored. Probeo Bot does not collect user data or execute tracking behavior.
These asset requests are minimal and are used only to understand how the page is constructed.
Identification and verification
Probeo Bot identifies itself using the following.
- A dedicated User-Agent string
- Optional request signatures (v1.1 and later)
When signature verification is enabled, requests can be validated to confirm they originate from Probeo Bot.
robots.txt and crawl control
Probeo Bot respects standard crawl controls.
- robots.txt allow and disallow rules
- crawl-delay directives
- Explicit path restrictions
Crawl behavior can be adjusted using standard robots configuration.
If additional restrictions are required, behavior can be configured per site.
Crawl scope
Probeo Bot crawls only the domains and paths associated with a site.
It does not do the following.
- Discover unrelated domains
- Follow links outside the defined scope
- Crawl third-party services
Scope is defined before crawling begins.
Security considerations
Probeo Bot is designed with security in mind.
- No credential storage
- No session reuse
- No form submission
- No write operations
Its behavior is limited by design to reduce risk.
Troubleshooting and contact
If Probeo Bot appears to behave in unexpected ways, or if crawl behavior needs to be adjusted, contact the team.
Please include the following.
- the affected domain
- timestamps of observed requests
- relevant request headers
Summary
Probeo Bot is a read-only observer.
It is designed to understand site behavior without interfering with site operation.
No action is required on this page.