Setting Up Intelligent Bot Detection¶
This tutorial guides you through configuring Peakhour's intelligent bot detection system to protect your website from malicious automation while allowing legitimate crawlers. By the end, you'll have a comprehensive bot protection strategy that distinguishes between good bots and threats.
Duration: 30 minutes
Prerequisites: Active Peakhour domain, basic understanding of web traffic patterns
Learning Goals: Configure JavaScript challenges, verify legitimate bots, set up challenge-based rate limiting, monitor bot activity
What You'll Build: A complete bot detection system that challenges suspicious automation, verifies legitimate crawlers, and provides detailed analytics on bot activity across your site.
Understanding Bot Detection Methods¶
Peakhour uses a multi-layered approach to identify and manage automated traffic:
Detection Layers¶
- Network Fingerprinting: Identifies clients by analyzing their unique TLS, HTTP/2, and TCP signatures. This is highly effective at detecting automated tools that have consistent network-level behavior.
- JavaScript Challenges: Browser validation to detect headless automation.
- User Agent Verification: Legitimate crawler identification with reverse DNS confirmation.
- Challenge Cookies: Fingerprint-based tracking using IP, session, and network signatures.
- Rate Limiting Integration: Challenge escalation for suspicious traffic patterns.
- IP Reputation: Integration with threat intelligence blocklists for known bad bots.
Bot Classifications¶
- Simple Bots: Automation that fails basic verification checks
- Advanced Bots: Sophisticated automation that bypasses initial detection but fails JavaScript challenges
- Legitimate Crawlers: Verified search engines, social media, and service bots
- Malicious Bots: Scrapers, attackers, and unauthorized automation
Enable Basic Bot Protection¶
Access Bot Settings¶
- Navigate to your Peakhour dashboard
- Select your domain from the domain list
- Go to Security > Bot Protection
- Review current bot protection status
Configure JavaScript Injection¶
Enable lightweight JavaScript challenges:
- Enable Bot Detection: Check "Inject lightweight JavaScript into the browser to assist with Bot detection"
- Review Impact: This adds minimal JavaScript (< 2KB) to pages for browser validation
- Test Compatibility: Verify with your site's existing JavaScript frameworks
Expected Behavior: Legitimate browsers execute JavaScript and receive access tokens, while headless automation tools fail the challenge.
Initial Configuration Review¶
Your basic setup should show:
Configure Legitimate Bot Access¶
Select Verified Bots¶
Choose which legitimate bots to allow through verification:
- Navigate to Bot Verification List
- Select trusted crawlers:
- Google: Search indexing and services
- Bing: Microsoft search engine
- Yandex: Russian search engine
- Facebook: Social media crawling
- Apple: Siri and search features
- Pinterest: Content discovery
- Stripe: Payment processing verification
- LetsEncrypt: Certificate validation
Best Practice: Start with essential services (Google, Bing) and add others based on your site's needs.
Enable Reverse DNS Verification¶
Protect against bot impersonation:
- Enable rDNS Verification: Check "Verify bots against published IP ranges using reverse DNS"
- How it Works: Bots claiming to be from Google must originate from Google's published IP ranges
- Block Impersonators: Fake crawlers using search engine user agents are blocked
Security Benefit: Prevents malicious bots from bypassing protection by spoofing legitimate user agents.
Test Legitimate Bot Access¶
Verify your configuration doesn't block wanted crawlers:
- Check Search Console for crawl errors after 24 hours
- Monitor Social Media tools for access issues
- Review Service Integrations (payments, monitoring) for failures
Configure Advanced Challenge System¶
Set Challenge Cookie Keys¶
Configure fingerprinting elements for challenge tracking:
- Navigate to Firewall > Challenge Configuration
- Select challenge key components:
- IP Address: Track challenges per IP
- Session ID: Per-session challenge tracking
- TLS Fingerprint: Device/client fingerprinting
Recommended Setup:
☑ IP Address (essential for distributed attacks)
☑ TLS Fingerprint (device identification)
☐ Session ID (for session-specific tracking)
Challenge Duration and Behavior¶
Configure challenge persistence:
- Challenge Validity: Set how long successful challenges remain valid
- Challenge Difficulty: Balance security vs. user experience
- Retry Logic: Configure failed challenge handling
Example Configuration:
Challenge Duration: 24 hours
Retry Attempts: 3 failures before temporary block
Grace Period: 5 minutes between challenge attempts
Integrate with Rate Limiting¶
Create Challenge-Based Rate Limits¶
Set up rate limiting that escalates to challenges rather than immediate blocks:
- Navigate to Rules > Rate Limiting
- Create a new rate limit rule:
Configure Path-Specific Protection¶
Protect high-value endpoints:
-
Login Pages:
-
API Endpoints:
-
Search/Browse Pages:
Test Rate Limit Integration¶
Verify challenge escalation works correctly:
- Trigger Rate Limit: Make requests above threshold from test IP
- Verify Challenge: Confirm JavaScript challenge appears instead of block
- Complete Challenge: Ensure successful completion grants access
- Monitor Analytics: Check challenge events in dashboard
Configure Firewall Integration¶
Create Firewall Challenge Rules¶
Use firewall rules to trigger challenges for suspicious patterns:
- Navigate to Rules > Firewall
- Create challenge rules for suspicious behavior:
Example Rules:
-
Suspicious User Agents:
-
Missing Headers:
-
Suspicious Network Fingerprints:
Note: You will need to create a text list namedName: Challenge Suspicious TLS Fingerprints Expression: fingerprint.tls in $suspicious_tls_fingerprints Action: CHALLENGE
suspicious_tls_fingerprints
containing fingerprints of known bots or malicious tools. For more information, see the Network Fingerprinting guide. -
High-Risk Countries:
Geographic Challenge Rules¶
Implement location-based challenges:
Name: Challenge International Traffic
Expression: ip.geoip.country ne "US" and
not ip.geoip.country in {"CA", "GB", "AU"}
Action: CHALLENGE
Priority: 10
This challenges international traffic while allowing major English-speaking countries.
Monitor and Analyze Bot Activity¶
Review Bot Analytics Dashboard¶
Access comprehensive bot activity metrics:
- Navigate to Analytics > Bot Detection
- Review key metrics:
- Challenge Success Rate: Percentage of challenges completed successfully
- Bot Classification: Simple vs. Advanced bot detection breakdown
- Geographic Distribution: Bot activity by country/region
- Temporal Patterns: Bot activity over time
Investigate Bot Events¶
Analyze specific bot encounters:
- Go to Security > Bot Events
- Review recent events with details:
- IP Address and Location
- User Agent String
- Detection Method (Simple/Advanced)
- Challenge Outcome
- Request Pattern
Analyze Traffic Patterns¶
Look for patterns indicating bot activity:
Indicators of Bot Traffic:
- High request rates from single IPs
- Unusual user agent strings
- Missing or unusual request headers
- Geographic clustering of requests
- Failed JavaScript challenges
Example Investigation:
Event: Advanced Bot Detection
IP: 203.0.113.42
Location: United States (Automated Detection)
User Agent: Mozilla/5.0... (suspicious)
Pattern: 500 requests in 60 seconds
Outcome: JavaScript challenge failed
Optimize Bot Protection¶
Review Challenge Success Rates¶
Analyze challenge effectiveness:
- High Success Rate (>80%): Challenges may be too easy
- Low Success Rate (<40%): Challenges may be too difficult or affecting legitimate users
- Optimal Range (60-80%): Good balance of security and usability
Adjust Based on Analytics¶
Fine-tune settings based on observed patterns:
Common Optimizations:
-
Reduce False Positives:
-
Increase Bot Detection:
-
Geographic Adjustments:
Integrate with IP Reputation¶
Leverage threat intelligence for enhanced protection:
- Review Blocked Sources: Analyze automatically blocked IP reputation sources
- Custom Blocklists: Add specific IP ranges based on observed attack patterns
- Allow Lists: Create exceptions for legitimate high-traffic sources
Example Custom Rule:
Name: Allow Verified Partners
Expression: ip.src in {203.0.113.0/24, 198.51.100.0/24}
Action: ALLOW
Priority: 1
Establish Monitoring and Maintenance¶
Set Up Alerts¶
Configure notifications for unusual bot activity:
- Challenge Spike Alert: Unusual increase in challenge attempts
- Geographic Anomaly: Bot activity from new geographic regions
- Success Rate Drop: Significant change in challenge completion rates
- Volume Threshold: Bot traffic exceeding normal baselines
Regular Review Schedule¶
Establish ongoing maintenance:
Weekly Reviews:
- Bot event summary and trends
- Challenge success rate analysis
- False positive identification
Monthly Reviews:
- Bot verification list updates
- Geographic rule adjustments
- Rate limiting threshold optimization
Quarterly Reviews:
- Overall bot protection strategy
- Integration with new services
- Challenge system effectiveness assessment
Emergency Procedures¶
Prepare for bot attack scenarios:
High-Volume Attack Response:
- Temporarily lower rate limiting thresholds
- Enable stricter challenge requirements
- Implement emergency IP blocking
- Monitor legitimate user impact
Challenge System Issues:
- Temporary disable JavaScript injection if affecting users
- Switch to basic user agent verification
- Implement manual allow-lists for critical traffic
Troubleshooting Common Issues¶
Challenge Loop Problems¶
Problem: Users get stuck in repeated challenges Solution: - Check challenge cookie configuration - Verify JavaScript compatibility with your site - Review session management settings
Legitimate Crawlers Blocked¶
Problem: Search engines can't crawl your site Solution:
- Verify reverse DNS is enabled
- Add missing crawlers to verification list
- Check firewall rules aren't overriding bot settings
High False Positive Rate¶
Problem: Real users frequently challenged
Solution:
- Reduce challenge sensitivity
- Add user agent exceptions for legitimate tools
- Review geographic rules for broad restrictions
Next Steps¶
With your bot detection system operational:
- Expand Coverage: Apply similar protection to additional domains
- API Protection: Implement service-specific bot rules for APIs
- Advanced Analytics: Set up custom dashboards for bot monitoring
- Integration: Connect bot events with SIEM systems for comprehensive security
Key Concepts Learned¶
- Multi-Layer Detection: JavaScript challenges, user agent verification, and fingerprinting work together
- Challenge-Based Protection: Escalation from logging to challenges to blocking based on threat level
- Legitimate Bot Management: Proper verification ensures search engines and services maintain access
- Analytics-Driven Optimization: Regular monitoring and adjustment improves effectiveness while reducing false positives
You've successfully implemented a sophisticated bot detection system that protects against automated threats while maintaining accessibility for legitimate crawlers and users. This foundation can be extended with additional rules and integrations as your security requirements evolve.