Setting Up Intelligent Bot Detection¶

This tutorial guides you through configuring Peakhour's intelligent bot detection system to protect your website from malicious automation while allowing legitimate crawlers. By the end, you'll have a comprehensive bot protection strategy that distinguishes between good bots and threats.

Duration: 30 minutes
Prerequisites: Active Peakhour domain, basic understanding of web traffic patterns
Learning Goals: Configure JavaScript challenges, verify legitimate bots, set up challenge-based rate limiting, monitor bot activity

What You'll Build: A complete bot detection system that challenges suspicious automation, verifies legitimate crawlers, and provides detailed analytics on bot activity across your site.

Understanding Bot Detection Methods¶

Peakhour uses a multi-layered approach to identify and manage automated traffic:

Detection Layers¶

Network Fingerprinting: Identifies clients by analyzing their unique TLS, HTTP/2, and TCP signatures. This is highly effective at detecting automated tools that have consistent network-level behavior.
JavaScript Challenges: Browser validation to detect headless automation.
User Agent Verification: Legitimate crawler identification with reverse DNS confirmation.
Challenge Cookies: Fingerprint-based tracking using IP, session, and network signatures.
Rate Limiting Integration: Challenge escalation for suspicious traffic patterns.
IP Reputation: Integration with threat intelligence blocklists for known bad bots.

Bot Classifications¶

Simple Bots: Automation that fails basic verification checks
Advanced Bots: Sophisticated automation that bypasses initial detection but fails JavaScript challenges
Legitimate Crawlers: Verified search engines, social media, and service bots
Malicious Bots: Scrapers, attackers, and unauthorized automation

Enable Basic Bot Protection¶

Access Bot Settings¶

Navigate to your Peakhour dashboard
Select your domain from the domain list
Go to Security > Bot Protection
Review current bot protection status

Configure JavaScript Injection¶

Enable lightweight JavaScript challenges:

Enable Bot Detection: Check "Inject lightweight JavaScript into the browser to assist with Bot detection"
Review Impact: This adds minimal JavaScript (< 2KB) to pages for browser validation
Test Compatibility: Verify with your site's existing JavaScript frameworks

Expected Behavior: Legitimate browsers execute JavaScript and receive access tokens, while headless automation tools fail the challenge.

Initial Configuration Review¶

Your basic setup should show:

✅ JavaScript Injection: Enabled
✅ Challenge System: Active
✅ Analytics Tracking: Configured

Configure Legitimate Bot Access¶

Select Verified Bots¶

Choose which legitimate bots to allow through verification:

Navigate to Bot Verification List
Select trusted crawlers:
Google: Search indexing and services
Bing: Microsoft search engine
Yandex: Russian search engine
Facebook: Social media crawling
Apple: Siri and search features
Pinterest: Content discovery
Stripe: Payment processing verification
LetsEncrypt: Certificate validation

Best Practice: Start with essential services (Google, Bing) and add others based on your site's needs.

Enable Reverse DNS Verification¶

Protect against bot impersonation:

Enable rDNS Verification: Check "Verify bots against published IP ranges using reverse DNS"
How it Works: Bots claiming to be from Google must originate from Google's published IP ranges
Block Impersonators: Fake crawlers using search engine user agents are blocked

Security Benefit: Prevents malicious bots from bypassing protection by spoofing legitimate user agents.

Test Legitimate Bot Access¶

Verify your configuration doesn't block wanted crawlers:

Check Search Console for crawl errors after 24 hours
Monitor Social Media tools for access issues
Review Service Integrations (payments, monitoring) for failures

Configure Advanced Challenge System¶

Configure fingerprinting elements for challenge tracking:

Navigate to Firewall > Challenge Configuration
Select challenge key components:
IP Address: Track challenges per IP
Session ID: Per-session challenge tracking
TLS Fingerprint: Device/client fingerprinting

Recommended Setup:

☑ IP Address (essential for distributed attacks)
☑ TLS Fingerprint (device identification)
☐ Session ID (for session-specific tracking)

Challenge Duration and Behavior¶

Configure challenge persistence:

Challenge Validity: Set how long successful challenges remain valid
Challenge Difficulty: Balance security vs. user experience
Retry Logic: Configure failed challenge handling

Example Configuration:

Challenge Duration: 24 hours
Retry Attempts: 3 failures before temporary block
Grace Period: 5 minutes between challenge attempts

Integrate with Rate Limiting¶

Create Challenge-Based Rate Limits¶

Set up rate limiting that escalates to challenges rather than immediate blocks:

Navigate to Rules > Rate Limiting

Create a new rate limit rule:

Name: "Suspicious Activity Challenge"
Pattern: All requests to sensitive paths
Threshold: 100 requests per minute
Action: CHALLENGE (not BLOCK)

Configure Path-Specific Protection¶

Protect high-value endpoints:

Login Pages:

Path: /login, /signin
Rate: 20 requests per 5 minutes
Action: Challenge after threshold

API Endpoints:

Path: /api/*
Rate: 500 requests per hour
Action: Challenge suspicious patterns

Search/Browse Pages:

Path: /search, /category/*
Rate: 200 requests per 10 minutes
Action: Challenge after threshold

Test Rate Limit Integration¶

Verify challenge escalation works correctly:

Trigger Rate Limit: Make requests above threshold from test IP
Verify Challenge: Confirm JavaScript challenge appears instead of block
Complete Challenge: Ensure successful completion grants access
Monitor Analytics: Check challenge events in dashboard

Configure Firewall Integration¶

Create Firewall Challenge Rules¶

Use firewall rules to trigger challenges for suspicious patterns:

Navigate to Rules > Firewall
Create challenge rules for suspicious behavior:

Example Rules:

Suspicious User Agents:

Name: Challenge Suspicious User Agents
Expression: http.user_agent matches ".*(curl|wget|python|scanner).*"
Action: CHALLENGE

Missing Headers:

Name: Challenge Headless Browsers
Expression: not http.accept exists or http.accept eq ""
Action: CHALLENGE

Suspicious Network Fingerprints:
```
Name: Challenge Suspicious TLS Fingerprints
Expression: fingerprint.tls in $suspicious_tls_fingerprints
Action: CHALLENGE
```
Note: You will need to create a text list named suspicious_tls_fingerprints containing fingerprints of known bots or malicious tools. For more information, see the Network Fingerprinting guide.

High-Risk Countries:

Name: Challenge High-Risk Locations
Expression: ip.geoip.country in {"CN", "RU", "BR"}
Action: CHALLENGE

Geographic Challenge Rules¶

Implement location-based challenges:

Name: Challenge International Traffic
Expression: ip.geoip.country ne "US" and 
           not ip.geoip.country in {"CA", "GB", "AU"}
Action: CHALLENGE
Priority: 10

This challenges international traffic while allowing major English-speaking countries.

Monitor and Analyze Bot Activity¶

Review Bot Analytics Dashboard¶

Access comprehensive bot activity metrics:

Navigate to Analytics > Bot Detection
Review key metrics:
Challenge Success Rate: Percentage of challenges completed successfully
Bot Classification: Simple vs. Advanced bot detection breakdown
Geographic Distribution: Bot activity by country/region
Temporal Patterns: Bot activity over time

Investigate Bot Events¶

Analyze specific bot encounters:

Go to Security > Bot Events
Review recent events with details:
IP Address and Location
User Agent String
Detection Method (Simple/Advanced)
Challenge Outcome
Request Pattern

Analyze Traffic Patterns¶

Look for patterns indicating bot activity:

Indicators of Bot Traffic:

High request rates from single IPs
Unusual user agent strings
Missing or unusual request headers
Geographic clustering of requests
Failed JavaScript challenges

Example Investigation:

Event: Advanced Bot Detection
IP: 203.0.113.42
Location: United States (Automated Detection)
User Agent: Mozilla/5.0... (suspicious)
Pattern: 500 requests in 60 seconds
Outcome: JavaScript challenge failed

Optimize Bot Protection¶

Review Challenge Success Rates¶

Analyze challenge effectiveness:

High Success Rate (>80%): Challenges may be too easy
Low Success Rate (<40%): Challenges may be too difficult or affecting legitimate users
Optimal Range (60-80%): Good balance of security and usability

Adjust Based on Analytics¶

Fine-tune settings based on observed patterns:

Common Optimizations:

Reduce False Positives:

Issue: Legitimate users failing challenges
Solution: Reduce challenge difficulty or add user agent exceptions

Increase Bot Detection:

Issue: Sophisticated bots bypassing detection
Solution: Enable additional challenge keys (TLS fingerprint)

Geographic Adjustments:

Issue: Legitimate international traffic challenged
Solution: Refine geographic rules or add allow-list

Integrate with IP Reputation¶

Leverage threat intelligence for enhanced protection:

Review Blocked Sources: Analyze automatically blocked IP reputation sources
Custom Blocklists: Add specific IP ranges based on observed attack patterns
Allow Lists: Create exceptions for legitimate high-traffic sources

Example Custom Rule:

Name: Allow Verified Partners
Expression: ip.src in {203.0.113.0/24, 198.51.100.0/24}
Action: ALLOW
Priority: 1

Establish Monitoring and Maintenance¶

Set Up Alerts¶

Configure notifications for unusual bot activity:

Challenge Spike Alert: Unusual increase in challenge attempts
Geographic Anomaly: Bot activity from new geographic regions
Success Rate Drop: Significant change in challenge completion rates
Volume Threshold: Bot traffic exceeding normal baselines

Regular Review Schedule¶

Establish ongoing maintenance:

Weekly Reviews:

Bot event summary and trends
Challenge success rate analysis
False positive identification

Monthly Reviews:

Bot verification list updates
Geographic rule adjustments
Rate limiting threshold optimization

Quarterly Reviews:

Overall bot protection strategy
Integration with new services
Challenge system effectiveness assessment

Emergency Procedures¶

Prepare for bot attack scenarios:

High-Volume Attack Response:

Temporarily lower rate limiting thresholds
Enable stricter challenge requirements
Implement emergency IP blocking
Monitor legitimate user impact

Challenge System Issues:

Temporary disable JavaScript injection if affecting users
Switch to basic user agent verification
Implement manual allow-lists for critical traffic

Troubleshooting Common Issues¶

Challenge Loop Problems¶

Problem: Users get stuck in repeated challenges Solution: - Check challenge cookie configuration - Verify JavaScript compatibility with your site - Review session management settings

Legitimate Crawlers Blocked¶

Problem: Search engines can't crawl your site Solution:

Verify reverse DNS is enabled
Add missing crawlers to verification list
Check firewall rules aren't overriding bot settings

High False Positive Rate¶

Problem: Real users frequently challenged
Solution:

Reduce challenge sensitivity
Add user agent exceptions for legitimate tools
Review geographic rules for broad restrictions

Next Steps¶

With your bot detection system operational:

Expand Coverage: Apply similar protection to additional domains
API Protection: Implement service-specific bot rules for APIs
Advanced Analytics: Set up custom dashboards for bot monitoring
Integration: Connect bot events with SIEM systems for comprehensive security

Key Concepts Learned¶

Multi-Layer Detection: JavaScript challenges, user agent verification, and fingerprinting work together
Challenge-Based Protection: Escalation from logging to challenges to blocking based on threat level
Legitimate Bot Management: Proper verification ensures search engines and services maintain access
Analytics-Driven Optimization: Regular monitoring and adjustment improves effectiveness while reducing false positives

You've successfully implemented a sophisticated bot detection system that protects against automated threats while maintaining accessibility for legitimate crawlers and users. This foundation can be extended with additional rules and integrations as your security requirements evolve.