How to Set Up Log-Based Alerting¶
This guide shows you how to configure Peakhour's comprehensive alerting system to monitor security events, performance issues, and operational anomalies using log-based triggers and multiple notification channels.
Before you begin: Understand advanced log queries and security investigation techniques to effectively configure alert thresholds.
Understanding Peakhour's Alerting System¶
Peakhour's "Instant Alerts" system monitors log events in real-time and triggers notifications when specific conditions are met. The system supports multiple alert types, notification channels, and includes intelligent cooldown mechanisms to prevent alert fatigue.
Alert Categories¶
Security Alerts¶
- WAF blocks (web application firewall events)
- IP reputation blocks (malicious traffic detection)
- Geographic blocking events
- Rate limiting violations
Performance Alerts¶
- Origin server timeouts (90+ second delays)
- Origin server errors (5xx responses)
- Connection refused errors
- High error rate thresholds
Operational Alerts¶
- Service availability issues
- Configuration problems
- Certificate expiration warnings
- Cache performance degradation
Notification Channels¶
- Email: HTML and text notifications via Postmark
- SMS: Mobile notifications via Twilio (Australian numbers supported)
- Webhooks: HTTP integrations for custom systems
- Log Forwarding: SIEM integrations (Azure Sentinel, GCP, etc.)
Configure Basic Instant Alerts¶
Access Alert Configuration¶
- Navigate to Monitoring > Instant Alerts
- Review current alert configuration status
- Check notification channel setup
Set Up Notification Channels¶
Configure Email Notifications¶
- Add Alert Emails: Enter email addresses for alert recipients
- Test Email Delivery: Send test alerts to verify delivery
- Email Templates: Review default alert email format
Configure SMS Notifications¶
- Add Mobile Numbers: Enter Australian mobile numbers (+61 format)
- SMS Rate Limiting: Understand built-in cooldown periods
- Test SMS Delivery: Verify mobile notification delivery
Example Configuration¶
Alert Emails:
- security@company.com
- ops-team@company.com
- admin@company.com
Alert Mobile Numbers:
- +61412345678 (Primary on-call)
- +61487654321 (Secondary contact)
Configure Alert Rules¶
Enable and configure specific alert types:
WAF Alerts:
Alert Type: WAF Block
Description: Web Application Firewall detected attacks
Notification: Email + SMS
Cooldown: 30 minutes
IP Block Alerts:
Alert Type: IP Block
Description: Malicious IP reputation blocks
Notification: Email only
Cooldown: 1 hour
Origin Error Alerts:
Alert Type: Origin 5xx Errors
Description: Backend server errors
Notification: Email + SMS
Cooldown: 15 minutes
Configure Alert Thresholds and Cooldowns¶
Set Appropriate Cooldown Periods¶
Critical Alerts (immediate action required):
WAF Attacks: 30 minutes cooldown
Origin Down: 30 minutes cooldown
Origin Timeouts: 30 minutes cooldown
Monitoring Alerts (informational):
Operational Alerts (planned maintenance):
Understanding Cooldown Logic¶
Cooldown periods prevent alert spam during extended incidents:
- First Alert: Immediate notification sent
- Subsequent Events: Suppressed during cooldown period
- Cooldown Reset: After period expires, next event triggers alert
- Per-Rule Cooldown: Each alert type has independent cooldown
Example Scenario:
14:00 - WAF attack detected → Alert sent
14:15 - More WAF attacks → Suppressed (30min cooldown)
14:30 - More WAF attacks → Suppressed (cooldown continues)
14:31 - Different alert type (Origin error) → Alert sent (separate cooldown)
15:00 - WAF attack → Alert sent (cooldown expired)
Advanced Cooldown Configuration¶
Global Domain Settings:
{
"cooldown_time": "PT1800S", // 30 minutes global default
"alert_emails": ["ops@company.com"],
"alert_mobiles": ["+61412345678"]
}
Per-Rule Overrides:
{
"rules": {
"waf": {
"notify": true,
"cooldown_time": "PT900S" // 15 minutes for WAF (override)
},
"origin_down": {
"notify": true,
"cooldown_time": "PT3600S" // 1 hour for origin issues
}
}
}
Create Advanced Log-Based Alert Rules¶
Security Event Monitoring¶
While Peakhour's built-in alerts cover major security events, you can enhance monitoring using log forwarding to external systems:
High-Volume Attack Detection:
Log Forward to SIEM:
Event Type: WAF Blocks
Threshold: >50 events in 5 minutes
Action: Trigger advanced incident response
Geographic Anomaly Detection:
Log Analysis Rule:
Pattern: New country in top traffic sources
Threshold: >100 requests from new geographic region
Alert: Email notification with geographic analysis
Performance Monitoring Alerts¶
Response Time Degradation:
Custom Monitoring:
Metric: Average response time
Threshold: >2 seconds for >5 minutes
Action: Performance alert with trend analysis
Cache Hit Rate Drop:
Performance Alert:
Metric: Cache hit rate
Threshold: <70% for >10 minutes
Action: Cache performance investigation alert
Operational Health Monitoring¶
Error Rate Spikes:
Error Rate Monitor:
Metric: 4xx/5xx error percentage
Threshold: >10% for >5 minutes
Action: Service health alert
Traffic Volume Anomalies:
Traffic Monitoring:
Metric: Request volume deviation
Threshold: >300% of baseline or <10% of baseline
Action: Traffic anomaly alert
Integrate with External Systems¶
Webhook Integration for Custom Notifications¶
Slack Integration (via webhook):
# Configure webhook endpoint
POST https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK
# Webhook payload transformation
{
"text": "Peakhour Alert: {{alert_type}} on {{domain}}",
"channel": "#security-alerts",
"username": "PeakHour Monitor",
"icon_emoji": ":warning:"
}
PagerDuty Integration:
# PagerDuty Events API v2
POST https://events.pagerduty.com/v2/enqueue
{
"routing_key": "YOUR_INTEGRATION_KEY",
"event_action": "trigger",
"payload": {
"summary": "{{alert_type}} alert on {{domain}}",
"source": "peakhour.io",
"severity": "error"
}
}
SIEM Integration via Log Forwarding¶
Azure Sentinel Integration:
- Configure Log Forwarding: Set up HTTP endpoint for Sentinel
- Create Analytics Rules: Define detection logic in Sentinel
- Set Up Playbooks: Automated response workflows
Example Sentinel Rule:
PeakHourLogs_CL
| where block_by_s == "waf"
| where waf_matched_rule_severity_s == "CRITICAL"
| summarize count() by client_s, bin(TimeGenerated, 5m)
| where count_ > 10
Splunk Integration:
# HTTP Event Collector
curl -X POST "https://splunk.company.com:8088/services/collector" \
-H "Authorization: Splunk YOUR_TOKEN" \
-d '{
"event": {
"alert_type": "{{alert_type}}",
"domain": "{{domain}}",
"timestamp": "{{timestamp}}"
}
}'
Custom Alert Processing¶
Webhook Receiver Example (Python):
from flask import Flask, request, jsonify
import requests
app = Flask(__name__)
@app.route('/peakhour-alert', methods=['POST'])
def handle_peakhour_alert():
data = request.json
alert_type = data.get('alert_type')
domain = data.get('domain')
if alert_type == 'waf' and 'critical' in data.get('severity', '').lower():
# Escalate critical security alerts
send_to_security_team(data)
elif alert_type in ['origin_down', 'origin_timeout']:
# Alert infrastructure team
send_to_ops_team(data)
return jsonify({'status': 'received'})
Monitor Alert Effectiveness¶
Alert Analytics Dashboard¶
Track alert performance and effectiveness:
Alert Volume Metrics:
- Alerts triggered per day/week/month
- Alert type distribution
- Cooldown effectiveness (suppressed vs sent)
- Response time to alerts
Alert Quality Metrics:
- False positive rate
- Time to resolution
- Alert escalation patterns
- Notification channel effectiveness
Alert Tuning Based on Analytics¶
Reduce Alert Fatigue:
High-Volume Alerts Analysis:
- IP Block alerts: 150/day (consider longer cooldown)
- WAF alerts: 25/day (appropriate level)
- Origin alerts: 2/day (critical - keep current settings)
Tuning Actions:
- Increase IP block cooldown: 30min → 2 hours
- Add geographic filtering to reduce noise
- Create severity-based escalation
Improve Alert Coverage:
Gap Analysis:
- Missing alerts for certificate expiration
- No alerting for DNS resolution issues
- Limited visibility into cache performance
Enhancement Plan:
- Add certificate monitoring alerts
- Implement DNS health checks
- Create cache performance thresholds
Advanced Alerting Strategies¶
Multi-Tier Alert Escalation¶
Tier 1 - Information (Log only):
Events: Regular WAF blocks, standard IP blocks
Action: Log to SIEM, no immediate notification
Threshold: Standard activity levels
Tier 2 - Warning (Email notification):
Events: Elevated attack activity, minor performance issues
Action: Email to operations team
Threshold: 2-3x normal activity
Tier 3 - Critical (Email + SMS):
Events: Major attacks, service outages, significant performance degradation
Action: Email + SMS to on-call team
Threshold: 5x+ normal activity or service impact
Context-Aware Alerting¶
Business Hours vs. After Hours:
Business Hours (9 AM - 6 PM):
- Higher alert thresholds (more activity expected)
- Email notifications preferred
- Faster response time expectations
After Hours:
- Lower alert thresholds (less legitimate traffic)
- SMS notifications for critical issues
- Extended response time acceptable for non-critical
Geographic Context:
Expected Traffic Regions:
- US/CA/AU: Higher thresholds, standard alerting
- EU: Moderate thresholds during EU business hours
- High-Risk Regions: Lower thresholds, immediate alerting
Intelligent Alert Correlation¶
Attack Campaign Detection:
Correlation Logic:
- Multiple IPs from same ASN attacking
- Similar attack patterns across time
- Geographic clustering of threats
Enhanced Alert:
- Single "coordinated attack" alert vs. many individual IP alerts
- Include campaign analysis and attribution
- Provide recommended response actions
Alert Response Procedures¶
Create Standard Operating Procedures¶
Security Alert Response:
WAF Alert Procedure:
1. Review alert details and affected domains
2. Check security dashboard for attack patterns
3. Verify WAF rules are blocking effectively
4. Escalate to security team if bypass detected
5. Document incident and response actions
Performance Alert Response:
Origin Error Procedure:
1. Check origin server health and connectivity
2. Review recent configuration changes
3. Test direct origin connectivity
4. Contact hosting provider if needed
5. Update monitoring thresholds if appropriate
Alert Response Templates¶
Security Incident Response:
Subject: Security Alert Response - {{alert_type}} on {{domain}}
Initial Assessment:
- Alert Time: {{timestamp}}
- Attack Vector: {{attack_details}}
- Source Analysis: {{source_ips_countries}}
- Current Status: {{blocking_effectiveness}}
Immediate Actions Taken:
- [ ] Verified WAF blocking effectiveness
- [ ] Reviewed attack patterns and scale
- [ ] Checked for successful bypasses
- [ ] Escalated to security team (if needed)
Next Steps:
- Continue monitoring for escalation
- Update threat intelligence with IOCs
- Review and enhance detection rules
Integration with Incident Management¶
Incident Tracking Integration¶
ServiceNow Integration:
Alert → ServiceNow Incident:
- Automatic incident creation for critical alerts
- Alert details mapped to incident fields
- Priority assignment based on alert severity
- Assignment to appropriate support groups
JIRA Integration:
Alert → JIRA Issue:
- Create issues for operational alerts
- Track resolution and root cause analysis
- Link related alerts and incidents
- Generate metrics on alert resolution time
Automated Response Actions¶
Immediate Response Automation:
Critical Security Alert Automation:
1. Create firewall rule to block attack sources
2. Increase rate limiting temporarily
3. Notify security team via multiple channels
4. Create incident tracking ticket
5. Begin evidence collection for forensics
Performance Issue Automation:
Origin Error Response:
1. Automatically failover to backup origins (if configured)
2. Increase cache TTL temporarily
3. Enable maintenance page if needed
4. Alert operations and hosting teams
5. Begin diagnostic data collection
Troubleshooting Common Alerting Issues¶
Alert Delivery Problems¶
Problem: Alerts not being received Solutions:
- Verify email addresses and mobile numbers are correct
- Check spam/junk folders for email alerts
- Test notification channels with manual test alerts
- Review cooldown settings - alerts may be suppressed
- Confirm alert rules are enabled
Alert Fatigue¶
Problem: Too many alerts, important ones get ignored Solutions:
- Increase cooldown periods for noisy alerts
- Implement severity-based escalation
- Use log forwarding for informational events
- Create alert summaries instead of individual notifications
- Filter out expected patterns (maintenance, known good traffic)
Missing Important Alerts¶
Problem: Critical events not generating alerts Solutions:
- Review alert rule configuration and thresholds
- Check if events fall outside configured alert types
- Verify log forwarding for custom monitoring
- Test alert conditions with known scenarios
- Consider additional monitoring tools for gaps
Best Practices for Log-Based Alerting¶
Alert Configuration¶
- Start with conservative thresholds and adjust based on experience
- Use appropriate cooldown periods to prevent alert spam
- Test all notification channels regularly
- Document alert procedures and escalation paths
- Regular review and tuning based on effectiveness metrics
Notification Management¶
- Use appropriate channels for different severity levels
- Implement follow-the-sun on-call coverage
- Provide context and recommended actions in alerts
- Create alert summaries for management reporting
- Maintain contact information and escalation procedures
Integration Strategy¶
- Leverage existing monitoring and incident management tools
- Use webhooks for custom integrations and workflows
- Implement automated response for common scenarios
- Maintain audit trails of alert actions and responses
- Regular testing of integration points and escalation procedures
Your log-based alerting system is now configured to provide comprehensive monitoring coverage with intelligent notification management and integration capabilities for effective incident response.