API Schema Discovery and Endpoint Analysis¶
Peakhour's API Discovery system automatically detects, catalogs, and analyzes API endpoints from live traffic to provide comprehensive visibility into your API infrastructure and enable advanced security policies.
How API Discovery Works¶
Automatic Detection Process¶
Traffic Analysis:
- Request Classification: Distinguish API traffic from web traffic based on content types, headers, and URL patterns
- Endpoint Extraction: Identify unique API endpoints from request paths and methods
- Parameter Discovery: Analyze request parameters (query, path, body) and their data types
- Response Analysis: Catalog response codes, content types, and structure patterns
- Schema Generation: Build OpenAPI-compatible schemas from observed traffic patterns
Discovery Components¶
Endpoint Cataloging:
- Path Templates: Convert
/api/users/123
to/api/users/{id}
patterns - HTTP Methods: Track GET, POST, PUT, DELETE, PATCH operations per endpoint
- Parameter Types: Identify query parameters, path variables, request body fields
- Response Patterns: Catalog success/error response structures and status codes
Schema Learning:
- Data Type Inference: Determine parameter types (string, integer, boolean, array)
- Validation Rules: Extract constraints like min/max values, string patterns, required fields
- Enum Detection: Identify fixed value sets for parameters
- Nested Structure: Map complex object hierarchies in request/response bodies
Endpoint Discovery Features¶
Intelligent Traffic Classification¶
API Traffic Identification:
Content-Type: application/json → API Request
Accept: application/json → API Request
User-Agent: mobile-app/1.2 → API Request
Content-Type: text/html → Web Request
Endpoint Pattern Recognition:
/api/v1/users/123 → /api/v1/users/{id}
/api/v1/orders/456/items → /api/v1/orders/{id}/items
/api/v1/products?limit=10 → /api/v1/products + query params
Version Detection:
Path Versioning: /api/v1/, /api/v2/
Header Versioning: API-Version: 2.1
Parameter Versioning: ?version=3.0
Parameter Analysis¶
Request Parameter Discovery:
- Query Parameters:
?limit=10&offset=20&sort=name
- Path Parameters:
/users/{userId}/orders/{orderId}
- Header Parameters:
X-API-Version
,Authorization
, custom headers - Body Parameters: JSON/XML request body field analysis
Parameter Metadata:
{
"name": "limit",
"location": "query",
"type": "integer",
"required": false,
"minimum": 1,
"maximum": 100,
"default": 20,
"description": "Number of results to return"
}
Response Structure Analysis¶
Status Code Patterns:
GET /api/users/{id}:
200: User found and returned
404: User not found
401: Authentication required
429: Rate limit exceeded
POST /api/users:
201: User created successfully
400: Invalid user data
409: User already exists
Response Schema Detection:
{
"endpoint": "/api/v1/users/{id}",
"method": "GET",
"responses": {
"200": {
"content-type": "application/json",
"schema": {
"type": "object",
"properties": {
"id": {"type": "integer"},
"email": {"type": "string", "format": "email"},
"created_at": {"type": "string", "format": "date-time"}
}
}
}
}
}
Security Benefits¶
Attack Surface Analysis¶
Endpoint Risk Assessment:
- Authentication Requirements: Which endpoints require authentication
- Sensitive Data Exposure: Endpoints returning PII or confidential data
- Input Validation: Endpoints accepting user input without proper validation
- Rate Limiting Gaps: High-risk endpoints lacking proper rate limiting
Vulnerability Detection:
High Risk Endpoints:
- /api/admin/* (administrative functions)
- /api/users/{id}/password (password changes)
- /api/payments/* (financial transactions)
- /api/debug/* (debugging endpoints in production)
Common Issues Found:
- Missing authentication on sensitive endpoints
- Overly permissive CORS policies
- Endpoints returning internal system information
- Lack of input validation on user-provided data
Automated Security Policy Generation¶
Rule Generation Based on Discovery:
// Auto-generated rate limiting rule
if (http.request.uri.path matches "/api/v1/search.*") {
rate_limit(zone: "api_search", rate: "60r/m", key: ["api_key"])
}
// Auto-generated authentication requirement
if (starts_with(http.request.uri.path, "/api/v1/users/") and
http.request.method ne "GET") {
require_authentication()
}
// Auto-generated input validation
if (http.request.uri.path eq "/api/v1/users" and
http.request.method eq "POST") {
validate_json_schema(user_creation_schema)
}
OpenAPI Schema Generation¶
Automated Documentation¶
Schema Export Formats:
- OpenAPI 3.0: Industry-standard API documentation format
- Swagger UI: Interactive API documentation interface
- Postman Collections: Ready-to-import API testing collections
- Insomnia Workspaces: API development environment setup
Generated OpenAPI Example:
openapi: 3.0.3
info:
title: Discovered API
version: 1.0.0
description: Automatically generated from traffic analysis
paths:
/api/v1/users:
get:
summary: List users
parameters:
- name: limit
in: query
schema:
type: integer
minimum: 1
maximum: 100
default: 20
- name: offset
in: query
schema:
type: integer
minimum: 0
default: 0
responses:
'200':
description: User list retrieved successfully
content:
application/json:
schema:
type: object
properties:
users:
type: array
items:
$ref: '#/components/schemas/User'
components:
schemas:
User:
type: object
properties:
id:
type: integer
format: int64
email:
type: string
format: email
created_at:
type: string
format: date-time
Schema Validation Integration¶
Real-time Validation:
- Request Validation: Ensure incoming requests match discovered schemas
- Response Validation: Verify API responses maintain consistent structure
- Breaking Change Detection: Alert when API responses deviate from established patterns
- Version Drift Monitoring: Track API evolution and compatibility
Traffic Pattern Analysis¶
Usage Analytics¶
Endpoint Performance Metrics:
/api/v1/users (GET):
- Average Response Time: 156ms
- 95th Percentile: 324ms
- Request Volume: 1,247 req/hour
- Error Rate: 2.1%
- Cache Hit Rate: 78%
/api/v1/orders (POST):
- Average Response Time: 423ms
- 95th Percentile: 892ms
- Request Volume: 89 req/hour
- Error Rate: 5.3%
- Success Rate: 94.7%
Consumer Behavior Analysis:
- API Key Usage: Which consumers use which endpoints
- Geographic Distribution: Where API requests originate
- Time-based Patterns: Peak usage hours and seasonal trends
- Device/Platform Analysis: Mobile vs web vs server-to-server usage
Security Event Correlation¶
Threat Detection Integration:
Endpoint: /api/v1/admin/users
Security Events:
- 15 brute force attempts (2024-01-15 14:30)
- 3 SQL injection attempts (2024-01-15 15:45)
- 1 privilege escalation attempt (2024-01-15 16:12)
Risk Score: HIGH
Recommendation: Require additional authentication for admin endpoints
Anomaly Detection:
- Usage Spikes: Unusual request volume increases
- New Endpoints: Previously unseen API endpoints appearing
- Parameter Anomalies: Requests with unexpected parameter values
- Response Anomalies: Unusual error rates or response patterns
Discovery Configuration¶
Learning Parameters¶
Discovery Sensitivity:
Endpoint Detection Threshold:
- Minimum Requests: 5 (before cataloging endpoint)
- Time Window: 24 hours (learning period)
- Parameter Confidence: 80% (before including in schema)
- Response Stability: 90% (consistent response structure)
Traffic Sampling:
- Sample Rate: 10% of traffic (configurable)
- Excluded Paths: Static assets, health checks, internal endpoints
- Included Content Types: JSON, XML, form-data
- User Agent Filtering: Exclude bots, include legitimate API clients
Privacy and Compliance¶
Data Handling:
- Parameter Value Redaction: Never store actual parameter values
- PII Detection: Automatically identify and mask sensitive data patterns
- Retention Policies: Schema data retention and cleanup schedules
- Access Controls: Who can view discovered API information
Compliance Features:
- GDPR Compliance: No personal data stored in discovery process
- SOC2 Controls: Audit trails for schema access and modifications
- Data Residency: Schema storage location controls
- Encryption: All discovered schema data encrypted at rest
Integration Capabilities¶
Development Workflow Integration¶
CI/CD Pipeline Integration:
# Export current API schema for validation
curl -H "Authorization: Bearer $API_KEY" \
"https://api.peakhour.io/domains/example.com/api-discovery/schema" > current-schema.json
# Compare with expected schema in version control
diff expected-schema.json current-schema.json
# Fail build if breaking changes detected
if [ $? -ne 0 ]; then
echo "Breaking API changes detected!"
exit 1
fi
Documentation Generation:
- Automatic Updates: Keep API documentation current with live traffic
- Version Tracking: Maintain historical schemas for all API versions
- Change Notifications: Alert teams when API schemas change
- Documentation Publishing: Auto-publish to developer portals
Security Tool Integration¶
SIEM Integration:
{
"event_type": "api_endpoint_discovered",
"timestamp": "2024-01-15T14:30:15Z",
"endpoint": "/api/v1/admin/reset-password",
"method": "POST",
"risk_level": "HIGH",
"authentication_required": false,
"sensitive_data": ["password", "email"],
"recommendation": "Add authentication requirement"
}
Vulnerability Scanning:
- Endpoint Inventory: Provide complete API surface for security scanning
- Risk Prioritization: Focus scans on high-risk discovered endpoints
- Configuration Validation: Ensure security controls match discovered API surface
- Compliance Checking: Verify API security posture against standards
This comprehensive API discovery system provides complete visibility into your API infrastructure, enabling better security policies, improved documentation, and enhanced operational insights.