v0.2.0 restructured to pipeline architecture, dirty
This commit is contained in:
426
doc/filters.md
426
doc/filters.md
@ -1,21 +1,17 @@
|
||||
# Filter Guide
|
||||
|
||||
LogWisp's filtering system allows you to control which log entries are streamed to clients, reducing noise and focusing on what matters.
|
||||
LogWisp filters control which log entries pass through pipelines using regular expressions.
|
||||
|
||||
## How Filters Work
|
||||
|
||||
Filters use regular expressions to match log entries. Each filter can either:
|
||||
- **Include**: Only matching logs pass through (whitelist)
|
||||
- **Include**: Only matching logs pass (whitelist)
|
||||
- **Exclude**: Matching logs are dropped (blacklist)
|
||||
- Multiple filters apply sequentially - all must pass
|
||||
|
||||
Multiple filters are applied sequentially - a log entry must pass ALL filters to be streamed.
|
||||
|
||||
## Filter Configuration
|
||||
|
||||
### Basic Structure
|
||||
## Configuration
|
||||
|
||||
```toml
|
||||
[[streams.filters]]
|
||||
[[pipelines.filters]]
|
||||
type = "include" # or "exclude"
|
||||
logic = "or" # or "and"
|
||||
patterns = [
|
||||
@ -26,115 +22,79 @@ patterns = [
|
||||
|
||||
### Filter Types
|
||||
|
||||
#### Include Filter (Whitelist)
|
||||
Only logs matching the patterns are streamed:
|
||||
|
||||
#### Include Filter
|
||||
```toml
|
||||
[[streams.filters]]
|
||||
[[pipelines.filters]]
|
||||
type = "include"
|
||||
logic = "or"
|
||||
patterns = [
|
||||
"ERROR",
|
||||
"WARN",
|
||||
"CRITICAL"
|
||||
]
|
||||
# Result: Only ERROR, WARN, or CRITICAL logs are streamed
|
||||
patterns = ["ERROR", "WARN", "CRITICAL"]
|
||||
# Only ERROR, WARN, or CRITICAL logs pass
|
||||
```
|
||||
|
||||
#### Exclude Filter (Blacklist)
|
||||
Logs matching the patterns are dropped:
|
||||
|
||||
#### Exclude Filter
|
||||
```toml
|
||||
[[streams.filters]]
|
||||
[[pipelines.filters]]
|
||||
type = "exclude"
|
||||
patterns = [
|
||||
"DEBUG",
|
||||
"TRACE",
|
||||
"/health"
|
||||
]
|
||||
# Result: DEBUG, TRACE, and health check logs are filtered out
|
||||
patterns = ["DEBUG", "TRACE", "/health"]
|
||||
# DEBUG, TRACE, and health checks are dropped
|
||||
```
|
||||
|
||||
### Logic Operators
|
||||
|
||||
#### OR Logic (Default)
|
||||
Log matches if ANY pattern matches:
|
||||
- **OR**: Match ANY pattern (default)
|
||||
- **AND**: Match ALL patterns
|
||||
|
||||
```toml
|
||||
[[streams.filters]]
|
||||
type = "include"
|
||||
# OR Logic
|
||||
logic = "or"
|
||||
patterns = ["ERROR", "FAIL", "EXCEPTION"]
|
||||
# Matches: "ERROR: disk full" OR "FAIL: connection timeout" OR "NullPointerException"
|
||||
```
|
||||
patterns = ["ERROR", "FAIL"]
|
||||
# Matches: "ERROR: disk full" OR "FAIL: timeout"
|
||||
|
||||
#### AND Logic
|
||||
Log matches only if ALL patterns match:
|
||||
|
||||
```toml
|
||||
[[streams.filters]]
|
||||
type = "include"
|
||||
# AND Logic
|
||||
logic = "and"
|
||||
patterns = ["database", "timeout", "ERROR"]
|
||||
# Matches: "ERROR: database connection timeout"
|
||||
# Doesn't match: "ERROR: file not found" (missing "database" and "timeout")
|
||||
# Not: "ERROR: file not found"
|
||||
```
|
||||
|
||||
## Pattern Syntax
|
||||
|
||||
LogWisp uses Go's regular expression syntax (RE2):
|
||||
|
||||
### Basic Patterns
|
||||
Go regular expressions (RE2):
|
||||
|
||||
```toml
|
||||
"ERROR" # Substring match
|
||||
"(?i)error" # Case-insensitive
|
||||
"\\berror\\b" # Word boundaries
|
||||
"^ERROR" # Start of line
|
||||
"ERROR$" # End of line
|
||||
"error|fail|warn" # Alternatives
|
||||
```
|
||||
|
||||
## Common Patterns
|
||||
|
||||
### Log Levels
|
||||
```toml
|
||||
patterns = [
|
||||
"ERROR", # Exact substring match
|
||||
"(?i)error", # Case-insensitive
|
||||
"\\berror\\b", # Word boundaries
|
||||
"^ERROR", # Start of line
|
||||
"ERROR$", # End of line
|
||||
"ERR(OR)?", # Optional group
|
||||
"error|fail|exception" # Alternatives
|
||||
"\\[(ERROR|WARN|INFO)\\]", # [ERROR] format
|
||||
"(?i)\\b(error|warning)\\b", # Word boundaries
|
||||
"level=(error|warn)", # key=value format
|
||||
]
|
||||
```
|
||||
|
||||
### Common Pattern Examples
|
||||
|
||||
#### Log Levels
|
||||
### Application Errors
|
||||
```toml
|
||||
# Standard log levels
|
||||
patterns = [
|
||||
"\\[(ERROR|WARN|INFO|DEBUG)\\]", # [ERROR] format
|
||||
"(?i)\\b(error|warning|info|debug)\\b", # Word boundaries
|
||||
"level=(error|warn|info|debug)", # key=value format
|
||||
"<(Error|Warning|Info|Debug)>" # XML-style
|
||||
]
|
||||
|
||||
# Severity patterns
|
||||
patterns = [
|
||||
"(?i)(fatal|critical|severe)",
|
||||
"(?i)(error|fail|exception)",
|
||||
"(?i)(warn|warning|caution)",
|
||||
"panic:", # Go panics
|
||||
"Traceback", # Python errors
|
||||
]
|
||||
```
|
||||
|
||||
#### Application Errors
|
||||
```toml
|
||||
# Java/JVM
|
||||
# Java
|
||||
patterns = [
|
||||
"Exception",
|
||||
"\\.java:[0-9]+", # Stack trace lines
|
||||
"at com\\.mycompany\\.", # Company packages
|
||||
"NullPointerException|ClassNotFoundException"
|
||||
"at .+\\.java:[0-9]+",
|
||||
"NullPointerException"
|
||||
]
|
||||
|
||||
# Python
|
||||
patterns = [
|
||||
"Traceback \\(most recent call last\\)",
|
||||
"Traceback",
|
||||
"File \".+\\.py\", line [0-9]+",
|
||||
"(ValueError|TypeError|KeyError)"
|
||||
"ValueError|TypeError"
|
||||
]
|
||||
|
||||
# Go
|
||||
@ -143,297 +103,73 @@ patterns = [
|
||||
"goroutine [0-9]+",
|
||||
"runtime error:"
|
||||
]
|
||||
```
|
||||
|
||||
# Node.js
|
||||
### Performance Issues
|
||||
```toml
|
||||
patterns = [
|
||||
"Error:",
|
||||
"at .+ \\(.+\\.js:[0-9]+:[0-9]+\\)",
|
||||
"UnhandledPromiseRejection"
|
||||
"took [0-9]{4,}ms", # >999ms operations
|
||||
"timeout|timed out",
|
||||
"slow query",
|
||||
"high cpu|cpu usage: [8-9][0-9]%"
|
||||
]
|
||||
```
|
||||
|
||||
#### Performance Issues
|
||||
### HTTP Patterns
|
||||
```toml
|
||||
patterns = [
|
||||
"took [0-9]{4,}ms", # Operations over 999ms
|
||||
"duration>[0-9]{3,}s", # Long durations
|
||||
"timeout|timed out", # Timeouts
|
||||
"slow query", # Database
|
||||
"memory pressure", # Memory issues
|
||||
"high cpu|cpu usage: [8-9][0-9]%" # CPU issues
|
||||
]
|
||||
```
|
||||
|
||||
#### Security Patterns
|
||||
```toml
|
||||
patterns = [
|
||||
"(?i)(unauthorized|forbidden|denied)",
|
||||
"(?i)(auth|authentication) fail",
|
||||
"invalid (token|session|credentials)",
|
||||
"SQL injection|XSS|CSRF",
|
||||
"brute force|rate limit",
|
||||
"suspicious activity"
|
||||
]
|
||||
```
|
||||
|
||||
#### HTTP Patterns
|
||||
```toml
|
||||
# Error status codes
|
||||
patterns = [
|
||||
"status[=:][4-5][0-9]{2}", # status=404, status:500
|
||||
"HTTP/[0-9.]+ [4-5][0-9]{2}", # HTTP/1.1 404
|
||||
"\"status\":\\s*[4-5][0-9]{2}" # JSON "status": 500
|
||||
]
|
||||
|
||||
# Specific endpoints
|
||||
patterns = [
|
||||
"\"(GET|POST|PUT|DELETE) /api/",
|
||||
"/api/v[0-9]+/users",
|
||||
"path=\"/admin"
|
||||
"status[=:][4-5][0-9]{2}", # 4xx/5xx codes
|
||||
"HTTP/[0-9.]+ [4-5][0-9]{2}",
|
||||
"\"/api/v[0-9]+/", # API paths
|
||||
]
|
||||
```
|
||||
|
||||
## Filter Chains
|
||||
|
||||
Multiple filters create a processing chain. Each filter must pass for the log to be streamed.
|
||||
|
||||
### Example: Error Monitoring
|
||||
### Error Monitoring
|
||||
```toml
|
||||
# Step 1: Include only errors and warnings
|
||||
[[streams.filters]]
|
||||
# Include errors
|
||||
[[pipelines.filters]]
|
||||
type = "include"
|
||||
logic = "or"
|
||||
patterns = [
|
||||
"(?i)\\b(error|fail|exception)\\b",
|
||||
"(?i)\\b(warn|warning)\\b",
|
||||
"(?i)\\b(critical|fatal|severe)\\b"
|
||||
]
|
||||
patterns = ["(?i)\\b(error|fail|critical)\\b"]
|
||||
|
||||
# Step 2: Exclude known non-issues
|
||||
[[streams.filters]]
|
||||
# Exclude known non-issues
|
||||
[[pipelines.filters]]
|
||||
type = "exclude"
|
||||
patterns = [
|
||||
"Error: Expected behavior",
|
||||
"Warning: Deprecated API",
|
||||
"INFO.*error in message" # INFO logs talking about errors
|
||||
]
|
||||
|
||||
# Step 3: Exclude noisy sources
|
||||
[[streams.filters]]
|
||||
type = "exclude"
|
||||
patterns = [
|
||||
"/health",
|
||||
"/metrics",
|
||||
"ELB-HealthChecker",
|
||||
"Googlebot"
|
||||
]
|
||||
patterns = ["Error: Expected", "/health"]
|
||||
```
|
||||
|
||||
### Example: API Monitoring
|
||||
### API Monitoring
|
||||
```toml
|
||||
# Include only API calls
|
||||
[[streams.filters]]
|
||||
# Include API calls
|
||||
[[pipelines.filters]]
|
||||
type = "include"
|
||||
patterns = [
|
||||
"/api/",
|
||||
"/v[0-9]+/"
|
||||
]
|
||||
patterns = ["/api/", "/v[0-9]+/"]
|
||||
|
||||
# Exclude successful requests
|
||||
[[streams.filters]]
|
||||
# Exclude successful
|
||||
[[pipelines.filters]]
|
||||
type = "exclude"
|
||||
patterns = [
|
||||
"\" 200 ", # HTTP 200 OK
|
||||
"\" 201 ", # HTTP 201 Created
|
||||
"\" 204 ", # HTTP 204 No Content
|
||||
"\" 304 " # HTTP 304 Not Modified
|
||||
]
|
||||
|
||||
# Exclude OPTIONS requests (CORS)
|
||||
[[streams.filters]]
|
||||
type = "exclude"
|
||||
patterns = [
|
||||
"OPTIONS "
|
||||
]
|
||||
patterns = ["\" 2[0-9]{2} "]
|
||||
```
|
||||
|
||||
### Example: Security Audit
|
||||
```toml
|
||||
# Include security-relevant events
|
||||
[[streams.filters]]
|
||||
type = "include"
|
||||
logic = "or"
|
||||
patterns = [
|
||||
"(?i)auth",
|
||||
"(?i)login|logout",
|
||||
"(?i)sudo|root",
|
||||
"(?i)ssh|sftp|ftp",
|
||||
"(?i)firewall|iptables",
|
||||
"COMMAND=", # sudo commands
|
||||
"USER=", # user actions
|
||||
"SELINUX"
|
||||
]
|
||||
## Performance Tips
|
||||
|
||||
# Must also contain failure/success indicators
|
||||
[[streams.filters]]
|
||||
type = "include"
|
||||
logic = "or"
|
||||
patterns = [
|
||||
"(?i)(fail|denied|error)",
|
||||
"(?i)(success|accepted|granted)",
|
||||
"(?i)(invalid|unauthorized)"
|
||||
]
|
||||
```
|
||||
|
||||
## Performance Considerations
|
||||
|
||||
### Pattern Complexity
|
||||
|
||||
Simple patterns are fast (~1μs per check):
|
||||
```toml
|
||||
patterns = ["ERROR", "WARN", "FATAL"]
|
||||
```
|
||||
|
||||
Complex patterns are slower (~10-100μs per check):
|
||||
```toml
|
||||
patterns = [
|
||||
"^\\[\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}\\]\\s+\\[(ERROR|WARN)\\]\\s+\\[([^\\]]+)\\]\\s+(.+)$"
|
||||
]
|
||||
```
|
||||
|
||||
### Optimization Tips
|
||||
|
||||
1. **Use anchors when possible**:
|
||||
```toml
|
||||
"^ERROR" # Faster than "ERROR"
|
||||
```
|
||||
|
||||
2. **Avoid nested quantifiers**:
|
||||
```toml
|
||||
# BAD: Can cause exponential backtracking
|
||||
"((a+)+)+"
|
||||
|
||||
# GOOD: Linear time
|
||||
"a+"
|
||||
```
|
||||
|
||||
3. **Use non-capturing groups**:
|
||||
```toml
|
||||
"(?:error|warn)" # Instead of "(error|warn)"
|
||||
```
|
||||
|
||||
4. **Order patterns by frequency**:
|
||||
```toml
|
||||
# Most common first
|
||||
patterns = ["ERROR", "WARN", "INFO", "DEBUG"]
|
||||
```
|
||||
|
||||
5. **Prefer character classes**:
|
||||
```toml
|
||||
"[0-9]" # Instead of "\\d"
|
||||
"[a-zA-Z]" # Instead of "\\w"
|
||||
```
|
||||
1. **Use anchors**: `^ERROR` faster than `ERROR`
|
||||
2. **Avoid nested quantifiers**: `((a+)+)+`
|
||||
3. **Non-capturing groups**: `(?:error|warn)`
|
||||
4. **Order by frequency**: Most common first
|
||||
5. **Simple patterns**: Faster than complex regex
|
||||
|
||||
## Testing Filters
|
||||
|
||||
### Test Configuration
|
||||
Create a test configuration with sample logs:
|
||||
|
||||
```toml
|
||||
[[streams]]
|
||||
name = "test"
|
||||
[streams.monitor]
|
||||
targets = [{ path = "./test-logs", pattern = "*.log" }]
|
||||
|
||||
[[streams.filters]]
|
||||
type = "include"
|
||||
patterns = ["YOUR_PATTERN_HERE"]
|
||||
|
||||
[streams.httpserver]
|
||||
enabled = true
|
||||
port = 8888
|
||||
```
|
||||
|
||||
### Generate Test Logs
|
||||
```bash
|
||||
# Create test log entries
|
||||
echo "[ERROR] Database connection failed" >> test-logs/app.log
|
||||
echo "[INFO] User logged in" >> test-logs/app.log
|
||||
echo "[WARN] High memory usage: 85%" >> test-logs/app.log
|
||||
# Test configuration
|
||||
echo "[ERROR] Test" >> test.log
|
||||
echo "[INFO] Test" >> test.log
|
||||
|
||||
# Run LogWisp with debug logging
|
||||
logwisp --config test.toml --log-level debug
|
||||
# Run with debug
|
||||
logwisp --log-level debug
|
||||
|
||||
# Check what passes through
|
||||
curl -N http://localhost:8888/stream
|
||||
```
|
||||
|
||||
### Debug Filter Behavior
|
||||
Enable debug logging to see filter decisions:
|
||||
|
||||
```bash
|
||||
logwisp --log-level debug --log-output stderr
|
||||
```
|
||||
|
||||
Look for messages like:
|
||||
```
|
||||
Entry filtered out component=filter_chain filter_index=0 filter_type=include
|
||||
Entry passed all filters component=filter_chain
|
||||
```
|
||||
|
||||
## Common Pitfalls
|
||||
|
||||
### Case Sensitivity
|
||||
By default, patterns are case-sensitive:
|
||||
```toml
|
||||
# Won't match "error" or "Error"
|
||||
patterns = ["ERROR"]
|
||||
|
||||
# Use case-insensitive flag
|
||||
patterns = ["(?i)error"]
|
||||
```
|
||||
|
||||
### Partial Matches
|
||||
Patterns match substrings by default:
|
||||
```toml
|
||||
# Matches "ERROR", "ERRORS", "TERROR"
|
||||
patterns = ["ERROR"]
|
||||
|
||||
# Use word boundaries for exact words
|
||||
patterns = ["\\bERROR\\b"]
|
||||
```
|
||||
|
||||
### Special Characters
|
||||
Remember to escape regex special characters:
|
||||
```toml
|
||||
# Won't work as expected
|
||||
patterns = ["[ERROR]"]
|
||||
|
||||
# Correct: escape brackets
|
||||
patterns = ["\\[ERROR\\]"]
|
||||
```
|
||||
|
||||
### Performance Impact
|
||||
Too many complex patterns can impact performance:
|
||||
```toml
|
||||
# Consider splitting into multiple streams instead
|
||||
[[streams.filters]]
|
||||
patterns = [
|
||||
# 50+ complex patterns...
|
||||
]
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Start Simple**: Begin with basic patterns and refine as needed
|
||||
2. **Test Thoroughly**: Use test logs to verify filter behavior
|
||||
3. **Monitor Performance**: Check filter statistics in `/status`
|
||||
4. **Document Patterns**: Comment complex patterns for maintenance
|
||||
5. **Use Multiple Streams**: Instead of complex filters, consider separate streams
|
||||
6. **Regular Review**: Periodically review and optimize filter rules
|
||||
|
||||
## See Also
|
||||
|
||||
- [Configuration Guide](configuration.md) - Complete configuration reference
|
||||
- [Performance Tuning](performance.md) - Optimization guidelines
|
||||
- [Examples](examples/) - Real-world filter configurations
|
||||
# Check output
|
||||
curl -N http://localhost:8080/stream
|
||||
```
|
||||
Reference in New Issue
Block a user