Files
logwisp/doc/filters.md
2025-07-12 23:24:41 -04:00

6.5 KiB

Filter Guide

LogWisp filters control which log entries pass through pipelines using regular expressions.

How Filters Work

  • Include: Only matching logs pass (whitelist)
  • Exclude: Matching logs are dropped (blacklist)
  • Multiple filters apply sequentially - all must pass

Configuration

[[pipelines.filters]]
type = "include"    # or "exclude"
logic = "or"        # or "and"
patterns = [
    "pattern1",
    "pattern2"
]

Filter Types

Include Filter

[[pipelines.filters]]
type = "include"
logic = "or"
patterns = ["ERROR", "WARN", "CRITICAL"]
# Only ERROR, WARN, or CRITICAL logs pass

Exclude Filter

[[pipelines.filters]]
type = "exclude"
patterns = ["DEBUG", "TRACE", "/health"]
# DEBUG, TRACE, and health checks are dropped

Logic Operators

  • OR: Match ANY pattern (default)
  • AND: Match ALL patterns
# OR Logic
logic = "or"
patterns = ["ERROR", "FAIL"]
# Matches: "ERROR: disk full" OR "FAIL: timeout"

# AND Logic
logic = "and"
patterns = ["database", "timeout", "ERROR"]
# Matches: "ERROR: database connection timeout"
# Not: "ERROR: file not found"

Pattern Syntax

Go regular expressions (RE2):

"ERROR"              # Substring match
"(?i)error"          # Case-insensitive
"\\berror\\b"        # Word boundaries
"^ERROR"             # Start of line
"ERROR$"             # End of line
"error|fail|warn"    # Alternatives

Common Patterns

Log Levels

patterns = [
    "\\[(ERROR|WARN|INFO)\\]",      # [ERROR] format
    "(?i)\\b(error|warning)\\b",    # Word boundaries
    "level=(error|warn)",           # key=value format
]

Application Errors

# Java
patterns = [
    "Exception",
    "at .+\\.java:[0-9]+",
    "NullPointerException"
]

# Python
patterns = [
    "Traceback",
    "File \".+\\.py\", line [0-9]+",
    "ValueError|TypeError"
]

# Go
patterns = [
    "panic:",
    "goroutine [0-9]+",
    "runtime error:"
]

Performance Issues

patterns = [
    "took [0-9]{4,}ms",           # >999ms operations
    "timeout|timed out",
    "slow query",
    "high cpu|cpu usage: [8-9][0-9]%"
]

HTTP Patterns

patterns = [
    "status[=:][4-5][0-9]{2}",    # 4xx/5xx codes
    "HTTP/[0-9.]+ [4-5][0-9]{2}",
    "\"/api/v[0-9]+/",            # API paths
]

Filter Chains

Error Monitoring

# Include errors
[[pipelines.filters]]
type = "include"
patterns = ["(?i)\\b(error|fail|critical)\\b"]

# Exclude known non-issues
[[pipelines.filters]]
type = "exclude"
patterns = ["Error: Expected", "/health"]

API Monitoring

# Include API calls
[[pipelines.filters]]
type = "include"
patterns = ["/api/", "/v[0-9]+/"]

# Exclude successful
[[pipelines.filters]]
type = "exclude"
patterns = ["\" 2[0-9]{2} "]

Performance Tips

  1. Use anchors: ^ERROR faster than ERROR
  2. Avoid nested quantifiers: ((a+)+)+
  3. Non-capturing groups: (?:error|warn)
  4. Order by frequency: Most common first
  5. Simple patterns: Faster than complex regex

Testing Filters

# Test configuration
echo "[ERROR] Test" >> test.log
echo "[INFO] Test" >> test.log

# Run with debug
logwisp --log-level debug

# Check output
curl -N http://localhost:8080/stream

Regex Pattern Guide

LogWisp uses Go's standard regex engine (RE2). It includes most common features but omits backtracking-heavy syntax.

For complex logic, chain multiple filters (e.g., an include followed by an exclude) rather than writing one complex regex.

Basic Matching

Pattern Description Example
literal Matches the exact text. "ERROR" matches any log with "ERROR".
. Matches any single character (except newline). "user." matches "userA", "userB", etc.
a|b Matches expression a OR expression b. "error|fail" matches lines with "error" or "fail".

Anchors and Boundaries

Anchors tie your pattern to a specific position in the line.

Pattern Description Example
^ Matches the beginning of the line. "^ERROR" matches lines starting with "ERROR".
$ Matches the end of the line. "crashed$" matches lines ending with "crashed".
\b Matches a word boundary. "\berror\b" matches "error" but not "terrorist".

Character Classes

Pattern Description Example
[abc] Matches a, b, or c. "[aeiou]" matches any vowel.
[^abc] Matches any character except a, b, or c. "[^0-9]" matches any non-digit.
[a-z] Matches any character in the range a to z. "[a-zA-Z]" matches any letter.
\d Matches any digit ([0-9]). \d{3} matches three digits, like "123".
\w Matches any word character ([a-zA-Z0-9_]). \w+ matches one or more word characters.
\s Matches any whitespace character. \s+ matches one or more spaces or tabs.

Quantifiers

Quantifiers specify how many times a character or group must appear.

Pattern Description Example
* Zero or more times. "a*" matches "", "a", "aa".
+ One or more times. "a+" matches "a", "aa", but not "".
? Zero or one time. "colou?r" matches "color" and "colour".
{n} Exactly n times. \d{4} matches a 4-digit number.
{n,} n or more times. \d{2,} matches numbers with 2 or more digits.
{n,m} Between n and m times. \d{1,3} matches numbers with 1 to 3 digits.

Grouping

Pattern Description Example
(...) Groups an expression and captures the match. `(ERROR
(?:...) Groups an expression without capturing. Faster. `(?:ERROR

Flags and Modifiers

Flags are placed at the beginning of a pattern to change its behavior.

Pattern Description
(?i) Case-insensitive matching.
(?m) Multi-line mode (^ and $ match start/end of lines).

Example: "(?i)error" matches "error", "ERROR", and "Error".

Practical Examples for Logging

  • Match an IP Address:

    \b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b
    
  • Match HTTP 4xx or 5xx Status Codes:

    "status[= ](4|5)\d{2}"
    
  • Match a slow database query (>100ms):

    "Query took [1-9]\d{2,}ms"
    
  • Match key-value pairs:

    "user=(admin|guest)"
    
  • Match Java exceptions:

    "Exception:|at .+\.java:\d+"