Files
log/doc/disk-management.md

8.4 KiB

Disk Management

← Logging Guide | ← Back to README | Heartbeat Monitoring →

Comprehensive guide to log file rotation, retention policies, and disk space management.

Table of Contents

File Rotation

Automatic Rotation

Log files are automatically rotated when they reach the configured size limit:

logger.InitWithDefaults(
    "max_size_mb=100",  // Rotate at 100MB
)

Rotation Behavior

  1. Size Check: Before each write, the logger checks if the file would exceed max_size_mb
  2. New File Creation: Creates a new file with timestamp: appname_240115_103045_123456789.log
  3. Seamless Transition: No logs are lost during rotation
  4. Old File Closure: Previous file is properly closed and synced

File Naming Convention

{name}_{YYMMDD}_{HHMMSS}_{nanoseconds}.{extension}

Example: myapp_240115_143022_987654321.log

Components:

  • name: Configured log name
  • YYMMDD: Date (year, month, day)
  • HHMMSS: Time (hour, minute, second)
  • nanoseconds: For uniqueness
  • extension: Configured extension

Disk Space Management

Space Limits

The logger enforces two types of space limits:

logger.InitWithDefaults(
    "max_total_size_mb=1000",   // Total log directory size
    "min_disk_free_mb=5000",    // Minimum free disk space
)

Automatic Cleanup

When limits are exceeded, the logger:

  1. Identifies oldest log files
  2. Deletes them until space requirements are met
  3. Preserves the current active log file
  4. Logs cleanup actions for audit

Example Configuration

// Conservative: Strict limits
logger.InitWithDefaults(
    "max_size_mb=50",          // 50MB files
    "max_total_size_mb=500",   // 500MB total
    "min_disk_free_mb=1000",   // 1GB free required
)

// Generous: Large files, external archival
logger.InitWithDefaults(
    "max_size_mb=1000",        // 1GB files
    "max_total_size_mb=0",     // No total limit
    "min_disk_free_mb=100",    // 100MB free required
)

// Balanced: Production defaults
logger.InitWithDefaults(
    "max_size_mb=100",         // 100MB files
    "max_total_size_mb=5000",  // 5GB total
    "min_disk_free_mb=500",    // 500MB free required
)

Retention Policies

Time-Based Retention

Automatically delete logs older than a specified duration:

logger.InitWithDefaults(
    "retention_period_hrs=168",    // Keep 7 days
    "retention_check_mins=60",     // Check hourly
)

Retention Examples

// Daily logs, keep 30 days
logger.InitWithDefaults(
    "retention_period_hrs=720",    // 30 days
    "retention_check_mins=60",     // Check hourly
    "max_size_mb=1000",           // 1GB daily files
)

// High-frequency logs, keep 24 hours
logger.InitWithDefaults(
    "retention_period_hrs=24",     // 1 day
    "retention_check_mins=15",     // Check every 15 min
    "max_size_mb=100",            // 100MB files
)

// Compliance: Keep 90 days
logger.InitWithDefaults(
    "retention_period_hrs=2160",   // 90 days
    "retention_check_mins=360",    // Check every 6 hours
    "max_total_size_mb=100000",   // 100GB total
)

Retention Priority

When multiple policies conflict, cleanup priority is:

  1. Disk free space (highest priority)
  2. Total size limit
  3. Retention period (lowest priority)

Adaptive Monitoring

Adaptive Disk Checks

The logger adjusts disk check frequency based on logging volume:

logger.InitWithDefaults(
    "enable_adaptive_interval=true",
    "disk_check_interval_ms=5000",    // Base: 5 seconds
    "min_check_interval_ms=100",      // Minimum: 100ms
    "max_check_interval_ms=60000",    // Maximum: 1 minute
)

How It Works

  1. Low Activity: Interval increases (up to max)
  2. High Activity: Interval decreases (down to min)
  3. Reactive Checks: Immediate check after 10MB written

Monitoring Disk Usage

Check disk-related heartbeat messages:

logger.InitWithDefaults(
    "heartbeat_level=2",           // Enable disk stats
    "heartbeat_interval_s=300",    // Every 5 minutes
)

Output:

2024-01-15T10:30:00Z DISK type="disk" sequence=1 rotated_files=5 deleted_files=2 total_log_size_mb="487.32" log_file_count=8 current_file_size_mb="23.45" disk_status_ok=true disk_free_mb="5234.67"

Recovery Behavior

Disk Full Handling

When disk space is exhausted:

  1. Detection: Write failure or space check triggers recovery
  2. Cleanup Attempt: Delete oldest logs to free space
  3. Status Update: Set disk_status_ok=false if cleanup fails
  4. Log Dropping: New logs dropped until space available
  5. Recovery: Automatic retry on next disk check

Monitoring Recovery

// Check for disk issues in logs
grep "disk full" /var/log/myapp/*.log
grep "cleanup failed" /var/log/myapp/*.log

// Monitor disk status in heartbeats
grep "disk_status_ok=false" /var/log/myapp/*.log

Manual Intervention

If automatic cleanup fails:

# Check disk usage
df -h /var/log

# Find large log files
find /var/log/myapp -name "*.log" -size +100M

# Manual cleanup (oldest first)
ls -t /var/log/myapp/*.log | tail -n 20 | xargs rm

# Verify space
df -h /var/log

Best Practices

1. Plan for Growth

Estimate log volume and set appropriate limits:

// Calculate required space:
// - Average log entry: 200 bytes
// - Entries per second: 100
// - Daily volume: 200 * 100 * 86400 = 1.7GB

logger.InitWithDefaults(
    "max_size_mb=2000",          // 2GB files (~ 1 day)
    "max_total_size_mb=15000",   // 15GB (~ 1 week)
    "retention_period_hrs=168",   // 7 days
)

2. External Archival

For long-term storage, implement external archival:

// Configure for archival
logger.InitWithDefaults(
    "max_size_mb=1000",          // 1GB files for easy transfer
    "max_total_size_mb=10000",   // 10GB local buffer
    "retention_period_hrs=48",    // 2 days local
)

// Archive completed files
func archiveCompletedLogs(archivePath string) error {
    files, _ := filepath.Glob("/var/log/myapp/*.log")
    for _, file := range files {
        if !isCurrentLogFile(file) {
            // Move to archive storage (S3, NFS, etc.)
            if err := archiveFile(file, archivePath); err != nil {
                return err
            }
            os.Remove(file)
        }
    }
    return nil
}

3. Monitor Disk Health

Set up alerts for disk issues:

// Parse heartbeat logs for monitoring
type DiskStats struct {
    TotalSizeMB    float64
    FileCount      int
    DiskFreeMB     float64
    DiskStatusOK   bool
}

func monitorDiskHealth(logLine string) {
    if strings.Contains(logLine, "type=\"disk\"") {
        stats := parseDiskHeartbeat(logLine)
        
        if !stats.DiskStatusOK {
            alert("Log disk unhealthy")
        }
        
        if stats.DiskFreeMB < 1000 {
            alert("Low disk space: %.0fMB free", stats.DiskFreeMB)
        }
        
        if stats.FileCount > 100 {
            alert("Too many log files: %d", stats.FileCount)
        }
    }
}

4. Separate Log Volumes

Use dedicated volumes for logs:

# Create dedicated log volume
mkdir -p /mnt/logs
mount /dev/sdb1 /mnt/logs

# Configure logger
logger.InitWithDefaults(
    "directory=/mnt/logs/myapp",
    "max_total_size_mb=50000",   # Use most of volume
    "min_disk_free_mb=1000",     # Leave 1GB free
)

5. Test Cleanup Behavior

Verify cleanup works before production:

// Test configuration
func TestDiskCleanup(t *testing.T) {
    logger := log.NewLogger()
    logger.InitWithDefaults(
        "directory=./test_logs",
        "max_size_mb=1",             // Small files
        "max_total_size_mb=5",       // Low limit
        "retention_period_hrs=0.01", // 36 seconds
        "retention_check_mins=0.5",  // 30 seconds
    )
    
    // Generate logs to trigger cleanup
    for i := 0; i < 1000; i++ {
        logger.Info(strings.Repeat("x", 1000))
    }
    
    time.Sleep(45 * time.Second)
    
    // Verify cleanup occurred
    files, _ := filepath.Glob("./test_logs/*.log")
    if len(files) > 5 {
        t.Errorf("Cleanup failed: %d files remain", len(files))
    }
}

← Logging Guide | ← Back to README | Heartbeat Monitoring →