Systemd Analyze Blame: Ultimate Linux Boot Performance Guide

Q: How do I use PM2 save and resurrect commands effectively?

Use pm2 save after starting your applications to save their configuration to the startup script. This creates a dump file in ~/.pm2/dump.pm2 that contains all running processes. Use pm2 resurrect to restore all saved processes, typically during system startup. For automation, combine these with systemd: create a systemd service that runs pm2 resurrect on boot, and use pm2 startup to generate the appropriate startup script for your system.

Master the systemd-analyze blame command to dramatically improve your Linux system’s boot performance. This comprehensive guide will teach you how to identify bottlenecks, analyze service dependencies, and optimize startup times for faster system boots.

Understanding Systemd-Analyze Blame

Systemd-analyze blame is a powerful diagnostic tool that helps system administrators and developers identify which services are slowing down Linux boot times. The command lists all running units ordered by the time they took to initialize, making it essential for performance optimization and troubleshooting.

When you run systemd-analyze blame, you get a detailed breakdown of service startup times, allowing you to pinpoint exactly which processes are consuming the most time during system initialization. This information is crucial for optimizing boot performance, especially on servers and development machines where fast startup times are critical.

How Systemd-Analyze Blame Works

The blame command analyzes the systemd journal to determine how long each unit spent in the “activating” state before transitioning to “active”. It measures the time from when a service starts until it completes its initialization process. However, it’s important to understand that this tool has some limitations:

It doesn’t display results for services with Type=simple because systemd considers these services started immediately
The output might be misleading if one service waits for another to complete
It only shows startup time, not execution queue time
Device units that transition directly from “inactive” to “active” aren’t measured

Getting Started with Systemd-Analyze Commands

Before diving deep into blame analysis, let’s explore the basic systemd-analyze commands that provide a comprehensive view of your system’s boot performance.

Check Total Boot Time

The first step in boot optimization is understanding your current boot performance:

systemd-analyze time

Sample output:

Startup finished in 2.584s (kernel) + 19.176s (initrd) + 47.847s (userspace) = 1min 9.608s
multi-user.target reached after 47.820s in userspace

This breakdown shows:

Kernel time: Time spent in kernel before userspace
Initrd time: Time in initial RAM disk before normal userspace
Userspace time: Time for normal system initialization
Total boot time: Combined duration of all phases

Analyze Service Startup Times

Now let’s use the blame command to identify the slowest services:

systemd-analyze blame | head

Sample output:

32.875s pmlogger.service
20.905s systemd-networkd-wait-online.service
13.299s dev-vda1.device
8.456s mariadb.service
5.234s NetworkManager-wait-online.service
3.108s network.service
2.421s plymouth-quit-wait.service
1.890s snapd.service
1.234s ufw.service
987ms systemd-journald.service

Interpreting Systemd-Analyze Blame Results

Understanding the output is crucial for effective optimization. Let’s break down what each line represents and how to interpret the data.

Reading the Output Format

Each line in the blame output follows this format:

[time] [service-name]

Time: Duration in seconds (s) or milliseconds (ms)
Service name: The systemd unit that took that long to start

Common Slow Services and Their Impact

Based on extensive analysis across various Linux distributions, here are the most common services that typically cause boot delays:

Network Services

NetworkManager-wait-online.service
systemd-networkd-wait-online.service

These services wait for network connectivity, which can add 10-30 seconds to boot time, especially on systems with slow or unreliable network connections.

Database Services

mariadb.service
postgresql.service
mysql.service

Database services often require significant startup time due to initialization, recovery checks, and data loading.

Logging Services

pmlogger.service
rsyslog.service

System logging services can be slow, particularly when processing large log files or performing maintenance tasks.

Security Services

ufw.service
firewalld.service
apparmor.service

Security services add time due to rule loading and system hardening processes.

Practical Boot Optimization Strategies

Now that you can identify slow services, let’s implement effective optimization strategies.

Strategy 1: Disable Unnecessary Services

Before disabling any service, research its purpose and impact:

# Check service description
systemctl status [service-name]

# Check if service is required by other services
systemctl list-dependencies [service-name]

Example: Disabling NetworkManager-wait-online.service

# First, check if you really need it
systemctl status NetworkManager-wait-online.service

# If not needed, disable it
sudo systemctl disable NetworkManager-wait-online.service
sudo systemctl mask NetworkManager-wait-online.service

Strategy 2: Optimize Service Configuration

For services you can’t disable, optimize their configuration:

Database Optimization

# For MySQL/MariaDB, optimize my.cnf
[mysqld]
innodb_buffer_pool_size = 256M
innodb_log_file_size = 64M
skip-name-resolve

Network Service Optimization

# Reduce NetworkManager timeout
sudo nano /etc/NetworkManager/NetworkManager.conf

[connection]
ipv4.dhcp-timeout=10

Strategy 3: Use Parallel Boot

Enable parallel service startup to utilize multi-core systems:

# Check current setting
systemctl show-defaults | grep DefaultDependencies

# Optimize for parallel boot
sudo systemctl daemon-reload

Advanced Systemd-Analyze Techniques

Critical Chain Analysis

Use the critical-chain command to understand service dependencies:

systemd-analyze critical-chain

Sample output:

multi-user.target @47.820s
└─pmie.service @35.968s +548ms
  └─pmcd.service @33.715s +2.247s
    └─network-online.target @33.712s
      └─systemd-networkd-wait-online.service @12.804s +20.905s

This shows the dependency chain and helps identify bottlenecks in the startup sequence.

Visual Boot Analysis

Generate a visual representation of the boot process:

# Create SVG boot chart
systemd-analyze plot > boot-analysis.svg

# Create detailed boot report
systemd-analyze dump > boot-dump.txt

Performance Comparison

Compare boot performance before and after optimization:

# Before optimization
systemd-analyze blame > before-optimization.txt

# After optimization
systemd-analyze blame > after-optimization.txt

# Compare results
diff before-optimization.txt after-optimization.txt

Real-World Optimization Examples

Case Study 1: Development Machine Boot Time Reduction

Initial State: 2 minutes 15 seconds boot time Problem Services:

NetworkManager-wait-online.service: 18s
Docker.service: 25s
MySQL.service: 12s

Optimization Steps:

# Disable network wait for development
sudo systemctl disable NetworkManager-wait-online.service

# Delay Docker startup
sudo systemctl edit docker.service
[Service]
ExecStart=
ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock

# Optimize MySQL for development
sudo nano /etc/mysql/my.cnf
[mysqld]
innodb_buffer_pool_size = 128M
innodb_flush_log_at_trx_commit = 2

Result: 45 seconds boot time (67% improvement)

Case Study 2: Server Boot Optimization

Initial State: 3 minutes 45 seconds boot time Problem Services:

systemd-networkd-wait-online.service: 30s
rsyslog.service: 15s
ufw.service: 8s

Optimization Steps:

# Configure network service timeout
sudo nano /etc/systemd/system/network-online.target.d/timeout.conf
[Unit]
TimeoutStartSec=10

# Optimize rsyslog for server environment
sudo nano /etc/rsyslog.conf
$ModLoad immark
$MarkMessagePeriod 0

# Pre-load UFW rules
sudo ufw enable
sudo systemctl enable ufw.service

Result: 1 minute 20 seconds boot time (64% improvement)

Troubleshooting Common Issues

Service Not Appearing in Blame Output

If a service doesn’t appear in systemd-analyze blame output:

Check service type:

systemctl show [service-name] -p Type

Verify service is enabled:

systemctl is-enabled [service-name]

Check service status:

systemctl status [service-name]

Inconsistent Boot Times

If boot times vary significantly:

# Check for hardware issues
sudo dmesg | grep -i error

# Monitor disk I/O
sudo iostat -x 1 5

# Check memory usage
free -h

Services Taking Too Long

For services that consistently take too long:

# Check service logs
journalctl -u [service-name] -b

# Analyze service dependencies
systemctl list-dependencies [service-name]

# Check resource usage
systemd-cgtop

Best Practices for Boot Optimization

Do’s and Don’ts

DO:

Research services before disabling them
Test changes in non-production environments first
Document all changes for rollback purposes
Monitor system performance after optimization
Use incremental optimization approach

DON’T:

Disable critical system services blindly
Ignore security implications
Make multiple changes simultaneously
Forget to backup configurations
Ignore service dependencies

Monitoring and Maintenance

Establish a monitoring routine:

# Weekly boot performance check
systemd-analyze blame > /var/log/boot-performance-$(date +%Y%m%d).txt

# Monthly performance comparison
systemd-analyze time >> /var/log/boot-trend.log

Create automated alerts for performance degradation:

#!/bin/bash
# boot-monitor.sh
BOOT_TIME=$(systemd-analyze time | grep "Startup finished" | awk '{print $4}' | sed 's/s//')
THRESHOLD=120

if [ $(echo "$BOOT_TIME > $THRESHOLD" | bc -l) -eq 1 ]; then
    echo "Boot time warning: ${BOOT_TIME}s exceeds threshold of ${THRESHOLD}s" | mail -s "Boot Performance Alert" admin@example.com
fi

Integration with DevOps Workflows

CI/CD Pipeline Integration

Integrate boot performance testing into your CI/CD pipeline:

# Example GitHub Actions workflow
name: Boot Performance Test
on: [push, pull_request]

jobs:
  boot-performance:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - name: Test Boot Performance
        run: |
          sudo apt-get update
          sudo apt-get install -y systemd
          systemd-analyze blame > boot-times.txt
          python scripts/check-boot-performance.py boot-times.txt

Infrastructure as Code

Include boot optimization in your infrastructure configuration:

# Ansible playbook example
- name: Optimize boot performance
  systemd:
    name: NetworkManager-wait-online.service
    enabled: no
    state: stopped
  when: environment == "development"

- name: Configure systemd timeouts
  lineinfile:
    path: /etc/systemd/system.conf
    regexp: '^DefaultTimeoutStartSec='
    line: 'DefaultTimeoutStartSec=10s'

PM2 Process Management for Application Services

While systemd-analyze blame helps optimize system boot performance, managing application services effectively is equally important. PM2 is a popular process manager for Node.js and Python applications that can work alongside systemd for optimal performance.

Essential PM2 Commands

PM2 Save and Resurrect

The pm2 save and pm2 resurrect commands are crucial for process persistence:

# Save current process list to startup script
pm2 save

# Resurrect saved processes on system restart
pm2 resurrect

# Check startup script status
pm2 startup

These commands ensure your applications automatically restart after system reboots, maintaining service availability.

Complete PM2 Workflow

# Start your Python/FastAPI application
pm2 start main.py --name "fastapi-app"

# Monitor process status
pm2 status

# View logs
pm2 logs fastapi-app

# Save current process configuration
pm2 save

# Generate startup script for systemd
pm2 startup systemd

# Enable PM2 to start on boot
sudo env PATH=$PATH:/usr/bin /usr/lib/node_modules/pm2/bin/pm2 startup systemd -u $USER --hp $HOME

PM2 with Python/FastAPI Best Practices

Configuration File Approach

Create an ecosystem configuration file for better management:

// ecosystem.config.js
module.exports = {
  apps: [{
    name: 'fastapi-app',
    script: 'main.py',
    interpreter: 'python3',
    interpreter_args: '-m uvicorn',
    args: 'main:app --host 0.0.0.0 --port 8000',
    instances: 'max',
    exec_mode: 'cluster',
    autorestart: true,
    watch: false,
    max_memory_restart: '1G',
    env: {
      NODE_ENV: 'production'
    },
    error_file: './logs/err.log',
    out_file: './logs/out.log',
    log_file: './logs/combined.log',
    time: true
  }]
};

FastAPI Production Setup

# Install required packages
pip install fastapi uvicorn gunicorn
npm install -g pm2

# Create production startup script
# main.py
from fastapi import FastAPI
import uvicorn

app = FastAPI()

@app.get("/")
async def root():
    return {"message": "Hello World"}

if __name__ == "__main__":
    uvicorn.run(app, host="0.0.0.0", port=8000)

PM2 Process Management Commands

# Start with configuration file
pm2 start ecosystem.config.js

# Stop specific process
pm2 stop fastapi-app

# Restart process
pm2 restart fastapi-app

# Delete process
pm2 delete fastapi-app

# Reload with zero downtime
pm2 reload fastapi-app

# Monitor CPU and memory usage
pm2 monit

# List all processes with details
pm2 show fastapi-app

Integrating PM2 with Systemd

For optimal boot performance, integrate PM2 with systemd:

# Create systemd service for PM2
sudo nano /etc/systemd/system/pm2-user.service

[Unit]
Description=PM2 process manager
After=network.target

[Service]
Type=forking
User=your_username
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
Environment=PATH=/usr/bin:/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
Environment=PM2_HOME=/home/your_username/.pm2
PIDFile=/home/your_username/.pm2/pm2.pid
Restart=on-failure
ExecStart=/usr/lib/node_modules/pm2/bin/pm2 resurrect
ExecReload=/usr/lib/node_modules/pm2/bin/pm2 reload all
ExecStop=/usr/lib/node_modules/pm2/bin/pm2 kill

[Install]
WantedBy=multi-user.target

# Enable and start the service
sudo systemctl enable pm2-user.service
sudo systemctl start pm2-user.service

# Check status
sudo systemctl status pm2-user.service

Performance Monitoring with PM2

# Real-time monitoring
pm2 monit

# Check process metrics
pm2 show fastapi-app

# View resource usage
pm2 status --watch

# Generate performance report
pm2 report

Common PM2 Troubleshooting

# Check PM2 logs
pm2 logs --lines 100

# Restart all processes
pm2 restart all

# Clear logs
pm2 flush

# Reset PM2 completely
pm2 kill
pm2 resurrect

Performance Measurement Tools

Complementary Tools

Enhance your analysis with these additional tools:

# Bootchart for detailed visualization
sudo apt-get install bootchart
bootchart

# Systemd-cgtop for resource monitoring
systemd-cgtop

# Process monitoring during boot
ps aux --sort=-%cpu | head

# PM2 process monitoring
pm2 monit

Benchmarking Script

Create a comprehensive benchmarking script:

#!/bin/bash
# boot-benchmark.sh

echo "=== Boot Performance Benchmark ==="
echo "Date: $(date)"
echo "Hostname: $(hostname)"
echo "Kernel: $(uname -r)"
echo ""

echo "=== Boot Time Analysis ==="
systemd-analyze time

echo ""
echo "=== Top 10 Slowest Services ==="
systemd-analyze blame | head -10

echo ""
echo "=== Critical Chain Analysis ==="
systemd-analyze critical-chain

echo ""
echo "=== System Resources ==="
free -h
df -h
lscpu | grep "Model name"

Security Considerations

While optimizing boot performance, maintain security:

Security vs Performance Balance

Don’t disable security services for performance gains
Keep firewall enabled but optimize rule loading
Maintain logging for security auditing
Preserve authentication services

Secure Optimization Practices

# Verify service security impact
systemctl list-dependencies [service-name] --reverse

# Check security policies
sestatus
apparmor_status

# Audit changes
auditctl -w /etc/systemd/ -p wa -k systemd_changes

Future Trends in Boot Optimization

Emerging Technologies

Stay updated with these boot optimization trends:

Systemd-boot: Faster bootloader alternatives
eBPF-based monitoring: Real-time boot performance tracking
Container-native boot: Optimized for containerized environments
AI-driven optimization: Machine learning for automatic tuning

Preparing for Future Changes

# Monitor systemd updates
apt-cache policy systemd

# Test new systemd features
systemd-analyze --version

# Stay informed about boot optimization techniques
# Follow systemd mailing lists and documentation

FAQ

What is systemd-analyze blame and how does it work?

Systemd-analyze blame is a diagnostic command that lists all running systemd units ordered by the time they took to initialize during system boot. It analyzes the systemd journal to measure how long each service spends in the “activating” state before transitioning to “active”. This information helps identify bottlenecks in the boot process and optimize system startup times. The command is particularly useful for system administrators and developers who need to improve boot performance on servers and development machines.

Why doesn’t systemd-analyze blame show all services?

Systemd-analyze blame doesn’t display results for services with Type=simple because systemd considers these services started immediately upon execution, making it impossible to measure their initialization time. Additionally, device units that transition directly from “inactive” to “active” state without passing through an “activating” state are also not measured. The command only shows services that go through the activation process, which is why some services might be missing from the output even though they’re running on your system.

How do I interpret the time units in systemd-analyze blame output?

The output shows time in seconds (s) for longer durations and milliseconds (ms) for shorter ones. For example, “32.875s pmlogger.service” means the pmlogger service took 32.875 seconds to initialize, while “987ms systemd-journald.service” indicates the journald service took 987 milliseconds. The services are listed in descending order of startup time, with the slowest services appearing first, making it easy to identify the main bottlenecks in your boot process.

Is it safe to disable services that appear in systemd-analyze blame?

Disabling services should be done with caution. Before disabling any service, research its purpose and check if other services depend on it using systemctl list-dependencies [service-name]. Some services, like network wait services, can often be safely disabled on development machines but might be essential on production servers. Always test changes in a non-production environment first and document all modifications for potential rollback. Critical system services, security services, and services required by your applications should never be disabled without thorough understanding of their impact.

How can I reduce the boot time shown by systemd-analyze blame?

Several strategies can reduce boot time: disable unnecessary services using systemctl disable, optimize service configurations (reduce timeouts, adjust resource allocation), enable parallel boot processing, optimize database configurations, and configure network services to avoid waiting for connectivity. For network-related delays, consider disabling services like NetworkManager-wait-online.service if you don’t need immediate network connectivity. Always make changes incrementally and measure the impact after each modification to track improvements.

What’s the difference between systemd-analyze blame and systemd-analyze critical-chain?

Systemd-analyze blame shows services ordered by their individual startup times, while systemd-analyze critical-chain displays the dependency tree and shows how services depend on each other during boot. The blame command helps identify which individual services are slowest, whereas critical-chain reveals the dependency bottlenecks that might be causing delays in the startup sequence. Use blame to identify slow services and critical-chain to understand how service dependencies affect overall boot performance.

How often should I run systemd-analyze blame?

Run systemd-analyze blame regularly as part of system maintenance, especially after installing new software, updating system packages, or making configuration changes. For production servers, weekly checks are recommended to monitor performance trends. For development machines, run it whenever you notice slower boot times or after significant system changes. Consider setting up automated monitoring that alerts you when boot times exceed predefined thresholds.

Can systemd-analyze blame help with troubleshooting boot failures?

Yes, systemd-analyze blame is valuable for troubleshooting boot issues. If your system is taking unusually long to boot, the blame output can identify which services are causing the delay. Combine it with journalctl -b to view boot logs and systemctl status [service-name] to check specific service statuses. For complete troubleshooting, use it alongside other systemd-analyze commands like systemd-analyze time and systemd-analyze critical-chain to get a comprehensive view of boot performance issues.

What is PM2 and how does it work with systemd-analyze blame?

PM2 is a process manager for Node.js and Python applications that works alongside systemd for optimal application performance. While systemd-analyze blame helps optimize system boot times, PM2 manages application processes with features like automatic restarts, clustering, and monitoring. The pm2 save command saves the current process list to a startup script, while pm2 resurrect restores those processes after system reboots. This combination ensures both fast system boot and reliable application management.

How do I use PM2 save and resurrect commands effectively?

Use pm2 save after starting your applications to save their configuration to the startup script. This creates a dump file in ~/.pm2/dump.pm2 that contains all running processes. Use pm2 resurrect to restore all saved processes, typically during system startup. For automation, combine these with systemd: create a systemd service that runs pm2 resurrect on boot, and use pm2 startup to generate the appropriate startup script for your system.

What are the best practices for using PM2 with Python/FastAPI applications?

For Python/FastAPI applications, use PM2 with an ecosystem configuration file that specifies the Python interpreter, uvicorn as the server, and appropriate clustering. Set max_memory_restart to prevent memory leaks, configure proper logging paths, and use cluster mode for production. Always set watch: false in production to avoid unnecessary file monitoring. Integrate with systemd by creating a dedicated PM2 service that ensures your FastAPI applications start automatically after boot optimization with systemd-analyze blame.

Conclusion

Mastering systemd-analyze blame is essential for Linux system administrators and developers who need to optimize boot performance. By understanding how to interpret the output, identify bottlenecks, and implement effective optimization strategies, you can significantly reduce boot times and improve system responsiveness.

Remember that boot optimization is an iterative process. Start with the most impactful changes, measure results, and continue refining your approach. Regular monitoring and maintenance will ensure your system continues to boot efficiently over time.

The key is balancing performance gains with system stability and security. Always test changes thoroughly and maintain proper documentation of your optimization efforts. With the techniques and strategies outlined in this guide, you’re well-equipped to tackle even the most challenging boot performance issues.

Search Posts

Understanding Systemd-Analyze Blame

How Systemd-Analyze Blame Works

Getting Started with Systemd-Analyze Commands

Check Total Boot Time

Analyze Service Startup Times

Interpreting Systemd-Analyze Blame Results

Reading the Output Format

Common Slow Services and Their Impact

Network Services

Database Services

Logging Services

Security Services

Practical Boot Optimization Strategies

Strategy 1: Disable Unnecessary Services

Strategy 2: Optimize Service Configuration

Database Optimization

Network Service Optimization

Strategy 3: Use Parallel Boot

Advanced Systemd-Analyze Techniques

Critical Chain Analysis

Visual Boot Analysis

Performance Comparison

Real-World Optimization Examples

Case Study 1: Development Machine Boot Time Reduction

Case Study 2: Server Boot Optimization

Troubleshooting Common Issues

Service Not Appearing in Blame Output

Inconsistent Boot Times

Services Taking Too Long

Best Practices for Boot Optimization

Do’s and Don’ts

Monitoring and Maintenance

Integration with DevOps Workflows

CI/CD Pipeline Integration

Infrastructure as Code

PM2 Process Management for Application Services

Essential PM2 Commands

PM2 Save and Resurrect

Complete PM2 Workflow

PM2 with Python/FastAPI Best Practices

Configuration File Approach

FastAPI Production Setup

PM2 Process Management Commands

Integrating PM2 with Systemd

Performance Monitoring with PM2

Common PM2 Troubleshooting

Performance Measurement Tools

Complementary Tools

Benchmarking Script

Security Considerations

Security vs Performance Balance

Secure Optimization Practices

Future Trends in Boot Optimization

Emerging Technologies

Preparing for Future Changes

FAQ

What is systemd-analyze blame and how does it work?

Why doesn’t systemd-analyze blame show all services?

How do I interpret the time units in systemd-analyze blame output?

Is it safe to disable services that appear in systemd-analyze blame?

How can I reduce the boot time shown by systemd-analyze blame?

What’s the difference between systemd-analyze blame and systemd-analyze critical-chain?

How often should I run systemd-analyze blame?

Can systemd-analyze blame help with troubleshooting boot failures?

What is PM2 and how does it work with systemd-analyze blame?

How do I use PM2 save and resurrect commands effectively?

What are the best practices for using PM2 with Python/FastAPI applications?

Conclusion

Tags

Related Articles

Systemd Complete Guide: From Beginner to Advanced

FastAPI Production Deployment: Ultimate Guide with systemd + Apache/Nginx

ENOSPC Error: Complete Guide to Linux File Watchers and Node.js

🍪 We Use Cookies

Cookie Preferences

Essential Cookies

Analytics Cookies

Preference Cookies

Advertising Cookies