Reading Docker Stats & top on a Production Server: CPU, RAM, and Why 'Used' Memory Looks High

Reading Docker Stats & top on a Production Server: CPU, RAM, and Why 'Used' Memory Looks High

Table of Contents

When you run docker stats next to top, the numbers rarely match your mental model. Memory usage looks “too high”, CPU “sy” is larger than you expect, and free RAM is tiny — yet everything runs fine.

This post explains what you see, how it relates (CPU/RAM/cgroups), and why Linux shows little free memory but lots of “available”. I’ll also show you how to sanity‑check limits, avoid OOMs, and interpret I/O.

In the examples below I’ve replaced real domains with placeholders like example1, example2, etc.


1) What docker stats shows (and what it doesn’t)

A typical docker stats --no-stream excerpt:

CONTAINER ID   NAME                   CPU %     MEM USAGE / LIMIT     MEM %     NET I/O           BLOCK I/O         PIDS
6c43d6d1fbd2   shop_prestashop        0.01%     233.3MiB / 768MiB     30.38%    11GB / 4.85GB     590MB / 1.26GB    4
18ae2ae3cf55   shop_db                0.01%     305MiB / 512MiB       59.58%    2.57GB / 11GB     76.3MB / 167MB    15
32efa5e4125a   example1_wordpress     0.07%     151.4MiB / 1GiB       14.79%    10.9GB / 12GB     51.8GB / 12.2GB   20
82f1e46ba22f   example1_db            0.02%     139.1MiB / 1GiB       13.58%    96.4MB / 203MB    26.3MB / 60.1MB   9
0a66cdb6dcc1   example2_forum         0.69%     37.53MiB / 1GiB       3.67%     22.9GB / 1.52GB   8.72MB / 0B       5
e0c1e88ea1e6   example2_db            0.46%     195.8MiB / 1GiB       19.12%    792MB / 22.7GB    33MB / 389MB      15
7294a4870cd9   example3_wordpress     0.14%     136.8MiB / 768MiB     17.81%    4.21GB / 1.5GB    83.5MB / 12.4MB   6
35489d5958a6   example3_db            0.01%     110.9MiB / 512MiB     21.65%    53.1MB / 53MB     13.7MB / 98.2MB   8

Key columns:

  • CPU % — percentage of a single host CPU (can exceed 100% if multi‑core and a container uses >1 core). Short snapshot, not an average.
  • MEM USAGE / LIMITcontainer cgroup memory usage vs container memory limit (e.g., 768MiB). If no limit, it shows host memory as the denominator.
  • MEM % — usage ÷ limit.
  • NET I/O — bytes in/out since container start (cumulative).
  • BLOCK I/O — reads/writes to storage since container start (cumulative).
  • PIDS — processes/threads inside the container (counts can be higher with PHP‑FPM pools, DB threads, etc.).

Important: MEM USAGE is inside the cgroup. It doesn’t include the host’s filesystem page cache used outside the container’s cgroup (unless the daemon is configured for cgroup‑v2 unified accounting with page cache reclaim tied to the cgroup). So docker stats and top will never be a simple 1:1 match.


2) What top shows — and why “free” RAM is low

A typical top header:

top - 09:46:41 up 8 days,  load average: 0.30, 0.36, 0.27
Tasks: 194 total,   1 running, 193 sleeping
%Cpu(s):  4.0 us, 12.0 sy, 0.0 ni, 80.0 id, 0.0 wa, 0.0 hi, 4.0 si, 0.0 st
MiB Mem :   7747.5 total,    922.0 free,   2186.6 used,   5264.2 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.   5560.9 avail Mem
  • free is small — Linux aggressively uses memory for page cache and slab (shown under buff/cache) to speed up disk I/O. That memory is reclaimable when apps need it.
  • available is the real headroom — here, 5560.9 MiB. This estimates how much memory can be given to apps without swapping (even with buffers/caches present).
  • CPU lines:
    • us = user‑space work (your app code).
    • sy = kernel‑space work (syscalls, networking, VFS, cgroups accounting).
    • si = software interrupts (e.g., network packet processing under load).
    • wa = I/O wait (blocked on storage — often a warning sign).
    • st = stolen time (virtualization — hypervisor took CPU).

Why it looks like RAM is “reserved”: Linux is deliberately filling memory with cache to avoid disk reads. It’s not a leak; it’s a feature. When a process needs memory, the kernel drops cache pages and hands RAM to the requester. That’s why available is the number to watch, not free.


3) Mapping containers to host usage

  • Cgroup limits vs host totals: A container with MEM USAGE / LIMIT = 600MiB / 1GiB contributes ~600 MiB to the host “used” number, plus any page cache the host keeps for its files outside the cgroup accounting path. With cgroup v2, cache can be more tightly attributed per cgroup if configured.
  • Multiple containers: Adding all container MEM USAGE figures will under‑count host memory because:
    • Host daemons (Docker, journald, sshd) also consume RAM.
    • Kernel memory, slab, and page cache are outside containers (unless attributed).
    • tmpfs volumes, overlayfs metadata, and filesystem cache inflate buff/cache.

Mental model:

Host used ≈ sum(container RSS) + host services + kernel + page cache

So “host used” > “sum of docker stats memory” is normal.


4) CPU: why sy can be non‑trivial on container hosts

You may see sy (system) at ~10–15% even with low application CPU. Reasons:

  • Network stack handling (NAT/bridge/overlay), iptables/nftables rules, conntrack.
  • Filesystem work (overlayfs, page cache, dentry/inode management).
  • cgroup accounting and context switches across many containers/PHP‑FPM workers.
  • TLS termination, logging, container runtime housekeeping.

When to worry: sustained high sy (e.g., >30%) with low us might indicate chatty I/O, packet floods, or inefficient filesystem patterns. Check iostat, pidstat -w, perf top, bpftrace/bcc tools, or reduce network/iptables complexity.


5) Load average vs CPU usage

  • Load average counts processes running or uninterruptible (D) (often I/O). On modern multicore hosts, load ~ 0.3 with 4+ cores is trivial even if one container spikes.
  • High load with low us/sy often means blocked I/O (wa climbs). Look at disk metrics and container BLOCK I/O.

6) Interpreting I/O columns

From docker stats:

  • NET I/O (cumulative) helps spot chatty containers (reverse proxies, busy WP sites).
  • BLOCK I/O (cumulative) highlights disk hitters (DBs, caches flushing to disk, loggers).
  • Sudden large writes from a WordPress container often come from image uploads, cache plugins writing, or log growth.

Pair with host tools:

iostat -xz 2
pidstat -d 2
dstat -tcdnm 2

7) Practical thresholds & alerts (rule‑of‑thumb)

  • Host available RAM < 10–15% persistently → investigate limits, caches, or memory leaks.
  • Container MEM % > 80–90% for long periods → raise limit or tune app (avoid OOM).
  • wa (I/O wait) > 5–10% consistently → storage or query tuning needed.
  • PIDS rising unbounded → check PHP‑FPM/DB max children/threads, cron storms.

8) Tuning & fixes you’ll actually use

Set sane container limits (Docker Compose):

services:
  example1_wordpress:
    mem_limit: 1g
    cpus: 0.50
  example1_db:
    mem_limit: 1g
    cpus: 0.50

Right‑size DB caches:

  • MariaDB/MySQL: tune innodb_buffer_pool_size to a fraction of the container limit (e.g., 50–70%), not of host RAM.

PHP‑FPM:

  • Set pm = dynamic, tune pm.max_children based on memory per worker (measure with ps, smem).

Logs:

  • Rotate aggressively (logrotate) and consider piping to journald/aggregator to cut filesystem churn.

Cache behavior:

  • Don’t “drop caches” routinely (echo 3 > /proc/sys/vm/drop_caches) in production — it hurts performance. If you must test, do it once and measure before/after.

Kernel hints (advanced):

  • vm.vfs_cache_pressure (higher → reclaim dentries/inodes more aggressively).
  • vm.swappiness (irrelevant if you have no swap, but consider adding a small zram swap to soften bursts).

9) Quick checklist when “memory looks full”

  1. Look at available in top or free -h — not free.
  2. Sum container MEM USAGE; expect it to be less than host used.
  3. Check OOM history:
    dmesg -T | egrep -i 'killed process|out of memory|oom'
    
  4. Inspect the big consumers:
    ps aux --sort=-rss | head -20
    
  5. For per‑process real usage (shared vs private):
    smem -tk
    

10) FAQ

Q: Why does the host show 5–6 GiB “used” when my containers sum to ~1–2 GiB?
A: The rest is page cache, slab, and host services. It’s normal. Use available to gauge pressure.

Q: Can docker stats lie?
A: It reports cgroup‑accounted memory. Differences with host views are expected due to caching/accounting boundaries.

Q: Should I clear cache to “free” memory?
A: No, Linux will reclaim it on demand. Dropping cache reduces performance and is rarely necessary.


Closing thoughts

  • docker stats tells you how each container behaves within its cgroup limits.
  • top tells you how the host kernel manages all memory, including the page cache that makes Linux fast.
  • Low free with high available is healthy. Worry when available shrinks or when containers live near their limits and trigger OOMs.

Once you internalize that, those “mystery” numbers start to make perfect sense — and you’ll know exactly what to tune next.

Share :

Related Posts

Why You Should Move Your Laravel App to Docker

Why You Should Move Your Laravel App to Docker

If you’re maintaining a Laravel app — whether it’s a startup MVP, a custom-built admin panel, or a full-featured SaaS product — Docker should be part of your toolchain.

Read More