Ansible Performance Tuning

A playbook that runs fine against five servers can take an unacceptably long time against five hundred. Performance tuning in Ansible is not premature optimisation — it is a production requirement. This lesson covers the full set of performance mechanisms Ansible provides, from the simplest (increasing forks) to the most powerful (async execution and pipelining). These techniques separate Ansible beginners from practitioners who can confidently manage infrastructure at scale.

Understanding the Default Performance Profile

By default, Ansible:

Runs against 5 hosts simultaneously (forks = 5)
Opens a new SSH connection for every task on every host
Gathers facts on every playbook run
Runs tasks sequentially (one task finishes on all hosts before the next begins)

For small inventories this is fine. For 100+ hosts, these defaults create significant bottlenecks.

Increasing Forks

The forks setting controls how many hosts Ansible manages in parallel. Increase it in ansible.cfg:

[defaults]
forks = 20

Or at runtime:

ansible-playbook site.yml -f 50

The right fork count depends on your control node's resources and network capacity. Start by doubling the default (forks = 10) and increase until you see diminishing returns or resource exhaustion. A strong control node with a fast network can comfortably handle forks = 50 to 100.

SSH Pipelining

By default, Ansible transfers the module script to the managed node via SFTP, executes it, then deletes it — three SSH operations per task. SSH pipelining eliminates the file transfer by sending the module code directly over the SSH connection:

[ssh_connection]
pipelining = true

Pipelining typically reduces playbook run time by 60–80 percent on high-latency connections. The requirement: requiretty must be disabled in sudoers on managed nodes (most modern systems have this disabled by default). Verify with:

sudo grep -r requiretty /etc/sudoers /etc/sudoers.d/

ControlMaster and SSH Multiplexing

SSH multiplexing reuses existing SSH connections instead of opening a new one for each task:

[ssh_connection]
ssh_args = -C -o ControlMaster=auto -o ControlPersist=60s
control_path_dir = ~/.ansible/cp
control_path = %(directory)s/%%h-%%r

ControlPersist=60s keeps the master connection alive for 60 seconds after the last task, allowing subsequent tasks to reuse it. Combined with pipelining, this dramatically reduces SSH overhead.

Fact Caching

Gathering facts is slow — it runs the setup module on every host at the start of every play. Fact caching stores gathered facts between playbook runs:

[defaults]
gathering = smart          # Only gather if facts aren't cached or are stale
fact_caching = jsonfile
fact_caching_connection = /tmp/ansible_facts_cache
fact_caching_timeout = 7200   # Cache for 2 hours

With gathering = smart, Ansible skips fact gathering on subsequent runs if cached facts are still fresh. For infrastructure that does not change frequently, this eliminates a significant time cost. Use redis as the fact cache backend in production for persistence across control node restarts.

Disabling Facts When Not Needed

For plays that do not use facts, disable gathering entirely:

- name: Quick configuration play
  hosts: all
  gather_facts: false
  tasks:
    - name: Touch a file
      file:
        path: /tmp/marker
        state: touch

A task that does not use ansible_os_family, ansible_distribution, or any other fact variable has no reason to spend time gathering them. Disabling facts on short plays can save 5–30 seconds per run depending on host count.

Async Tasks and Polling

By default, Ansible waits for each task to complete before moving to the next. Long-running tasks — package upgrades, compilation, database migrations — block the playbook while they run. Async execution lets these tasks run in the background:

- name: Run long database migration (async)
  command: /opt/db/migrate up
  async: 3600    # Task may run for up to 1 hour
  poll: 0        # Don't wait for it — fire and forget
  register: migration_job

- name: Continue with other tasks while migration runs
  apt:
    name: nginx
    state: present

- name: Wait for migration to complete
  async_status:
    jid: "{{ migration_job.ansible_job_id }}"
  register: job_result
  until: job_result.finished
  retries: 60
  delay: 30   # Check every 30 seconds

With poll: 0, the task is launched and Ansible immediately moves to the next task. The async_status module polls for completion later. This pattern is essential for operations that take longer than SSH timeout windows.

Serial Execution for Rolling Updates

The serial keyword controls how many hosts a play runs against simultaneously:

- name: Rolling web server update
  hosts: webservers
  serial: "25%"    # Update 25% of hosts at a time
  tasks:
    - name: Update Nginx
      apt:
        name: nginx
        state: latest
      notify: Restart Nginx

Serial values can be integers (run 2 at a time), percentages (run 10% at a time), or lists for staged rollouts:

serial: [1, 5, 20%]   # 1 first, then 5, then remaining in 20% batches

Strategy Plugins: free vs linear

The default linear strategy runs each task on all hosts before moving to the next. The free strategy lets each host run through tasks as fast as possible, without waiting for other hosts:

- name: Independent parallel deployment
  hosts: webservers
  strategy: free
  tasks: ...

Use free when tasks are independent and you want maximum throughput. Use linear (default) when task ordering across hosts matters.

Mitogen: External Execution Engine

Mitogen is a third-party execution strategy plugin that replaces SSH-based task transfer with a faster Python-based connection mechanism. Organisations report 2–7x speedup with Mitogen:

pip3 install mitogen
# In ansible.cfg:
[defaults]
strategy_plugins = /path/to/mitogen/ansible_mitogen/plugins/strategy
strategy = mitogen_linear

Mitogen is not officially supported by Red Hat but is widely used in the community for performance-critical environments.

Performance Baseline and Measurement

Measure before tuning. Use the profile_tasks callback plugin to see exactly where time is being spent:

[defaults]
callback_whitelist = profile_tasks, timer

The output shows the time taken by each task, sorted by duration. Target the slowest tasks for async execution or skipping, not arbitrary settings.

Try This: Benchmark Your Playbook

Enable the profile_tasks callback in your ansible.cfg and run your LAMP stack playbook against all three lab nodes. Record the total time. Then enable pipelining and SSH multiplexing, increase forks to 20, and enable smart fact caching. Run again and compare. Document the time savings for each optimisation. This benchmarking discipline is directly applicable to production tuning work and makes an impressive addition to your portfolio documentation.

Summary

Ansible performance optimisation layers several mechanisms: increasing forks for broader parallelism, SSH pipelining to eliminate file transfer overhead, connection multiplexing to reuse SSH sessions, fact caching to avoid redundant setup module runs, async execution for long-running tasks, and the free strategy for independent parallel workloads. The profile_tasks callback identifies bottlenecks before tuning begins. Each optimisation is independently configurable in ansible.cfg, making it easy to apply the right combination for any environment.

Previous lessons

Back to courses