Forks determines how many managed nodes (e.g. the target systems) a task will be run against simultaneously. By default, ansible.cfg is setup with 5 forks.
forks = 5
Let's say you have the following playbook, which uses the files module to create /tmp/foo.txt on a managed node (e.g. the target system).
---
- hosts: all
tasks:
- name: create /tmp/foo.txt
file:
path: /tmp/foo.txt
state: touch
...
Let's also say you have 10 managed nodes defined in your default hosts file or your own inventory file, server1.example.com through server10.example.com. Since the default is 5 forks, the task to create /tmp/foo.txt will first be executed on server1.example.com through server5.example.com simultaneously, then server6.example.com through server10.example.com simultaneously.
Forks is not so much of a concern with the managed nodes. Instead, the concern is with the demand put on the control node (your Ansible server). Let's say you have 1000 managed nodes, and forks is set to 1000 (probably not a good idea). In this situation, each task in the playbook would be run against all 1000 managed nodes simultaneously. This could put too much demand on your Ansible server in the form of CPU utilization and memory usage, and also put a blast of packets on your network, which could impact network performance will the tasks are being executed. For this reason, a reasonable approach is to baseline the performance of your Ansible server and network, monitor the performance of your Ansible server and network while running a playbook with the default setting of 5 forks, monitor the performance of your Ansible server and network while running a playbook with some other forks settings (e.g. 100 forks), and then compare the performance to determine which fork setting is ideal.
There are two ways to define the number of forks that will be used.
Did you find this article helpful?
If so, consider buying me a coffee over at