design-cpu-pinning.rst 7.89 KB
Newer Older
Tsachy Shacham's avatar
Tsachy Shacham committed
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
Ganeti CPU Pinning
==================

Objective
---------

This document defines Ganeti's support for CPU pinning (aka CPU
affinity).

CPU pinning enables mapping and unmapping entire virtual machines or a
specific virtual CPU (vCPU), to a physical CPU or a range of CPUs.

At this stage Pinning will be implemented for Xen and KVM.

Command Line
------------

Suggested command line parameters for controlling CPU pinning are as
follows::

  gnt-instance modify -H cpu_mask=<cpu-pinning-info> <instance>

cpu-pinning-info can be any of the following:

* One vCPU mapping, which can be the word "all" or a combination
  of CPU numbers and ranges separated by comma. In this case, all
  vCPUs will be mapped to the indicated list.
* A list of vCPU mappings, separated by a colon ':'. In this case
  each vCPU is mapped to an entry in the list, and the size of the
  list must match the number of vCPUs defined for the instance. This
  is enforced when setting CPU pinning or when setting the number of
  vCPUs using ``-B vcpus=#``.

  The mapping list is matched to consecutive virtual CPUs, so the first entry
  would be the CPU pinning information for vCPU 0, the second entry
  for vCPU 1, etc.

The default setting for new instances is "all", which maps the entire
instance to all CPUs, thus effectively turning off CPU pinning.

Here are some usage examples::

  # Map vCPU 0 to physical CPU 1 and vCPU 1 to CPU 3 (assuming 2 vCPUs)
  gnt-instance modify -H cpu_mask=1:3 my-inst

  # Pin vCPU 0 to CPUs 1 or 2, and vCPU 1 to any CPU
  gnt-instance modify -H cpu_mask=1-2:all my-inst

  # Pin vCPU 0 to any CPU, vCPU 1 to CPUs 1, 3, 4 or 5, and CPU 2 to
  # CPU 0
51
  gnt-instance modify -H cpu_mask=all:1\\,3-5:0 my-inst
Tsachy Shacham's avatar
Tsachy Shacham committed
52 53 54 55 56 57 58

  # Pin entire VM to CPU 0
  gnt-instance modify -H cpu_mask=0 my-inst

  # Turn off CPU pinning (default setting)
  gnt-instance modify -H cpu_mask=all my-inst

59
Assuming an instance has 3 vCPUs, the following commands will fail::
Tsachy Shacham's avatar
Tsachy Shacham committed
60 61

  # not enough mappings
62
  gnt-instance modify -H cpu_mask=0:1 my-inst
Tsachy Shacham's avatar
Tsachy Shacham committed
63 64

  # too many
65
  gnt-instance modify -H cpu_mask=2:1:1:all my-inst
Tsachy Shacham's avatar
Tsachy Shacham committed
66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104

Validation
----------

CPU pinning information is validated by making sure it matches the
number of vCPUs. This validation happens when changing either the
cpu_mask or vcpus parameters.
Changing either parameter in a way that conflicts with the other will
fail with a proper error message.
To make such a change, both parameters should be modified at the same
time. For example:
``gnt-instance modify -B vcpus=4 -H cpu_mask=1:1:2-3:4\\,6 my-inst``

Besides validating CPU configuration, i.e. the number of vCPUs matches
the requested CPU pinning, Ganeti will also verify the number of
physical CPUs is enough to support the required configuration. For
example, trying to run a configuration of vcpus=2,cpu_mask=0:4 on
a node with 4 cores will fail (Note: CPU numbers are 0-based).

This validation should repeat every time an instance is started or
migrated live. See more details under Migration below.

Cluster verification should also test the compatibility of other nodes in
the cluster to required configuration and alert if a minimum requirement
is not met.

Failover
--------

CPU pinning configuration can be transferred from node to node, unless
the number of physical CPUs is smaller than what the configuration calls
for.  It is suggested that unless this is the case, all transfers and
migrations will succeed.

In case the number of physical CPUs is smaller than the numbers
indicated by CPU pinning information, instance failover will fail.

In case of emergency, to force failover to ignore mismatching CPU
information, the following switch can be used:
105 106
``gnt-instance failover --fix-cpu-mismatch my-inst``.
This command will try to failover the instance with the current cpu mask,
Tsachy Shacham's avatar
Tsachy Shacham committed
107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129
but if that fails, it will change the mask to be "all".

Migration
---------

In case of live migration, and in addition to failover considerations,
it is required to remap CPU pinning after migration. This can be done in
realtime for instances for both Xen and KVM, and only depends on the
number of physical CPUs being sufficient to support the migrated
instance.

Data
----

Pinning information will be kept as a list of integers per vCPU.
To mark a mapping of any CPU, we will use (-1).
A single entry, no matter what the number of vCPUs is, will always mean
that all vCPUs have the same mapping.

Configuration file
------------------

The pinning information is kept for each instance's hypervisor
130
params section of the configuration file as the original string.
Tsachy Shacham's avatar
Tsachy Shacham committed
131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204

Xen
---

There are 2 ways to control pinning in Xen, either via the command line
or through the configuration file.

The commands to make direct pinning changes are the following::

  # To pin a vCPU to a specific CPU
  xm vcpu-pin <domain> <vcpu> <cpu>

  # To unpin a vCPU
  xm vcpu-pin <domain> <vcpu> all

  # To get the current pinning status
  xm vcpu-list <domain>

Since currently controlling Xen in Ganeti is done in the configuration
file, it is straight forward to use the same method for CPU pinning.
There are 2 different parameters that control Xen's CPU pinning and
configuration:

vcpus
  controls the number of vCPUs
cpus
  maps vCPUs to physical CPUs

When no pinning is required (pinning information is "all"), the
"cpus" entry is removed from the configuration file.

For all other cases, the configuration is "translated" to Xen, which
expects either ``cpus = "a"`` or ``cpus = [ "a", "b", "c", ...]``,
where each a, b or c are a physical CPU number, CPU range, or a
combination, and the number of entries (if a list is used) must match
the number of vCPUs, and are mapped in order.

For example, CPU pinning information of ``1:2,4-7:0-1`` is translated
to this entry in Xen's configuration ``cpus = [ "1", "2,4-7", "0-1" ]``

KVM
---

Controlling pinning in KVM is a little more complicated as there is no
configuration to control pinning before instances are started.

The way to change or assign CPU pinning under KVM is to use ``taskset`` or
its underlying system call ``sched_setaffinity``. Setting the affinity for
the VM process will change CPU pinning for the entire VM, and setting it
for specific vCPU threads will control specific vCPUs.

The sequence of commands to control pinning is this: start the instance
with the ``-S`` switch, so it halts before starting execution, get the
process ID or identify thread IDs of each vCPU by sending ``info cpus``
to the monitor, map vCPUs as required by the cpu-pinning information,
and issue a ``cont`` command on the KVM monitor to allow the instance
to start execution.

For example, a sequence of commands to control CPU affinity under KVM
may be:

* Start KVM: ``/usr/bin/kvm … <kvm-command-line-options> … -S``
* Use socat to connect to monitor
* send ``info cpus`` to monitor to get thread/vCPU information
* call ``sched_setaffinity`` for each thread with the CPU mask
* send ``cont`` to KVM's monitor

A CPU mask is a hexadecimal bit mask where each bit represents one
physical CPU. See man page for :manpage:`sched_setaffinity(2)` for more
details.

For example, to run a specific thread-id on CPUs 1 or 3 the mask is
0x0000000A.

Hrvoje Ribicic's avatar
Hrvoje Ribicic committed
205 206 207 208 209 210 211
As of 2.12, the psutil python package
(https://github.com/giampaolo/psutil) will be used to control process
and thread affinity. The affinity python package
(http://pypi.python.org/pypi/affinity) was used before, but it was not
invoking the two underlying system calls appropriately, using a cast
instead of the CPU_SET macro, causing failures for masks referencing
more than 63 CPUs.
Tsachy Shacham's avatar
Tsachy Shacham committed
212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227

Alternative Design Options
--------------------------

1. There's an option to ignore the limitations of the underlying
   hypervisor and instead of requiring explicit pinning information
   for *all* vCPUs, assume a mapping of "all" to vCPUs not mentioned.
   This can lead to inadvertent missing information, but either way,
   since using cpu-pinning options is probably not going to be
   frequent, there's no real advantage.

.. vim: set textwidth=72 :
.. Local Variables:
.. mode: rst
.. fill-column: 72
.. End: