From ae9b5e0f77ce3ada7a3b6741e95024e6d1d35679 Mon Sep 17 00:00:00 2001 From: Tsachy Shacham <tsachy@google.com> Date: Wed, 18 May 2011 19:00:00 +0200 Subject: [PATCH] Design doc for CPU pinning Signed-off-by: Tsachy Shacham <tsachy@google.com> Reviewed-by: Michael Hanselmann <hansmi@google.com> Reviewed-by: Iustin Pop <iustin@google.com> --- Makefile.am | 1 + doc/design-cpu-pinning.rst | 225 +++++++++++++++++++++++++++++++++++++ doc/design-draft.rst | 1 + 3 files changed, 227 insertions(+) create mode 100644 doc/design-cpu-pinning.rst diff --git a/Makefile.am b/Makefile.am index 62e259de1..2c0b25db3 100644 --- a/Makefile.am +++ b/Makefile.am @@ -278,6 +278,7 @@ docrst = \ doc/design-2.4.rst \ doc/design-draft.rst \ doc/design-oob.rst \ + doc/design-cpu-pinning.rst \ doc/design-query2.rst \ doc/design-x509-ca.rst \ doc/design-http-server.rst \ diff --git a/doc/design-cpu-pinning.rst b/doc/design-cpu-pinning.rst new file mode 100644 index 000000000..f1b3de140 --- /dev/null +++ b/doc/design-cpu-pinning.rst @@ -0,0 +1,225 @@ +Ganeti CPU Pinning +================== + +Objective +--------- + +This document defines Ganeti's support for CPU pinning (aka CPU +affinity). + +CPU pinning enables mapping and unmapping entire virtual machines or a +specific virtual CPU (vCPU), to a physical CPU or a range of CPUs. + +At this stage Pinning will be implemented for Xen and KVM. + +Command Line +------------ + +Suggested command line parameters for controlling CPU pinning are as +follows:: + + gnt-instance modify -H cpu_mask=<cpu-pinning-info> <instance> + +cpu-pinning-info can be any of the following: + +* One vCPU mapping, which can be the word "all" or a combination + of CPU numbers and ranges separated by comma. In this case, all + vCPUs will be mapped to the indicated list. +* A list of vCPU mappings, separated by a colon ':'. In this case + each vCPU is mapped to an entry in the list, and the size of the + list must match the number of vCPUs defined for the instance. This + is enforced when setting CPU pinning or when setting the number of + vCPUs using ``-B vcpus=#``. + + The mapping list is matched to consecutive virtual CPUs, so the first entry + would be the CPU pinning information for vCPU 0, the second entry + for vCPU 1, etc. + +The default setting for new instances is "all", which maps the entire +instance to all CPUs, thus effectively turning off CPU pinning. + +Here are some usage examples:: + + # Map vCPU 0 to physical CPU 1 and vCPU 1 to CPU 3 (assuming 2 vCPUs) + gnt-instance modify -H cpu_mask=1:3 my-inst + + # Pin vCPU 0 to CPUs 1 or 2, and vCPU 1 to any CPU + gnt-instance modify -H cpu_mask=1-2:all my-inst + + # Pin vCPU 0 to any CPU, vCPU 1 to CPUs 1, 3, 4 or 5, and CPU 2 to + # CPU 0 + gnt-instance modify -H cpu_mask=all:1\\,3-4:0 my-inst + + # Pin entire VM to CPU 0 + gnt-instance modify -H cpu_mask=0 my-inst + + # Turn off CPU pinning (default setting) + gnt-instance modify -H cpu_mask=all my-inst + +Assuming an instance has 2 vCPUs, the following commands will fail:: + + # not enough mappings + gnt-instance modify -H cpu_mask=0 my-inst + + # too many + gnt-instance modify -H cpu_mask=2:1:1 my-inst + +Validation +---------- + +CPU pinning information is validated by making sure it matches the +number of vCPUs. This validation happens when changing either the +cpu_mask or vcpus parameters. +Changing either parameter in a way that conflicts with the other will +fail with a proper error message. +To make such a change, both parameters should be modified at the same +time. For example: +``gnt-instance modify -B vcpus=4 -H cpu_mask=1:1:2-3:4\\,6 my-inst`` + +Besides validating CPU configuration, i.e. the number of vCPUs matches +the requested CPU pinning, Ganeti will also verify the number of +physical CPUs is enough to support the required configuration. For +example, trying to run a configuration of vcpus=2,cpu_mask=0:4 on +a node with 4 cores will fail (Note: CPU numbers are 0-based). + +This validation should repeat every time an instance is started or +migrated live. See more details under Migration below. + +Cluster verification should also test the compatibility of other nodes in +the cluster to required configuration and alert if a minimum requirement +is not met. + +Failover +-------- + +CPU pinning configuration can be transferred from node to node, unless +the number of physical CPUs is smaller than what the configuration calls +for. It is suggested that unless this is the case, all transfers and +migrations will succeed. + +In case the number of physical CPUs is smaller than the numbers +indicated by CPU pinning information, instance failover will fail. + +In case of emergency, to force failover to ignore mismatching CPU +information, the following switch can be used: +``gnt-instance failover --ignore-cpu-mismatch my-inst``. +This command will try to fail the instance with the current cpu mask, +but if that fails, it will change the mask to be "all". + +Migration +--------- + +In case of live migration, and in addition to failover considerations, +it is required to remap CPU pinning after migration. This can be done in +realtime for instances for both Xen and KVM, and only depends on the +number of physical CPUs being sufficient to support the migrated +instance. + +Data +---- + +Pinning information will be kept as a list of integers per vCPU. +To mark a mapping of any CPU, we will use (-1). +A single entry, no matter what the number of vCPUs is, will always mean +that all vCPUs have the same mapping. + +Configuration file +------------------ + +The pinning information is kept for each instance's hypervisor +params section of the configuration file as +``cpu_mask: [ [ a ], [ b, c ], [ d ] ]`` + +Xen +--- + +There are 2 ways to control pinning in Xen, either via the command line +or through the configuration file. + +The commands to make direct pinning changes are the following:: + + # To pin a vCPU to a specific CPU + xm vcpu-pin <domain> <vcpu> <cpu> + + # To unpin a vCPU + xm vcpu-pin <domain> <vcpu> all + + # To get the current pinning status + xm vcpu-list <domain> + +Since currently controlling Xen in Ganeti is done in the configuration +file, it is straight forward to use the same method for CPU pinning. +There are 2 different parameters that control Xen's CPU pinning and +configuration: + +vcpus + controls the number of vCPUs +cpus + maps vCPUs to physical CPUs + +When no pinning is required (pinning information is "all"), the +"cpus" entry is removed from the configuration file. + +For all other cases, the configuration is "translated" to Xen, which +expects either ``cpus = "a"`` or ``cpus = [ "a", "b", "c", ...]``, +where each a, b or c are a physical CPU number, CPU range, or a +combination, and the number of entries (if a list is used) must match +the number of vCPUs, and are mapped in order. + +For example, CPU pinning information of ``1:2,4-7:0-1`` is translated +to this entry in Xen's configuration ``cpus = [ "1", "2,4-7", "0-1" ]`` + +KVM +--- + +Controlling pinning in KVM is a little more complicated as there is no +configuration to control pinning before instances are started. + +The way to change or assign CPU pinning under KVM is to use ``taskset`` or +its underlying system call ``sched_setaffinity``. Setting the affinity for +the VM process will change CPU pinning for the entire VM, and setting it +for specific vCPU threads will control specific vCPUs. + +The sequence of commands to control pinning is this: start the instance +with the ``-S`` switch, so it halts before starting execution, get the +process ID or identify thread IDs of each vCPU by sending ``info cpus`` +to the monitor, map vCPUs as required by the cpu-pinning information, +and issue a ``cont`` command on the KVM monitor to allow the instance +to start execution. + +For example, a sequence of commands to control CPU affinity under KVM +may be: + +* Start KVM: ``/usr/bin/kvm … <kvm-command-line-options> … -S`` +* Use socat to connect to monitor +* send ``info cpus`` to monitor to get thread/vCPU information +* call ``sched_setaffinity`` for each thread with the CPU mask +* send ``cont`` to KVM's monitor + +A CPU mask is a hexadecimal bit mask where each bit represents one +physical CPU. See man page for :manpage:`sched_setaffinity(2)` for more +details. + +For example, to run a specific thread-id on CPUs 1 or 3 the mask is +0x0000000A. + +We will control process and thread affinity using the python affinity +package (http://pypi.python.org/pypi/affinity). This package is a Python +wrapper around the two affinity system calls, and has no other +requirements. + +Alternative Design Options +-------------------------- + +1. There's an option to ignore the limitations of the underlying + hypervisor and instead of requiring explicit pinning information + for *all* vCPUs, assume a mapping of "all" to vCPUs not mentioned. + This can lead to inadvertent missing information, but either way, + since using cpu-pinning options is probably not going to be + frequent, there's no real advantage. + +.. vim: set textwidth=72 : +.. Local Variables: +.. mode: rst +.. fill-column: 72 +.. End: diff --git a/doc/design-draft.rst b/doc/design-draft.rst index 2f736ca5b..63759db3f 100644 --- a/doc/design-draft.rst +++ b/doc/design-draft.rst @@ -10,6 +10,7 @@ Design document drafts design-impexp2.rst design-lu-generated-jobs.rst design-multi-reloc.rst + design-cpu-pinning.rst .. vim: set textwidth=72 : .. Local Variables: -- GitLab