Commits · b8a2c0abd577be97ee9409b034e1d1c4ffc6904d · itminedu / snf-ganeti

Sep 29, 2011

Add an allocation limit to hspace · b8a2c0ab

Iustin Pop authored 13 years ago


This is very useful for testing/benchmarking.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

b8a2c0ab

Small simplification in tryAlloc · 1bf6d813

Iustin Pop authored 13 years ago


Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

1bf6d813

Change how node pairs are generated/used · b0631f10

Iustin Pop authored 13 years ago

Currently, the node pairs used for allocation are a simple [(primary,
secondary)] list of tuples, as this is how they were used before the
previous patch. However, for that patch, we use them separately per
primary node, and we have to unpack this list right after generation.

Therefore it makes sense to directly generate the list in the correct
form, and remove the split from tryAlloc. This should not be slower
than the previous patch, at least, possibly even faster.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

b0631f10

Parallelise instance allocation/capacity computation · f828f4aa

Iustin Pop authored 13 years ago

This patch finally enables parallelisation in instance placement.

My original try for enabling this didn't work well, but it took a
while (and liberal use of threadscope) to understand why. The attempt
was to simply `parMap rwhnf` over allocateOnPair, however this is not
good as for a 100-node cluster, this will create roughly 100*100
sparks, which is way too much: each individual spark is too small, and
there are too many sparks. Furthermore, the combining of the
allocateOnPair results was done single-threaded, losing even more
parallelism. So we had O(n²) sparks to run in parallel, each spark of
size O(1), and we combine single-threadedly a list of O(n²) length.

The new algorithm does a two-stage process: we group the list of valid
pairs per primary node, relying on the fact that usually the secondary
nodes are somewhat balanced (it's definitely true for 'blank' cluster
computations). We then run in parallel over all primary nodes, doing
both the individual allocateOnPair calls *and* the concatAllocs
summarisation. This leaves only the summing of the primary group
results together for the main execution thread. The new numbers are:
O(n) sparks, each of size O(n), and we combine single-threadedly a
list of O(n) length.

This translates directly into a reasonable speedup (relative numbers
for allocation of 3 instances on a 120-node cluster):

- original code (non-threaded): 1.00 (baseline)
- first attempt (2 threads):    0.81 (20% slowdown‼️

)
- new code (non-threaded):      1.00 (no slowdown)
- new code (threaded/1 thread): 1.00
- new code (2 threads):         1.65 (65% faster)

We don't get a 2x speedup, because the GC time increases. Fortunately
the code should scale well to more cores, so on many-core machines we
should get a nice overall speedup. On a different machine with 4
cores, we get 3.29x.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

f828f4aa

Abstract comparison of AllocElements · d7339c99

Iustin Pop authored 13 years ago


This is moved outside of the concatAllocs as it will be needed in
another place in the future.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

d7339c99

Change type of Cluster.AllocSolution · 129734d3

Iustin Pop authored 13 years ago


Originally, this data type was used both by instance allocation (1
result), and by instance relocation (many results, one per
instance). As such, the field 'asSolutions' was a list, and the
various code paths checked whether the length of the list matches the
current mode. This is very ugly, as we can't guarantee this matching
via the type system; hence the FIXME in the code.

However, commit 6804faa0 removed the instance evacuation code, and thus
we now always use just one allocation solution. Hence we can change
the data type to a simply Maybe type, and get rid of many 'otherwise
barf out' conditions.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

129734d3

Sep 28, 2011

http.client: Show pending requests as “owner” · 90b2eeb0

Michael Hanselmann authored 13 years ago


In the context of the lock monitor a “pending” item does not yet own the
requested resource. Since these HTTP requests are already undergoing
they should be shown as owners.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

90b2eeb0

http.client: Add nice name to requests · 7cb2d205

Michael Hanselmann authored 13 years ago


With this change a node name instead of the IP address can be shown for
pending RPC requests:
Name                              Pending
rpc/node18.example.com/test_delay thread:Jq1/Job692/TEST_DELAY

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

7cb2d205

rpc/http: Show pending RPC requests in lock monitor · aea5caef

Michael Hanselmann authored 13 years ago


Not all requests use an instance of RpcRunner yet and therefore won't
show up (only instances have access to the global Ganeti context).
Currently only the IP address is accessible. Another patch will add a
nicer name for requests.

Example output (gnt-debug locks -o name,pending):
Name                      Pending
rpc/192.0.2.18/test_delay thread:Jq12/Job683/TEST_DELAY

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

aea5caef

http.client: Factorize code interacting with cURL · ecd61b4e

Michael Hanselmann authored 13 years ago


This simplifies HttpClientPool.ProcessRequests significantly and will be
handy for showing pending RPC requests in the lock monitor.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

ecd61b4e

Sep 27, 2011

Adding qemu-img dependency to INSTALL · 6567f1d9

Agata Murawska authored 13 years ago


Signed-off-by: Agata Murawska <agatamurawska@google.com>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

6567f1d9

http.client: Reduce performance impact by assertion · a3c10d31

Michael Hanselmann authored 13 years ago


Call dict.values once instead of N times.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

a3c10d31

rpc: Overhaul client structure · 00267bfe

Michael Hanselmann authored 13 years ago


- Clearly separate node name to IP address resolution into separate
  functions
- Simplified code structure (one code path instead of several)
- Fully unittested
- Preparation for more RPC improvements

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

00267bfe

rpc: Make compression function module-global · 30474135

Michael Hanselmann authored 13 years ago


No need to keep it in the class.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

30474135

Keep only one global RPC runner in Ganeti context · 87b3cb26

Michael Hanselmann authored 13 years ago


Instead of having one RPC runner per mcpu processor this will keep only
one instance as part of the masterd-wide Ganeti context. Upcoming
patches will change the RPC runner to report pending requests to the
lock manager.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

87b3cb26

Sep 26, 2011

Update INSTALL with ovfconverter requirements · daa4dcc1

Agata Murawska authored 13 years ago


Signed-off-by: Agata Murawska <agatamurawska@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

daa4dcc1

TemporaryFilesManager implementation · 0c1a5b1e

Agata Murawska authored 13 years ago


Signed-off-by: Agata Murawska <agatamurawska@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

0c1a5b1e

Export: unittests · d92518d3

Agata Murawska authored 13 years ago


Signed-off-by: Agata Murawska <agatamurawska@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

d92518d3

Export: documentation · 54f834df

Agata Murawska authored 13 years ago


Signed-off-by: Agata Murawska <agatamurawska@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

54f834df

Export: saving data to ovf file · 7432d332

Agata Murawska authored 13 years ago


Signed-off-by: Agata Murawska <agatamurawska@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

7432d332

Export: parsing data from config file · b179ce72

Agata Murawska authored 13 years ago


Signed-off-by: Agata Murawska <agatamurawska@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

b179ce72

Export: initial commit - manifest, ova creation etc · 0963b26a

Agata Murawska authored 13 years ago


Signed-off-by: Agata Murawska <agatamurawska@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

0963b26a

Import: unittests · 1e6fab60

Agata Murawska authored 13 years ago


Signed-off-by: Agata Murawska <agatamurawska@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

1e6fab60

Import: backend, hypervisor and os · 7bde29b5

Agata Murawska authored 13 years ago


Signed-off-by: Agata Murawska <agatamurawska@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

7bde29b5

Import: networks · 24b9469d

Agata Murawska authored 13 years ago


Signed-off-by: Agata Murawska <agatamurawska@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

24b9469d

Import: disk conversion · 99381e3b

Agata Murawska authored 13 years ago


Signed-off-by: Agata Murawska <agatamurawska@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

99381e3b

Import: reading ovf file · 864cf6bf

Agata Murawska authored 13 years ago


Signed-off-by: Agata Murawska <agatamurawska@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

864cf6bf

Initial commit for ovfconverter tool · ced78a66

Agata Murawska authored 13 years ago


Signed-off-by: Agata Murawska <agatamurawska@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

ced78a66

doc: sphinx config file changes · 1b225415

Iustin Pop authored 13 years ago


I wanted to just enable another extension (the graphviz one), but then
I went and did a lot of changes:

- replaced ' with " for consistency with our style guide
- imported new settings (commented out) that current python-sphinx
  (1.0.7) generates when starting a new project; for the keys that are
  different in 0.6 and 1.0+, I left the 0.6 version until we bump our
  documented version
- enabled graphviz; needed for a design doc I'm currently working on
- updated copyright years
- changed list style from single-line to multi-line
- added coverage/ dir to exclude_trees

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

1b225415

doc: re-wrap design-oob to 72 chars · e3c39cc3

Iustin Pop authored 13 years ago


I started with just adding some :term:`SoW` and similar to design-oob,
but then I realised this was 80-chars wrapped, not 72-chars. So I went
and re-wrapped most of it, plus adding the glossary references.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

e3c39cc3

doc: glossary improvements · 0805519a

Iustin Pop authored 13 years ago


These will be used to remove some inline definitions and replace them
with :term:`foo`.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

0805519a

serializer: Add comment about simplejson vs. built-in json · b76fd1bc
Michael Hanselmann authored 13 years ago
```
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
```
b76fd1bc

Revert "Fail if dictionary uses invalid keys" and "Support newer “json” module" · cdeda3b6

Michael Hanselmann authored 13 years ago


This reverts commit fd0351ae and
9869e771. The built-in module is a lot
slower in Python 2.6.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

cdeda3b6

Sep 23, 2011

serializer: Fail if dictionary uses invalid keys · fd0351ae

Michael Hanselmann authored 13 years ago

JSON only supports a very restricted set of types for dictionary keys,
among them strings, booleans and “null”. Integers and floats are
converted to strings. Since this can cause a lot of confusion in Python,
this check raises an exception if a caller tries to use such types.

Since the pre-Python 2.6 “simplejson” module doesn't support overriding
the function where the conversion takes place this check can only be
done for the newer “json” module.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

fd0351ae

serializer: Support newer “json” module · 9869e771

Michael Hanselmann authored 13 years ago


This module is included from Python 2.6 and is based on
simplejson.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

9869e771

htools: man page improvements · acd9fa11

Iustin Pop authored 13 years ago


This patch moves all the backend options into the main htools man
page, and it adds documentation for the -t option, which so far was
not documented w.r.t. the file structure.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

acd9fa11

hspace: add short forms for the group policy · 2ef8013f

Iustin Pop authored 13 years ago


This adds a shortened versions of the allocation policies, as writing
out the whole name in the command line can become tedious.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

2ef8013f

Fix interaction between CPU pinning and KVM migration · 1d8a7812

Andrea Spadaccini authored 13 years ago


CPU pinning requires the KVM hypervisor to start in the paused state, in
order to retrieve information, and immediately unpauses it.

This does not play well with live migration, as the unpausing was done
before the migration started and so the receiving kvm process left the
migrated instance in the stopped status.

This patch fixes this behavior, by not launching the KVM process in
stopped state while on the receiving side of a migration.

Also, the stopping is now done outside _ExecuteCpuAffinity.

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Tsachy Shacham <tsachy@google.com>

1d8a7812

Sep 22, 2011

htools: add a MonadPlus instance for Result · 1c7c4578

Iustin Pop authored 13 years ago


This will be used to implement more easily 'choice' parsing of input
data, without resorting to syntax (case … of Bad _ -> …).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Agata Murawska <agatamurawska@google.com>

1c7c4578

Sep 20, 2011

Merge branch 'devel-2.5' · 5916e61a

Andrea Spadaccini authored 13 years ago


* devel-2.5:
  Add tls_ciphers and use_vdagent options
  Updated man pages with new SPICE TLS options
  Implementation of TLS-protected SPICE connections
  Added SPICE TLS option and related cert paths
  Fix OS creation's error handling when pausing sync

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

5916e61a