Step 4. Adding success criteria (SLA) for subtasks

SLA - Service-Level Agreement (Success Criteria)

Rally allows you to set success criteria (also called SLA - Service-Level Agreement) for every subtask. Rally will automatically check them for you.

To configure the SLA, add the “sla” section to the configuration of the corresponding subtask (the check name is a key associated with its target value). You can combine different success criteria:

{
    "NovaServers.boot_and_delete_server": [
        {
            "args": {
                ...
            },
            "runner": {
                ...
            },
            "context": {
                ...
            },
            "sla": {
                "max_seconds_per_iteration": 10,
                "failure_rate": {
                    "max": 25
                }
            }
        }
    ]
}

Such configuration will mark the NovaServers.boot_and_delete_server task scenario as not successful if either some iteration took more than 10 seconds or more than 25% iterations failed.

Checking SLA

Let us show you how Rally SLA work using a simple example based on Dummy scenarios. These scenarios actually do not perform any OpenStack-related stuff but are very useful for testing the behaviors of Rally. Let us put in a new task, test-sla.json, 2 scenarios – one that does nothing and another that just throws an exception:

{
    "Dummy.dummy": [
        {
            "args": {},
            "runner": {
                "type": "constant",
                "times": 5,
                "concurrency": 2
            },
            "context": {
                "users": {
                    "tenants": 3,
                    "users_per_tenant": 2
                }
            },
            "sla": {
                "failure_rate": {"max": 0.0}
            }
        }
    ],
    "Dummy.dummy_exception": [
        {
            "args": {},
            "runner": {
                "type": "constant",
                "times": 5,
                "concurrency": 2
            },
            "context": {
                "users": {
                    "tenants": 3,
                    "users_per_tenant": 2
                }
            },
            "sla": {
                "failure_rate": {"max": 0.0}
            }
        }
    ]
}

Note that both scenarios in these tasks have the maximum failure rate of 0% as their success criterion. We expect that the first scenario is going to pass this criterion while the second will fail it. Let’s start the task:

rally task start test-sla.json

After the task completes, run rally task sla_check to check the results again the success criteria you defined in the task:

$ rally task sla_check
+-----------------------+-----+--------------+--------+-------------------------------------------------------------------------------------------------------+
| subtask               | pos | criterion    | status | detail                                                                                                |
+-----------------------+-----+--------------+--------+-------------------------------------------------------------------------------------------------------+
| Dummy.dummy           | 0   | failure_rate | PASS   | Maximum failure rate percent 0.0% failures, minimum failure rate percent 0% failures, actually 0.0%   |
| Dummy.dummy_exception | 0   | failure_rate | FAIL   | Maximum failure rate percent 0.0% failures, minimum failure rate percent 0% failures, actually 100.0% |
+-----------------------+-----+--------------+--------+-------------------------------------------------------------------------------------------------------+

Exactly as expected.

SLA in task report

SLA checks are nicely visualized in task reports. Generate one:

rally task report --out=report_sla.html --open

SubTask that have passed SLA have a green check on the overview page:

../../_images/Report-SLA-Overview.png

Somewhat more detailed information about SLA is displayed on the subtask pages:

../../_images/Report-SLA-Scenario.png

Success criteria present a very useful concept that enables not only to analyze the outcome of your tasks, but also to control their execution. In one of the next sections of our tutorial, we will show how to use SLA to abort the load generation before your OpenStack goes wrong.