Skip to content

Extension chaosazure

Version 0.3.1
Repository https://github.com/chaostoolkit-incubator/chaostoolkit-azure

Build Status Python versions

This project is a collection of actions and probes, gathered as an extension to the Chaos Toolkit. It targets the Microsoft Azure platform.

Install

This package requires Python 3.5+

To be used from your experiment, this package must be installed in the Python environment where chaostoolkit already lives.

$ pip install -U chaostoolkit-azure

Usage

To use the probes and actions from this package, add the following to your experiment file:

{
    "type": "action",
    "name": "start-service-factory-chaos",
    "provider": {
        "type": "python",
        "module": "chaosazure.factory.actions",
        "func": "start_chaos",
        "secrets": ["azure"],
        "arguments": {
            "parameters": {
                "TimeToRunInSeconds": 45
            }
        }
    }
},
{
    "type": "action",
    "name": "stop-service-factory-chaos",
    "provider": {
        "type": "python",
        "module": "chaosazure.factory.actions",
        "func": "stop_chaos",
        "secrets": ["azure"]
    }
}

That’s it!

Please explore the code to see existing probes and actions.

Configuration

Credentials

This extension uses the requests and Azure SDK libraries under the hood. The requests library expects that you have a PFX certificate, converted as to the PEM format, that allows you to authenticate with the Service Factory endpoint.

Generally speaking, there are two ways of doing this:

  • you have created a configuration file where you will run the experiment from (so with a ~/.sfctl/config file)
  • you explicitly pass the correct environment variables to the experiment definition as follows:

    Configuration section:

    {
        "endpoint": "https://XYZ.westus.cloudapp.azure.com:19080",
        "verify_tls": false,
        "use_ca": false
    }
    

    Secrets section:

    {
        "azure": {
            "security": "pem",
            "pem_path": "./cluster-client-cert.pem"
        }
    }
    

    The PEM can also be passed as an environment variable:

    {
        "azure": {
            "security": "pem",
            "pem_content": {
                "type": "env",
                "key": "AZURE_PEM"
            }
        }
    }
    

    The environment variable name can be anything.

The Azure SDK library on the other hand expects that you have set up a service principal and provide its credentials. With those credentials you are able to authenticate with the Azure infrastructure and to spread Chaos on e.g. virtual machines.

There are two ways of doing this:

  • you can either pass the name of the environment variables to the experiment definition as follows (recommended):

    {
        "azure": {
            "client_id": "AZURE_CLIENT_ID",
            "client_secret": "AZURE_CLIENT_SECRET",
            "tenant_id": "AZURE_TENANT_ID"
        }
    }
    
  • or you inject the secrets explicitly to the experiment definition:

    {
        "azure": {
            "client_id": "your-super-secret-client-id",
            "client_secret": "your-even-more-super-secret-client-secret",
            "tenant_id": "your-tenant-id"
        }
    }
    

    Additionally you need to provide the Azure subscription id.

    {
        "azure": {
            "subscription_id": "your-azure-subscription-id"
        }
    }
    

Putting it all together

Here is a full example:

{
    "version": "1.0.0",
    "title": "...",
    "description": "...",
    "configuration": {
        "endpoint": "https://XYZ.westus.cloudapp.azure.com:19080",
        "verify_tls": false,
        "use_ca": false
    },
    "secrets": {
        "azure": {
            "security": "pem",
            "pem_path": "./cluster-client-cert.pem"
        }
    },
    "steady-state-hypothesis": {
        "title": "Services is healthy",
        "probes": [
            {
                "type": "probe",
                "name": "application-must-respond",
                "tolerance": 200,
                "provider": {
                    "type": "http",
                    "verify_tls": false,
                    "url": "https://some-url-in-cluster/"
                }
            }
        ]
    },
    "method": [
        {
            "type": "action",
            "name": "start-service-factory-chaos",
            "provider": {
                "type": "python",
                "module": "chaosazure.factory.actions",
                "func": "start_chaos",
                "secrets": ["azure"],
                "arguments": {
                    "parameters": {
                        "TimeToRunInSeconds": 45
                    }
                }
            },
            "pauses": {
                "after": 30
            }
        },
        {
            "type": "probe",
            "ref": "application-must-respond"
        },
        {
            "type": "action",
            "name": "stop-service-factory-chaos",
            "provider": {
                "type": "python",
                "module": "chaosazure.factory.actions",
                "func": "stop_chaos",
                "secrets": ["azure"]
            },
            "pauses": {
                "after": 5
            }
        },
        {
            "type": "probe",
            "name": "get-service-factory-chaos-report",
            "provider": {
                "type": "python",
                "module": "chaosazure.factory.probes",
                "func": "chaos_report",
                "secrets": ["azure"],
                "arguments": {
                    "start_time_utc": "1 minute ago",
                    "end_time_utc": "now"
                }
            }
        }
    ]
}

Contribute

If you wish to contribute more functions to this package, you are more than welcome to do so. Please, fork this project, make your changes following the usual PEP 8 code style, sprinkling with tests and submit a PR for review.

The Chaos Toolkit projects require all contributors must sign a Developer Certificate of Origin on each commit they would like to merge into the master branch of the repository. Please, make sure you can abide by the rules of the DCO before submitting a PR.

Develop

If you wish to develop on this project, make sure to install the development dependencies. But first, create a virtual environment and then install those dependencies.

$ pip install -r requirements-dev.txt -r requirements.txt 

Then, point your environment to this directory:

$ python setup.py develop

Now, you can edit the files and they will be automatically be seen by your environment, even when running from the chaos command locally.

Test

To run the tests for the project execute the following:

$ pytest

Exported Activities

aks


delete_node

Type action
Module chaosazure.aks.actions
Name delete_node
Return None

Delete a node at random from a managed Azure Kubernetes Service.

Be aware: Deleting a node is an invasive action. You will not be able to recover the node once you deleted it.

Parameters

filter : str Filter the managed AKS. If the filter is omitted all AKS in the subscription will be selected as potential chaos candidates. Filtering example: ‘where resourceGroup==”myresourcegroup” and name=”myresourcename”’

Signature:

def delete_node(filter: str = None,
                configuration: Dict[str, Dict[str, str]] = None,
                secrets: Dict[str, Dict[str, str]] = None):
    pass

Arguments:

Name Type Default Required
filter string null No

Usage:

{
  "provider": {
    "module": "chaosazure.aks.actions",
    "func": "delete_node",
    "type": "python"
  },
  "type": "action",
  "name": "delete-node"
}
name: delete-node
provider:
  func: delete_node
  module: chaosazure.aks.actions
  type: python
type: action

restart_node

Type action
Module chaosazure.aks.actions
Name restart_node
Return None

Restart a node at random from a managed Azure Kubernetes Service.

Parameters

filter : str Filter the managed AKS. If the filter is omitted all AKS in the subscription will be selected as potential chaos candidates. Filtering example: ‘where resourceGroup==”myresourcegroup” and name=”myresourcename”’

Signature:

def restart_node(filter: str = None,
                 configuration: Dict[str, Dict[str, str]] = None,
                 secrets: Dict[str, Dict[str, str]] = None):
    pass

Arguments:

Name Type Default Required
filter string null No

Usage:

{
  "provider": {
    "module": "chaosazure.aks.actions",
    "func": "restart_node",
    "type": "python"
  },
  "type": "action",
  "name": "restart-node"
}
name: restart-node
provider:
  func: restart_node
  module: chaosazure.aks.actions
  type: python
type: action

stop_node

Type action
Module chaosazure.aks.actions
Name stop_node
Return None

Stop a node at random from a managed Azure Kubernetes Service.

Parameters

filter : str Filter the managed AKS. If the filter is omitted all AKS in the subscription will be selected as potential chaos candidates. Filtering example: ‘where resourceGroup==”myresourcegroup” and name=”myresourcename”’

Signature:

def stop_node(filter: str = None,
              configuration: Dict[str, Dict[str, str]] = None,
              secrets: Dict[str, Dict[str, str]] = None):
    pass

Arguments:

Name Type Default Required
filter string null No

Usage:

{
  "provider": {
    "module": "chaosazure.aks.actions",
    "func": "stop_node",
    "type": "python"
  },
  "type": "action",
  "name": "stop-node"
}
name: stop-node
provider:
  func: stop_node
  module: chaosazure.aks.actions
  type: python
type: action

machine


count_machines

Type probe
Module chaosazure.machine.probes
Name count_machines
Return None

Return count of Azure virtual machines.

Parameters

filter : str Filter the virtual machines. If the filter is omitted all machines in the subscription will be selected for the probe. Filtering example: ‘where resourceGroup==”myresourcegroup” and name=”myresourcename”’

Signature:

def count_machines(filter: str = None,
                   configuration: Dict[str, Dict[str, str]] = None,
                   secrets: Dict[str, Dict[str, str]] = None) -> int:
    pass

Arguments:

Name Type Default Required
filter string null No

Usage:

{
  "provider": {
    "module": "chaosazure.machine.probes",
    "func": "count_machines",
    "type": "python"
  },
  "type": "probe",
  "name": "count-machines"
}
name: count-machines
provider:
  func: count_machines
  module: chaosazure.machine.probes
  type: python
type: probe

delete_machine

Type action
Module chaosazure.machine.actions
Name delete_machine
Return None

Delete a virtual machines at random.

***Be aware**: Deleting a machine is an invasive action. You will not be able to recover the machine once you deleted it.

Parameters

filter : str Filter the virtual machines. If the filter is omitted all machines in the subscription will be selected as potential chaos candidates. Filtering example: ‘where resourceGroup==”myresourcegroup” and name=”myresourcename”’

Signature:

def delete_machine(filter: str = None,
                   configuration: Dict[str, Dict[str, str]] = None,
                   secrets: Dict[str, Dict[str, str]] = None):
    pass

Arguments:

Name Type Default Required
filter string null No

Usage:

{
  "provider": {
    "module": "chaosazure.machine.actions",
    "func": "delete_machine",
    "type": "python"
  },
  "type": "action",
  "name": "delete-machine"
}
name: delete-machine
provider:
  func: delete_machine
  module: chaosazure.machine.actions
  type: python
type: action

describe_machines

Type probe
Module chaosazure.machine.probes
Name describe_machines
Return None

Describe Azure virtual machines.

Parameters

filter : str Filter the virtual machines. If the filter is omitted all machines in the subscription will be selected for the probe. Filtering example: ‘where resourceGroup==”myresourcegroup” and name=”myresourcename”’

Signature:

def describe_machines(filter: str = None,
                      configuration: Dict[str, Dict[str, str]] = None,
                      secrets: Dict[str, Dict[str, str]] = None):
    pass

Arguments:

Name Type Default Required
filter string null No

Usage:

{
  "provider": {
    "module": "chaosazure.machine.probes",
    "func": "describe_machines",
    "type": "python"
  },
  "type": "probe",
  "name": "describe-machines"
}
name: describe-machines
provider:
  func: describe_machines
  module: chaosazure.machine.probes
  type: python
type: probe

restart_machine

Type action
Module chaosazure.machine.actions
Name restart_machine
Return None

Restart a virtual machines at random.

Parameters

filter : str Filter the virtual machines. If the filter is omitted all machines in the subscription will be selected as potential chaos candidates. Filtering example: ‘where resourceGroup==”myresourcegroup” and name=”myresourcename”’

Signature:

def restart_machine(filter: str = None,
                    configuration: Dict[str, Dict[str, str]] = None,
                    secrets: Dict[str, Dict[str, str]] = None):
    pass

Arguments:

Name Type Default Required
filter string null No

Usage:

{
  "provider": {
    "module": "chaosazure.machine.actions",
    "func": "restart_machine",
    "type": "python"
  },
  "type": "action",
  "name": "restart-machine"
}
name: restart-machine
provider:
  func: restart_machine
  module: chaosazure.machine.actions
  type: python
type: action

stop_machine

Type action
Module chaosazure.machine.actions
Name stop_machine
Return None

Stop a virtual machines at random.

Parameters

filter : str Filter the virtual machines. If the filter is omitted all machines in the subscription will be selected as potential chaos candidates. Filtering example: ‘where resourceGroup==”myresourcegroup” and name=”myresourcename”’

Signature:

def stop_machine(filter: str = None,
                 configuration: Dict[str, Dict[str, str]] = None,
                 secrets: Dict[str, Dict[str, str]] = None):
    pass

Arguments:

Name Type Default Required
filter string null No

Usage:

{
  "provider": {
    "module": "chaosazure.machine.actions",
    "func": "stop_machine",
    "type": "python"
  },
  "type": "action",
  "name": "stop-machine"
}
name: stop-machine
provider:
  func: stop_machine
  module: chaosazure.machine.actions
  type: python
type: action

vmss


deallocate_vmss

Type action
Module chaosazure.vmss.actions
Name deallocate_vmss
Return None

Deallocate a virtual machine scale set instance at random. Parameters


filter : str Filter the virtual machine scale set. If the filter is omitted all virtual machine scale sets in the subscription will be selected as potential chaos candidates. Filtering example: ‘where resourceGroup==”myresourcegroup” and name=”myresourcename”’

Signature:

def deallocate_vmss(filter: str = None,
                    configuration: Dict[str, Dict[str, str]] = None,
                    secrets: Dict[str, Dict[str, str]] = None):
    pass

Arguments:

Name Type Default Required
filter string null No

Usage:

{
  "provider": {
    "module": "chaosazure.vmss.actions",
    "func": "deallocate_vmss",
    "type": "python"
  },
  "type": "action",
  "name": "deallocate-vmss"
}
name: deallocate-vmss
provider:
  func: deallocate_vmss
  module: chaosazure.vmss.actions
  type: python
type: action

delete_vmss

Type action
Module chaosazure.vmss.actions
Name delete_vmss
Return None

Delete a virtual machine scale set instance at random.

Be aware: Deleting a VMSS instance is an invasive action. You will not be able to recover the VMSS instance once you deleted it.

Parameters

filter : str Filter the virtual machine scale set. If the filter is omitted all virtual machine scale sets in the subscription will be selected as potential chaos candidates. Filtering example: ‘where resourceGroup==”myresourcegroup” and name=”myresourcename”’

Signature:

def delete_vmss(filter: str = None,
                configuration: Dict[str, Dict[str, str]] = None,
                secrets: Dict[str, Dict[str, str]] = None):
    pass

Arguments:

Name Type Default Required
filter string null No

Usage:

{
  "provider": {
    "module": "chaosazure.vmss.actions",
    "func": "delete_vmss",
    "type": "python"
  },
  "type": "action",
  "name": "delete-vmss"
}
name: delete-vmss
provider:
  func: delete_vmss
  module: chaosazure.vmss.actions
  type: python
type: action

restart_vmss

Type action
Module chaosazure.vmss.actions
Name restart_vmss
Return None

Restart a virtual machine scale set instance at random. Parameters


filter : str Filter the virtual machine scale set. If the filter is omitted all virtual machine scale sets in the subscription will be selected as potential chaos candidates. Filtering example: ‘where resourceGroup==”myresourcegroup” and name=”myresourcename”’

Signature:

def restart_vmss(filter: str = None,
                 configuration: Dict[str, Dict[str, str]] = None,
                 secrets: Dict[str, Dict[str, str]] = None):
    pass

Arguments:

Name Type Default Required
filter string null No

Usage:

{
  "provider": {
    "module": "chaosazure.vmss.actions",
    "func": "restart_vmss",
    "type": "python"
  },
  "type": "action",
  "name": "restart-vmss"
}
name: restart-vmss
provider:
  func: restart_vmss
  module: chaosazure.vmss.actions
  type: python
type: action

stop_vmss

Type action
Module chaosazure.vmss.actions
Name stop_vmss
Return None

Stop a virtual machine scale set instance at random. Parameters


filter : str Filter the virtual machine scale set. If the filter is omitted all virtual machine scale sets in the subscription will be selected as potential chaos candidates. Filtering example: ‘where resourceGroup==”myresourcegroup” and name=”myresourcename”’

Signature:

def stop_vmss(filter: str = None,
              configuration: Dict[str, Dict[str, str]] = None,
              secrets: Dict[str, Dict[str, str]] = None):
    pass

Arguments:

Name Type Default Required
filter string null No

Usage:

{
  "provider": {
    "module": "chaosazure.vmss.actions",
    "func": "stop_vmss",
    "type": "python"
  },
  "type": "action",
  "name": "stop-vmss"
}
name: stop-vmss
provider:
  func: stop_vmss
  module: chaosazure.vmss.actions
  type: python
type: action

fabric


chaos_report

Type probe
Module chaosazure.fabric.probes
Name chaos_report
Return None

Get Chaos report using following the Service Fabric API:

https://docs.microsoft.com/en-us/rest/api/servicefabric/sfclient-v60-model-chaosparameters

Please see the :func:chaosazure.fabric.auth help for more information on authenticating with the service.

Signature:

def chaos_report(timeout: int = 60,
                 start_time_utc: str = None,
                 end_time_utc: str = None,
                 configuration: Dict[str, Dict[str, str]] = None,
                 secrets: Dict[str, Dict[str, str]] = None) -> Dict[str, Any]:
    pass

Arguments:

Name Type Default Required
timeout integer 60 No
start_time_utc string null No
end_time_utc string null No

Usage:

{
  "provider": {
    "module": "chaosazure.fabric.probes",
    "func": "chaos_report",
    "type": "python"
  },
  "type": "probe",
  "name": "chaos-report"
}
name: chaos-report
provider:
  func: chaos_report
  module: chaosazure.fabric.probes
  type: python
type: probe

start_chaos

Type action
Module chaosazure.fabric.actions
Name start_chaos
Return None

Start Chaos in your cluster using the given parameters. This is a mapping of keys as declared in the Service Fabric API:

https://docs.microsoft.com/en-us/rest/api/servicefabric/sfclient-v60-model-chaosparameters

Please see the :func:chaosazure.fabric.auth help for more information on authenticating with the service.

Signature:

def start_chaos(parameters: Dict[str, Any],
                timeout: int = 60,
                configuration: Dict[str, Dict[str, str]] = None,
                secrets: Dict[str, Dict[str, str]] = None) -> Dict[str, Any]:
    pass

Arguments:

Name Type Default Required
parameters mapping Yes
timeout integer 60 No

Usage:

{
  "provider": {
    "module": "chaosazure.fabric.actions",
    "arguments": {
      "parameters": {}
    },
    "func": "start_chaos",
    "type": "python"
  },
  "type": "action",
  "name": "start-chaos"
}
name: start-chaos
provider:
  arguments:
    parameters: {}
  func: start_chaos
  module: chaosazure.fabric.actions
  type: python
type: action

stop_chaos

Type action
Module chaosazure.fabric.actions
Name stop_chaos
Return None

Stop Chaos in your cluster.

Please see the :func:chaosazure.fabric.auth help for more information on authenticating with the service.

Signature:

def stop_chaos(timeout: int = 60,
               configuration: Dict[str, Dict[str, str]] = None,
               secrets: Dict[str, Dict[str, str]] = None) -> Dict[str, Any]:
    pass

Arguments:

Name Type Default Required
timeout integer 60 No

Usage:

{
  "provider": {
    "module": "chaosazure.fabric.actions",
    "func": "stop_chaos",
    "type": "python"
  },
  "type": "action",
  "name": "stop-chaos"
}
name: stop-chaos
provider:
  func: stop_chaos
  module: chaosazure.fabric.actions
  type: python
type: action