Super Saltstack

Event driven orchestration and configuration management

Glynn Forrest

me@glynnforrest.com

About me

  • Musician, teacher, percussionist, drummer
  • Software developer and administrator
  • Maintainer of salt-mode for emacs

Topics covered

  • SaltStack defined
  • Execution
  • State and highstate
  • Orchestration
  • Events, reactor, and beacons
  • Salt cloud
  • More stuff (Salt has a LOT of functionality)
  • Getting started and tips
  • Stuff that sucks
  • Demo

High level overview

Configuration Management

Puppet (2005)

Chef (2009)

SaltStack (2011)

Ansible (2012)

Salt is a Python-based remote execution engine used for configuration management, orchestration, and automation

SaltStats

  • Started by Thomas Hatch in 2011, first release in March
  • Quickly adopted by Linkedin, 10,000 servers
  • SaltStack Inc. founded in 2012
  • Open source - 8th most popular GitHub project in 2012
  • Most exciting project at OSCON 2013 (with Docker) and 2014 InfoWorld Technology of the Year
  • SaltStack Enterprise adds a GUI, integrations, and other features
  • ~444 execution modules, ~278 state modules

Salt started as an execution engine

salt-minion on every host

salt-master sends commands to minions


salt '*' test.ping


salt '*' file.copy /etc /opt/backups/etc recurse=True

Built for speed and scalability

ZeroMQ transport layer

Handle thousands of concurrent minion connections

Other transports available

Targeting


salt '*' test.ping


salt 'db*' service.restart mysql


salt -L 'office_neighbour,other_office_neighbour' cmd.run 'eject'


salt -E 'web1-(prod|stag)' disk.wipe /dev/sda1

Grains

Get underlying facts from hosts

Operating system, kernel version, IP address, hardware information

Set custom grains using files or python code

Target minions using grains with -G


salt '*' grains.items

Targeting with grains


salt -G 'os:CentOS' test.ping


salt -G 'cpuarch:x86_64' grains.item num_cpus


salt -G 'ec2_tags:environment:*production*' system.poweroff

Managing state

Execution modules as building blocks

Define what you want, not how to do it

salt web1 service.status memcached
salt web1 service.start memcached
salt web1 state.single service.running memcached

Define state in code


# /srv/salt/memcached.sls
memcached:             # id declaration
  pkg.installed:       # state function declaration
    - name: memcached  # function arg
  service.running:     # state function declaration
    - name: memcached  # function arg
    - require:         # requisite declaration
      - pkg: memcached # requisite arg


salt web1 state.sls memcached

Add templating


{% if grains['os'] == 'Red Star Linux' %}
  {% set pkg = 'best-memcached' %}
  {% set service = 'best-memcached' %}
{% else %}
  {% set pkg = 'memcached' %}
  {% set service = 'memcached' %}
{% endif %}

memcached:
  pkg.installed:
    - name: {{pkg}}
  service.running:
    - name: {{service}}
    - require:
      - pkg: memcached

Looping


{% for user in ['admin','deploy','qa'] %}
account_{{user}}:
  user.present:
    - name: {{user}}
{% endfor %}


account_admin:
  user.present:
    - name: admin

account_deploy:
  user.present:
    - name: deploy

account_qa:
  user.present:
    - name: qa

Managing files


/etc/postfix/main.cf:
  file.managed:
    - source: salt://postfix/main.cf.j2 # /srv/salt/postfix/main.cf.j2
    - template: jinja
    - defaults:
        listen_ipv4: 127.0.0.0/8


myhostname = {{ grains['fqdn'] }}
mydestination = {{ grains['fqdn'] }}, localhost.{{ grains['domain'] }}, localhost
mynetworks = {{listen_ipv4}} [::ffff:127.0.0.0]/104 [::1]/128

access to grains and passed in vars


myhostname = mail.example.com
mydestination = mail.example.com, localhost.example.com, localhost
mynetworks = 127.0.0.0/8 [::ffff:127.0.0.0]/104 [::1]/128

Pillar

Private, minion-specific data, only accessible to the minion

Trusted, comes from the master (grains come from the minion)

Fetch pillar data from a variety of sources


# /srv/pillar/mail-server/postfix.sls
postfix:
  listen_ipv4: 127.0.0.0/16


/etc/postfix/main.cf:
  file.managed:
    - source: salt://postfix/main.cf.j2 # /srv/salt/postfix/main.cf.j2
    - template: jinja


myhostname = {{ grains['fqdn'] }}
mydestination = {{ grains['fqdn'] }}, localhost.{{ grains['domain'] }}, localhost
mynetworks = {{pillar['postfix']['listen_ipv4']}} [::ffff:127.0.0.0]/104 [::1]/128


myhostname = mail.example.com
mydestination = mail.example.com, localhost.example.com, localhost
mynetworks = 127.0.0.0/16 [::ffff:127.0.0.0]/104 [::1]/128

Multiple states

We can now manage the state of minions


salt -G role:cache state.sls memcached
salt 'web*' state.sls nginx
salt 'lb*' state.sls haproxy
salt '*' state.sls common,ntp,salt_minion

Define all of this in a catalogue?

...enter highstate

Highstate


# /srv/salt/top.sls
base:        # environment
  '*':       # target
    - common # state.sls
    - ntp
    - salt_minion
  'lb*':
    - haproxy
  'web*':
    - nginx
  'role:cache':
    - match: grain
    - memcached


salt '*' state.highstate

Orchestration

Run a sequence of coordinated steps across many minions

Executed on the master with salt-run


# /srv/salt/orch/provision_cluster.sls

bootstrap_servers:
  salt.function:
  - name: cmd.run
  - tgt: 10.0.0.0/24
  - tgt_type: ipcidr        # target minions in a subnet
  - arg:
    - bootstrap

storage_setup:
  salt.state:
  - tgt: 'role:storage'
  - tgt_type: grain
  - sls: ceph
  - require:                # requisite
    - salt: webserver_setup # run after this id succeeds

webserver_setup:
  salt.state:
  - tgt: 'web*'
  - highstate: True


salt-run state.orch orch.provision_cluster

# alias for state.orchestrate
salt-run state.orchestrate orch.provision_cluster

Events, the salt firehose

Everything in saltstack sends an event


salt-run state.event pretty=True


salt/job/20170207181552284907/ret/super-saltstack-web   {
  "_stamp": "2017-02-07T18:15:52.399668",
  "cmd": "_return",
  "fun": "test.ping",
  "fun_args": [],
  "id": "super-saltstack-web",
  "jid": "20170207181552284907",
  "retcode": 0,
  "return": true,
  "success": true
}

Send custom events


salt 'lb*' event.send my-org/something/custom foo=bar


my-org/something/custom {
  "_stamp": "2017-02-07T18:17:29.535405",
  "cmd": "_minion_event",
  "data": {
    ...
    "foo": "bar"
  },
  "id": "lb1",
  "pretag": null,
  "tag": "my-org/something/custom"
}

my-org/something/custom {
  ...
  "id": "lb2",
}

Reactor

Run .sls files on a certain event


# /etc/salt/master or /etc/salt/master.d/reactor.conf
reactor:
  - 'salt/minion/*/start':
    - /srv/reactor/new_minion.sls
  - 'my-org/new-webserver':
    - /srv/reactor/new_webserver.sls


# /srv/reactor/new_minion.sls
highstate_on_start:       # id declaration
  cmd.state.highstate:    # function to run
    - tgt: {{data['id']}} # function argument, using data from the event


# /srv/reactor/new_webserver.sls
{% if data['id'] !== 'web-test1' %}
open_iptables:
  cmd.state.sls:
    - arg:
      - iptables 
    - tgt: {{data['id']}}

reload_loadbalancers:
  cmd.state.sls:
    - arg:
      - haproxy
    - tgt: 'lb*'
{% endif %}

Reactor caveats

No access to grains or pillars inside reactor sls files

Rendering of sls files will block the reactor, avoid heavy function calls

No use of requisites, 'onlyif', and 'unless'

"The goal of a Reactor file is to process a Salt event as quickly as possible and then to optionally start a new process in response."

Common to have a reactor delegate to an orchestrate state

Beacons

Monitor non-salt processes using the event system

File system changes

Service status

User logins

Network usage


salt/beacon/super-saltstack-master/inotify//etc/passwd       {
  "_stamp": "2017-02-13T13:18:35.938072",
  "change": "IN_MODIFY",
  "id": "super-saltstack-master",
  "path": "/etc/passwd"
}

Beacons and reactors

When /etc/mysql/mysql.conf is modified, run mysql.sls on that minion, kill the process that made the change, and send a slack message with the event payload

When a new SSL certificate is added to a bastion server, copy it to the salt master and run an orchestration state to distribute the new certificate to the load balancers and reload them

When sshd stops running, attempt to restart it

Salt-cloud

Manage resources on cloud hosts and hypervisors with salt-cloud

Configured with providers and profiles

Create new minions with these profiles, pre-configured to connect to a salt-master

Providers

Public and private clouds: EC2, GCE, Azure, OpenStack, Linode, Digital Ocean, etc

VMware, LXC, Proxmox, Virtualbox

Salt uses APIs from these providers to provision instances, run actions on instances and execute functions

Capabilities vary

Configure providers


# /etc/salt/cloud.providers or /etc/salt/cloud.providers.d/*.conf

ec2_workers:                 # provider name
  driver: ec2                # ec2, vmware, gce, kvm, etc
  id: 'HJGRYCILJLKJYG'       # EC2 access credentials
  key: 'kdjgfsgm;woormgl/aserigjksjdhasdfgn'
  private_key: /etc/salt/aws_key.pem
  keyname: aws_key
  securitygroup: default
  location: ap-southeast-1   # salt-cloud --list-locations ec2_workers
  availability_zone: ap-southeast-1b
  minion:
    master: salt.example.com # location of the master for new minions

Configure profiles


# /etc/salt/cloud.profiles or /etc/salt/cloud.profiles.d/*.conf

micro_worker:             # profile name
    provider: ec2_workers # name of a configured provider
    image: ami-d514f291
    size: t1.micro

medium_worker:
    provider: ec2_workers
    image: ami-d514f291
    size: m3.medium

Run instances


salt-cloud -p micro_worker worker1 worker2 worker3

salt-cloud -p micro_worker -P worker1 worker2 worker3 


salt-cloud -d worker2

Actions and functions


salt-cloud -a reboot worker3

salt-cloud -a stop worker1 worker3 


# functions require the provider
salt-cloud -f list_nodes ec2_workers

salt-cloud -f show_image ec2_workers image=ami-d514f291

Capabilities vary

Maps

Instances defined in code


micro_worker: # profile name
    - worker1 # names of the minions
    - worker2
    - worker3
medium_worker:
    - mworker1
    - mworker2


salt-cloud -m /path/to/map.map

# in parallel
salt-cloud -m /path/to/map.map -P

Destroy machines in a map with --destroy


salt-cloud -m /path/to/map.map -d


The following virtual machines are set to be destroyed:
    worker1
    worker2
    worker3
    mworker1
    mworker2

Proceed? [N/y]

Enforce maps with --hard


salt-cloud -m /path/to/map.map -H

Very dangerous!


# /etc/salt/cloud

enable_hard_maps: True

Detailed maps


micro_worker:
    - worker1:
        minion:
            mine_functions:
                network.ip_addrs: [eth1]
            grains:
                role: worker
medium_worker:
    - mworker2:
        minion:
            log_level: garbage
            mine_functions:
                network.ip_addrs: [eth1]
            grains:
                role: mworker

Integrate salt-cloud with the reactor

Run highstate on a minion when it sends the salt/cloud/<minion_id>/created event

Create a new worker minion when cpu load on existing workers is above 85% for 5 minutes

Cloud functions available as execution modules, states, and runners

Caveat: only the presence of the minion is managed statefully with cloud.present state

(Lots) More stuff

Salt-api

Communicate with salt over http

  • rest_cherrypy - "a REST API for Salt"

  • rest_tornado - "a websockets add-on to saltnado"

  • rest_wsgi - "a minimalist REST API for Salt"

Hook into your CI/CD system, integrate with a chatbot, create custom dashboards

Runner modules

Run functions on the master with salt-run

  • orchestration
  • salt-cloud
  • manage the reactor, fileserver, pillar
  • lookup jobs
  • control lxc containers
  • launch spacewalk
  • http queries
  • many more

External pillar sources


cmd_json
cmd_yaml
cmd_yamlex
cobbler
confidant
consul_pillar
django_orm
ec2_pillar
etcd_pillar
file_tree
foreman
git_pillar
hg_pillar
hiera
http_yaml
libvirt
mongo
mysql
neutron
nodegroups
pepa
pillar_ldap
puppet
reclass_adapter
redismod
s3
sql_base
sqlcipher
sqlite3
stack
svn_pillar
varstack_pillar
vault
virtkey

State renderers


cheetah
dson
genshi
gpg
hjson
jinja
json
json5
mako
msgpack
py  # completely custom code, just return dictionaries of state data 
pydsl
pyobjects
stateconf
wempy
yaml
yamlex

Default is yaml_jinja

Fileserver backends



azurefs
gitfs
hgfs
minionfs # minion push files to the master
roots    # defaults to /srv/salt and /srv/pillar 
s3fs
svnfs

Default is roots

Salt mine

Publish the results of function calls between minions

e.g. Every web minion publishes the result of network.ip_addrs eth1, which is then used in the haproxy state for the load balancer minions

Returners

Publish job results to an external system

Databases, log aggregators, monitoring, chatrooms...

Scheduler

Use the internal job scheduler to periodically execute jobs

Supports cron syntax and a rich datetime definition

Transports

For when ZeroMQ isn't fast enough...

RAET - Reliable Asynchronous Event Transport

Also SSH and TCP

SSH is run 'masterless' in the style of Ansible

SDB

Store and fetch arbitrary data from external systems that doesn't belong in pillars or grains

Engines

Long-running, external system processes that leverage salt

Can access salt configs, execution and runner modules

Executed in a separate process on a minion or master, automatically restarted by salt if it stops

e.g. Engine that reads messages from redis pubsub and sends reactor events

Proxy

Talk to devices that can't run a salt-minion

Scaling

salt-syndic runs a master, receives orders from 'master of masters', controls minions

Multi-master for high-availability

Always "hot" or failover

Quick start


curl -L https://bootstrap.saltstack.com > install_salt.sh
sh install_salt.sh

# install a master too
sh install_salt.sh -M

# install salt-cloud with no minion
sh install_salt.sh -L -N


salt-key --accept-all


salt \* test.ping

Master and minion configuration


# /etc/salt/master

file_roots:
    base:
        - /srv/salt

pillar_roots:
    base:
        - /srv/pillar



# /etc/salt/minion

master: salt
# master: 192.168.4.2


Organising code


/salt-codez/
├── pillar
│   ├── staging
│   │   ├── accounts.sls
│   │   └── servers.sls
│   ├── prod
│   │   ├── haproxy.sls
│   │   ├── iptables.sls
│   │   └── servers.sls
│   └── top.sls
├── reactor
│   └── new_minion.sls
├── salt
│   ├── _modules
│   │   └── custom_execution.py
│   ├── _states
│   │   └── custom_state.py
│   ├── apache
│   │   ├── apache2.conf.j2
│   │   ├── init.sls
│   │   ├── vhosts.sls
│   │   └── virtual_host.conf.j2
│   ├── haproxy
│   │   └── haproxy.conf.j2
│   ├── top.sls
├── Vagrantfile
└── rsync_to_master.sh

Pushing salt code to the master

Edit files in /srv/{salt,pillar,reactor} directly (dev only!)

Test locally, rsync to master

Gitfs on master pulls states down from a stable branch

Gitfs with multiple masters in a dev/qa/staging/prod environment, with automatic promotion

Some useful salt tips

Run minion / master in the foreground


service salt-minion stop
salt-minion -l debug

service salt-master stop
salt-master -l info

Run functions on the minion


# on the master
salt cache1 state.sls memcached

# on the cache1 minion, showing logs in the foreground
salt-call state.sls memcached

salt-call uses the 'local' returner to echo output

Dry run of states


salt \* state.highstate test=True

salt \* state.sls risky_untested_state test=True pillar="{comment: 'please work'}"

Reduce output noise


salt \* state.highstate --state-output changes

Bad stuff

Lots of features; lots of bugs

High development speed and churn

Documentation (has improved a lot)

Terminology

Testing not mature

Weird return codes from cli, api client and http api

DEMO