Monday, July 18, 2016

Ansible and DigitalOcean: setting up

The playbook in my previous blog gives an idea of what I do, but it doesn't get you working, and misses out lots of configuration. Let's fill in some gaps by looking at the infrastructure round the playbook. But before that, some rationale so you can see some of the reasons for my decisions.

When I need to do precise work over and over again, I look around for a tool to help me do that work faster, more accurately and more repeatably*. 

Plenty of tools exist to help in setting up servers; here's Wikipeda's big list. Chef and Puppet are common choices. I've chosen Ansible over Chef or Puppet because it instructs servers over ssh, so doesn't need to install client software before communicating. I'm told that Ansible is easier to learn.

I've leant heavily on other guides – chiefly 

Crucial gaps were filled in by 

View them as definitive, and my account below as flaky.


Most of the following is done in the terminal of my MacBook under OS X.11.5 (El Capitan), with an admin user. I used Atom as my primary editor. 

First, we'll need Ansible. I want to run it locally.

I used homebrew to install Ansible, installing 2.1.0.0
brew install ansible

Irritatingly, there's a bug in the tool that enables Ansible's communication with DigitalOcean. While working on the playbooks, you may see an error NameError: name 'DoError' is not defined\r\n".

To get over the hump, I needed to downgrade the version (0.3.7 for me) of dopy (DigitalOceanPYthon) that comes with Ansible.
sudo pip install 'dopy>=0.3.5,<=0.3.5'


I set up a base directory for this project, deep in my home folder. I've not set any special permissions on the new directory. When I run an Ansible command, I run it from that directory, and I keep all my Ansible configuration, playbooks and templates there**.

In that base directory, I've put following ansible.cfg file to instruct Ansible to use nearby files to read inventory and write logs.
[defaults]
inventory = ./hosts
log_path=./ansible.log

Please note: here, and below, I'm using ./ to indicate explicitly to the system (and to you, reader) that we're starting from whatever directory we're in – which is the base directory, most of the time.

The inventory file holds information about the servers you're asking Ansible to manage. I'm using Ansible locally, so my inventory file ./hosts looks like this: 
[local]
localhost ansible_connection=local

You might imagine that I'd need to put all my DigitalOcean servers into this inventory file. I can't, because I don't know them. We'll need to use Ansible's "in-memory inventory" instead.

So that's Ansible set up. Let's hope. 


Ansible has a concept of playbooks – short, readable files that basically link a list of hosts and set of instructions about what to do with them. These playbooks are written in YAML – they're readable, but not as easy to build as to read. I've got a short posting coming on .yml files for Ansible. Ansible's example playbooks use roles – which is good for reuse, but harder to read.

My playbooks share three sets (currently) of configuration information. Each of these are in their own file, so that they can make changes in one place only, and so that I can strip the information out from change control. They're all together in a ./vars directory. 
./vars/sensititve.yml  – the API key (which I don't want to share)
./vars/sshInfo.yml – the ssh information (which I want to share temporarily)
./vars/droplets.yml – the list of servers to build (which will include different configuration options).


Let's get the bits together to allow my setup to identify itself to DigitalOcean as a valid account owner, and to the servers as a valid controlling account.

The DigitalOcean API key will identify my account from any playbook that sets up (or destroys) servers. I got it from DigitalOcean - API Tokens. I don't want to share it with anyone, ever. If I do, I need to revoke it and get a new one. 
./vars/sensititve.yml looks like:
---
  sensitive:
    do_token: « 64 characters of hex signal that looks like noise but isn't. »
...

My ssh key information is used in any playbook that communicates with servers. If a server has the public key, and I have the private key, Ansible can log in over ssh without a password. 
./vars/sshInfo.yml  looks like this:
--
  sshInfo:
    do_ssh_key_name: TestLabEuroSTAR2016
    local_private_ssh_key: ~/.ssh/TestLabEuroSTAR2016
...

This needs a bit of matching infrastructure, and a rationale. I want to have the option of sharing the public key with TestLab people, so it needs to be separate from my usual keys. Sharing means I don't want to use my default keys, so I need to make and name it for this task, and specify it explicitly. 

I made a custom-named key like this:
ssh-keygen -t rsa -f TestLabEuroSTAR2016
which builds two files in ~/.ssh/ for my key. 
The public part is TestLabEuroSTAR2016.pub and the private TestLabEuroSTAR2016
Note: this command will ask for a passphrase. Don't forget the passphrase; OS X (not Ansible) may ask you to enter it if you're re-using the key after a couple of days.

I uploaded the public part of the key to DigitalOcean (via: DigitalOcean - Settings ), so that DigitalOcean can put it on any new server. I gave the key a name on upload, and it's that name I'm using in do_ssh_key_name above. When Ansible sets up a new server, it will ask DigitalOcean to add this uploaded public key to the server as it's being made.

When my tool communicates with the new server to load and configure software, it will use the private key ~/.ssh/TestLabEuroSTAR2016 to get in via ssh. If I don't specify this in my playbook, Ansible will quietly default to id_rsa.pub, and that won't get in.

While we're considering shared information, here's one more. I want to have a script that destroys the servers I set up (and only those servers), so I'll need to share information about those, too.

My ./vars/droplets.yml file looks like:
---
  droplets:
  - name: TestLab01
  - name: TestLab02
...
I can add more servers as I need them. I'm currently limited to 50 droplets. I expect that I'll add more details to each server (an indented list for each - name: line) as I differentiate my servers.

When I get my servers set up, I want each to be doing something that differentiates it from its neighbours. I'd also prefer not to be faffing around with IP addresses. I've set up a template to build a web page for each server. Each server's page will have its own name at the top of the page, and a list of named links to the others.
I've put my template at ./siteStuff/index.html , and it looks like
 <!DOCTYPE html>  
 <html lang="en">  
 <head>  
  <meta charset="utf-8">  
  <meta http-equiv="X-UA-Compatible" content="IE=edge">  
  <meta name="viewport" content="width=device-width, initial-scale=1">  
  <title>Basic HTML Template</title>  
 </head>  
 <body>  
  <h1>James's TestLab stuff for {{WPL_server_info}}</h1>  
  <ul>  
  {% for item in otherServers %}  
   <li><a href="http://{{ item.droplet.ip_address }}">{{ item.droplet.name }}</a></li>  
  {% endfor %}  
  </ul>  
  <p>Ansible set up this index from a template</p>  
 </body>  
 </html>  

This is a jinja2 template, and will make bare HTML. The set up index task in my playbook will generate an index.html file for each of the hosts we set up in the newServers group in the in-memory inventory. Look back to see that I set up a bunch of host variables for those servers – the template substitutes the stuff in {curly brackets} with those host variables. It uses the server name (which came from the names in the list of desired droplets), then builds a list of links to all the servers in the group. The playbook uploads the built page to the server.


When I run the makeDroplets.yml playbook with  ansible-playbook makeDroplets.yml, I get plenty of information about what's happening. I won't paste it here. Occasionally, one of the post-server-creation steps fails – I may need to add a wait_for. However, Ansible is idempotent, so if a step fails I can simply run the playbook again, and it should fill in the gaps.

Be aware:
- Ansible and DigitalOcean take about a minute to set up each server.
- Each of these smallest-possible servers costs $0.007 an hour to run. Which is piddling, until you fire up 50 and forget to destroy them.

I can check my handiwork by browsing to one of the IP addresses. I hope to see a page with links to all my newly-minted servers – and when I click through, I hope to observe that the server name changes – and so does the IP address.




* And with my soul intact (but my yaks shaved).
** here's an edited listing to give you an idea of the shape

./ansible.cfg
./ansible.log
./hosts
./destroyDroplets.yml
./makeDroplets.yml
./siteStuff/index.html
./vars
./vars/sensititve.yml
./vars/sshInfo.yml
./vars/droplets.yml


Saturday, July 16, 2016

Using Ansible and DigitalOcean to provision TestLab servers

Here's an Ansible playbook that I use to spin up and provision DigitalOcean droplets.

There's a longer article to follow, if you're interested – but the salient points are:

- Spin up the droplets with Ansible's DigitalOcean module
- Put their details into Ansible's "in-memory inventory" with Ansible's add_host module
- Use those details when you provision the droplets with the apt module and more.

I used homebrew to install Ansible 2.1 on my OSX.11 MacBook. I needed to revert to dopy 0.3.5 (there's a bug in the 0.3.7 version that comes with Ansible 2.1)


The playbook below
- uses a custom ssh key where necessary
- keeps the ssh keys and the API out of the main file
- takes an external file of names for the hosts
- avoids irritating known-host checking by setting the following variable for each new server ansible_ssh_common_args='-o StrictHostKeyChecking=no'
- sets up apache / php / git on each server, and uses a jinja2 template to make a unique-ish page on each host.
- takes about 90 seconds per server
- goes with a matching "destroyDroplets.yml" playbook

---
- name: provision servers

  hosts: local

  vars_files:
    - ./vars/droplets.yml
    - ./vars/sensitive.yml
    - ./vars/sshInfo.yml

  tasks:
  - name: Get DigitalOcean's ID of ssh key
    digital_ocean:  #note avoidance of = signs...
      command: ssh
      state: present
      name: "{{ sshInfo.do_ssh_key_name}}"
      api_token: "{{ sensitive.do_token }}"
    register: my_DO_ssh_key
    #
  - name: make droplets, if they don't exist already
    digital_ocean: >
      state=present
      command=droplet
      name={{item.name}}
      unique_name=yes
      size_id=512mb
      region_id=lon1
      image_id=ubuntu-14-04-x64
      ssh_key_ids={{ my_DO_ssh_key.ssh_key.id }}
      api_token={{ sensitive.do_token }}
      wait=yes
    with_items: "{{droplets}}"
    register: droplet_details
    #
  - name: Add named droplet to  group newServers #   variables set user (needed), use right key, stop wretched dialog with known_hosts
    add_host: >
      groupname=newServers
      hostname="{{ item.droplet.ip_address }}"
      ansible_user=root
      ansible_private_key_file="{{sshInfo.local_private_ssh_key}}"
      ansible_ssh_common_args='-o StrictHostKeyChecking=no'
      WPL_server_info="{{item.droplet.name}}"
      otherServers="{{droplet_details.results}}"
    with_items: '{{droplet_details.results}}'
#
- name: set up servers
  hosts: newServers
  tasks:
  - name: install packages
    apt:  >
      name={{item}}
      state=present
      update_cache=yes
    with_items:
      - apache2
      - libapache2-mod-php5
      - git
  - name: remove existing web stuff
    file: >
      path=/var/www/html/index.html
      state=absent
  - name: set up index
    template: src=./siteStuff/index.html dest=/var/www/html/index.html force=yes
  - name: start Apache
    service: name=apache2 state=running enabled=yes

...

If you want to use this, you'll need a DigitalOcean account (get yours here), a DigitalOcean API key, a public/private key pair for ssh (and you'll upload the public one for DigitalOcean to use as you set up, a bunch of configuration files that can be inferred from the playbook, and a template for a web page. Wait about and I'll post them.