Harness the power of PuppetDB with Ansible's dynamic inventory

March 14, 2015 David Moreau Simard

5 minute read

People out there are wondering what orchestration and configuration tool to use. Rightfully so because it gets confusing quickly.

Puppet ? Ansible ? Chef ? Salt ? The myriad of other tools ?

In the age of Infrastructure as Code, have you made your choice yet ?

My answer: Why limit yourself to just one option ?

I started getting involved with Puppet a long time ago and, to this day, I don’t regret that choice. Sometimes it’s sort of a love/hate relationship but… It has a vibrant ecosystem, lots of great community-supported modules and it has been the most popular way to get an Openstack cloud going for the past years.

Strengths and weaknesses: Why not both ?

While I’m not here to sell you on either Puppet or Ansible, I happen to have experience to share throughout my use of both.

The two of them have their strengths and weaknesses, let’s talk about the ones that I’m trying to take advantage of.

Orchestration

Something Puppet isn’t able to address very well for the time being: Orchestration.

Puppet is strong at ensuring stuff is where it’s supposed to be and configured the way it should be. There’s an agent, it’s there, it runs every 30 minutes and you can be pretty damn sure what it manages will stay put, for better or worse.

This means that even though you’ve updated your recipes, you might not get to see them applied for another 30 minutes unless you manually run Puppet. Fun bit, running Puppet locally through Ansible is becoming an increasingly popular option as to control when Puppet will run.

Otherwise, though, when you have a lot of servers and moving parts in your infrastructure… You’re never really sure when changes are going to take place where.

If you have to upgrade your servers in a specific order, it’s going to be a pain with Puppet.

This is something that Ansible is great at. You tell Ansible to do several things on several servers in a specific order and it’ll do just that.

Facts

This probably bothers me more than it should: how slow Ansible is at gathering facts.

Ansible can retrieve “facts” for you to use whenever executing playbooks on remote systems. Facts are information about the system, for example:

It’s operating system, it’s version
It’s kernel and version
CPU and RAM configuration
It’s hostname, domain or network interfaces and their configuration

Unless you specify that you don’t want to gather facts at all, Ansible will retrieve them individually on each host before running your tasks.

This can take a long time, especially if you have a lot of hosts you’re running against. And the next time you run that task, it’s going to need to retrieve those again. By design, Ansible is kind of stuck with this way of doing things and in a certain respect, that’s fine.

Puppet gathers and uses facts too. In that sense, Puppet is no different - it needs to fetch facts on the system before doing it’s thing. It’s not particularly fast either.

What you can do with Puppet, however, is to hook it to PuppetDB:

PuppetDB collects data generated by Puppet. It enables advanced Puppet features like exported resources, and can be the foundation for other applications that use Puppet’s data.

This’ll make Puppet, amongst other things, report the system’s facts back to PuppetDB - where they will be stored. This quickly becomes powerful to gain insight into your infrastructure with tools such as Puppetboard.

Inventory

You ultimately decide a static inventory with Puppet.. where it will run. You’re the one that has to install and configure Puppet. You either run it locally on the server or you know which nodes are hooked to a Puppetmaster and through a site.pp, you know which node will do what (eventually).

Ansible has that too - a static inventory. A flat file with your hosts in them. You then decide, based on that inventory, on what servers you will be running your Ansible playbooks against.

Where Ansible shines though, is with it’s dynamic inventories.

Let’s pretend you’re using a lot of cloud servers, Ansible could retrieve it’s inventory dynamically and automatically from sources such as Amazon EC2, Google Compute Engine, Openstack and more.

Now, that’s great because you’re spinning instances up and down, scaling your application in the direction it needs to go to achieve optimal efficiency and Ansible will keep up with fresh inventory data.

You don’t want to maintain a static inventory by hand, that’s a pain. Dynamic inventories are great to help you with that.

Ansible dynamic inventory with PuppetDB

Wouldn’t it be great if you could hook Ansible to PuppetDB’s inventory of nodes and their data ? I tried to search and see if it was possible but it didn’t look like there was anything upstream to do just that.

I took it upon myself to develop a mean of getting Ansible to generate a dynamic inventory using PuppetDB as a source: ansible-inventory-puppetdb

I developed a dynamic inventory script that can be used by Ansible to generate a list of hosts and their facts from PuppetDB. This means you can use Puppet facts in your playbooks as they are now available as hostvars.

The script also allows you to group your hosts by fact values. Grouping by fact values enables you to run Ansible playbooks on, perhaps, servers with just a certain kernel version or a specific operating system release.

Another great use case is that the inventory will not limit itself to the default facts. Custom Puppet facts will be retrieved and be made available as well.

This enables you to push just any custom fact through puppet and then easily run Ansible task on only the servers that match what you’re looking for.

Generating this inventory is really quick, too, because that data is already waiting for you inside PuppetDB. You don’t need to trigger a Puppet or Ansible run to retrieve the inventory and the facts of each node, it’s there.

There is a caching layer available also. The script will not fetch a new copy of the inventory unless the cache has expired - making subsequent runs instantaneous.

Want to try it out ? Please do ! Feel free to provide feedback and contribute if you find opportunities for improvement.

dmsimard.com

Home

Blog

Categories

About

Recent Posts

AnsibleFest 2018: Community project highlights

ARA Records Ansible 0.15 has been released

Scaling ARA to a million Ansible playbooks a month

Awesome things in software engineering: open source

Rebranding Ansible Run Analysis to ARA Records Ansible