Demystifying Puppet

• Infrastructure-as-a-Code
• Puppet is an open source configuration management utility
• It is written in Ruby and released as free software under the GPL
• Built to be cross-platform
• A Declarative Language
• A Model Driven Architecture
A simple client-server model
Procedural Language
Puppet Language
• Talks about Resource Management when agents connect
• Handles the “How” by knowing how different platforms and OS manages certain type of resources
• Each types has number of providers
• A Providers contains “how” of managing packages using a Package Management tool.
• When agents connect , Puppet uses a tool called “Facter”. To return the information about that agent, including what OS is it running.
• Puppet chooses the appropriate package provider for that OS. Reports Success or Failure
• A System Inventory tool
• Returns facts about each agent(Hostname, IP address, OS and version information)
• These facts are gathered when agent runs.
• Facts are sent to the Puppet master, and automatically created as variable available to Puppet
• How to run facter?
• Facts are made available as variables that can be used in your puppet configuration
• Helps Puppet understand on how to manage particular resources in an agent.
• Resources - The core of the Puppet language is declaring resources. An Individual configuration items
• Files – Physical files to serve to your agents
• Templates – Templates files that you can use to populate files
• Modules – Portable collection of resources. Reusable, sharable units of Puppet Code
• Classes – Modules can contain many Puppet classes. Groups of resource declarations and conditional statements
• Manifests - Puppet code is saved in files called manifests, which are in turn stored in structured directories called modules.
• Pre-built Puppet modules can be downloaded from the Puppet Forge, and most users will write at least some of their own modules.
• Searching/Installing Module
• Puppet Forge
Installing Win7zip through Puppet
Installing Chrome through Puppet
• All servers that are physical with 4 CPU, deploy ESX.
• All servers that are virtual with 1 CPU and 4GB of memory, deploy CentOS, and handoff system to for management.
• All servers that are virtual with 32GB of memory, deploy Debian, and handoff system to for management.
- A Software tool for rapid provisioning of OS and Hypervisor
on both physical and virtual servers
- A policy-based bare-metal provisioning lets you inventory and manage the lifecycle of your physical machines.
- Automatically discovers bare-metal hardware, dynamically configure operating systems and/or hypervisors, and hand nodes off to PE for workload
- Two major Components:
~ The Razor Server
(Ruby, MongoDB, Node.js)
~ The Razor Microkernel
(~20MB LInux Kernel, Facter, MCollective)
• - Discovery ( Tags, Matcher Rules)
• - Models(Defining OS Templates,..)
• - Policies(Rules that apply Models to Nodes based on Discovery)
• - Broker( Configuration Management)
When a new node appears, Razor discovers its characteristics by booting it with the Razor microkernel and inventorying its facts.
The node is tagged based on its characteristics. Tags contain a match condition — a Boolean expression that has access to the node’s facts and determines
whether the tag should be applied to the node or not.
• Install PE in Your Virtual Environment
• Install and Configure dnsmasq DHCP/TFTP Service
• Temporarily Disable SELinux to Enable PXE Boot
• Edit the dnsmasq Configuration File to Enable PXE Boot
• Install the Razor Server
– Load iPXE Software
– Verify the Razor Server
• Install and Set Up the Razor Client
• Setup Razor Provisioning
• Include Repos
• Include Brokers
• Include Tasks
• Create Policies
• Identify and Register Nodes
- A Tag consists of a unique name and rule
- The tag matches a node if evaluating it against the tag's facts results in true.
- The syntax for rule expressions is defined in lib/razor/matcher.rb
- Policies orchestrate repos, brokers, and tasks to tell Razor what bits to install, where to get the
bits, how they should be configured, and how to communicate between a node and PE.
- Policies contain a good deal of information, it’s handy to save them in a JSON file that you run when you create the policy
Example: It should be applied to the first 20 nodes that have no more than two processors that boot
- Create a file called policy.json and copy the following template text into it:
- Edit the options in the policy.json template with information specific to your environment.
- Apply the policy by executing: razor create-policy --json policy.json.
Step -1 – A Fresh Razor with no new node.
Step-2 – Create a new VM. It retrieves DHCP IP and loads microkernel
Step-3 - Razor API server is being contacted.
Step-4 – Razor server shows a new node registered.
Step-5- Razor Tag – Create a new tag. Let the node be tagged based on its characteristics.
Step-6 – Check the characteristic of the new tagged node. Count =1 shows that the new node is tagged successfully.
Step-7- Verify the Razor Tag by name.
Step-8 – Creating a repo for a new VM to be deployed
Step-9 - Creating a broker ( Puppet Enterprise for Configuration Management)
Step-10 – Creating a Policy for new node
Step-11- A New Node starts loading as per the policy specified.
Step-12 - Verify the node2 policy attached through puppet master
Step-13 - The New OS comes up , shows that it has been installed through Razor.
- Deploy version 1.2.3 on my application to all 3000 systems
- Deploy version 1.2.5rc2 of my application to all 340 development systems
- Restart the Apache service on all the systems in North America zones
- What systems are online right now?
- Run puppet on all systems, ensuring that at most 10 runs are happenings at once
- Upgrade the Hadoop version from 0.1 to 1.1 on all those 2500 nodes
- A framework to build server orchestration or parallel job execution systems
- Uses a Publish Subscribe Middleware philosophy ~ real time discovery of network resources
using meta data and not hostnames
~ a messaging pattern where senders of messages, called publishers, do not program the messages to be sent directly to specific
receivers, called subscribers. Instead, published messages are characterized into classes, without knowledge of what, if any, subscribers
there may be. Similarly, subscribers express interest in one or more classes, and only receive messages that are of interest, without
knowledge of what, if any, publishers there are. “
- Uses a broadcast paradigm for request distribution.
~ all servers get all requests at the same time, requests have filters attached and only servers matching the filter will act on requests.
There is no central asset database to go out of sync, the network is the only source of truth “
An MCollective client can send requests to any number of servers, using a security plugin to encode and sign the request
and a connector plugin to publish it.
It can also receive replies from servers, and format the response data for a user or some other system.
Example : mco command-line client
An mCollective server is a computer which can be controlled via mCollective.
Servers runs the mcollecitve daemon (mcollectived) and have any number of agent plugins installed
• An Abstract Cloud of Magic
• Clients and servers don’t communicate directly. They publish messages to the middleware, and subscribe to
messages they are interested in.
• Middleware system ~ knows how to route messages.
• External to MCollective, and everything that interacts directly with it is pushed out into a connector
plugin (which needs some limited knowledge about the topology and configuration of the middleware).
Apache ActiveMQ
- an open-source message broker that runs on the JVM;
- Installed with a wrapper script and init script that allow it to be managed as a normal OS service.
- MCollective talks to ActiveMQ using the Stomp protocol
- This is the main middleware recommended for use with Mcollective
- most well-tested option, its security features are powerful and flexible enough to suit nearly all needs, and it can scale by clustering
once a deployment gets too big (we recommend ~800 nodes per ActiveMQ server as a maximum). Its main drawback is that it can be
frustrating to configure; to help mitigate that, we provide a detailed ActiveMQ config reference in our own docs
• An open-source message broker written in Erlang; MCollective talks to RabbitMQ using the Stomp protocol.
Although it works well with MCollective, it isn’t as thoroughly tested with it as ActiveMQ is, so if your site has no
preference, you should default to ActiveMQ.
• The RabbitMQ connector ships with MCollective’s core and is available by default.
- mco – mCollective Client
~ A generic node and network inventory tool
~ Lets you define broadcast domains and configure a mcollective server to belong to one or many of these domains.
$mco ping – communicates to all the puppet agents
- Make requests to your servers.
- Capable of interacting with any standard Remote Procedure Call (RPC) agent.
How it work?
i. Perform discovery against the network and discover 10 servers
ii. Send the request and then show a progress bar of the replies
iii. Show any results that were out of the ordinary
iv. Show some statistics
• You can request the status for a specific service
• Report of all your IBM hardware listing hostname, serial number, and product name
Are you using the default Web Server?
Limitation of WEBrick, the default web server used to enable Puppet’s web services connectivity, is essentially a reference implementation, and becomes
unreliable as number of nodes increases.
Challenge-1 - Scaling the Transport
Possible Solution - increase the performance and potential number of possible master and agent connections.
Challenge-2 – Scaling SSL
Possible Solution - we implement good management of the SSL certificates that secure the connection between the master and the agent
• Replacing WEBRick Ruby-based HTTP server with the Apache web server on a single Puppet master system
• Extending the strategy to multiple Puppet master systems working behind a load balancer with Apache Web Server/Nginx.
Running Puppet Master with Apache and Passenger
An Apache module that allows the embedding of Ruby applications, much like mod_php or mod_perl allow the embedding of PHP and Perl applications.
For networks of one to two thousand Puppet managed nodes, a single Puppet master system running inside of Apache with Passenger is often sufficient
- Install Apache and Passenger
- Configure apache to handle SSL authentication and Verification of Puppet Agent
- Connect Apache to the Puppet Master
Step-1 Installing Apache Web Server
Step-2 Installing Passenger
Step-3 Configuring Apache and Passenger
Step-4 Configuring Apache Virtual Host
Front End HTTP Load Balancer
- Front-end Apache Virtual Host to authorize incoming Puppet Agent requests and handle SSL encryption and decryption of the traffic
- It will terminate the SSL connection, be responsible for authenticating the puppet agent request and then present this authentication information to the back-end Puppet
master workers for authorization.
- All traffic between the front-end load balancer and the back-end Puppet master systems are unencrypted
and in plain text
Puppet Master Worker Configuration
• When running the Puppet master behind a load balancer, there will be multiple Puppet master processes running on different
hosts behind the load balancer.
• The load balancer will listen on the Puppet port of 8140. Incoming requests will be dispatched to available back-end worker
processes, (configures the Puppet CA and workers all on the same host using unique TCP ports bound to the loopback interface)
Front-End Load Balancer Configuration
Single Back-end Puppet Master workers
Testing a Single Backend Puppet worker
- Synchronize the Puppet CA directory across all of the worker systems
- Make one worker system the active Puppet CA service and a second worker system the hot standby Puppet CA service.

similar documents