Thurs 22 Aug 2013
Fairmont ballroom looks like the Apple 1984 commercial.
Identified challenges in cultural shifts needed at companies who need config management but can't figure out how to implement.
PuppetConf 2012 largest complaint was "not enough time to see all the talks."
- 1.2m forge downloads.
- Derek Elan of Spotify
- Kris Buytaert PuppetCamp
- Greg Baker
- Dustin Mitchell of Mozilla
- List of companies using Puppet, Go Daddy conspicuously missing from slide
Nod to VMWare "We're not a VMWare company, not an OpenStack company..."
- Cisco and Juniper support for Puppet (Thanks to Bruce Figaro) sp?
Moving from vertical silos to horizontal abstractions
Level up from technology to applications
Whole community needs to get better about DRY
2 major problems with adopting companies, want agility & culture change.
- X-Wing strategy
- Making the core better
- Geppetto IDE
- Increasing on perf, stability, better documented API
- Enterprise Puppet 3
- Building towards things like continuous delivery
- A new reporting tool with insight on how puppet is configuring your system
- Say you're a sysadmin converting from debian to Red Hat and you want to make sure mongodb is going to deploy on new Red Hat systems
- Puppet Enterprise console
- Classes Nodes Resources reporting elements
Tell Puppet Labs how it sucks, how to improve Questions?
- They are working on more granular ACLs but no ETA
- One of bigger goals is to simplify experience for clustering Puppet
- Can contribute to Puppet by participating
- Publish, run a local Puppet Group, contribute to the Forge
New head of engineering defining how long they can afford to maintain backwards compatibility.
$2000 Google cloud credit for on-site attendees - talk to Puppet volunteers. Follow @puppetconf with hashtag #puppetconf
Google [email protected]
Makes Google run from the inside Focus is on externalizing a lot of internalized feed SRE runs many services
- make the service scale
- make the deployment consistent
- understand all layers of the system
- monitor everything
- how many requests
- how long does request take
- what's the effect on the db
- which one is expensive
- which is a x-system call
- when that network crawls
- when that switch is busy...
- put data inflection monitoring points at any point in code you can
- operations want to see 'this piece of data went into the db at this time and took this long...'
- plan for future
- Make sure things break and break them
- Take system apart and see what happens
- slow db down, use slow disks and see what happens
- load the disk with a cronjob with arbitrary proc
- break things, under controlled conditions
In order to engineer reliable systems, you need to understand all layers of the system. Unless you know how all the parts work together you won't be able to support the system.
- Scaling is fun
"We don't deploy 'a server'"
- servers break, power fails
- clients DNS need to be reconfigured
Don't deploy " a cluster"
- network breaks
- a single cluster isn't good enough
- clients/dns need to be reconfigured
Deploy redundant clusters
- attempt to send clients to nearest serving cluster
- anycast allows for unified client configuration
Must have redundancy in every layer of the stack Anycast has same ip addr injected into route tables of any available route structure Everything has to be identical. If you get routed to wrong cluster you get the wrong answer nice thing about using anycast is it's a requirement of application deploy that any cluster can answer any question for any user consistency at every point
Client DOS is not fun Poorly written code on a small number of clients is annoying
Poorly written code on a huge number of clients can cause serious infrastructure pain
Write good code and stage releases Work with service owners Stage rollouts, allow soak time Have a rollback plan and test it Have DoS limits for services, test them
Professionalism of change management; roll out as quickly as possible but if you haven't given soak time to test "canary" clusters that can go wrong, you're not doing the right thing for application owners
Stage rollouts, allow soak time.
Does it look good odn an almost-prod cluster 3 days/week later on prod traffic?
do you have enough capacity?
how many backends do you need
what happens if half lose power?
what happens wheen half are out for maint?
how do you send clients to right cluster?
- client configuration
- dns rr
- dns views
- dns anycast
- consier dns views plus anycast
- normally send all to anycast but have a switch for emergencies
Make sure there's enough headroom but don't go crazy (overprovision)
- Monitor everything
- Test everything
- Learn from outages
- write postmortems; not about blame
- focus on the facts
- Lots of desktops
- What if they want to do a puppet run? every hour? every 5 mins?
- Randomize cronjobs
- How can you shed load on the server end? anycast is coarse-grain load balancing network break - physical/routing/config/load balancer issues anycast monitoring is hard
- include load calculations early in health checks
- consider DNS views w/ Anycast to manage traffic
- dropped traffic better than cascading failures
- If you can't automate it, don't do it.
- Anycast helps be consistent; traffic could go anywhere
- Every OS upgrade is a time to refactor and clean
- clients need to be able to handle RR DNS
- he's never had a need to use SRV records
They can turn live machines into canary machines to soak in newly deployed code. Then they turn an entire cluster into canaries, if that's good then a complete rollout.
Perfect weather on the balcony of the Fairmont today. Warm sun with a light, cool ocean breeze.
Lead config management SME for engineering team providing and architecting cloud services for BoA/Merrill Lynch @beenybeenz
- not touching into db replication, performance, passenger tuning, etc.
- focus more on the Puppet product itself, how to handle different components
- puppet components
- basic deployment
- master pair
- master deployment behind a proxy/load balancer
- Global
- Americas
- Nodes under management 7200+
- Linux, working on Windows CFM
- Separate External Node Classifier drives data
- Internally developed Web UI serves as request system
- Puppet provides configuration and life cycle management
puppet enterprise puppet master + puppet ca puppet agent mcollective -> activemq/mcollective server -> console gui/report url + mysql/pg/puppetdb
Single master can handle ~ 1k nodes
- How did Puppet Labs come up with this calculation?
- Number of cores x Memory / Avg manifest comipilation time / (check-ins/day)
All puppet enterprise components installed on 2 servers
Agents still point to 1 destination
RR DNS distributes load across 2 servers
Allows for one entire server to go down but still have ability to manage and provision
CA ensure your serial numbers have different start ranges if not you'll have certificate conflicts. This file should NOT be synced.
ensure inventory file is merged, not overwritten
Ensure both mcollective credential files are identical (md5sum)
SCP file from one server to other after it is generated by module (rsync?)
ActiveMQ network broker config needs to be added to activemq.xml
- Natural progression, build upon prev installation
- Proxy server round robin config requests across puppet workers which only have puppet-server installed.
- Distributes primary load across commodity compute (VMs)
- Central inventory for node classification and reports.
- CA performed centrally
- Central orchestration
- CA - add to /etc/puppetlabs/puppet/auth.conf
update [main] section, [master] sections
Update path of new CRL file that will be downloaded to httpd vhost file
Add the proxy match string for CA requests within same file
SSLProxyEngine On ProxyPassMatch ^/([^/]+certificate.*)$ ...
sync over mcollective required files to configure new agents rsync -av{credentials, certs, keys}
On worker VMs, only need mcollective not activemq
each worker needs to accept dns name going to be used
listen puppet_00
mode tcp
server mypuppetworker1 check
server mypuppetworker2 check
- Use Zack's HAProxy module
- Logs on puppet master access.log will always come from haproxy ip
- Can use "optionforwardfor" but then you must terminate ssl and use http mode (not tcp mode)
- Filebucket should be centralized to orchestration servers.
- Use puppet to maintain everything in sync
agent -> haproxy -> puppet config workers (vms) -> orchestration (mcollective) -> agent
- GlusterFS for PuppetCA - how to replicate /var/lib/puppet/ssl across masters?
rsync works
- Proxied ssl request from master nodes to CA; not have agent config to ca_server?
can do what works
- Why go from FOSS to PE?
nice to have everything pre-packaged + Puppet Labs' support
Puppet, MCollective
When he started building OpenShift, his background was IT
Red Hat has been using Puppet since the beginning.
- Platform as a Service
If you're on the app dev side, they try to make it magically easy
One of the differences it's not only hosted.
Tries to make managing systems easier as well.
class openshift {
package { 'openshift-origin-broker':
ensure => installed,
- Focus on linux containers instead of virtual machines.
- Needed something volatile that wasn't going to kill operations.
- Gears == "Containers"
- Cartridges == "Code running in a container"
- Kernel namespace segmentation
- Control group workload management
- SELinux protection
- Better at segmentation but not for security
- Mount namespaces
- UTS namespaces
- Network
- User namespaces
Workload management
- blkio
- cpu
- memory
- net_cls
- whitelist approach to security Mandatory Access Control
- effective in controlled use cases
- Critical to control malicious activity
- Not exclusive from VMs, good complement
Virtualization is still relatively heavyweight
Containers powerful for driving density up
OpenStack's challenge was to use AWS business model, but most cost-effective
MCollective is agent driven Puppet -> Setup new VM MCollective -> Setup new container
- Query: Find the best matching machine to install something upon
- Execute: Find the perfect machine and runn the command
- Control: Limit the applicable commands.
Don't want a distributed root account to expose control.
Puppet -> Setup new VM
MCollective -> Setup new container
MCollective -> Interact w/ Container
- Easy agent development; faster to build an agent with MCollective.
- Good Docs, quick starting point
- Built on Messaging: Utilizes ActiveMQ
- Security: Message signing, auditing, etc.
Kernel Namespaces
- What point will it tip over and fall? Broadcast paradigm consuming more latency than needed
Professional Services Engineer| Puppet Labs @jsween_y
supercow on
- Windows Agent overview
- Puppet resource model overview
- Managing Linux vs. managing Windows
- Windows specific challenges and solutions
- Windows oddities that will bite you
- Server 2003 and 2003 R2
- Server 2008
- Windows 7/Server 2008 R2
- Windows Server 2012
Basic Windows .msi installation
msiexec /qn /l*v install.log /i pupet-3.2.4.msi
Puppet on Windows works out of a basic service.
C:\Program Files (x86)\Puppet Labs\Puppet{sys,bin}
- Don't mess with those
Docs&Settings\all users\app data\puppetlabs
- puppet\var
- cached data
- plugins
- puppet\etc
- puppet.conf
- ssl data <- directory to delete for puppet cert regen
- puppet\var
Goes over Puppet architecture, RAL
Basic puppet code example
service {'TerminalService':
ensure => running,
enable => true,
- Windows doesn't have cron
command => 'C:\jobs\run_job.bat',
user => 'SYSTEM',
trigger => {
schedule => 'MONTHLY',
- Line endings
- Paths
Need a linux master with linux line endings/paths (no windows masters)
- Forward slashes are safer (Puppet will translate to )... ...except when a Windows program will read them:
scheduled_task {'something':
command => 'C:\something\something',
- Think about where code is being evaluated (client or server)
- File resource in Puppet will copy everything in binary
- Source engine will copy it down perfectly
- If you have code generating newlines, it will use Linux style line endings
- Can use the unix2dos utility
- use separate files between linux and windows machines for templates
- Still specified with Unix-style modes
- Mode must be specified if owner/group are
file { 'C:\Blah\blah.txt':
ensure => file,
owner => 'Administrator',
mode => '0644',
Pretend WIndows is case sensitive with your Puppet code
No way to set SID
exec {'iisreset':
path => 'C:/WINDOWS/System32',
refreshonly => true,
- Execs run without a shell
- If you need shell built-ins
command => 'cmd \c ...',
Workaround is sysnative directory
$psh_cmd = 'powershell.exe -ExecutionPolicy RemoteSigned'
exec { 'my_script.ps1':
path => 'C:/WINDOWS/System32/WindowsPowerShell/v1.0',
command => "${psh_cmd},
provider => powershell,
Modules best way to organize code and extend core Puppet
puppet module search keyword
puppet module install author-module
- puppetlabs/registry
registry_key {'HKLM\System\TestKey':
ensure => present,
- adenning/winntp
class {'winntp':
ntp_server => '',
- trlinkin/domain_membership Good to change Computer password
sql server module windows facts module (good for desktops) The more we contribute, teh better Windows on Puppet becomes.
package {'mysql': install_options => {}, }
- deprecates msi provider (don't use provider attribute)
- puppet 3
- backports avail for puppet 2.7
"apt-get for Windows"
package { 'sysinternals':
ensure => installed,
provider => chocolatey,
Check OpenTable namespace on PuppetForge for modules that use native PowerShell
exec { 'reboot':
path => 'C:/WINDOWS/System32',
command => 'shutdown.exe /r /t 300',
exec {'pending_reboot':
command => 'cmd.exe /c "echo Reboot Pending"',
refreshonly => true,
noop => true,
reboot {'SQL_Server_Install':
message => 'blah',
prompt => true,
What Windows really needs most is more Puppet providers contributors.
- Support for Complex ACLs?
- Separate Puppet type might be the way to do that, have auto-repliers on the type.
- Able to run both linux clients and windows from same tree?
- Yes.
- Windows admin has problems with Puppet relgating puppet with classes & groups Would like Puppet to read what's already in AD groups that the Computer belongs to. "If you're a member of that group, do this." Active Directory Security Groups Classification hiera-ldap backend? One could be easily written. Have Puppet talk to LDAP and get that information out of it.
- Users resource, use UID for SID, what about AD for shared environment Puppet will not modify AD for resources
- For Windows Services, for service resources, eg. RabbitMQ on a Windows service Feature request currently open.
- Way to set .net runtime for which PowerShell executes under? Have to set up xml file inside directory.
- Does Puppet apply work in windows? Yes
- No way to install Windows Agent with the cloud provider Not sure if there's something in projects. (there is)
Technical difficulties with the mic
Technical Marketing Engineer | Cisco
Cisco Automation with Puppet and onePK
Evolution of Operational Maturity from minimized cost/best effort to more of a managed approach (basic SLAs)
fixes from days to hours
tiered domain experts
business operations -> virtual overlay networks
Business today requries self service, on demand on premise, remote, hybrid cloud with sla
good solution with Puppet
Let business operations drive the network
- Type 1 automating existing tasks
- Type 2 automating new tasks
- Type 3 automation as integral solution
Opening up the API, "What if the 'User' is a Software app?"
- Simplified Operations
- Enhanced Agility
- New Business Opportunities
APIs Cisco is opening up
- Try to run layers 2-4 through a consistent API cross platform
- Write the same program that behaves the same way
- APIs in C, Java, Python, will support Ruby
- "Cisco ONE Platform Kit" (onePK)
Architecture slide
API presentation layer ->
API infrastructure { catalyst nexus asr/isr }
Choose the hosting model that suits your app/platform
allow you to run your applications in multiple locations
run apps directly on blade or router
allows cisco to put applications (agents) on boxes
Puppet Master takes facts, generates catalogue, sends to Puppet Agent, Puppet Agent applies catalogue and sends Report but PuppetDB does not throw facts away
storedconfigs previously used w/ ActiveRecord into MySQL
Didn't work very well
Now you can query PuppetDB
In order to make PuppetDB super fast (1800 node catalogues on his laptop)
Design decisions:
- Asynchrony
- PuppetDB is supposed to store stuff, and let you query it.
- Command
- Query
- Responsibility
- Separation
Use a different model to update information than the model you use to read
CQRS Write Pipeline
async, parallel, MQ-parallel
:command "replace catalog"
:version 2
:payload {...}
/commands -> mq -> parse -> process -> catalog
\ /
Dead Letter Office
Dead Letter Office is a bit-for-bit copy of wire communications
Deepak can use data from this directory to replay
expect failure because it will happen
PuppetDB needed to be:
- Fast
- Free
- Portable
- Multi-core
- Popular
Tons of high quality libraries for web servers, concurrency frameworks, databases, fast parsing/lexing, clustering debugging, profiling, etc.
Can ship an uberjar, makes straightforward with few moving pieces.
Nobody cares what runtime is used, they just want it to work.
- Queries are expressed in their own language domain specific, AST-based query language:
["and", ["=", "type", "User"]]
PuppetDB walks the AST to compile efficient SQL
AST-based API lets users write their own languages
Erik Dalen of Spotify
AST-based API lets us more safely manipulate queries
daenny, Puppetboard: ORM for Ruby talks directly to PuppetDB via Ruby code
hiera backend so they can use hiera lookups to yank data out of PuppetDB
Bindings for Java, Go, Scala, CoffeeScript, Node.js, MCollective, Rundeck
OpenStack bodepd/puppet-openstack_puppetdb
- Relational Database backend
- embedded or PostgreSQL
- arrays, recursive queries, indexing inside complex structures
- Wrote it in Clojure
- Functional language
- No locking
- 0 bugs related to deadlocks, 0 bugs involving mutable state
- companion Ruby code has ~10x the defect rate.
7k lines of Clojure code.
resource often exists across multiple hosts
single instance resource storage
We'll often receive the same catalog for a host
In the field, we almost always see Resource and catalog duplication rates of 85%
Should be able to get metrics on how volatile your environment is.
Users want easy ways to consume metrics and analyze performance.
- jasonhancock/nagios-puppetdb
- puppetdb-external-naginator
- munin_puppetdb
- puppetdb-muninplugins
- mfournier collectd
Available in PE 3
basis for reporting and analytics
20% faster storage
improvements to memoization, caching, eliminate double-serialization, superfluous indexes nuked
Much faster terminus
Death to keystores can now use PEM certs directly eliminating one of the largest sources of configuration problems.
Configurable HTTPS
Automated recovery from MQ corruption, compression of the Dead Letter Office, purging of inactive node data, DB connection recycling
Backup and restore now integrated into daemon, can restore while PuppetDB is running.
- no need to ask for only active nodes
- full fact queries instead of just a list of facts
- node metadata
- exploration-friendly
curl localhost:8080/v2/nodes
Report storage
- comes with a report processing plugin
- store report-level metadata
- can do queries on events that span reports
- basis for PE's event Inspector
stream results to clients on the fly as they come in from the database
Will be developing tools to replicate data from one puppetdb daemon to another, to help with HA and DR
Can also later optimize the process to lower latency, but preserve eventual consistency.
More flexible routing is coming, allowing soft failures and read/write splits:
- log errors and continue
- write to one puppetdb, read from another
Remembered my recharge cables today.
modules are atomic role, library use puppet librarian cooperative modules build loosely coupled odules that can work together if installed together but can work on their own
will try to push deps into puppet instead of rpm rpm packaging works fine for deploying to puppet masters modules are good for 4am provisioning
transparency and simplicity if everything is in the Puppet manifest
Composition trumps inheritance
Same Puppet code used from dev -> test -> staging -> prod everybody works off same modules
Cisco is hiring
Puppet at scale study of paypal's learnings With Harendra Narayan
- PayPal part of eBay Inc 132 million active registered accounts 25 currencies in 193 markets net tootal payment vol $43B in Q2 7.6 million payments per day
$5,277 in total payment volume every second
- build anew system for deploying app/sys software
- massive scale in production across multiple data centers
- 3000 devs QE in 10+ offices across time zones and geographic regions
Stores in MongoDB created a web API in REST any tool in the data center can leverage their tool they integrated with openstack
cloud portal -> openstack -> flame icon ->
- enter label, application, size (20, 50, 1000 deps on what type of app)
- Click Go, key orchestration stack provisions VM, configures load balancer, adds to DNS, verifiecation checks
- Stored in hiera database (Mongo)
- Register with Puppet
- Deploy API will download appropriate packages, install and configure. Fully automated. User enters info on step 1, all the infrastructure is spun up within a minute. "Project Velocity"
- traditional 1 app 1 module does not scale
- high velocity env with increasing speed of change
- new pkgs sunset pkgs dep changes dev staff opperating 24x5
- Lack of puppet expertise to complement 3000+ technology
Is it possible to create a system where Puppet coding is not required for deploying applications?
You give a list of applications you want to install
- get list of packages
- autodiscover dependencies
- generate necessary puppet resources looking at the dependency graph
- execute deployment
To optimize it, they cache this, so next time they install the application all the code is available in memory for quick lookup.
Break out sessions with more detail later today
Role 1 role per pool define a set of packages to install
Label a set of versioned packages backed by a yum repository
Label gets placed in line for deployment. Pools are just a list of VMs servicing functions. abWeb role pool receives aWeb and bWeb packages abSvc role pool receives aSvc and bSvc package
Hosts receive versioned packages based on Label
Harendra Narayan
PayPal uses hierarchy
env { geo { dc { pool { host } } } }
They store classes on each of the nodes. Whenever they want to look for a Puppet class applied to a host, they can traverse the tree.
activemq clients -> load balancer -> activemq cluster
MCollective wanted long connection, load balancer would time out and terminate connections.
now they connect activemq clients directly to activemq cluster
MCollective has replaced all the SSH scripts, now if they want to query boxes for what version of a package it has on all packages, they can run a simple mco query and check state of packages. Same w/ facts "how may machines have this or that" easily retrieved via MCollective. OPs loves it.
- SSH script replacement
Also use MCollective to kick off Puppet on demand.
- REST API enables MCollective to web and other tools
Large number of apps/services -> real time status updates of puppet runs
"Why is it taking so long? ETA? Using MCollective to monitor deployment status
- They want to make sure system packages are in sync
- PayPal worked with Puppet Labs to create Puppet Progress module Every time puppet processes a resource it will send a message back to AMQ and tell if the process failed or if in sync.
Sample update message
"host" "blah",
"time": "Zulu",
"type": "puppet_run",
"catalog_version": 12345,
"puppet_run_status": "running":,
"package: {
"status": "successful",
"name": "axis"
progress_mq on GitHub
Nick Weaver Cloud Automation Architect, Hybrid Cloud Service | VMWare @lynxbat
What is vCloud Hybrid Service? IaaS cloud owned and operated by VMWare based on VMWare software
Any application... No Changes
Launching a public infrastructure service will give network connectivity give common management if you have stuff in your infr that runs in vmware, moving to cloud will run same SLA, same tools
Nick's job == Automation
"Automation is Effort Evolution"
Why is automation important for vCHS?
Needed an automation to scale out cloud offerings at VMWare; what key principles were needed?
- Scalability need to build something for scale, horizontal all the way across
- Extensibility got to live with this solution need to expose it so they can enable more than just ops to use that tool. Multiplies the benefit for everyone.
- Simplicity cannot build on automation platform that takes 10 hours of manual configuration
"I'll just do it myself."
- Resiliency Reliability is highly tuned athlete. Betting on based on behaviors observed Difference with resiliency, you expect things to fail. Plan for Failure. Never make assumptions things will ever stay in place.
Puppet had critical pieces w/ MCollective and VMWare support
"Puppet was the right choice for what VMWare wanted to do."
Zombie runs on Cassandra, JRuby, RabbitMQ
Built a distributed resource management system
Built distributed locking system.
Rez is the refridgerator, their engine is "the chef"
"Needed Puppet Runs at concurrency scale."
Zed lets an ops person say "do this concurrently then wait for..."
Gives management of different types of execution
Call OVFTool via Puppet -> check report/wait report
Can distribute work across multiple nodes/datacenters globally
Modularized in a way, can add Compute cluster to cloud.
talks to Rez, gets compute, talks to Razor, comes back, talks to Puppet take this build server from Razor and add it.
Created an operational language without needing to know Ruby/Java
Load up your Zed code into engine, action can be called immediately from API
REST API endpoint can be called dynamically. Horizontally scales.
13 in-house Puppet Modules so far vCloud Director vShield networking vSphere
total of 47 modules for everything puppet modules for installing zombie in production, integration, and development (including vagrant + puppet use for laptops) project zombie itself uses Puppet to push and do stuff, puppet outside zombie to push out zombie/update. "A Puppet sandwich"
Can stage environments with MCollective. update that env through job execution can apply env against target at a time
To a fully built stack, take 72 man hours, 6 days delivery
Project Zombie launched and moved it to 1.5 man hours to 2.5 hour delivery
Primarily testing, minor configs
entire job run from call API to complete = 1 hour
turned 4-5 days of human effort to 1hr of hands off
120 tasks (pluign calls) 3000 config points 1400 managed resources dynamically sized (pick the # of compute and storage) controls: vCloud director, vCenter, ESXi, EMC VX, Razor, vShield Manager, vShield Edge, Linux, Windows OS all in the right order, concurrent, distributed, in 1 hr.
Puppet will work just as well outside vCHS as well as inside data center
VMware believes its something that should be available to all web stacks.
key thing vmware wanted to do is make it easy to bring existing apps and build it together.
LK: Why Puppet? ML: because it solves an incredibly difficult problem of config management Puppet helps SLA to market What they wanted to do is move quickly Very rapid expansion plans
Jordan Sissel @jordansissel
Majority maintainer of LogStash Continuation of PuppetConf 2012 Logstash talk
"I get angry at computers."
Take that sour feeling that computers have ruined everything
'#hugops' amazon had an outage, github ddossed, instead of being angry at these things, express "you're having a crappy day, let's hug it out"
- open source and community
- what is a log?
- what's new in logstash?
- logstash ecosystem news
"open [in open source] means community"
One of the things you shouldn't do is tell new users "You're doing it wrong, RTFM"
Documentation isn't perfect.
Goes over example of apache logs, variances with syslog, ntpd log...etc.
time + data = log
- takes logs from anywhere (input)
- parse, process, or combine logs (filter)
- ship them somewhere else (ouptut) real time
Logstash should be:
- fast
- easy to learn
- easy to deploy If it feels slow, there's something wrong.
- easy to operate
- easy to extend
If you find it isn't fast and easy to use, then it's a bug (and we can fix it).
logstash 1.2 release Sep 1
125 plugins so far
For any given task, if there is a need to transmit logs from 1 location to another there's a good chance logstash has a plugin.
Speed is way up
- 3.5x higher event rate
- dramatically faster startup time
- Conditionals
output {
# notify nagios on http 5x's
# for events with a 'status' field
if [status ] == '500' {
email { to => "[email protected]" }
- new json schema smaller in size, easier to integrate with
LogStash web, first attempt at a web UI "It's ok, not good."
- LogStash web is dead and gone
- Kibana comes with LogStash 1.2
Tools that LogStash uses or built around LogStash
- elasticsearch can take very high data rates and scalable, just throw another machine in. strong community (leslie)
Smaller memory footprint, enables compression by default.
- Kibana
- What logstash web should have been
- Different types of charts, themes, maps
- Filter in logstash to do a lookup to find coordinates for data
Start with input data from logs, then visualize it in Kibana.
LogStash puppet module is umbrella'ed under LogStash project
- versioned
- tested
Puppet DSL lines up really well with logstash data model
%{SYSLOGBASE} <(?<level>\w_)> *%{GREEDYDATA:message}
"timestamp": "Aug 16 13:01:51",
"logsource": "pork",
"program": "NetworkManager"
"pid": "820",
"level": "info",
"message": "address"
Can start with a sample log and have grok figure out a pattern to watch.
expiring old logs Can do time-based or disk-storage-based arguments for data retention
Support for Thrift? Nobody has submitted one.
Metrics filter labeled beta? the label chosen for plugin status was poorly chosen. In the next version of logstash, they're just numbers, with definitions for what the numbers mean. Benefits of numbers is. There are no tests for it yet, so he's not comfortable moving out of beta title.
Multiline support in grokdebug? If your question is 'can we make it do something' I will respond, 'if it's a computer we can make it do what we want.'
Kibana not intended for highly technical people primarily, so it has a web interface.
- How can we keep logstash cookbook up to date better? "wiki means to me, where documentation goes to die." Why it's called a cookbook instaed of a wiki Would like to see more examples.
"Show me the top IRC talkers in the past 2 days"
Business folks don't want to see raw logs, they want to see visualization of trends.
We should make systems that are easier to use that don't depend on meatspace.
Learning a query language is a b.s. proposal If you're writing a complex search query, that's a bug or a feature that should exist.
- Got to conference room too late, standing room only
- Still working on proof of concept
- Not available for public consumption yet Hard to see slides or hear presentation from back of room. Went to CERN presentation on OpenStack
CERN using Puppet Foreman "Foreman is our tool for building servers"
Foreman Proxy {
Physical Box
CERN uses an API wrapper internally that may be open sourced eventually
new() { set up compute instance }
Split up services
Puppet critical vs noncritical
12 backend nodes Batch 4 backend nodes interactive
autoscale via alarms (heat)
define situations (load threshold...)
spin up VMs as necessary
Do not server large files over Puppet.
That's what repositories are for!
- Using their own monitoring system based on Ganglia and Nagios
- PaurShell the northern irish pronunciation of the windows based task orchestration tool Developer for OpenTable Member of Jetbrains Dev Academy DevOps Extremist
- classic infr mgmt
- snowflakes
- infrastructure
- powershell as a way to manage Windows
- How Puppet kicks its ass
"The Run book" RHEL 5 Installation Guide 416 pages
People are generally rubbish at performing manual repetitive tasks.
Martin Fowler said that snowflake servers are things you have no control over been in production a long time, don't know how they'll react.
Machines are much more reliable at performing repetitive tasks.
Very important we don't run our machines at 100%
If we can automate things, we can code them.
completely destroyed and brought back from Scratch
Chad Fowler wrote about
Comes from the idea of running servers in prod for a long time. If they make a deployment push, they bring it up from scratch. Concept comes from functional programming.
Puppet has Geppetto, RubyMine, IntelliJ Testing with puppet lint, rspec, huge community Generally better you can store your infrastructure in the same place you store your application. Can version it, tie it to app versions, code is generally better.
CFEngine, Puppet, Chef, Ansible, Salt, all built on Linux
Windows has Powershell
Powershell 1 was complete horsecrap.
Powershell 2 was a lot better, RPC control.
Powershell 3 came along.
Powershell 4 destroyed it with "desired state configuration." Another layer of abstraction over something that's simple to write.
Presentation demo run on his macbook.
When PowerShell 3 came out, MS came out with a clean API.
If you try and add a feature that is already on a server, it doesn't do anything. Just says the server's there.
Not difficult problems to solve, up until the 2 years ago Windows community hasn't done a lot to solve them.
Goes over Chocolatey for a bit
Live Demo: destroyed and rebuilt a qa server using PowerShell 3 in seconds
Been around since Puppet 2.7.6
Getting better, but still nascient.
Not all the types from linux side are available in Windows.
Limited in functionality.
Lots of exec statements, not the best way to write modules.
Need to convert execs to custom Types.
OpenTable will be open sourcing a lot of their Windows Puppet Modules to the Forge.
Going to run from a base, newly provisioned Windows Server 2008 with just Puppet installed, via a VPN connection to his box in London.
Somebody mentioned SCCM, crowd laughs.
Will Farrengton @wfarr
Presentation to look up
Throwing out preconceptions on how things work now, and how things could work back at the beginning.
What in the hell is my principle?
Developing Software is hard
Painful software creates friction
Friction gets between product and user
"Software should get out of the way."
GitHub statement = Help ppl design, build, and ship software better, together.
Need more porcelain to enable ppl to ship
Shipping isn't just for software. Legal team ships a takedown, HR ships a policy, can't just treat it as a software design problem exclusively.
"We need a mission statement."
TAFT - Test All The Fucking Time
Need to be able to throw your code spike away and go back to TDD
_"Whatever you do, make sure you are testing, because if you aren't, all you are doing is making it harder for yourself when you revisit the code and making it even harder..."
"Whatever you do make sure you are automating because if you aren't all you are doing is making it harder for yourself when you revisit hte problem and making it even harder for the next person..."
There are ppl in GitHub who don't work in Rails.
There are ppl who do work in Rails.
Aiming for a higher level of abstraction.
How do you describe a workflow for an environment?
We know a thing about development environments.
But we are clueless when it comes to empoweirng ppl outside developer space.
"non-technical" ppl are very technical ppl who are specialized in their own skills. Devs cannot do what lawyers/HR does as well as they do.
We need to understand different kinds of environments and then make them better with automation.
Saying they should use "our tools" is a cop-out.
It's about finding tools that enable a better workflow.
- long-lived piece of software
$ boxen github
... now Jill developer can work on GitHub.
What about everybody in finance?
class projects::financial_audit {
Arrive at a better level of abstraction for everybody.
How are we going to improve Boxen
- OS X 10.9 "Mavericks" Support
- No manual XCode installation required
- 10.9 has shims installed for CLI dev tools
Hiera everywhere
All boxens are designed with Hiera
- Updating modules to get a new version of X sucks.
- Trying to run a service on a different port sucks.
- YAML changes are more approachable than Puppet
More core modules will get support over time
DNS mask, MySQL, PG Trying to figure out how to make some things, making multiple instances of e.g., Redis server running.
PuppetMaster Support Use PuppetMaster in context of their Boxen environment.
librarian-puppet is not the greatest piece of software. author is co-writing Henson.
librarian-puppet "works"ish (sort of)
Librarian makes a lot of assumptions that don't fit Puppet's use cases.
Not extensible. Not a very good library.
Henson is not a dead project.
Rodjek is also working on it.
Ruby, stdlib only, no dependencies. Local caching and block file work still needed.
Some other priorities needed their attn first.
- Boxen is not perfect
- made lives easier at GitHub
- might make your life easier
- Don't use it because there's a big name attached.
- Why did they go with Puppetfile instead of Modulefile? They want boxen to be used by the Modulefile itself eventually, avoid different behavior between Puppet and Boxen with Modulefile
Puppet module will be split out of Puppet core and its own tool again according to Puppet Labs.
Rumor Modulefile may be going away, eventually replaced by metadata.json file.