Manager of IT: 2012

Thursday, August 23, 2012

say what, yum? #linux #devops

Linux generally doesn't pretend to be user friendly. Geeks love linux just for that. This is an example of linux moving from geeky to dorky.

Warning: 3.0.x versions of yum would erroneously match against filenames.
You can use "*/cantfindit" and/or "*bin/cantfindit" to get that behaviour

That's the error message you get when using 'yum provides' to find a package. The complete output is:

# yum provides thatpackage
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
* base: mirror.metrocast.net
* epel: mirror.metrocast.net
* extras: mirror.cogentco.com
* updates: mirror.metrocast.net
Warning: 3.0.x versions of yum would erroneously match against filenames.
You can use "*/thatpackage" and/or "*bin/thatpackage" to get that behaviour
No Matches found

so what does this mean? I mean... I didn't much use yum 3.0.x, so what does this have to do with me?

Here's what yum -h says about the 'provides' option

provides Find what package provides the given value

hmmm... not so much help. How about 'man'?

provides or whatprovides

Is used to find out which package provides some feature or file. Just use a specific name or a file-glob-syntax wildcards to list the packages available or installed that provide that feature or file.

Say what? "Just use a specific name or a file-glob-syntax wildcards..."

Yum is trying say, in a particularly obtuse way, "try using */thatpackage and/or *bin/thatpackage." When you use the wildcard in this fashion (what they are evidently calling the file-glob-sytanx wildcards) yum will search both the package name AND the filename. By default it only searches the package name, which often is not particularly helpful when trying to find a package or when you're trying to resolve a dependency. Searching for the filename IS a pretty useful feature. Perhaps the docs could have explained it a little better. Just sayin.

Monday, April 16, 2012

machine data is #bigdata. #axeda #in #m2m

Machines talk way more than humans. A good break down by @andrewbrust

http://www.zdnet.com/blog/big-data/industrial-big-data/350

Friday, April 6, 2012

Another blog post on HOWTO PORT FORWARD with IPTABLES #in #sysadmin

My application (puppetmaster) is behind a firewall and there is no direct access from the internet to the server. So, it was necessary to configure a host on the DMZ to act as a proxy. This was accomplished using IPTABLES to do a forward and reverse NAT. Iptables version: v1.3.5.

This first step is very important. Without doing this, you will scratch your head wondering why your perfectly formed IPTABLES rules don't get the job done.

Step 1: enable ip forwarding on the system (if it hasn't already been done):

echo 1 > /proc/sys/net/ipv4/ip_forward

Step 2: configure IPTABLES (on CentOS, add these lines to your /etc/sysconfig/iptables file and restart iptables):

iptables -A PREROUTING -p tcp -s *route_only_for_this_ip* -d *router_ip* --dport *router_port* -j DNAT --to *destination_ip*:*destination_port*

iptables -A POSTROUTING -o eth0 -d *destination_ip* -j SNAT --to-source *router_ip*

A few definitions for the above:

--dport 8140 -- the default puppetmaster listening port
*route_only_for_this_ip* -- if you want to limit incoming IPs to a single known IP (not required)
*router_port* -- which port incoming requests are accepted on. This does not have to be the same as the destination_port. No one should rely on security through obscurity, but using it is a best practice.
*router_ip* -- the host doing the routing (the one running these rules!)
*destination_ip* -- the server that will actually be servicing the request
*destination_port* -- the port on the above server that will accept requests

Now here's an important note. If you input those rules and run a simple:

iptables -L

Chain INPUT (policy ACCEPT)target prot opt source destination
Chain FORWARD (policy ACCEPT)target prot opt source destination
Chain OUTPUT (policy ACCEPT)target prot opt source destination

What? Your rules are NOT in your ouptut! To see them, you need to specify the nat table, like this:

iptables -L -t nat

Chain PREROUTING (policy ACCEPT)
target prot opt source destination
DNAT tcp -- anywhere router_ip tcp dpt:router_port to:destination_ip:destination_port
Chain POSTROUTING (policy ACCEPT)
target prot opt source destination
SNAT all -- anywhere destination_ip to:router_ip
Chain OUTPUT (policy ACCEPT)
target prot opt source destination

Resources:

http://www.linuxquestions.org/questions/linux-networking-3/iptables-port-forwarding-599401/
http://www.whatismyip.com -- love this tool. From a command line using wget or curl goto: http://automation.whatismyip.com/n09230945.asp
http://www.linuxquestions.org/questions/linux-software-2/iptables-list-does-not-show-pre-or-postrouting-rules-785184/

Monday, March 12, 2012

The decline and fall of system administration #in #devops

wherein @pvenezia bemoans: "Virtualization makes it all too easy to spawn new instances rather than figuring out what went wrong. Is this the end of Unix best practices?"

http://www.infoworld.com/d/data-center/the-decline-and-fall-system-administration-375

Monday, February 13, 2012

Different takes on why monitoring sucks and what's to be done about it #in

Why monitoring sucks — for now

http://gigaom.com/2012/02/12/why-monitoring-sucks-for-now/

A new (old) model
I’d suggest that any well-designed monitoring tool can help automate the OODA loop for operations teams.
1. Deep integration
2. Contextual alerting and pattern recognition
3. Timeliness
4. High resolution
5. Dynamic configuration
What’s next for monitoring?

Why Alerts Suck and Monitoring Solutions need to become Smarter

http://www.appdynamics.com/blog/2012/01/23/why-alerts-suck-and-monitoring-solutions-need-to-become-smarter/

#1 Problem Identification – Do I have a problem?
#2 Problem Isolation – Where is my problem?
#3 Problem Resolution – How do I fix my problem?

My ideal monitoring system
http://forecastcloudy.net/2012/01/12/my-ideal-monitoring-system/

Hosted (CloudKick, ServerDensity, CloudWatch, RevelCloud and others) vs Installed (Nagios, Munin, Ganglia, Cacti)
Hosted solutions pricing plans use varied parameters such as price/server, price/metric, retention policy, # of metrics tracked, realtime-ness, etc.
Poll based method – where collecting server polls the other servers/service vs. Push – where you have a client on the server that pushes locally collected data to logging/monitoring server
Allowing custom metrics – not all systems allows monitoring, plotting, sending and alert on custom data (at least not in a easy manner)

Friday, February 3, 2012

puppet day #2 -- and I need a custom fact

Objective
One of the first things I wanted to accomplish with puppet is to track down rogue cron jobs under accounts of people that are no longer here. The broader objective is to delete old/un-used accounts.

Problem
But there was some evidence that a few of these old accounts still had cron jobs running. So, we couldn't just delete the old accounts, but needed to proceed cautiously to insure we didn't stomp on some cron job that was actually needed!

I was looking for puppet to tell me which systems had cron jobs under this old account. Now, puppet is a declarative language, so something like:

if /var/spool/cron/userfoo exists, notify me, so I can take a look and see what I need to fix/replace

doesn't exist! In puppet, you have to declare whether something should or should not exist and then puppet will take the corresponding action. I just wanted puppet to tell me about something on my system. I didn't want puppet to take an action!

Solution
It's up to the puppetlabs provided facter to help out here. Puppet ships with a bundle called facter that collects a lot of bits of information about systems, like their OS, RAM, kernel version, etc. The code to gather these facts is written in ruby and is extensible. I needed a custom fact that would indicate whether or not /var/spool/cron/userfoo or (on solaris) /var/spool/cron/crontabs/userfoo exists. Writing that code is actually straight forward (my first ruby code ever! yay!). Getting that code onto my agents had an obstacle to overcome.

Problem #2
Puppet does not deliver custom facts to agents by default. Agents and the puppetmaster need this set in /etc/puppet.conf

pluginsync = true

This required using puppet to update the puppet.conf and restart puppet. That's what I built. Getting puppet to allow delivery of custom facts by default is a listed feature request: http://projects.puppetlabs.com/issues/5454

The only gotcha here is to make sure you include:

hasrestart => true,

in your init.pp for the puppet service. Otherwise puppet will send a stop, but not a start since it can't send a start since it is no longer running!

Resources
http://conshell.net/wiki/index.php/Puppet
grabbed this:

kill -USR1 `cat /var/run/puppet/puppetd.pid`; tail -f /var/log/syslog

from the above link. Which I shortened to:

kill -USR1 `pgrep puppet`; tail -f /var/log/syslog

Config details after the jump

Puppet installed -- let's do something!

Having got a critical mass (but not all) of my servers running puppet and talking to the puppetmaster, I was ready to start actually doing something with puppet. So, the first thing I wanted to do, was update the motd on the servers. I appreciate a standard look and feel when logging into a server and being provided with some useful info about the host I'm on. Moreover, I wanted to communicate to system users that

I found this: https://github.com/aussielunix/puppet-motd, which uses a puppet template to collect a number of facts along with a really big ASCII banner, that I quite like.
_

_ __ _ _ _ __ _ __ ___| |_
| '_ \| | | | '_ \| '_ \ / _ \ __|
| |_) | |_| | |_) | |_) | __/ |_
| .__/ \__,_| .__/| .__/ \___|\__|
|_| |_| |_|
_ _
_ __ ___ __ _ _ __ __ _ __ _ ___ __| | | |
| '_ ` _ \ / _` | '_ \ / _` |/ _` |/ _ \/ _` | | |
| | | | | | (_| | | | | (_| | (_| | __/ (_| | |_|
|_| |_| |_|\__,_|_| |_|\__,_|\__, |\___|\__,_| (_)
|___/

Any files that have a 'Puppet' header need to be changed in puppet.

Interesting tidbit
In my motd.erb template, I included:

Uptime: <%= uptime %>

What happens with this, is that the "uptime" fact (and the other facts included in the template) gets evaluated on the client on every puppet run and a flat file without the puppet mock-up is laid down on the file system. This file gets compared and reevaluated on every run. Here's the point: every day the uptime changes and a new file is laid down in /etc/motd and the old file is backed up. This is clearly pretty inefficient, and needs to be replaced with a function that will process/update the uptime on login, and not on every puppet run.

Resources
https://github.com/aussielunix/puppet-motd
~~My init.pp and motd.erb are in the comments~~ I just discovered that I can't format in comments, so adding the file specs after the jump...

managing users with puppet

useful resource:
http://itand.me/using-puppet-to-manage-users-passwords-and-ss

Friday, January 27, 2012

Bumps along the way of deploying puppet

In my new environment we have about 100 servers of various flavors... predominately CentOS and Solaris with several RedHat servers and a couple of Windows and Debian boxes. The configurations, versions and patch releases are all over the place. Some of these boxes are quite old (cough) Fedora 5 (cough) (cough) solaris 9 (cough).

My first goal is simply to get puppet onto all of these servers. Of the ~100 servers I need to manage, about 30 of them are dev/qa/test boxes. I now have puppet installed on all of them. There were a few bumps along the way.

Impediments

1. The right repository--I'm sure for the yum guru's out there, this will seem trivial, but it was a problem for me. A repository I was initially using had an older version of puppet (which I did not realize immediately). It wasn't until one of the boxes I was installing puppet on already had a repository configured with a new version of puppet did I realize I had a problem. And it wasn't until I tried connecting it to the puppet-server that I realized I had a problem because I got this somewhat unhelpful error: Error 400 on SERVER: No support for http method POST

Thanks to http://bitcube.co.uk/content/puppet-errors-explained for the explanation.

So, I updated the puppet-master and I fixed the repository I was using and now I'm getting the latest and greatest.

2. Yum dependencies--Occasionally I ran into dependency issues when running yum install. It wasn't terribly clear to me why I got these errors, but generally, it happened when there was a longer list of dependencies. I was able to work around this typically, by simply doing a yum install of one of the dependent packages first, and then trying the yum install puppet again and it worked.

3. Old OSes without the required packages--In some cases I could not work around the the dependencies because the OS version was so old--Fedora 8, 7 and 5. These OSes were looking for libselinux-util which wasn't made available until Fedora 10! Note to self: put these systems on the top of the list to retire.

4. puppetmaster directory details: Also worth mentioning, it took me some time to sort out which directories and where they need to be located on the puppetmaster. I'm not sure if this is a poor documentation problem, or a user problem, but it took some trial and error to get it right.

I needed to have:

/etc/puppet/manifests/site.pp
/etc/puppet/modules

and as an example under /etc/puppet/modules I needed:

/etc/puppet/modules/sudo/manifests/init.pp

Resources

AWESOME very helpful and engaged channel: puppet IRC. IRC server irc.freenode.net, room: #puppet
List of common puppet errors with pointers to fix: http://bitcube.co.uk/content/puppet-errors-explained
Of course the puppet docs, particularly for installing puppet on solaris: http://projects.puppetlabs.com/projects/1/wiki/Puppet_Solaris
RPM search: http://rpm.pbone.net/

(updated to clean-up layout, edit fonts, etc)

Implementing DevOps

I've started at a new company. They are a very large company with a medium sized web presence operating several on-line brands for a niche audience.

Generally they are a functioning company with a well established environment that is running well enough, but is ready for an overhaul. They have a mid-to-long term project to consolidate different content management systems into a unified content management system that allows for sharing of content between brands. This larger project provides an opening to perform a major face lift on the internal operations. WooHoo!

Currently releases are largely un-automated, time consuming, take place during off-hours, require quite a few people on-line to do the actual work and testing or just to be on-hand in the event something blows up. There seems to be plenty of room to implement a DevOps methodology for releases, particularly automation and measurement.