MySQL: ERROR 1286 (42000): Unknown table engine ‘InnoDB’

I was asked to review a application recently that had some performance issues. The db engine was MySQL and the database was getting hammered. Changing the behavior of the queries the app used was not a short term option so I started looking tuning at the MySQL configuration.

The box in question was a “large” EC2 instance (two cores, 8 GB RAM). With the MySQL data dir mounted on an EBS volume. One of parameters I tweaked was the “innodb_buffer_pool_size”. Out of the box it was set to 8MB which is fairly tiny. After reading some docs and blog posts on the subject, I decided to go with the suggested 50-80 percent. After setting the value to 4GB, I restarted and things were humming along.

I came back a while later and noticed this error in the app logs: Unknown table engine ‘InnoDB’ .

I logged into the mysql console and got the same error trying to run some simple SELECTs on data that I knew should be there. As it turns out MySQL will go ahead and shutdown the InnoDB engine if it has trouble allocating memory for the innodb buffer pool. I suspect the pool started off small and progressively grew and failed to find needed RAM. At which point it turned off InnoDB. Nice! Of course restarting MySQL would “fix” this issue for a while till it happened again, so I ended up turning it down a bit to reduce the chance of this happening again.

A couple of related posts I found useful:

InnoDB engine disabled when specifying a buffer pool size too high

Choosing innodb_buffer_pool_size

Posted in /geek, /sysadmin, /unix | Tagged , , , , , , , , | 1 Comment

Ganglia: Using the ruby gmetric gem

If you are new to Ganglia this post might not make much sense. If you are interested in Ganglia I suggest you read this paper first to get a better understanding of how it works.

Over the past year I have been moving away from Munin and towards Ganglia for my server monitoring needs (for reasons that should go in another post). One of the nice things about Munin is the number of plugins that are included out the box. Ganglia is lacking in this department. The lack of time to re-write plugins was one of the initial obstacles preventing me from ditching munin altogether. Once I found the gmetric gem for ruby that all changed.

Using the gemtric gem you can write a ganglia plugin in ruby and send the metrics directly from your ruby scirpt without integrating the “gmetric” command line tool provided by Ganglia. The other nice thing about the gem is that it supports the “group” part of the ganglia message protocol. So you can specify a grouping for the metric you submit. Using the stock “gmetric” tool that ships with ganglia in Ubuntu, all data goes into an uncategorised group in ganglia.

Here is an example of a simple gmetric plugin using the gem. It counts some files in directory and reports them to the ganglia cluster:

Ganglia file count graph

Posted in /geek, /sysadmin, /unix, /work, monitoring, ruby | Tagged , , , , | Leave a comment

ATT SMS Gateways, consistent source short code

At work we get nagios pages via SMS using the @txt.att.net email-SMS gateway. Everyone who receives these pages also happens to have an Iphone. Our setup was fairly annoying because every txt message had a unique sender, so in the event you that you receive a ton of alerts its really hard to clean up afterwards.

Found the solution here. Use @mobile.mycingular.com.

Posted in /dev/random | 3 Comments

Maven tweaks continued…

In a previous post I mentioned using the settings.xml file to force maven to use a local repository. In my case this is Jfrog’s Artifactory.

This was accomplished using the “mirror” element in the settings.xml file (example). While most of our artifacts are downloaded from the “central” repository. Some of them have dependencies that are not part of “central”. This resulted in the build reaching out to other 3rd party sites to download artifacts. This is undesirable for us because we want to have control over all artifacts, and in addition this can also slow down the build or even break it when those sites go down. Which is how we discovered it in the first place.

I banged around on google for a while looking for mailing list posts or examples to force mirroring of all repos to no avail. So on my lunch break today I decided to stop by the book store and pick up the Oreilly Maven Book. A couple of minutes poking around and I found an example that setup a mirror using a wildcard.

Posted in /geek, /sysadmin, /unix, /work | Tagged , , , , , , , , , | 1 Comment

READYNAS NV+ Kernel Out of memory

Decided recently to pick up a Readynas NV+ to supplement our file storage at the office. I have the Readynas Duo at home and have been quite happy with it. It is a linux based NAS appliance and after installing a few add-ons you can have a remote root shell and apt packages to install things like munin and NRPE which happen to be what I use for monitoring.

As with most NAS appliances it supports SMB, AFP and FTP. It also can run an rsync daemon. Or with shell access you can setup cron jobs, run rsync from the shell just like a normal linux box. Other features which I don’t use include support for Apple’s Time Machine, Squeezebox, a web based bittorrent client and a handful of other stuff.

After I set the device up I figured I would burn it in for a couple of days to make sure everything was happy. I had upgraded the RAM from the default 256mb to 1 gig because I had read some benchmarking reports that indicated a noticeable increase in I/O performance. Much to my surprise with little or no activity for a day or two I got message from Nagios that it could not talk to the NRPE instance on the device. It was still responding to pings but all services on the device were horked. I rebooted it and checked the logs to find a bunch of “kernel: out of memory” messages indicating that both RAM and swap were entirely consumed and the kernel was attempting to kill off processes in an attempt to recover:

nas1corp kernel: Out of memory: Killed process 7823 (cron).
nas1corp kernel: Out of memory: Killed process 7824 (sh).
nas1corp kernel: Out of memory: Killed process 7840 (sh).

This seemed to be obviously a memory leak, however after some quick searching and ad-hoc research I found no obvious answers and decided to shelve it for a while.

Today I finally came across this thread where at least a few other people had the same problem. The person reporting the problem was also running the munin client on his Readynas as well.

So it appeared to be a kernel memory leak that in this case was triggered by bash which is used in a bunch of the munin plugins. I just upgraded to the firmware release posted on the mentioned thread and have noticed a huge difference already. I think its great that netgear finally fixed this problem. I also think its great that openness of their product allowed customers to troubleshoot it for them, however I was not all that impressed that it took them almost a year to get the issue resolved.

Posted in /sysadmin, /unix, /work | Tagged , , , , , , | Leave a comment

TemPageR 3E environment monitor

Recently ran into some heat issues in a mini computer room that has a window in direct line of the sun and problematic HVAC. Picked up a network enabled environment monitor so I could have some data to show when trying to get the issues resolved. Originally I was thinking about getting the Websensor EM01B that has been featured on the nagios website for ages and ages. I also investigate getting an APC unit that could plug directly into our UPS. I ended up settling on the AVTECH TemPageR 3E mostly because of price. It was cheaper than most of the options and it came with an external temp probe in addition to the integrated probe.

All in all I am happy with it. I wrote some quick and dirty nagios and munin plugins that monitor the unit via SNMP. The web interface renders data using JSON, so it would be possible to write some plugins that used that as well, but I ran the JSON through a validator after having the ruby json libs blow up, and its not 100% valid so I stuck with SNMP.

I had to contact support and so far the support has been really good. I was noticing that my graphs had gaps in them. I started doing some debugging and found that sometimes SNMP queries of the external temp probe return nothing. I contacted support and they suggested that I check to make sure that the wiring for the external probe does not run near any HVAC gear or other sources of electrical interference. After some rewiring I have managed to escape most of the interference and reduced the amount times SNMP queries for the external probe fail.

munin graph

Here is the shell script munin plugin. I doubt it complies 100 percent with their plugin requirements. Use at your own risk:

#!/bin/bash
GRAPH_TITLE="TITLE"
GRAPH_VLABEL="Degrees Fahrenheit"
S1_LABEL="Integrated Sensor"
S2_LABEL="External Sensor"
COMMUNITY="snmp_com_here"
HOST="1.1.1.1"
SNMPWALK="/usr/bin/snmpwalk"

#provide autoconf to munin
if [ "${1}" == "config" ];then
        echo graph_title ${GRAPH_TITLE}
        echo graph_vlabel ${GRAPH_VLABEL}
        echo sensor1.label ${S1_LABEL}
        echo sensor2.label ${S2_LABEL}
else

        #Get the values via SNMP. They are reported with out the decimal place so we do an ugly sed one liner to clean it up

        VALUE=$(${SNMPWALK} -v1 -c ${COMMUNITY} ${HOST} enterprises.20916.1.7.1.1.1.2.0 | awk '{print $4}'  | sed 's/..$/.&/;t;s/^.$/.0&/')
        echo sensor1.value ${VALUE}

        VALUE=$(${SNMPWALK} -v1 -c ${COMMUNITY} ${HOST} enterprises.20916.1.7.1.2.1.2.0 | awk '{print $4}'  | sed 's/..$/.&/;t;s/^.$/.0&/')
        echo sensor2.value ${VALUE}

fi

Here is the nagios plugin, same disclaimer:

#!/bin/bash
COMMUNITY="snmp_com_here"
HOST="1.1.1.1"
SNMPWALK="/usr/bin/snmpwalk"
WARNING=90
CRITICAL=95

        #Get the values via SNMP. They are reported with out the decimal place so we do an ugly sed one liner to clean it up

        VALUE1=$(${SNMPWALK} -v1 -c ${COMMUNITY} ${HOST} enterprises.20916.1.7.1.1.1.2.0 | awk '{print $4}'  | sed 's/..$/.&/;t;s/^.$/.0&/')

        VALUE2=$(${SNMPWALK} -v1 -c ${COMMUNITY} ${HOST} enterprises.20916.1.7.1.2.1.2.0 | awk '{print $4}'  | sed 's/..$/.&/;t;s/^.$/.0&/')

if [ ${VALUE1} \< ${WARNING} ] && [ ${VALUE2} \< ${WARNING} ];then
        echo "TEMP OK: "${VALUE1} ${VALUE2}
        exit 0

elif [ ${VALUE1} \> ${WARNING} ] || [ ${VALUE2} \> ${WARNING} ]; then

        if [ ${VALUE1} \> ${CRITICAL} ] || [ ${VALUE2} \> ${CRITICAL} ];then
                echo "TEMP CRITICAL: "${VALUE1} ${VALUE2}
                exit 2
        else
                echo "TEMP WARNING: "${VALUE1} ${VALUE2}
                exit 1
        fi

else
        echo "UKNOWN"
        exit 3
fi
Posted in /geek, /StupidShellTricks, /sysadmin, /unix, /work | Leave a comment

Overriding the default maven repository

If you are using maven for builds or dependency management you will notice that by default maven always attempts to pull dependencies from the public maven repository (repo1).

After reading documentation and examples I thought you could easily override this by specifying a local repository in your respective project object model file (pom.xml). Example.

Not the case in my experience. Every attempt to specify a local repository always resulted in the official maven repo being checked first. If the dependency was not available in the public repo it would then check the local repo.

After some reading I found the useful maven command “help:effective-pom” which does what it says. It takes all the information being inherited from other poms and spits out the xml for what is your effective pom.xml file when maven is run. The maven repo is included in maven’s “super pom” which is inherited by all poms. No matter what I tried the repository entry from the super pom is always first in line before all over repos.

The only way I could override this was to use the“mirror” element in the settings.xml file which can reside in either /etc/maven2/ for system wide settings or ~/.m2/ for user settings. This is not ideal for us because we want everyone who checks out this maven project to automatically use our local repo without having to touch config files on their machine. For the linux machines its not a big deal because we can control this file with puppet but its a slight pain for everything else.

Posted in /geek, /sysadmin, /unix, /work | Tagged , , , , | 1 Comment

This RRD was created on other architecture

I use Munin for some graphing of system and application stats. Like most graphing open source projects its based on the widely used RRDTool.

I recently was moving my munin instance from a xen instance on 64bit Ubuntu install, to a bare metal 32bit install. Moving munin consists of the moving the munin.conf file and moving the /var/lib/munin directory. After moving it I noticed graphs weren’t updating as expected and came across this is the munin-update.log:

This RRD was created on other architecture

Not surprising that RRD files are platform specific. Came across this blog post that has a couple of shell scripts to handle this. Since the blog post is in German I will summarize what they do:

1. dump.sh:

This does a find in your /var/lib/munin directory looking for all “.rrd” files. It takes each file and uses the rrdtool “dump” option to dump the information into a corresponding xml file.

2. restore.sh

This takes the xml data and creates a new rrd file from the xml file. Before creating the new rrd file from the xml file it copies the original rrd file to backup file with an extension of “.bak”.

When I did this I copied the munin directory to a 64bit host to run dump.sh since the rrd files were originally created on 64bit host. I then copied the directory back over to the 32bit machine to run the restore.

You can find the scripts on Frank Helmschrott’s blog posting. I will include a slightly modified version here:

dump.sh:

#!/bin/bash
MUNIN_DIR="/var/lib/munin"

for f in `find ${MUNIN_DIR} -name '*.rrd' -print` ;

do
f_xml=`dirname ${f}`/`basename ${f} .rrd`.xml
rrdtool dump "$f" > "${f_xml}"
chown root:0 "${f_xml}"
done

restore.sh:

#!/bin/bash
MUNIN_DIR="/home/kenc/munin"

for f in `find ${MUNIN_DIR} -name '*.xml' -print` ;
do
f_rrd=`dirname ${f}`/`basename ${f} .xml`.rrd
mv -f "${f_rrd}" "${f_rrd}.bak"
chown root:0 "${f_rrd}.bak"
rrdtool restore "$f" "${f_rrd}"
chown munin:munin "${f_rrd}"
done
Posted in /geek, /sysadmin, /unix, /work | Tagged | 2 Comments

Evolution MAPI Plugin on Ubuntu 9.04

Recently upgraded my laptop to the pre release Ubuntu 9.04 ( jaunty jackalope). One of the things I have been looking forward to is the Evolution MAPI plugin that is a result of work done by the Openchange project to implement MAPI stacks on both the client and server side. Evolution has been able to use Outlook Web Access for some time now as a back end, however this breaks when you are using Exchange 2007. So the MAPI plugin is currently the only hope for anyone who wants full email/calendar/GAL access in a open source mail client.

In Jaunty the evolution-mapi package is now available. After installing it a MAPI exchange account type is available. I attempted to setup my account numerous times providing my AD login, domain, password, exchange server fqdn etc… If I provided proper credentials it would seg fault, if I provide improper creds it would just error. It’s already logged as a bug. However if put in the IP address of the exchange server instead of the FQDN everything appears to work. A few people have reported this.

As of now I am able to read my exchange email in Evolution, I have yet to get GAL, Calendar or sending working. But eh progress is good.

Posted in /dev/random, /geek | Tagged | 17 Comments

Uptime

Or lack there of. Blog is back online.

Posted in /dev/random | Leave a comment