Archive for February, 2016

DHCP Cheat Sheet

03 Feb

This is for my own personal information. A cliff notes to myself where I’ve grabbed various pieces of information about DHCP.

Client sends a DHCPDISCOVER message.  Sent via broadcast packet.

DHCPOFFER message is a response to a DHCPDISCOVER and is sent by one or numerous DHCP servers.

DHCPREQUEST message is then sent by the client to the DHCP server which responded to its request. The message indicates the client is requesting that particular IP address for lease.

DHCPACK message is sent by the server, the DHCP server then assigns the IP address lease to that client.

So the pattern/order is (restating a lot of what I just mentioned above):

  • 1 = DHCP Discover message (DHCPDiscover).
  • 2 = DHCP Offer message (DHCPOffer).
  • 3 = DHCP Request message (DHCPRequest).
  • 4 = DHCP Decline message (DHCPDecline).
  • 5 = DHCP Acknowledgment message (DHCPAck).
  • 6 = DHCP Negative/No Acknowledgment message (DHCPNak).
  • 7 = DHCP Release message (DHCPRelease).
  • 8 = DHCP Informational message (DHCPInform).

Because the DHCPDISCOVER message is a broadcast message, and broadcasts only cross other segments when they are explicitly routed, you might have to configure a DHCP Relay Agent on the router interface so that all DHCPDISCOVER messages can be forwarded to your DHCP server. Alternatively, you can configure the router to forward DHCP messages and BOOTP message. In a routed network, you would need DHCP Relay Agents if you plan to implement only one DHCP server.

If your network has multiple segments, you have to perform either of the following:

  • Place a DHCP server on each segment.
  • Place a DHCP Relay Agent on each segment.
  • Configure your routers to forward broadcast messages.

DHCP Relay Agent (aka ip helper-address)

All Cisco routers that run Cisco software include a DHCP server and the relay agent software. A DHCP relay agent is any host or IP router that forwards DHCP packets between clients and servers. This module describes the concepts and tasks needed to configure the Cisco IOS DHCP relay agent.

  • The Cisco IOS DHCP server and relay agent are enabled by default. You can verify whether they have been disabled by checking your configuration file. If they have been disabled, the “no service dhcp” command will appear in the configuration file. Use the “service dhcp” command to reenable the functionality if necessary.
  • The Cisco IOS DHCP relay agent will be enabled on an interface only when the ip helper-address command is configured. This command enables the DHCP broadcast to be forwarded to the configured DHCP server.

…relay agents receive DHCP messages and then generate a new DHCP message to send out on another interface

The relay agent sets the gateway IP address (giaddr field of the DHCP packet) and, if configured, adds the relay agent information option (option 82) in the packet and forwards it to the DHCP server. The reply from the server is forwarded back to the client after removing option 82.

The Cisco IOS XE DHCP relay agent supports the use of unnumbered interfaces. An unnumbered interface can “borrow” the IP address of another interface already configured on the router, which conserves network and address space. For DHCP clients connected though the unnumbered interfaces, the DHCP relay agent automatically adds a static host route once the DHCP client obtains an address, specifying the unnumbered interface as the outbound interface. The route is automatically removed once the lease time expires or when the client releases the address.

  • The address argument can be a specific DHCP server address, or it can be the network address if other DHCP servers are on the destination network segment. Using the network address enables other servers to respond to DHCP requests (such as DHCP failover).

The DHCP Relay Agent (dhcrelay) allows for the relay of DHCP and BOOTP requests from a subnet with no DHCP server on it to one or more DHCP servers on other subnets.

When a DHCP client requests information, the DHCP Relay Agent forwards the request to the list of DHCP servers specified when the DHCP Relay Agent is started. When a DHCP server returns a reply, the reply is broadcast or unicast on the network that sent the original request.


DHCP Failover

The failover peer section that identifies the primary and secondary servers can be any string (such as “dhcp-failover”). The example identifies the two DHCP servers by address, but you can use DNS names as well.

Ports – In the past couple years, TCP ports 647 (primary) and 847 (peer) have emerged as the standard bindings for DHCP failover. It’s worth noting that as recently as 2005, the dhcpd.conf(5) man page used ports 519 and 520 in its failover example, but 647 and 847 look like good choices as of 2008 and beyond. 


Generic Informational:

Unicast Unicast is the term used to describe communication where a piece of information is sent from one point to another point. In this case there is just one sender, and one receiver. Unicast transmission, in which a packet is sent from a single source to a specified destination, is still the predominant form of transmission on LANs and within the Internet.

Broadcast Broadcast is the term used to describe communication where a piece of information is sent from one point to all other points. In this case there is just one sender, but the information is sent to all connected receivers. Network layer protocols (such as IPv4) also support a form of broadcast that allows the same packet to be sent to every system in a logical network (in IPv4 this consists of the IP network ID and an all 1’s host number).

Multicast Multicast is the term used to describe communication where a piece of information is sent from one or more points to a set of other points. In this case there  may be one or more senders, and the information is distributed to a set of receivers (there may be no receivers, or any other number of receivers).

Jenkins and DevOps – Installing/configuring Jenkins for use in the CI cycle

03 Feb

Jenkins is the oldest and most famous CI tool. It’s free and open source. It has got a huge community that has contributed thousands of plugins to it. So any external tool you name, Jenkins can integrate seamlessly with it. That’s why people are using Jenkins in their environment, they’re trusting it with their production environment as well.  (Fun Fact: Before chef and puppet, people used to write scripts and automate that using Jenkins)

Jenkins is actually a framework. It comes with no features out of the box, but it does present us a lot of features in the form of plugins.  Jenkins is ‘generous’ enough to ship with a few common plugins pre-installed though (cvs, svn, notifications, etc.). Any features you look at in Jenkins in the very early stage is because of the default plugins they included. You can remove them and that feature would go away.

Very basic installation:

  1. Download the jenkins.war file from
  2. You can then start jenkins by executing “java -jar jenkins.war”
  3. You could also put the downloaded jenkins.war file into webapps directory of your servlet container (aka Java Application Server; such as Tomcat, Jetty, JBoss, Glassfish) and then start tomcat. When you start tomcat it will deploy the jenkins war file.

The above is very simplified though, that essentially grabs the core jenkins framework and runs it without any plugins.

This is the better way to install Jenkins:

  1. Ubuntu:
  2. RedHat/CentOS:

If you install Jenkins via package (apt-get/aptitude or rpm/yum) then you’ll actually get a service (sysvinit) to start jenkins and have it run in the background  instead of having to run the jar file manually.  Jenkins lives in /var/lib/jenkins/ and will run as the ‘jenkins’ user.

Jenkins via package by default includes Jetty (a servlet container / java application server ) which listens on port 8080. So go to http://x.x.x.x:8080/

  1. Once Jenkins is running, the first step will be to configure JDK and build tools (Maven/Ant). Go to “Manage Jenkins” in the WebUI and then go to “Configure System”.  Towards the middle is an option to setup/configure JDK. By default it will pickup the one you already have installed on the server. You can use Jenkins to automatically add/install different versions of JDK.
  2. Configuring Maven/Ant if you’re using Java.  To configure Maven/Ant just provide the value for MAVEN_HOME or ANT_HOME path variables. If you haven’t installed Maven/Ant, choose the “Install automatically” checkbox.  You can select specific versions of Maven or Ant, in case your software is only certified for a specific version.
    1. Maven is good for Java. But not a good fit for other languages.  Look towards Ant, make, cmake, etc.
  3. Setting up Git.  Go to the plugin area of Jenkins. Go to “Manage Jenkins” and then to “Manage Plugins”. Go to the “Available” tab and find the Git plugin (use Filter to search for git).  You’ll find a few git plugins but the most common one is “GIT plugin”.  Click the checkbox “install without restart” and let it install. This will install the plugin in /var/lib/jenkins/plugins.  All plugins get an extension of .jpi
    1. Installing plugins does not install the base utility for that plugin to work. This means that just installing the “git plugin” plugin might not actually work if your server doesn’t have the “git” command.  If your server doesn’t have the git command, you need to get it now (‘apt-get install git-core’ or ‘yum install git’). The ‘git plugin’ simply acts as an interface (API interface) to the existing installation.
  4.  Now we’re ready to build our projects using Jenkins. To borrow ourself a maven based java project, we are going to fork the game-of-life which is a Demo application for Jenkins.


Integrating and executing jobs is not the only thing Jenkins can do for you. It could actually e-mail you, message you on your chatbox, it could send you an SMS, it could integrate with 3rd party tools like PagerDuty. You could also have Jenkins run the job in remote locations, so you could fire the commands remotely on different machines. At the same time, you can control what goes where, schedule the jobs according to cron, etc.  It can also help us in customizing the job scripts according to the environmental parameters; you could also change the parameters at runtime, so before starting a job you could send it a custom parameter or parameters.


What is a job?  A job is an automation part that you want to integrate with Jenkins. It could be a script, or you could ask Jenkins to do a few things for you in an automated fashion (and that is called a job).  In Jenkins click on “New Item” which will walk you through creating the new job.

  • Job Types
    • “Freestyle project” – If you want all the parameters to be available to you at once, you should go with the “Freestyle project”. It allows you to have access to each and every option out there. Freedom.
    • “Maven project” – would specialize it for the Maven build itself, you’d only see the Maven options in there.  The “Freestyle project” can also build Maven projects.
    • “External Job” – would be good for instance if you had 2 different Jenkins installations. You would be triggering a job on the second installation of Jenkins using the first installation of Jenkins itself.
    • “Multi-configuration project” – if it was required to build a job in multiple configurations at once. Example: You’re developing security software which has to be tested on multiple operating systems, so the same code should be running on multiple boxes  depending on how it is compiled. So if it was coded in “C” it’s not portable (because the compilers are machine specific; you can’t move it to another system that has a different architecture). You could ask Jenkins “I want 4 different remote machines to be present, all 4 will have different operating systems, I want you to handle it and execute ‘make’ (or ‘cmake’) on those individual machines, so that the code gets compiled depending on the flavor and architecture of that build system.  That’s how you’d build the code on 32 bit and 64 bit infrastructures.

Everything lives in /var/lib/jenkins/  so when you’re executing a job, look in /var/lib/jenkins/jobs/   and within there you’ll see a folder named after your job-name, such as “game-of-life” if that was the name of the project. Within /var/lib/jenkins/jobs/game-of-life/ there’s a folder called workspace/ that contains the scratch data/workspace that it needs to do the build.

Setting up Jenkins and securing it

Go to “Manage Jenkins” and then to “Configure Global Security”

  • Check “Enable security”
  • If you have LDAP credentials, choose LDAP
  • If you need something simple, choose “Jenkins’ own user database” and decide if you want users to be able to sign up or not. If you don’t allow them to sign up, you’ll need to create an account for each user and give them their login/password.  It actually stores the credentials in an XML file.
    • For the initial setup/securing I recommend you leave the ‘allow users to be able to sign up’ option enabled. Because you will need to create an account for yourself (or an admin account).
  • In the Authorization section choose what you want your users to be able to do. The “Matrix-based security” section allows you to give specific users specific access based on which part of Jenkins you want them to be able to access.
  • After you click Save, you will be kicked out and brought to the login screen. However, you do not yet have a login. Click on the Sign Up option and create an account for yourself.  As soon as you sign up, you will be logged in.
    • You can now choose to disable the ‘allow users to be able to sign up’ option if you want direct control (via the xml file) of every user account
    • If you did not enable the ‘allow users to be able to sign up’ option and you’ve just now been kicked out of Jenkins and can no longer log in… go to /var/lib/jenkins/ and edit the config.xml and you’ll find “<useSecurity>true</useSecurity>” towards the top. Change true to false, save the file and restart jenkins. Then you can get into the web interface without authentication.


DevOps and the Continuous Integration (CI) cycle

03 Feb

Jenkins is a Continuous Integration tool.

  • DevOps – bridge a gap between AGILE development and non-AGILE operations.
  • DevOps is really about setting up a good culture (not about fancy tools). If you are able to setup a good culture in your company (your communications should be horizontal, it should be fast enough; you should not be waiting/twiddling your thumbs; you should have a sense of ownership of what you’ve done, try to take initiatives, etc.)
  • 3 stages.
    • Continuous Integration (CI)
      • CI is a development practice that requires developers to integrate code into a shared repository several times a day.
      • Each check-in is then verified by an automated build, allowing teams to detect problems early and deliver the software early
    • Continuous Deployment
    • Continuous Delivery


We need a server to automate everything. We need a place where we can configure something, add some scripts, have our code check out, someone to take care of building stuff, integrating stuff and Jenkins is one of the tools which can help us do just that. If you’re interested in Jenkins, see my other post about Jenkins installation and configuration.

CI helps us by adding checks at various levels. At the same time it will be moving forward in the build and replace cycle. Standard Checks, Syntax checks, automation testing (feedback loops are created at every level/stage). So at every stage there is a notification channel, if that stage or particular test is failing, the developers are notified. This could be done in Jenkins by setting up notifications, which is included out of the box.

Benefits of CI

  • Continuous Integration brings the following benefits to software development (it doesn’t get rid of bugs, but it does make them dramatically easier to find and remove)
    • Catch issues fast and nip them in the bud
    • Everyone can see what’s happening
    • Automate the build
    • Keep the build fast
    • Stop waiting to find out if your code’s going to work
    • Continuous Integration leads to Continuous Deployment allowing you to deliver software more rapidly

How to do a disk benchmark using sysbench

03 Feb

I’ll clean this up later but I wanted to throw this down here before I forget. I’m constantly benchmarking various mounts (local Magnetic HDD, local SSD, SW RAID vs HW RAID, RAID5 vs RAID6, ZFS RAIDZ1 vs ZFS RAIDZ2, NFS Mounts, ISCSI mounts).

I used this as my starting guide:  But it was a bit dated, I used the commands as a guideline but I modified the numbers/values naturally.

  1. As a general rule of thumb I pick about +2gb more than the amount of memory the server has, just to be 100% sure that memory is not being used instead of the disk. So on an 8gb system you’d want to create 10gb of ‘test files’ that sysbench can work with. Run this command to prepare the system by creating these test files in the exact structure that sysbench requires
    1. sysbench --test=fileio --file-total-size=10G prepare
  2. This will run a 5 minute (300 second) test and do random read’s and write’s. It will then spit out a Total read, total written, total transferred and most importantly how many MB or GB a second it was able to get. That’s what I use to compare servers or mediums against each other.
    1. sysbench --test=fileio --file-total-size=10G --file-test-mode=rndrw \
       --init-rng=on --max-time=300 --max-requests=0 run
  3. This will run a 3 minute (180 second) test and do sequential reads only. This is usually the fastest a disk or link can perform, it’s very easy on the disk drive and so you may find the bottleneck is the controller or the network connection.
    1.  sysbench --test=fileio --file-total-size=10G --file-test-mode=seqrd \
       --init-rng=on --max-time=180 --max-requests=0 run
  4. When all is said and done, run this: (I’m not sure why I couldn’t just delete all the test* files but they say run cleanup, so whatever)
    1. sysbench --test=fileio --file-total-size=10G cleanup

These are the valid options for file-test-mode (these are the various file system tests you can perform)

–file-test-mode=seqwr (sequential write test)
–file-test-mode=seqrd (sequential read test)
–file-test-mode=rndwr (random write test)
–file-test-mode=rndrd (random read test)
–file-test-mode=rndrw (random read and write test)

You can play with the time to run (60 seconds, 180 seconds, 300 seconds, etc.) and the file test mode and record the various results.

For my own personal future reference:
POOPEYE = Dell PowerEdge R610 with Dual Xeon Processors, 32gb of memory (and no, I did not name this server)
gitchef = KVM VM running on poopeye with 4 cores,  6gb of memory

This is POOPEYE benchmarking in /root/  which is his RAID5 internal hard drives:  (2 runs)
Read 4.6106Gb  Written 3.0737Gb  Total transferred 7.6843Gb (26.228Mb/sec)
Read 4.5959Gb  Written 3.064Gb  Total transferred 7.6599Gb (26.099Mb/sec)

This is POOPEYE benchmarking in/mnt/pve/nfs-naspool/disktesting which is NFS to the main ZFS pool on Nas4Free:  (2 runs)
Read 365.62Mb  Written 243.75Mb  Total transferred 609.38Mb(2.0311Mb/sec)
Read 334.69Mb  Written 223.12Mb  Total transferred 557.81Mb(1.8593Mb/sec)

This is gitchef running on POOPEYE as a KVM VM. Right now his QCOW2 file is on local storage: (2 runs)
Read 2.2507Gb  Written 1.5004Gb  Total transferred 3.7511Gb(12.804Mb/sec)
Read 1.3498Gb  Written 921.44Mb  Total transferred 2.2496Gb(7.6785Mb/sec)

I was troubleshooting ZFS + NFS performance issues and so I got these results after each change: (Random Read/Write test)

# zpool Dataset with SYNC=DISABLED (across nfs)
Read 4.6976Gb  Written 3.1317Gb  Total transferred 7.8293Gb (26.724Mb/sec)
Read 4.6106Gb  Written 3.0737Gb  Total transferred 7.6843Gb (26.229Mb/sec)

# Direct to the main pool (no dataset) with SYNC=STANDARD (across nfs)
Read 703.12Mb  Written 468.75Mb  Total transferred 1.1444Gb (3.9062Mb/sec)

# Direct to the main pool with SYNC=STANDARD but ZIL going to 20gb SSD
Read 796.86Mb  Written 531.23Mb  Total transferred 1.297Gb (4.4269Mb/sec)

# Direct to the main pool with SYNC=STANDARD but cache to 20gb SSD Partition (SSD wasn’t that fast, it was a cheap SSD)
Read 545Mb  Written 363.33Mb  Total transferred 908.33Mb (3.0277Mb/sec)
Read 1.6278Gb  Written 1.0852Gb  Total transferred 2.713Gb (9.2604Mb/sec)

# Dataset with SYNC=disabled but with fast compression
Read 7.4808Gb  Written 4.9872Gb  Total transferred 12.468Gb (42.557Mb/sec)
Read 7.7948Gb  Written 5.1965Gb  Total transferred 12.991Gb (44.344Mb/sec)

So on a personal note, ZFS with a dataset having SYNC=disabled and compression=lz4 on results in wicked fast speeds, faster than local RAID5, heh.


AWS EBS Magnetic benchmarks: (for personal reference)

Random Read/Write Magnetic: 2.4523Mb/sec
Random Read/Write: Cold Storage (SC1): 659.42Kb/sec
Random Read/Write: Throughput Optimized (ST1): 2.25Mb/sec

Sequential Read Magnetic: 62.852Mb/sec
Random Read Magnetic: 2.564Mb/sec
Sequential Read SC1:  40.403Mb/sec
Random Read SC1: 887.18Kb/sec
Sequential Read ST1: 62.059Mb/sec
Random Read ST1: 2.6634Mb/sec

Sequential Write Magnetic: 34.457Mb/sec
Random Write Magnetic: 4.4009Mb/sec
Sequential Write SC1: 45.716Mb/sec
Random Write SC1: 524.01Kb/sec
Sequential Write ST1: 60.554Mb/sec
Random Write ST1: 2.1006Mb/sec


Oh, note to self: I just found this link:  it looks interesting. I/you/we need to go over it later and integrate some of it into this page (while giving full credit to the author of course).


MySQL Optimization – System Administration side

03 Feb

How to tweak MySQL from the System Administrators side of things. This doesn’t deal with specific queries or DBA stuff, this is more for Linux admins on how to get a lean mean MySQL machine.

  • What is MySQL Optimization?
    • These optimizations deal with changing the default variables. You would put these new and tweaked variables (aka settings)  into the /etc/my.cnf section under [mysqld] section
    • Determined by looking at MySQL logs, monitoring and using diagnostic tools
  • Why Optimize?
    • Seriously? Spend 1 or 2 days optimizing every once in a while and the results should be great; or at the very least you’ll be more confident that everything is as optimized as it can get (ever have that nagging feeling that the application just isn’t being queried properly…)
    • Optimizing improves server performance
      • Get more out of each server. Don’t just throw more servers at the problem or upgrade existing servers to mitigate the problem
    • Saves money on hardware upgrades
  • The default’s are very basic. This is because each environment and use case is unique. Think of these defaults as the MySQL “placeholder” page. You should change it to fit your needs
  • Get the current variables/settings of MySQL
    • From MySQL CLI: “show variables;”
    • From bash prompt: “mysqladmin variables” or “mysqladmin var”

Common Variables that really should be in every my.cnf file

M = megabytes ; K = kilobytes

  • Cache
    • thread_cache_size (default 0)
      • The number of threads the server should cache for re-use. It’s not even enabled by default but it should be.
      • A good starting value for a basic MySQL server is 4
    • table_open_cache (default 64)
      • This is the number of open tables for all threads
    • query_cache_limit (default 1M)
      • Limits how big of a query/result goes into the query_cache_size. So if something were to go into the query_cache, it can’t be larger than 1 MB by default. If it’s larger than 1mb it won’t be cached.
    • query_cache_size (default 0)
      • This is the amount of memory for caching queries
        • When cached it doesn’t have to keep getting pulled from the disk (speeds up results)
      • A good starting value is 8M
    • query_cache_type (default 1 = on)
      • 0 = off; 1 = on;  2 is on if needed. A specific database can call it if they want to but if they don’t call it then don’t use a query cache
      • Even though it’s on by default, the default query_cache_size is 0. So there’s no real query cache until you enable/set a size
  • Buffers
    • key_buffer_size (default 8M)
      • This is the buffer size for index blocks for MyISAM. It doesn’t include Innodb (so it won’t affect Innodb)
      • A good starting value, if you use MyISAM, is 28M (This is a very important variable to set if using MyISAM)
    • read_buffer_size (default 128k)
      • This is the thread sequential table scan buffer table size
    • join_buffer_size (default 128k)
      • The minimum size of the buffer that is used for plain and range index scans and for joins without index scans, and thus perform full table scans. Normally, the best way to get fast joins is to add indexes. If you can’t use Index’s for some reason, then increase the value of to get a faster full join. One join buffer is allocated for each full join between two tables. For a complex join between several tables for which indexes are not used, multiple join buffers might be necessary
  • Others
    • connect_timeout (MySQL default is 10)
      • Number of seconds MySQL will wait for a packet before it sends a bad handshake.
      • The reason you may want to reduce it is because you have a very active server and you have very solid connectivity between the application and the server and you just don’t want a lot of hosts connecting to you that are taking too long. So in order to keep a lot of those connections going, you reduce.
      • The reason you may want to increase it is maybe you have a slow server, or the network connectivity isn’t that great or prone to issues (like having your traffic deprioritized) and because its latent or the connectivity is slow you don’t want a lot of people getting bad pages on your machine because your connect_timeout is set too low.
    • max_connect_errors (default 10)
      • This is the maximum number of interrupted connections from a host. The default being 10, in order for a host to reach that they’d have to send a packet and then interrupt it before its completed. They’d have to do it 10 times in a row before they are permanently blocked.
        • Once they are permanently blocked they won’t be able to connect to the server again (until you run flush hosts; from the command line)
        • If you have a bad network connection or periodically bad and people continually have problems connecting, you may want to increase this or they’ll become permanently banned.
        • If you were getting a MySQL based attack, and they’re purposely interrupting the MySQL connections in order to try and take the server down, you may want to lower this number (since they’ll be using a lot of various IP’s, connecting frequently).
    • max_allowed_packet (default 1M)
      • This is the maximum size of the packet that can connect.  The packet message buffer is initialized to net_buffer_length bytes, but can grow up to max_allowed_packet bytes when needed. This value by default is small, to catch large (possibly incorrect) packets.
      • You must increase this value if you are using large BLOB columns or long strings. It should be as big as the largest BLOB you want to use. The protocol limit for max_allowed_packet is 1GB. The value should be a multiple of 1024; nonmultiples are rounded down to the nearest multiple.
      • Developers will often ask you to increase this value, especially if they’re storing a lot of data in the database
      • A good starting value may be 10M
    • tmp_table_size (system-specific default)
      • The maximum size of internal in-memory temporary tables. This variable does not apply to user-created MEMORY tables.
      • The actual limit is determined as the minimum of tmp_table_size and max_heap_table_size. If an in-memory temporary table exceeds the limit, MySQL automatically converts it to an on-diskMyISAM table.
      • Increase the value of tmp_table_size (and max_heap_table_size if necessary) if you do many advanced GROUP BY queries and you have lots of memory.


Unneeded Variables

These are unneeded because they’re already the default, or they don’t exist on your system.

  • thread_concurrency
    • It doesn’t exist on Linux (or FreeBSD). Only on Solaris.
    • A lot of guides mention this variable. If you see it, remove it. MySQL 5.5 and 5.6 won’t even start up if you have this variable defined in your config.
  • skip-locking
    • You can have it in your file if you want but there’s no reason to. Default skipped since MySQL 4.0
    • It’s actually called skip-external-locking
    • You’re not hurting anything by having it, but the goal of optimizing is only to have the variables/changes you need. If you don’t need it, it shouldn’t be in there.



  • Why log?
    • Only way to track down intensive databases or database users. It’s the only way you’re going to know historically over the last day/week/month which databases have been running processes and are showing up in the log.
    • Without it, have to rely on monitoring
  • Types of logging
    • Error
      • It logs MySQL errors
      • Default is enabled
    • General
      • All databases process activity
      • Default is disabled
      • This is very DiskIO intensive and very tedious to parse through. I wouldn’t do this unless you’re trying to track down a problem or it’s dev/qa
    • Slow Query
      • Slow MySQL queries (10+ seconds to execute by default, this can be changed)
      • Default is disabled
      • If you’re doing query optimization, this is very important to enable (mainly in Dev/QA environments; I don’t recommend doing this in production as it can slow things down, but if you must, do it for a brief period of time and then turn it back off)
      • Databases the frequently perform slow queries are going to cause terrible latency, terrible slowdown for MySQL on the server and increase load
    • In MySQL 5.5 and 5.6 to enable General or Slow Query logging
      • In /etc/my.cnf add “general_log” to the [mysqld] section
        • Remember, this is Disk IO intensive and these logs take a lot of space, so be careful when using this.
      • In /etc/my.cnf add “error_log” to the [mysqld] section
        • This isn’t too Disk IO intensive so it’s not as bad as general logs. If you feel it’s still generating too much activity, you could change the number of seconds a query takes (10 by default) before it gets added to the slow_query log. Then as you fix those queries and the log starts to slow down, you can decrease the number a little more and then address those slow queries, etc. Essentially start at the top, fix it and then scale down; rinse and repeat.


  • top – Almost every linux server has this
    • this is a basic overall real-time system monitor, not MySQL specific but is present on any decent Linux server.
  • atop – Most linux distro’s have this in their package/repository list for easy installation
    • I prefer this over top because it shows you a lot more details about the system and it can even be run as a background daemon and it’ll grab snapshots of your system ever few minutes so you can monitor processes for Disk/Memory/etc. over time. atop rocks.
  • mtop –  (may have to install from source)
    • mtop (MySQL top) monitors a MySQL server showing the queries which are taking the most amount of time to complete. Features include ‘zooming’ in on a process to show the complete query, ‘explaining’ the query optimizer information for a query and ‘killing’ queries. In addition, server performance statistics, configuration information, and tuning tips are provided.
    • Kind of an oldie but goodie. I’ve only used it on MySQL 5.1 and 5.5, not sure about 5.6
  • innotop –  (may have to install from source)
    • innotop is a ‘top’ clone for MySQL with many features and flexibility. It’s completely customizable; it even has a plugin interface.
    • innotop monitors MySQL servers. Each of its modes shows you a different aspect of what’s happening in the server. For example, there’s a mode for monitoring replication, one for queries, and one for transactions. innotop refreshes its data periodically, so you see an updating view.
    • monitors many servers at once and can aggregate across them
  • MySQL Monitoring Tool
    • mysqladmin processlist / mysqladmin proc  (or most likely ‘mysqladmin -u root -p processlist’)
    • This is MySQL’s own processing tool. It shows you details about what’s going on within the mysqld process. It shows you Id/User/Host/db/Command/Time/State/Info
  • check_mysql_health – Nagios plugin for monitoring various MySQL health aspects
    • MySQL_connection-time – ./check_mysql_health –socket /var/run/mysqld/mysqld.sock –user root –password rootpassword –database somedatabase –mode connection-time
    • MySQL_Uptime – ./check_mysql_health –socket /var/run/mysqld/mysqld.sock –user root –password rootpassword –database somedatabase –mode uptime
    • MySQL_threads-connected – ./check_mysql_health –socket /var/run/mysqld/mysqld.sock –user root –password rootpassword –database somedatabase –warning 30 –critical 40 –mode threads-connected
    • MySQL_threadcache-hitrate – ./check_mysql_health –socket /var/run/mysqld/mysqld.sock –user root –password rootpassword –database somedatabase –warning 100 –critical 110 –mode threadcache-hitrate
    • You can do the same as above for entries like: MySQL_table-lock-contentionMySQL_index-usageMySQL_tmp-disk-tablesMySQL_slow-queriesMySQL_long-running-procsMySQL_sqlMySQL_open-filesMySQL_encode and MySQL_querycache-hitrate


  • You’d fine them in /usr/share/doc/mysql-server-5.5/examples/ as names like “my-huge.cnf.gz” “my-large.cnf.gz” “my-small.cnf” “my-innodb-heavy-4G.cnf.gz” and “my-medium.cnf.gz”  ;  the ones with the gz extension would need to be decompressed first using gunzip
  • Don’t use them. They’re not very good templates. They were created a long time ago and have not been updated recently.
    • They even include the thread_concurrency unneeded variable I mentioned above, the one that will stop MySQL 5.5 and 5.6 from starting up
    • They change a lot of variables without you really understanding what each does
    • Performance improvements the template provides over the defaults is very questionable
  • Just change one or 2 variables at a time, knowing what each does and seeing exactly how it affects the database performance. It’s the best way to know your database/server inside and out and know what each and every setting does and how much it helped.



  • General Optimization
      • Checks MySQL activity since the last restart of MySQL (so it won’t help if you’ve recently restarted mysql; wait at least 24 hours)
      • Provides recommendations by comparing the activity with the current /etc/my.cnf settings
      • The output provides exact values for the recommended settings to add to /etc/my.cnf
        • You can generally do what it recommends, but you should lookup each variable in the MySQL documentation and get a general idea of what that variable does and what its normal default is and what your’e changing it to. Just don’t blindly trust the script unless you’re in a big hurry and you really don’t care too much about the server or the data.
    • mysqlidxchk
      • mysqlidxchk (MySQL Index Checker) checks MySQL databases/tables for unused indexes. It reads in the slow or general log files
      • mysqlidxchk reports which indexes in the database schema are not used by the queries in the log files
      • If feasible, you should then go in and unindex those databases. Removing index’s that aren’t needed speeds up databases.
      • There are some thing that need to be indexed (that haven’t been; that you should index ) but if it’s not needed, don’t index it.
  • Log parsing
    • mysqlsla
      • Uses the general or slow query logs to track the database users percentage-wise
      • Let’s say you enabled slow or general logs 2 days ago, if you go and run this against the logs it will provide the top 10 users (and show you each users percentage usage) and tell you what percentage that database user is of the queries in the log. Lets say you have the top most user, and they’re 30% of the  slow query activity; there’s something seriously wrong with that database. You could move that user or database to another MySQL server and speed up every other database/user using that server.
  • Status reporting
    • mysqlreport
      • It interprets MySQL’s ‘SHOW STATUS’ values into a detailed report on how MySQL is running
      • It’s a really complicated tool, read over their guide in detail



  • mysqlslap – quick/common tool for generating quick load
    • mysqlslap –user=root –auto-generate-sql –concurrency=100 –number-of-queries=1000 –number-char-cols=10 –number-int-cols=10 –iterations=100 >> /tmp/mysqlslap.log
    • You can put the above in a cronjob or better yet, a script that runs it in a loop. You can keep increasing some of those numbers in the command until you get a nice consistent high load on the dev/test server. You could let it run for ~24 hours if you were going to use a script like to get some good better-overall default settings for your my.cnf file for that particular server.
  • sysbench – Excellent utility for running an actual benchmark and providing you with the number of transactions per second value
    • To measure MySQL performance, we first create a test table in the database test with 1,000,000 rows of data
      • sysbench –test=oltp –oltp-table-size=1000000 –mysql-db=sysbenchtest –mysql-user=root –mysql-password=yourrootsqlpassword prepare
    • Then we run the actual benchmark
      • Read-Write – 16 threads
        • LOCAL: sysbench –db-driver=mysql –num-threads=16 –max-requests=100000 –test=oltp –oltp-table-size=1000000 –mysql-db=sysbenchtest –mysql-user=root –mysql-password=yourrootsqlpassword run
        • REMOTE: sysbench –db-driver=mysql –num-threads=16 –max-requests=100000 –test=oltp –oltp-table-size=1000000 –mysql-db=sysbenchtest –mysql-host=<mysql server hostname> –mysql-port=<mysqlport> –mysql-user=<mysqluserid for database> –mysql-password=<mysqluserid password> run
      • Read-Only – 8 threads (good for testing read replicas/slaves)
        • sysbench –test=oltp –oltp-table-size=1000000 –mysql-db=sysbenchtest –mysql-user=root –mysql-password=yourrootsqlpassword –max-time=60 –oltp-read-only=on –max-requests=0 –num-threads=8 run
    • The important number is the transactions per second value:  ie “transactions:  100001 (269.36 per sec.)”
    • Latency makes a very big impact on transactions per second. The difference between a separate but local database server and one that was 20ms or 60ms away was fairly substantial, even if the remote server was beefier. I don’t recommend being more than 20-25ms away from your database server, if you have to be, you need to have some kind of local read-replica/slave copy for reads or you need to have your application do some heavy caching and minimize the usage of the mysql server. You could also use connection pools to help, have the application establish and keep-alive/maintain x number of active connections to the mysql server and then just push all of its queries across that pool. This will help with TCP tear-up/tear-down’s (SYN, SYN-ACK, ACK-ACK, PUSH, FIN, FIN ACK, etc. only to come around a split second later and do it all over again)

MySQL query optimization – DBA side

03 Feb

Note to self: Brain Dump. Needs to be organized better.

This relates to how the application does query’s to the MySQL server. This is more DBA/Application side. But here are some good general rules to follow. FYI I took these from my own personal emails and notes that I’ve taken over time. Some of it is from an old abandoned Wiki I used to run. I’m sure my notes/emails/wiki/etc. were generated from some content on the net but I can’t always remember where. I apologize if I’ve offended anyone by not citing source, but if you know the source of any of these various sections below feel free to comment it and I’ll be happy to edit this post and cite the source.

  • Use EXPLAIN keyword to see the execution plan for the query. 2 key elements to watch for in the execution plan are
    • Check the index usage (key column in the EXPLAIN output)
    • Check rows scanned for query
    • See the EXPLAIN section below this section, because EXPLAIN is a powerful and complex utility for analyzing a query.
  • If you’re retrieving a unique/single row add “LIMIT 1” to the end anyways
    • You can also use LIMIT to improve aggregation functions like MIN or MAX
  • If your query uses ‘not equal to’ operator try to convert that to ‘equal to’ operator (Try to convert <> operator to = operator)
    • ‘equal to’ operator (=)  increases the chances of your index to be used for your query
  • Avoid using “SELECT *” unless you are retrieving all the columns of your table
    • “SELECT *” forces a full table scan by ignoring index’s to execute the query
    • Additionally, columns which your application doesn’t need wastes your network bandwidth (it may seem like a little bit but multiply this by thousands of queries per second and it adds up)
  • Split big DELETE, UPDATE or INSERT queries into multiple smaller queries. It improves the performance as well as give you better control on data
  • Use appropriate data types for columns. If you are going to only store integer’s don’t create your column with data type varchar
    • Smaller columns are faster for performance so don’t use overly large columns
  • MySQL query cache is case and space sensitive! That means if you are going to get the same result set multiple times, make sure to use the same case and spacing between words. This dramatically improves your query performance because MySQL engine will be able to return your result set with regard to index.
  • If you create index on the column in which you use the WHERE clause it’ll improve performance.
    • However this is just general/generic advice. Don’t go nuts creating too many index’s, that’s also dangerous. Create Index’s on columns you’ll be doing WHERE clauses on the most, not every single time. Use common sense.
  • Create Index columns on the columns in which you have used in JOIN
    • This is a general guideline. Again, don’t create an index on every column you used in your join. Use your best judgement, and test your queries before you create too many index’s.
  • Use ‘UNION ALL’ instead of ‘UNION’ if duplicate data is permissble in your result set. ‘UNION ALL’ gives better performance than ‘UNION’ because it does not have to do a distinct operation on your query
  • If you are going to use just INNER JOIN the table order used in your JOIN clause does not matter
  • If column used in ‘ORDER BY’ clause are indexed it helps with performance of the query (when you are doing sharding, one of the most expensive operations of MySQL)
  • If your application implements pagination, it is a good idea that you use the LIMIT clause to implement paging in your database itself. Not only does this improve performance but it also reduces database traffic between your database and client application (again, this adds up over thousands of queries).

There can multiple ways to write a query to produce the same result set. It is possible one method works for one query while another method works for another query.

Always test your query with near real data and schema on your development server before deploying to your production server. This is very important because what speeds up query’s varies largely on the type/size/etc. of data being queried, so you could find one of the above helps tremendously on your test data and then roll it into production and find it doesn’t help very much or at all (I doubt it’d make it worst; but maybe…).  It’ll look bad if you say “We did this optimization in our test lab and it improved things by 25%” and you go and roll it out in production and you only get a 2% increase. You could have used that deployment roll out time to do something else that would result in better performance.

If you production database is always changing / always updated, you could try creating a snapshot of the database and restoring the snapshot into a QA/Test environment or you could try restoring the latest backup of the production database (you DO have periodic backups don’t you…)



The number one utility for analyzing a query. EXPLAIN shows how MySQL thinks, how it anticipates it’s going to execute a query. This is not the true execution plan MySQL might choose, in many cases it is though. There are cases where MySQL thinks it is going to execute a certain way but then when it gets started with a query it finds out the values it’s examining is different than it thought, so it may actually re-evaluate and re-plan as it goes. Sometimes you can see in advance that MySQL knows it’s going to have to re-plan the query for every value it retrieves from the table. So take the results/information EXPLAIN presents to you with a grain of salt, it’s ‘generally accurate’ but not always.

If the MySQL query is doing a JOIN from one table into another and MySQL doesn’t really know which index is going to be best, if it thinks that different index’s might be best depending on the values it retrieves from the first table it might actually switch around and re-evaluate (the index to use) every time it looks for rows in the 2nd table. EXPLAIN may or may not be able to accurately display this (because it hasn’t done the actual query so it wont’ know the actual values)

EXPLAIN only works for SELECT in MySQL 5.5 and previous
EXPLAIN will additionally work with UPDATE, DELETE, INSERT and REPLACE  in MySQL 5.6

  • Generating EXPLAIN
    • MySQL actually executes the query (with a special bit that tells it not to execute, explain only)
      • But at each JOIN, instead of executing, it fills the EXPLAIN result set
    • What is a JOIN?
      • Everything is a JOIN, because MySQL always uses nested-loops
      • Even a single-table SELECT or a UNION or a subquery is a JOIN
    • Because it actually executes the query, but because of that special bit it doesn’t actually do anything. However this is not necessarily true for subquery’s.
      • subquery’s that require a materialized temporary table as a step in the input to the execution plan, will actually cause MySQL to execute the subquery and create the temporary table and fill it with the results and then re-execute the outer query against the subquery.
      • If you EXPLAIN a query that has a subquery and you notice EXPLAIN is taking a while to finish, that’s because it’s executing that inner table and filling that temporary table.
    • EXPLAIN output/columns
      • id: which SELECT the row belongs to
        • If only SELECT with no subquery or UNION, then 1
        • Otherwise, generally numbered sequentially
        • Simple/complex types
          • Simple: there is only one SELECT in the whole query
          • 3 subtypes of complex:
            • subquery: numbered according to position in the SQL text
            • derived: executed as a temp table
            • union: fill a temp table, then read out with a NULL id in a row that says UNION result
      • select_type
        • Simple vs complex; which type of complex
        • Special UNION rule
          • First contained SELECT matches outer context
        • Dependencies, uncacheability
          • Refers to the item_cache, not query cache
      • type
        • How MySQL will access rows (how it will find rows in the table)
        • Worse to better:
          • ALL – Table scan where it scans through the entire table looking for rows
          • index – Scan an entire index, kind of like a full table scan except it’s only a full index scan
          • range – Scan a range of an index
          • ref – value as a reference to look into and find rows in the index that match that reference value
          • eq_ref – Variation of reference, there’s going to be at most 1, so that’s a unique index / primary key
          • const – Constant
          • system – MySQL has completely, during the optimization phase, determined that it can replace this access to this table with a constant and include that constant in the rest of the query as though it wasn’t a table at all (example: selecting the maximum value from an indexed column; MySQL can seek to the end of the index during the execution planning phase and find out what the maximum value in that column is and then instead of referring to the query against the index later in the query it can actually refer to that value)
          • NULL – There’s no table at all involved. (ie “select 1”)
      • ref
        • When something is looked up in an index, when MySQL is seeking for rows that match a value, where does it get that value from.
      • rows
        • Approximately how many rows MySQL thinks it’s going to examine. This is based on the storage engine statistics. There are 2 function calls MySQL can make during query planning to the storage engine to ask it ‘what is the distribution of rows in your index tables, and how many tables do you estimate are going to match this value’
        • That value gets used in query execution and optimization and evaluating different orders that might be used (like for a JOIN)
      • filtered (only shows in MySQL 5.1)
        • The percentage of rows that satisfy WHERE
        • Usually 0 or 100; complex behavior
        • Kind of useless, doesn’t show you all that much, that’s why it was removed in MySQl 5.5 and newer
      • extra
        • Using index
        • Using where
        • Using temporary
        • Using filesort
        • Sort-merge types
        • Lots more values (check the documentation)
    • EXPLAIN Index-Related columns
      • possible_keys
        • Which indexes were considered? Which were candidates.
          • Sometimes MySQL will actually use an index that wasn’t listed as a candidate because the decision of which things are candidates is decided in one point of the source code and somewhere else there is an optimization that could choose a different index.
            • So you may sometimes see an index listed in key (which one was used) but not listed in the possible_keys (which were considered)
      • key
        • Which indexes did the optimizer choose?
      • key_len
        • How many bytes of the index will be used?


  • How does MySQL execute queries?  In a very broad sense…
    • Parses SQL => (data structure) Parse Tree => (data structure) Execution Plan
    • Executor (aka executioner) works from the data structure, the execution plan
      • It goes through the execution plan structure, navigating from part to part in the execution plan data structure and executing the storage engine API calls on each node in that data structure
      • Optimizations become changes to the tree structure of the execution plan, constant folding, etc. (looks like what a compiler does)
      • The execution plan is just another type of data structure, not byte-code (which could then be handled higher up by the CPU)
      • The executor makes storage engine calls


What is Explain? It shows the estimated query execution plan. Only for SELECT in MySQL 5.5 and previous. It might not be the true execution plan, but it could be; as it gets started out it may learn values that it's examining is different than it thought and it may reevaluate / re-plan the query.  Sometimes you can even see in advance that MySQL knows it'll have to re-plan the query for every value it gets.

If it’s doing a join from one table to another and it doesn’t really know which index is going to be best, it thinks that different index might be best depending on the values it gets from the first table it may switch around and reevaluate the index to use every time it looks for rows in the 2nd table. It’s only an estimate and it’s only for select.

In MySQL you can’t really see how the query executed after the fact.

EXPLAIN is an estimate of how MySQL thinks it is going to optimize and execute the query.

In MySQL 5.6 you can actually be able to say ‘explain update’ ‘explain delete’ ‘explain insert’ ‘explain replace’

Execution plans and how we end up generating the EXPLAIN output. MySQL’s execution process, broadly speaking, is that it parses SQL, converts that into a data-structure called the “Parse Tree” and then transforms the parse tree into the Execution Plan. The executor (called the executioner) works from that data structure (the execution plan) it iterates through the structure navigating from part to part in the data structure and executing the specified storage engine API calls on each node in that data structure.

MySQL starts with a data structure that come sout of the parse process and it transforms it into another data structure. Optimizations become changes to the tree structure, changes to the execution plan, etc.

MySQL JOIN only combienes 2 pieces of information from tables at the same time. So if you’re joining 3 tables, it joins the first 2 tables first, then adds in the 3rd table. If you joined 4 tables, it’d join table 1 and 2, then 12 + 3 then 123 + 4.

During an EXPLAIN MySQL actually executes the query. But at each JOIN, instead of executing, it fills the EXPLAIN result set. Most of the time it’s a NOOP (bit is set that says don’t do anything) but this is not necessarily true for subqueries (subqueries that require a materalized temporary table as a step in the input to the execution plan will actually cause mysql to execute that subquery and fill a temporary table with the results and re-execute the outter query with the subquery). So if you EXPLAIN a query that has a subquery, sometimes you’ll see EXPLAIN takes a long time to finish. This is because it’s executing the inner query to fill the tbale before it explains the outter query.

2 classes of query’s, SIMPLE and COMPLEX. Simple is what you used to be able to do in MySQL v3 (before version 4 introduced subquery’s and unions). Subquery’s and union’s are considered COMPLEX.

For the “id” column. Primary queries|top level queries|outter queries have an ID of 1 (in the output of EXPLAIN). ID = Which select the row belongs to. If only SELECT with no subquery or UNION, then 1. Otherwise, generalyl numbered sequentially. 3 tubetypes of complex:
– subquery: numbered according to position in the SQL text
– derived: executed as a temp table
– union: fill a temp table, then read out with a NULL id in a row that says UNION RESULT

TINYINT: 127 to -128
SMALLINT: 32,768 to -32,767
MEDIUM INT: 8,388,608 to -8,388,608
INT: 2^31 to -2^31 -1
BIGINT: 2^63 to -2^63 - 1
FLOAT: Decimal spaces, 1.1E38 to -1.1E38
DOUBLE: Decimal spaces, 1.7E308 to -1.7E308

CHAR: a character string with a fixed length
VARCHAR: A character string with a length that’s variable
BLOB: Can contain 2^16 bytes of data
ENUM: A character string that has a limited number of total values, which you must define.
SET: A list of legal possible character strings. Unlike ENUM, a SET can contain multiple values in comparison to the one legal value with ENUM


Deon's Playground

Placing whatever interests me and more