Three client error messages that MySQL DBAs should know by heart to save time

Working in an operations environment means that you get a lot of questions.  There’s the inevitable troubleshooting tasks that go along with being a DBA.  This is designed to be a quick reference post, much like my more in depth post in 2016 about timeouts. These typical error messages can create confusion and unneeded activities to diagnose.  To aid my own process of elimination and those of others, here are three error messages every DBA should know by heart.

SQLSTATE[HY000] [2002] Connection timed out

Plain and simple: this error means the client cannot connect to the server.

  • The calling program is trying to connect to the wrong database server (one that it cannot reach).
  • The database server is completely down (you’d get another error if the server and client actually made a connection).
  • Firewall rules on either local servers or on a network resource is preventing the connection.
  • Routes are misconfigured.
  • The calling server has so many outbound connections that the MySQL client cannot create an outbound socket (think LSOF / netstat (lots of TIME_WAITed connections) ulimits / tcp_tw_reuse & recycle, etc).
  • Anything else that might be preventing an actual TCP connection outbound to the server port (typically 3306).

SQLSTATE[HY000] [2006] MySQL server has gone away

This error is typically related to server timeouts.

  • an idle connection is hitting wait_timeout.  wait_timeout concerns only idle sessions.  sessions with running queries are not affected by this timeout.
  • a running SELECT query with streaming cursor has not moved the cursor in [net_write_timeout] seconds
  • a running bulk INSERT or UPDATE transaction has not written anything to the server within [net_read_timeout] seconds
  • in rare cases interactive_timeout is being hit by an ad-hoc CLI client or other interactive program that respects the CLIENT_INTERACTIVE option to mysql_real_connect().  This is rare.  Most of the time interactive_timeout = wait_timeout.

Keep in mind that if a DBA has mindfully set timeouts on the server that adjustments should be made to SESSION timeout variables within the calling program.  In other words, don’t change server timeout settings just for one outlier or for some exceptional situation.

Also: This error can be “normal” in situations where the application opens a connection pool to the database and many connections in the pool are idle, hitting the wait timeout.  In other words, in environments like this, SQLSTATE[HY000] [2006] MySQL server has gone away can be a “non issue”.  Work with developers to understand where in the code the error appears and collaborate on the correct solution, which may be to ensure the application’s MySQL connector or library is configured to reconnect.

SQLSTATE[HY000] [2013] Lost connection to MySQL server during query

This error is exactly what it says.

  • A mysql query was interrupted in the middle of the query, meaning it was
    • killed by a backup utility on the server
    • killed by pt-kill on the server,
    • killed manually by a DBA on the server
    • or some other situation by which the client thread lost the connection to the server (network issue between client and server).
  • This error normally does not concern timeouts or settings on the server side.

 

Two version upgrade fun with MySQL 5.5 to 5.7

In a perfect world, one would upgrade databases one version at a time and not let them get too old.  But our databases are where the “crown jewels” are.  They must stay up 24×7.  When performance is acceptable, it’s acceptable, and sometimes old versions stay around too long.  We don’t live in a perfect world.  This idea applies to so many things.  There’s almost never a perfect data model.  There is always some type of resource constraint be it storage, memory, CPU, IOPS, or just plain dollars.

I will bring this concept of not living in a perfect world into a discussion about upgrades.

Ideally there would be…

  • …time to do two upgrades.  One upgrade to 5.6, the other to 5.7.   This is the way sane, normal people upgrade.
  • …a lot of extra hardware.  It sure would be nice to maybe combine a maintenance like this with a hardware refresh so that we could just have an entirely new MySQL 5.7 database shard, slap a MySQL 5.6 box in the middle as a relay, and once things are all caught up just do a VIP cutover to new machines.

The extra hardware idea I have actually worked with in the past.  A client of a company I worked for just had a standalone 5.5 machine with no replicants.  They were able to allocate two more servers, one to be a 5.6 relay and one new 5.7 machine that would serve as a final destination.  Added bonus… At the end the 5.6 server would be upgraded and left in place for a redundant machine.

In the absence of extra time and extra money / hardware resources, sometimes a little creativity is required, which brings me to an example of the most interesting 5.5 to 5.7 upgrade process I have been a part of.  However, it’s not “that creative”, but it does involve being willing to accept single points of failure for short periods of time.

The beginning setup is a MySQL 5.5 master-master pair in the primary site (machines 1&2 in 1st diagram below).  However when I say “master-master” pair, machine 1 is getting all the traffic.  Machine 3 is a secondary site master and Machine 4 is the secondary site slave.  And a side note: I am intentionally keeping this high level, so there are more steps involved than described here.

We upgrade machine 4 to be a 5.6 server and allow its slave to catch up.  We use a method of binary backup and restore to backup and restore machine 4 to machines 2 and 3.  Machines 2 and 3 become MySQL 5.7 servers.  After this is done the replication chain looks like this.

Then, we cutover production traffic to machine two, then use a binary backup/restore method to restore a copy of machine 3 to machines 1 and 4.  In the end we look just like we started, except now we’re all upgraded to MySQL 5.7.

What are some other interesting upgrade paths?  This is a fun topic that I enjoy hearing about in the community.  Cheers.

southern fried: quick script to fix table auto increments

OK, Y’all.  If you have been a DBA long enough you have run into situations where you’ve had to import DDL from another database instance and forgotten to remove AUTO_INCREMENT on a table definition.  Sometimes this means your auto_increment is way up in the millions and you have 10 rows in the table.  People worried about ID depletion on an integer column will do a “facepalm”.

Making matters worse, having perhaps done this on more than one table, sometimes it’s a real pain to detect and fix the issue of auto_increments being much higher than you want them to be across a big number of tables.  Sometimes the pain is so great that you might want to simply repeat your data importation.

This quick and dirty “southern fried” script will help you report and rectify that.  It detects your auto_increment columns and checks them by querying “ORDER BY <auto_increment_column> desc limit 1”  I coded it in bash to be the “least objectionable”, so maybe some readers will actually use it.

So far the script is “lightly tested” so use at your own risk.

Enjoy!

 

 

Open Source DBA Ethos

I’m in Santa Clara at the Percona Live Conference and Expo.  I have been here less than 24 hours, but I have already had conversations with many fellow DBAs, some that I have met and worked with before, and others whom I am meeting for the first time.  It has been three years since my last visit, but being here shows you that as an Open Source DBA, you’re in sort of a “big family” of like minded, top notch, professionals.

I am proud to be an open source DBA because of our apparent values, values you see in almost everyone when you come to an event like this.

Who are we as Open Source DBAs ?

  • We are system experts.  Open source DBAs know their operating system.  They are not only focused on the database.
  • We are developers.  We contribute to source code within our enterprises, and in many cases, in the community.  We create meaningful tools for our companies, and help solve problems inherent in “bleeding edge” technologies, making such technologies approachable in the enterprise.  We strive to bring continuous integration and seamless migration into places that are very difficult.
  • We move fast.  Open source DBAs, even in large enterprises, are always “moving the needle” towards higher performance, making changes as fast as our tools, our applications, and our organizations allow.
  • We strive for answers.  Open source DBAs are never satisfied by answers to (even old) questions in our technologies.  “One source of truth” is almost never the case.
  • We are collaborators.  We work together across team lines within our companies, and across companies in the community, to broaden our understanding.  We work across technologies to find the “best solution for the task”.
  • We are facilitators of change, not barriers to change.

The point of this article is that there is no such thing as “walls” around an open source DBA. I’m proud to be counted among the growing army of creative talent in the field, and proud to be counted among those who will always enjoy becoming a better, more educated, and more helpful resource.