Tomato, Tomato – Assumptions about Regionalization

This month’s TSQL Tuesday prompt is on the topic of assumptions. I’d like to take a brief moment and talk about one of the more insidious and pervasive sources of bugs in TSQL application code I’ve seen over the years.

Let’s say I present you with a date: 08/12/2014

Quick! What month am I talking about?

If you answered August 12, 2014, you might be correct. If you answered December 8, 2014, you might also be correct. If you sneakily answered “It depends!” then, well, you’ve picked up on where I am going with this.

I very frequently see coders assuming that their local timezone, region, collation, and language are somehow universal, leading to coding practices which limit the geographical reach of their software. Here are some of the most common things which I see overlooked:

  • Regional date format – Keep in mind that mm/dd/yyyy and dd/mm/yyyy cannot be easily distinguished until the 13th of any given month. Usage varies around the world and even within countries. Use yyyymmdd instead!
  • Regional number formatting – Though this a larger problem in app code than in TSQL code, you still see the issue sometimes. A decimal may be represented as 3.14 or 3,14 depending on local settings. There are more variations than you might expect.
  • Time zones – You would hope it could go without saying that not all users will necessarily be located in the same time zone, but applications have issues with this all the time. Time zones are also not static and are affected by Daylight Savings Time windows which can shift or be cancelled outright. Storing data in UTC time and applying offsets in the presentation layer can usually solve this problem. And don’t forget – not all time zones are on even hour boundaries!
  • Collation settings – Honestly, the thing I see the most often is the assumption of a case insensitive collation. This problem bit me so many times in my early days working on SQL Diagnostic Manager that I switched my entire development environment to case sensitive many years ago. I also put Korean, Russian, Japanese, and Binary collations into my test environments. It’s better to write your code to handle these issues early on than to play whack-a-mole with collation conflict errors later on.

You may feel safe with these assumptions in quick little scripts now and then, but remember, the moment you put a script up on your blog, it has gone out to an international audience who may have different settings from you. You may think to yourself that all of your company’s servers are located two floors away and are all of the same predictable region, collation, and language settings – but then you attend a quarterly kick of meeting and find out you’ve merged with a company on another continent and, by the way, you’re now responsible for their servers too!

Save yourself some heartache and add a nice sparkle of professionalism to your code by writing it to be region and language agnostic. You’ll be a better coder for it.

Edit: Special thanks to Rob Farley (b | t) for pointing out an inaccuracy in the above! There is a difference between yyyymmdd and yyyy-mm-dd and it’s important to use the former.

July 8, 2014 Posted by: Leave a comment - Permalink

Leave a Reply

%d bloggers like this: