RaymondBerg.com

Archive for the ‘databases’ Category

Wednesday, February 17th, 2010

If you found this site via some sort of Google search, you’re probably looking for answers. I won’t get to that now, but I will warn you that I don’t have them. If you’re a friend or colleague of mine you will have let out at bit of a laugh and settled into your chair with every intent of hating what I’m about to write. Well, I’m glad.

In industrial technology applications we often see projects and ideas labelled with ideas that spark value to other techies.  Terms like ‘robust”, ‘quality’, ‘efficient’, and, to a lesser degree, ‘boffo’. Far less often do we see technology simply described in terms of ease of use.  In fact, I would say it’s downright rare to hear of any steam-powered, hard-left engineers bringing up usability or learning curve when drafting a system designs or applications. Just like football players are interested in beer chugging and cheerleaders, I guess engineers into database normalization and load-balancers.

So what happens when you take that out of the equation? What happens when you take away the pomp and circumstance, when you lose the beer and the cheerleaders? Well, then you have football  or,  in this case, engineering. Take away the sex appeal of massive, scalable availability and mind-boggling complexity; what are you left with? It’s not frequent, sure, but it happens eventually. You’ve got problem solving, pure and simple. Simple calculations, exercises, tasks, and duties.  Sometimes football players have practice, they run drills, lift weights, and sell pizza coupons. Well it’s no different for engineers.

Enter Microsoft Access.

I’m serious, although this is quite a leap (move with me here, I’m trying to keep this short). Access has never, in its entire existence as a RDBMS, won any championship football games, but that doesn’t mean it isn’t a great tool. It’s the football practice dummy and the barbell and the coupon book, but all of this for engineers. Just like them, we’ve got to do the boss’ laundry and earn our keep. We architect the big solutions, sure, but what about the little stuff? What about the parts inventory for the warehouse? What about the customer contact that the boss does once a week and notes in his journal? There’s also that email list that you share information on, but people keep asking the same questions every year or so and nobody keeps any of the information recorded anywhere? What about office supply orders that Debbie does once a week? It’s all little stuff.

Three times, in three different positions, I’ve used Access (or other simple data management tools) to bridge a gap or improve a process that was being done poorly or not-at-all.  Each of these times it’s been a task that I volunteered for, and each time I’ve received more recognition than all of my ‘big picture’ work combined. I didn’t choose Access because it’s fast or robust or sexy (as it is clearly none of these things), but it is definitely quick and easy and portable, not to mention the availability across most corporate IT spaces. It’s not designed to track Walmart’s inventory, but it does get the job done.  After all, who cares about an 18% performance increase on the security log queries when I have this nifty iPhone app that lets me keep track of what I eat every day? Okay, maybe that’s silly, but it’s all little stuff. And the reason it makes a difference is because it effects people.

I’m a proud Access developer. It’s not my day job, and I’m glad for that, but it’s an amazing tool. I pledge to volunteer my services to help improve the processes and daily work of people on whom I rely. I’m also going to use it as a ‘gateway database’; I’ll use it to get into American homes and get kids and adults to try harder stuff like MySQL, Linq, Rails, and Hibernate. But for Dad’s big list of home electronic serial numbers, Timmy’s baseball card collection, and small project CRM…well, I’m on board. Who knows? Maybe even the Microsoft or the EPA will use it.

*Remember: a good developer is an active developer. Please stop engineering for engineering’s sake; it’s not healthy. Put your skills to some good use and fix something or teach someone (or vice-versa).

Saturday, February 14th, 2009

Subtitle: Makes an A$$ Out of You and Me

I ran into an issue last night with MySQL Workbench that frustrated me. I’m working with a table of medium-low complexity (about 10 Tables and 15 foreign keys) that I’ve inherited in joining a research team.  At this point I’ve reworked the whole thing to be a little more robust, but when I did so I ran into some problems on importing the results into a table.

On some of the tables I was getting an “Can’t create table <x> (errno: 150)”, which turns out to be a foreign key(FK) creation failure. I then realized that all of the tables that were failing were referencing the same core table, but with no visible issues. The primary key on the table was a composite of two  surrogate keys [INT(10) in this case]: one from the parent table and a local one.

I immediately thought that the FK definitions might have been incompatible, and so I combed through some of them for reasons why this would happen. I looked through InnoDB reference documentation in search of limitations in the implementation. Maybe there was a limitation on the number of keys referencing a single table, or on chaining “cascade”, “set null”, “restrict” behaviors, or on the maximum length for foreign key field names. Nothing.

I was up until 2 in the morning in the office last night with this problem with no success and eyelids with exponential density growth. When I got in today I quickly realized that it was just a simple conflict between ‘unsigned’ and normal int values in the foreign keys and relating tables.

My beef: All of the table declarations were made in MySQL workbench. Relationships and everything were formed using the exact same tools for all of it, and yet some of my columns were unsigned and some were not. Additionally, there was seemingly no way to change this value in the interface. Well, it turns out there is.

How to enable zerofill, unsigned, and other flags in MySQL Workbench:

  • Go to (Menubar)->Tools->Options
  • Open the ‘Diagram’ Tab
  • Ensure “Show Table Column Flags” is checked

My biggest beef: A good programmer should never assume that a tool is doing what he wants it to do. If there’s no visible indication that numbers are being given unsigned classification, then that’s your job.  I wasted a few hours on this problem, but my jedi masters should be pleased that I’ve resolved this error in less than 24 hours. PEBKAC ftw.

Friday, August 1st, 2008

One of the biggest, although non-normalizing, issues that I’ve struggled with is that of the natural versus surrogate keys. Now, I didn’t know what these were a few years ago so on the off chance that you are me in that time period, I’ll briefly explain. A natural key is a unique/primary key that is made up of strictly information that is logically connected with the entity you are storing in the row. The tried and true example is that of a social security number or a detailed title, basically anything that is part of the entity that makes it unique. What makes things hard is choosing the right data to make that storage easier, obviously the same FIRST_NAME occurs in more than one PERSON so you’ll have to use another or more rows (ex. FIRST_NAME, LAST_NAME, BIRTHDATE, BIRTH_PLACE). But it’s a lot easier to use a surrogate key, most often an autoincrementing integer, to just represent each row, because you know that will be distinct.

When instructed in my undergrad database lectures I was told, flat out, to never use surrogate keys. Fortunately, the man I learned this from can now be properly labeled as a “hack” when it came to practical computational theory. He would literally grab his shirt collar and adjust it uncomfortable while making strange, wounded-animal-like noises every time a programmer stuck an extra method call in a constructor or used a surrogate key. His goal was to make us flawless when it came to design practice, but he really just made us terrified that if we did something outside of his strict guidelines he would leap on us with his red pen. Or that the noises he was making was part of the summoning act to bring some giant bird to come and tear our typing fingers off.

Well, we learned to do a pretty good impression of him, and we also learned that what he was saying wasn’t exactly flawless, or even reasonable. The reality of the matter is that it’s OKAY TO USE INTEGER PRIMARY KEYS. First of all, it’s not the end of the world to de-naturalize a piece of data. Plus, not only is it okay, but there are significant benefits in speed and tool interoperability.

The only argument in existence against using a surrogate key is that you are essentially defining rows as being defined by a simple number that has nothing to do with the actual data. Sure, this is a loss, and it’s going to make comparisons between multiple tables a whole lot easier. But what about complex tables that aren’t easy to represent in one or two fields. I remember that same “E” professor made us, for a practical software course, store foreign key relationships to a table using 3 varchar fields. Since we had 2 mappings to said table, this meant 6 varchar fields defining the relationships instead of 2 integer fields that would have been much faster for our thousands of rows and much more readable.

Beyond that, do you know how to fix the entity integrity issue? Really, you can just add a unique key constraint on the same fields you would have used for the natural primary key. By using this method you essentially recreate the exact same restrictions on a faster indexed table for which it is much easier to define relationships.

CAVEAT: But it’s not right to say that a database designer, especially a green one, gets to use surrogate keys off the start. Why not? Because it’s not the best case. To be honest, I think this whole world would (www) be a little bit better off if we could make the initial model work. I will continue to use natural keys whenever it is convenient for the model, but the difference is that I will make a conscious decision about which one to pick.

Call me a glassy-eyed, idealist youth, but I honestly believe that you can test a good programmer not by what he knows but by how he evaluates his options. It’s a fine balance between diving off the board before checking for pool water and hourly, broad-field pH tests, but it’s the one that will make you a solid programmer/architect/designer.

    follow me on Twitter