Welcome to NuclearPants.com   
...different bottle, same great taste!
Optimizing SQL for Very Large Data Sets

I have recetly been working on a project that involves manipulating data within a very large table in IBM's DB2 database.  One of the tables I am working with contains 10,000,000+ records (and growing).  While my initial code appeared to be sufficiently efficient when working with a relative empty table, it quicly became apparent that there were inefficiencies in my SELECT statements when I ran the code against the fully populated table.  After tweaking the code left and right, I finally  refined it to an acceptable state.  A process that would take upwards of 5 minutes to run now takes less than 2 seconds!!!  The following are a few important points to keep in mind when trying to write optimal code against large (and small) database tables:

 

1. When selecting data, never use concatenated columns

2. Make sure to take advantage database indexes, you may need to create them if they do not exist.  INDEXES make a HUGE difference!!!

3. Avoid the use of subselects

4. Avoid the use of keyword IN (...) if you can avoid it by using a series of ORs instead