Tuesday, July 29, 2008

Is a Temporary Table Really Necessary? - PART I:

Why would you use a temporary table?

There are several reasons that I use temporary tables in my work; to hold the results of a called stored procedure, to reduce the number of rows for joins, to aggregate data from different sources, or to replace cursors.As your query become more complex you will find yourself repeating blocks of code within a query or between different queries.

This reuse of code makes the case for a creating a stored procedure with that code and calling that stored procedure. This may make for a large amount of stored procedures in your database, but it does greatly reduce the maintenance when the functionality needs to be changed and you have to change code to meet that functionality, only one query to change now and not multiple queries that have to researched and changed. I use this technique quite often and it often forces me to use a temporary table to hold the results of those stored procedures since Transact-SQL does not allow the results of a stored procedure to be used as a table.

This is probably the number one reason in my code for the use of temporary tables. I quite often find myself having to join a 10 million plus row table to a 100 million plus row table to a 20 million plus row table and then ordering the results so only the most recent activity displays first.

Even with proper indexes and using WHERE clauses to filter and force the use of an index, the performance of the query is unacceptable (since the application I work on is used by call centers, acceptable performance for a query is measured in seconds) and often the sorting produces huge performance losses as well as huge tempdb activity. I have quite often found that using corresponding temporary tables for each of the permanent tables to hold data from filtered by the WHERE clauses before I join and sort the data will increase performance on the query to such a large degree that I can actually place it into production without worrying about its performance or the impact on the tempdb database.

Below is a very simple query to show how I do this.

Original Query to find details on a particular customer's phone call

SELECT table1.numCustID, table2.strPhoneNumber, table3.strPhoneNumberCalled
FROM dbo.table1 table1
INNER JOIN dbo.table2 table2 ON table1.numBillID = table2.numBillID
INNER JOIN dbo.table3 table3 ON table2.numBillDtlID = table3.numBillDtlID
WHERE table1.numCustID = '5555'
AND table2.strPhoneNumber = '5555555555'
AND table3.strPhoneNumberCalled = '1234561234'
ORDER BY table3.dtmCalled DESC

(This query does not match the schema or an existing query at Verizon. It has been created to show a particular problem with a hypothetical telecommunications database.)

No comments: