Decisions (so many tools)

During a database schema upgrade yesterday, I had to create a new primary key field for a table and then generate the new primary key values. At the moment this key value is a custom generated value by a Python class, but it will eventually use an RFC 4122 or ISO/IEC 9834-8 compliant algorithm (see Wikipedia for more information).

I had to decide how to generate this value and I ended up using a combination of Python, gawk, and T-SQL. I suppose I could have left out the gawk, but it would have taken almost the same amount of time to do it in Python and it took a just moment to write. I will present three different methods that I could have accomplished the same task, although there were reasons that I used the method that I did.

Method 1 - Python, gawk, and SQL

# Requires Python 2.5 or higher
import uuid
for i in range(200):
    print uuid.uuid4()

C:\>python genuuid.py > uuid.txt

C:\>gawk "/^'/ { i += 1; print \"INSERT @issueid VALUES (\" $1 \", \" i \")\"; }" uuid.txt > uuid.sql

/* Add the table variable to the SQL file */
DECLARE @issueid TABLE (
    issueid varchar(36),
    issuenum int
)

Method 2 - Python and SQL

# Requires Python 2.5 or higher
import uuid

print """DECLARE @issueid TABLE (
    issueid varchar(36),
    issuenum int
)"""

for i in range(200):
    print "INSERT @issueid VALUES ('%s', %d)" % (uuid.uuid4(), i+1)

C:\>python genuuid.py > uuid.sql

Method 3 - SQL

/* Add the table variable to the SQL script */
DECLARE @issueid TABLE (
    issueid varchar(36),
    issuenum int
)
DECLARE @i int

SET @i = 0
WHILE @i < 200
BEGIN
    INSERT @issueid VALUES(NEWID(), @i+1)
    SET @i = @i + 1
END

/* Process table here */

After looking back, Method 2 seems to be the most straight-forward and self-contained method. The reason I chose Option 1 was that it was the first method that I thought of. Now that I am using Python more, I will probably start favoring quick solutions like Method 2.