Brishen and I have been having ongoing disucussions revolving around query optimization theory. I thought I’d toss this out to the world (or the tiny proportion of the world that sees this site).
Currently, we’re discussing constraints. If a.childID = b.objectID is all that I need to create the right link between tables, would a.childID = b.objectID AND a.childtype = ‘B’ be more efficient? Tests against SQL Server for me show that this appears to be true if a.childtype is indexed, but if it is not indexed, it appears to be about the same speed (a.childID is indexed in both cases). Is there a general rule that can be applied here? Such as: if the constraints apply to indexed columns, it is better to include as many as possible, but if the columns are not indexed, as few as possible?
A further questions concerns using WHERE a IN (option1,option2,option3) vs. a = option1 OR a = option2 OR a=option3. Which is better when? Is there a limit to how large the list used with IN () should be? Personal experience has suggested not really, but the largest I’ve used is not that large.
Furthermore, while they should be equal, is FROM tableX JOIN TableY on x.objectID = y.objectID equal to FROM tableX,TableY WHERE x.objectID = y.objectID, performance-wise?
Every time I leave a largeish query alone for a while, and return to it a few months later, I find little tweaks that I can do to improve it. What I’d really like is some way of learning all the ways of optimizing queries in the first place, even if it is technology specific. Can anyone out there in the world recommend any books about query optimization theory, or a series of experiments comparing the various ways of accomplishing the same things through SQL? If so, I, and Brishen, would be most appreciative.