Selecting Random Row from a SQL SELECT Statement
Friday, March 23, 2007
Well I just found something cool as I was looking for a way to select some data from a SQL Server table and only return 1 random row. Pretty easy too:
SELECT TOP 1 columnName FROM table ORDER BY NEWID()
Works like a charm.
Labels: Random, SELECT Statement, SQL
March 23, 2007 at 11:28 PM
One thing you might want to watch out for in this case would be using this in a table with a lot of rows that aren't excluded. NEWID() generates a 128-bit value, which is already going to generate a slow comparison (that's up to four machine words on 32-bit servers!), not to mention the fact that every row returned will have NEWID() called for it.
There are a couple other ways that might be better. If you need a single random row, you can get max row ID first (SELECT MAX(identity-column) FROM table;), perform a random operation with that value as the cap (new Random().Next(maxID + 1);), and then retrieve the first row with the ID greater than or equal to that ID (SELECT TOP(1) * FROM table WHERE identity >= randomValue;). This seems to be a safer and somewhat more efficient way to keep load off of the database server at the possible risk of some loss of randomness, as rows with many IDs missing before them are more likely to be picked. For instance:
|-|2|-|-|5|-|-|-|-|-|11
In this case, row 11 is most likely to be picked because 6-11 can be the random choice and 11 will be chosen. 2 is the least likely, etc.
In any case, each method has its ups and downs based on the assumptions you can make about your data. If you know that your data isn't going to suffer from many DELETEs, my suggested method is pretty safe; on the other hand, if you're going to have a lot of fragmented IDs, something else such as the NEWID() strategy might be better. :) top
April 2, 2007 at 12:08 PM
So this is what I ended up doing so all my followers of this blog (5 I think) will know.
SELECT TOP 1 FROM tableName ORDER BY RAND((1000 * MyPK) * DATEPART(millisecond, GETDATE()))
Seemed to work and hopefully isn't taking up a bunch of resources on my SQL Server. top
April 12, 2007 at 10:42 AM
I just noticed that my previous comment is slightly wrong. Forgot to add the asterisk before the "FROM".
SELECT TOP 1 * FROM resource ORDER BY RAND((1000 * MyPK) * DATEPART(millisecond, GETDATE())) top