Chomper Stomping
jQuery/JavaScript/CSS 3/HTML 5, Java/PHP/Python/ActionScript, Git, Chrome/Firefox Extensions, Wordpress/Game/iPhone App Development and other random techie tidbits I've collected



programming concepts

May 20, 2008

How to ORDER BY before GROUP BY with MySQL

I’ve got a table that tracks inventory at multiple vendors for multiple products by timestamp. I want to only select the most recent inventory count per vendor by product.

One way to do this is with a bunch of while loops and hitting the DB over and over, or selecting a crap load of data and just dropping out everything but the most recent inventory count manually. Both of these are viable solutions with a lot of overhead, but what I really want is just a silver bullet SQL statement that does this for me. I think I have it.

What I initially envisioned was:

SELECT *
FROM INVENTORY
ORDER BY vendors_no,
timestamp DESC
GROUP BY vendors_no,
product_no

However, if you are familiar with MySQL, you will know that you cannot put ORDER BY before GROUP BY. You have to GROUP BY and then ORDER BY.

It appears that the way that GROUP BY works is that it takes the first row of each “group” and just drops the rest. So whichever row happens to be first in the DB when the SQL is executing will get grabbed and the rest will get dropped. If I had wanted to get the first inventory count for each vendor/product, this would have been ok, since those would have been the first row, but it’s going to be the last row each time. So what to do? If only I could flip the whole table on it’s head or something…

Actually, I kind of can! This is what I came up with:

SELECT *
FROM(
SELECT *
FROM INVENTORY
ORDER BY vendors_no,
timestamp DESC
) as inv
GROUP BY vendors_no,
product_no

If I understand GROUP BY correctly and ORDER BY correctly (which I very well may not) the sub-query will first basically create a temporary table with the results in reverse timestamp order. Then when GROUP BY is going through and finding the first vendor_no and dropping the rest it will be keeping the most recent one.

If anyone knows any reason why this is stupid or wrong, please let me know. It’s the best solution I’ve been able to find so far.

***UPDATED: In response to Brian’s suggestion***

Dang. I really wanted that to work, but it didn’t.

Here is the original SQL:

Here is the SQL you suggested:

Did I do something wrong?

Here is the total data being queried:

Thanks for the suggestion!



About the Author

Christopher McCulloh
E-Commerce developer at Finish Line Co-Author of HTML, XHTML and CSS All-in-one Desk Reference for Dummies Graduated from IU with a Bachelors of Media Arts and Science and a Certificate in Applied Computer Science. Tech Editor for Building Facebook Applications for Dummies and Building Websites All-in-one for Dummies 2nd Edition. Creator and maintainer of the Status-bar Calculator Firefox Extension Three years professional experience in Java E-Commerce Development and four years professional experience with PHP for a combined total of seven years professional JavaScript/HTML/CSS experience




 
 

 
logo

dynode Batch Get Item

Working a lot with node.js, dynode and dynamoDB recently. Still trying to wrap my head around it all. Had a horrible time getting dynode.batchGetItem to work. Here is the error I was getting: { name: 'AmazonError', type: 'Valid...
by Christopher McCulloh
0

 
 
mysqlerror

WP phpBB Bridge: Warning: mysql_set_charset() expects parameter 2 to be resource, boolean given

Warning: mysql_set_charset() expects parameter 2 to be resource, boolean given in wp-content/plugins/wp-phpbb-bridge/inc/widgets/wpbb_topics_widget.php on line 149 This is an error caused by the fact that the WP phpBB Bridge pl...
by Christopher McCulloh
0

 
 
 

Events Calendar Pro Nav Formatting Messed up on Empty Calendar

The Events Calendar Pro (from http://tri.be/) has a few problems. If you are trying to figure out why a calendar with no events in that month has completely screwed up header navigation, just put this line of code inside of tab...
by Christopher McCulloh
5

 

 
warning

OH SHNIKES, WE’VE BEEN HAXORED!!!

Yes. It finally happened. After… 6 years? on the web I finally got hacked. Two domains affected: http://cmcculloh.com http://hallelujahbutton.com (this also of course affected all sub-domains of cmcculloh.com, such as blo...
by Christopher McCulloh
2

 
 
blue-xl

WordPress Settings API – Adding Options to Existing Page

Adding new options to an existing page in the dashboard in wordpress can be maddening. I’ve literally spent 15+ hours dealing with this horrible API at this point. To the point where I wrote two different wrappers for it....
by Christopher McCulloh
0

 




6 Comments


  1. Have you tried the MAX function?

    Select MAX(timestamp) as latestDate, inventoryCount, vendor, product From Inventory Group By product, vendor

    I use the MAX/MIN functions a lot specifically for the reason you are referring to. I don’t know from your post exactly how your data is set up, and I usually like to test my SQL statements so you might have to tweak this a bit.

    I also use Group By quite a bit, and it’s for using aggregate functions like MAX/MIN. It’s not necessarily “dropping” the rows, it’s just not finding them since you did not use an aggregate in your From clause. If you use the SQL I posted w/o the Group By, you will get get a resultset with all the inventoryCounts, vendors, and products, but they will all have the same timestamp (the latest date in the entire table). That’s why you have to use the Group By because you’re telling it you want the latest date per product, per vendor.

    Your method will work too, but it is not as efficient – but that might not matter depending on how much data you have.


  2. Thanks for the suggestion!

    Updated post…


  3. jacob

    superb awesome mysql query


  4. NB

    I’m not very advanced with MySQL and I searched for hours trying to GROUP BY before ORDER BY, and it seemed impossible to do without a sub query. I didn’t want to do that because of server load, so instead I created a temporary array in my php result loop that dumps the row if the vendor ID has been used before:

    // CHECK ROW
    if(@$tempArray[$row[VendorID]] == 1){
    // SKIP
    }else{
    $tempArray[$row[VendorID]] = 1;
    // DISPLAY TABLE ROW HERE
    }


  5. muinx

    thanks for select :)

    @Brian: you can not use MAX if you order by date/time


  6. stats

    this is an obvious and straightforward solution, however if you run this on a large table (mine was only 350k rows) you get mysql error 28. I ran this on a pretty good dedicated server with some 100gb disk space available and still got the error .. need to find a work around



Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>