Recently at work we ran across a situation where we needed an efficient way to find the Group-wise maximum of a certain column in an efficient manner. After fighting with and realizing that there is apparently no way to make MySQL combine an ORDER BY statement with a GROUP BY statement, we were able to get the results with two queries, but due to the volume of data, this would eventually have too much overhead to be sustainable.
Once I determined that using a join on an uncorrelated query set would be the way to go at the problem, I set out to figure out the solution using the new Arel syntax.

Example

This example uses two models, ParentThing and ChildThing, with ChildThing being a basic key-value store. ChildThings can have duplicate rows with the same key, and we were looking to return only the newest results for each distinct key found in the ParentThing's ChildThings.

class ParentThing < ActiveRecord::Base
  has_many :child_things
end
 
class ChildThing < ActiveRecord::Base
  belongs_to :parent_thing
end

The solution was to add a single named scope to the ChildThing class:

class ChildThing < ActiveRecord::Base
  belongs_to :parent_thing
 
  scope :newest_results, 
    joins('JOIN 
      (SELECT id, key, val, MAX(created_at) as created_at FROM child_things GROUP BY key) as c2
      ON child_things.key = c2.key AND child_things.id = c2.id')
 
end

So, now if p is an instance of ParentThing:

p.child_things.newest_results 

will return a collection that contains the newest created ChildThing for each disparate key in the ParentThing's ChildThing full collection.

cross posted at ndfine.com

Nick spent the better part of a decade working in systems and networking support until discovering that developing software, particularly in Ruby, was a much more entertaining and satisfying career choice. He lives with his wife and pets in Birmingham and enjoys excessive amounts of coffee, contemporary literature, and music that's too loud.