Abstraction Breakdown: Table Aliasing in ActiveRecord
A few things I love about Rails:
- It strongly encourages you to put code in the right place MVC-wise. This feels good, but even more importantly, it enhances maintainability. (This is true both for other developers that might inherit your code and for three-weeks-from-now you.)
- The abstractions are powerful, but don’t have too many layers, so if some behavior surprises you, it’s easy to go look at what is actually going on. The one-two punch of open sourciness and simplicity, combined with the inherent readability and conciseness of the Ruby language, makes for productive debugging.
- Ruby is well-suited to creating DSLs and other forms of metacode that let us as developers solve a problem and then reuse the solution on different data sets with different parameters.
And an involved little problem where these break down:
Within our app, we have lots of places where we need searchable, sortable, configurable views of tabular data. While this is a common enough requirement, it’s one that folks tackle in all sorts of different ways, from pure Javascript implementations to scaffolding generators. Because we wanted the code to remain DRY and maintainable as much as possible, and yet be individually securable via controller/action pairs, we opted to build something somewhere in the middle.
A simplified example declaration, which you might find in WarehouseTypesController:
data_table :warehouse_type,
[:name,
:description,
:created_by,
{:key => :created_at, :label => "Created Date"},
{:key => :updated_by, :label => "Last Updated By" },
{:key => :updated_at, :label => "Last Updated Date"}],
:find_options => {:include => [:created_by, :updated_by]}
Each table has a base model (in this case, WarehouseType), where each row of the table represents one record in that model’s database table. The array that follows describes the columns to include and/or search in the table, and the find_options are identical to what you’d pass into an ActiveRecord find call. The data_table method itself generates some methods in the host controller that handle rendering, various AJAX actions and CSV export. The :created_by and :updated_by keys refer to associations defined in the model:
class WarehouseType < ActiveRecord::Base belongs_to :created_by, :class_name => 'User', :foreign_key => 'created_by' belongs_to :updated_by, :class_name => 'User', :foreign_key => 'updated_by'
A call to render_warehouse_type_data_table will produce something like this:

And the advanced search pictured above leads pretty directly to the following SQL:
SELECT * FROM `warehouse_types` WHERE (warehouse_types.name LIKE 'storage%') ORDER BY warehouse_types.name DESC LIMIT 0, 20
Straightforward, right? Well, it breaks down even in the above example when you try to query on the “updated_by” user, because suddenly we may or may not be joining to the users table twice, and any SQL we generate needs to know what table alias to use in the condition. Internally, ActiveRecord uses a table’s name as the alias the first time a table is joined into a query, and then uses an algorithm involving the relevant association names to generate aliases for subsequent joins. You don’t need to know this as long as you’re using the nested hash version of find(), but that doesn’t currently support any tests other than equality, range, and set inclusion. Consequently, for anything more complicated (pretty much all queries we run), we need to use the SQL fragment / array form of find() to prepare queries. A naive version of a find() call for warehouse types that were last updated by John might look like this:
WarehouseType.find(:all,:conditions => ['users.fullname LIKE ?','John%'],
:include => [:created_by,:updated_by]
which would generate something like this:
SELECT [redacted] FROM `warehouse_types` LEFT OUTER JOIN `users` ON `users`.id = `warehouse_types`.created_by LEFT OUTER JOIN `users` updated_bies_warehouse_types ON `updated_bies_warehouse_types`.id = `warehouse_types`.updated_by WHERE (users.fullname LIKE 'John%')
This won’t work, though- the WHERE clause is using the wrong alias, and will look at created_by users instead. The find() call would need to look like this:
WarehouseType.find(:all,:conditions => ['updated_bies_warehouse_types.fullname LIKE ?','John%'],
:include => [:created_by,:updated_by]
The thing is, though, we don’t really know what alias to use. If there’s a default_scope defined, for example, the user table may already be joined, or the query may already be scoped in some other way. Also, the outer joins may or may not actually get generated in the order specified, especially if they’re included as a Hash rather than an Array. (Like most folks with dependencies on external libraries, we can’t migrate to Ruby 1.9 yet.) Our current solution to this problem is indeed to assume particular scopes and a particular ordering to the joins, leading to a data_table column declaration more like this:
{:key => :updated_by, :label => "Last Updated By",
:sql_column => 'updated_bies_warehouse_types.fullname' },
I really don’t like this solution, though- it’s brittle, and it requires too much knowledge of both ActiveRecord’s implementation and any additional factors that may modify the model’s scope. This is not knowledge that something in the controller should have. It’s also tricky to test in a low-level way.
Before getting onto the possible solutions, I should point out that we’re certainly not the only ones to run into or describe this problem- there’s more succinct discussion in Rails tickets 2357 and 2087, including a proposal for and code for potential solution #1 below.
So, what do we do? At the moment, nothing- it happens seldom enough in our table definitions that we can get away with hard-coding the exceptions for now, but it limits functionality we can add and is just ugly. Essentially there are 3 somewhat reasonable approaches to actually fixing the problem:
- Introduce a way to query what the expected table alias for a particular situation is going to be
- Implement operations other than == in the nested-hash form of find(), as in the relational algebra code that may be merged into AR eventually.
- Replace alias references with locally-scoped tokens that represent the joins/includes, and substitute in the correct aliases when merging conditions.
These are all quite intrusive, however; the way that ActiveRecord’s merge_conditions calls work today, it doesn’t know anything structured about the JoinCondition objects, so even something simple would be a nasty hack. Given how central this code is, any change would need to be both well-tested and quite speedy. My personal preference is for #3, and when I have some spare time I may try coding this up sometime soon. My hunter in the Outlands may take precedence, though…











Company Blog
Developer Blog
News & Events