Friday, June 3, 2016

Queries in Rails, Part 2

In this second article, we’ll dive a little deeper into Active Record queries in Rails. In case you are still new to SQL, I’ll add examples that are simple enough that you can tag along and pick up the syntax a bit as we go. 

That being said, it would definitely help if you run through a quick SQL tutorial before you come back to continue to read. Otherwise, take your time to understand the SQL queries we used, and I hope that by the end of this series it won’t feel intimidating anymore. 

Most of it is really straightforward, but the syntax is a bit weird if you just started out with coding—especially in Ruby. Hang in there, it’s no rocket science!

Topics

  • Includes & Eager Loading
  • Joining Tables
  • Eager Loading
  • Scopes
  • Aggregations
  • Dynamic Finders
  • Specific Fields
  • Custom SQL

Includes & Eager Loading

These queries include more than one database table to work with and might be the most important to take away from this article. It boils down to this: instead of doing multiple queries for information that is spread over multiple tables, includes tries to keep these to a minimum. The key concept behind this is called “eager loading” and means that we are loading associated objects when we do a find.

If we did that by iterating over a collection of objects and then trying to access its associated records from another table, we would run into an issue that is called the “N + 1 query problem”. For example, for each agent.handler in a collection of agents, we would fire separate queries for both agents and their handlers. That is what we need to avoid since this does not scale at all. Instead, we do the following:

Rails

If we now iterate over such a collection of agents—discounting that we haven't limited the number of records returned for now—we'll end up with two queries instead of possibly a gazillion. 

SQL

This one agent in the list has two handlers, and when we now ask the agent object for its handlers, no additional database queries need to be fired. We can take this a step further, of course, and eager load multiple associated table records. If we needed to load not only handlers but also the agent’s associated missions for whatever reason, we could use includes like this.

Rails

Simple! Just be careful about using singular and plural versions for the includes. They depend on your model associations. A has_many association uses plural, while a belongs_to or a has_one needs the singular version, of course. If you need, you can also tuck on a where clause for specifying additional conditions, but the preferred way of specifying conditions for associated tables that are eager loaded is by using joins instead. 

One thing to keep in mind about eager loading is that the data that will be added on will be sent back in full to Active Record—which in turn builds Ruby objects including these attributes. This is in contrast to “simply” joining the data, where you will get a virtual result that you can use for calculations, for example, and will be less memory draining than includes.

Joining Tables

Joining tables is another tool that lets you avoid sending too many unnecessary queries down the pipeline. A common scenario is joining two tables with a single query that returns some sort of combined record. joins is just another finder method of Active Record that lets you—in SQL terms—JOIN tables. These queries can return records combined from multiple tables, and you get a virtual table that combines records from these tables. This is pretty rad when you compare that to firing all kinds of queries for each table instead. There are a few different kinds of data overlap you can get with this approach. 

The inner join is the default modus operandi for joins. This matches all the results that match a certain id and its representation as a foreign key from another object or table. In the example below, put simply: give me all missions where the mission’s id shows up as mission_id in an agent’s table. "agents"."mission_id" = "missions"."id". Inner joins exclude relationships that don’t exist.

Rails

SQL

So we are matching missions and their accompanying agents—in a single query! Sure, we could get the missions first, iterate over them one by one, and ask for their agents. But then we would go back to our dreadful “N + 1 query problem”. No, thank you! 

What’s also nice about this approach is that we won’t get any nil cases with inner joins; we only get records returned that match their ids to foreign keys in associated tables. If we need to find missions, for example, that lack any agents, we would need an outer join instead. Since this currently involves writing your own OUTER JOIN SQL, we will look into this in the last article. Back to standard joins, you can also join multiple associated tables, of course.

Rails

And you can add onto these some where clauses to specify your finders even more. Below, we are looking only for missions that are executed by James Bond and only the agents that belong to the mission 'Moonraker' in the second example.

SQL

Rails

SQL

With joins, you also have to pay attention to singular and plural use of your model associations. Because my Mission class has_many :agents, we can use the plural. On the other hand, for the Agent class belongs_to :mission, only the singular version works without blowing up. Important little detail: the where part is simpler. Since you are scanning for multiple rows in the table that fulfill a certain condition, the plural form always makes sense.

Scopes

Scopes are a handy way to extract common query needs into well-named methods of your own. That way they are a bit easier to pass around and also possibly easier to understand if others have to work with your code or if you need to revisit certain queries in the future. You can define them for single models but use them for their associations as well. 

The sky is the limit really—joins, includes, and where are all fair game! Since scopes also return ActiveRecord::Relations objects, you can chain them and call other scopes on top of them without hesitation. Extracting scopes like that and chaining them for more complex queries is very handy and makes longer ones all the more readable. Scopes are defined via the “stabby lambda” syntax:

Rails

SQL

As you can see from the example above, finding James Bond is much nicer when you can just chain scopes together. That way you can mix and match various queries and stay DRY at the same time. If you need scopes via associations, they are at your disposal as well:

You can also redefine the default_scope for when you are looking at something like Mission.all.

SQL

Aggregations

This section is not so much advanced in terms of the understanding involved, but you will need them more often than not in scenarios that can be considered a bit more advanced than your average finder—like .all, .first, .find_by_id or whatever. Filtering based on basic calculations, for example, is most likely something that newbies don’t get in touch with right away. What are we looking at exactly here?

  • sum
  • count
  • minimum
  • maximum
  • average

Easy peasy, right? The cool thing is that instead of looping through a returned collection of objects to do these calculations, we can let Active Record do all this work for us and return these results with the queries—in one query preferably. Nice, huh?

  • count

Rails

SQL

  • average

Rails

SQL

Since we now know how we can make use of joins, we can take this one step further and only ask for the average of gadgets the agents have on a particular mission, for example.

Rails

SQL

Grouping these average number of gadgets by missions' names becomes trivial at that point. See more about grouping below:

Rails

SQL

  • sum

Rails

SQL

  • maximum

Rails

SQL

  • minimum

Rails

SQL

Attention!

All of these aggregation methods are not letting you chain on other stuff—they are terminal. The order is important to do calculations. We don’t get an ActiveRecord::Relation object back from these operations, which makes the music stop at that point—we get a hash or numbers instead. The examples below won’t work:

Rails

Grouped

If you want the calculations broken down and sorted into logical groups, you should make use of a GROUP clause and not do this in Ruby. What I mean by that is you should avoid iterating over a group which produces potentially tons of queries.

Rails

SQL

This example finds all the agents that are grouped to a particular mission and returns a hash with the calculated average number of gadgets as its values—in a single query! Yup! The same goes for the other calculations as well, of course. In this case, it really makes more sense to let SQL do the work. The number of queries we fire for these aggregations is just too important.

Dynamic Finders

For every attribute on your models, say name, email_addressfavorite_gadget and so on, Active Record lets you use very readable finder methods that are dynamically created for you. Sounds cryptic, I know, but it doesn’t mean anything other than find_by_id or find_by_favorite_gadget. The find_by part is standard, and Active Record just tucks on the name of the attribute for you. You can even get to add an ! if you want that finder to raise an error if nothing can be found. The sick part is, you can even chain these dynamic finder methods together. Just like this:

Rails

SQL

Of course you can go nuts with this, but I think it loses its charm and usefulness if you go beyond two attributes:

Rails

SQL

In this example, it is nevertheless nice to see how it works under the hood. Every new _and_ adds an SQL AND operator to logically tie the attributes together. Overall, the main benefit of dynamic finders is readability—tucking on too many dynamic attributes loses that advantage quickly, though. I rarely use this, maybe mostly when I play around in the console, but it’s definitely good to know that Rails offers this neat little trickery.

Specific Fields

Active Record gives you the option to return objects that are a bit more focused about the attributes they carry. Usually, if not specified otherwise, the query will ask for all the fields in a row via * (SELECT  "agents".*), and then Active Record builds Ruby objects with the complete set of attributes. However, you can select only specific fields that should be returned by the query and limit the number of attributes your Ruby objects need to “carry around”.

Rails

SQL

Rails

SQL

As you can see, the objects returned will just have the selected attributes, plus their ids of course—that is a given with any object. It makes no difference if you use strings, as above, or symbols—the query will be the same.

Rails

A word of caution: If you try to access attributes on the object that you haven’t selected in your queries, you will receive a MissingAttributeError. Since the id will be automatically provided for you anyway, you can ask for the id without selecting it, though.

Custom SQL

Last but not least, you can write your own custom SQL via find_by_sql. If you are confident enough in your own SQL-Fu and need some custom calls to the database, this method might come in very handy at times. But this is another story. Just don’t forget to check for Active Record wrapper methods first and avoid reinventing the wheel where Rails tries to meet you more than halfway.

Rails

Unsurprisingly, this results in:

SQL

Since scopes and your own class methods can be used interchangeably for your custom finder needs, we can take this one step further for more complex SQL queries. 

Rails

We can write class methods that encapsulate the SQL inside a Here document. This lets us write multi-line strings in a very readable fashion and then store that SQL string inside a variable which we can reuse and pass into find_by_sql. That way we don’t plaster tons of query code inside the method call. If you have more than one place to use this query, it’s DRY as well.

Since this is supposed to be newbie-friendly and not an SQL tutorial per se, I kept the example very minimalistic for a reason. The technique for way more complex queries is quite the same, though. It’s easy to imagine having a custom SQL query in there that stretches beyond ten lines of code. 

Go as nuts as you need to—reasonably! It can be a life saver. A word about the syntax here. The SQL part is just an identifier here to mark the beginning and end of the string. I bet you won’t need this method all that much—let’s hope! It definitely has its place, and Rails land wouldn’t be the same without it—in the rare cases that you will absolutely want to fine-tune your own SQL with it.

Final Thoughts

I hope you got a bit more comfortable writing queries and reading the dreaded ol’ raw SQL. Most of the topics we covered in this article are essential for writing queries that deal with more complex business logic. Take your time to understand these and play around a bit with queries in the console. 

I’m pretty sure that when you leave tutorial land behind, sooner or later your Rails cred will rise significantly if you work on your first real-life projects and need to craft your own custom queries. If you are still a bit shy of the topic, I’d say simply have fun with it—it really is no rocket science!


by Ed Wassermann via Envato Tuts+ Code

No comments:

Post a Comment