How do Orms, like Active Record, generate SQL code?

Asked

Viewed 87 times

2

I’d like to know how Orms in ruby transform a proc/lambda like this:

{ (id > 1) & (created_at < Time.now) }

in something like " id > 1 and created_at < '2018-01-25' ".

2 answers

3

What is ORM?

Simply put, an ORM is a technique for mapping database manipulations to objects in programming languages. One of the features of an ORM, as you mentioned, is to convert queries using a DSL of the language used to an SQL dialect.

The Active Record

That’s why in Active Record (the standard Rails ORM):

Car.find_by(name: 'Opala')

translates to

SELECT * FROM cars WHERE name = 'Opala' LIMIT 1

Behind the scenes...

And how does that work? Well, let’s go to code of #find_by:

def find_by(arg, *args)
  where(arg, *args).take
rescue ::RangeError
  nil
end

It seems he calls a method called #where. This is interesting, because it seems that several SQL words were written in their Ruby versions. Let’s go deeper, to code of #Where.

def where(opts = :chain, *rest)
  if :chain == opts
    WhereChain.new(spawn)
  elsif opts.blank?
    self
  else
    spawn.where!(opts, *rest)
  end
end

Don’t be scared by those ifs. This is because the #where can be used in various ways, for example:

User.where.not(name: "Jon")

However, the type of #where that we are using here is the Hash, with the argument { name: 'Opala' }. So he falls in else, for the method #where!. Let’s go to the code!

def where!(opts, *rest) # :nodoc:
  opts = sanitize_forbidden_attributes(opts)
  references!(PredicateBuilder.references(opts)) if Hash === opts
  self.where_clause += where_clause_factory.build(opts, rest)
  self
end

Now it seems that we are on the way to something a little more "low level", where it actually mounts a string with SQL code, which after all, is the final product of a ORM. See what this method where! ago:

  • Sanitizes data to prevent syntax errors and attacks like SQL injection
  • Adds a clause to a query, which is saved in memory by an instance variable, called where_clause

The interesting thing about this method is the excerpt where_clause_factory.build(opts, rest). Behold the code:

def where_clause_factory
  @where_clause_factory ||= Relation::WhereClauseFactory.new(klass, predicate_builder)
end

(if you do not know the "pipe pipe equals / or equals) of Ruby, ||=, gives a look here)

The #where! calls the Relation::WhereClauseFactory#build. That cloister factory WHERE will analyze the input, which in our case is { name: 'Opala' }, to mount the query.

In Active Record, this input can come in many ways, such as:

  • Table field filters. For example: { name: 'Opala' }
  • Range filters. For example: { year: 1999...2000 }
  • Table relationship filters. For example: { pilot: { id: 1 } }
  • And others...

This factory generates a single instance of WhereClause, adding or removing the conditions of WHERE.

The bilingual: mounting the query

So far, we’ve only spoken Ruby and Active Record.

After riding all the query tree, stored in memory in classes such as WhereClause, Active Record calls an "SQL queries editor", called Arel.

Arel’s role is to speak Ruby and SQL at the same time, being able to read Ruby code and translate to SQL (by definition, Arel is an SQL Abstract Syntax Tree).

The concept of AST is used by compilers and other programming languages as an abstract representation of the code, or any other syntactic structure, such as good old Portuguese.

A mathematical example (source):

5 * 3 + (4 + 2 % 2 * 8)

     +
    / \
   /   \
  *     +
 / \   / \
5   3 4   *
         / \
        %   8
       / \
      2   2

And an example of Javascript:

See an example of Arel riding queries:

users = Arel::Table.new(:users, ActiveRecord::Base)

select_manager = users.project(Arel.star).where(users[:id].eq(23).or(users[:id].eq(42)))
select_manager = users.project(Arel.star).where(users[:id].eq_any([23, 42]))
select_manager.to_sql
# => SELECT * FROM "users"  WHERE ("users"."id" = 23 OR "users"."id" = 42)

Active Record converts its own classes (WhereChain, WhereClause, GroupClause) and converts them into Arel calls, "spitting" complex queries mounted with simple, easy and Rails-like Dsls.

0

Actually my doubt was another, I was not in doubt of how the querie is created. my doubt basically was this:

How an expression like that is transformed: { (id > 1) & (created_at < Time.now) } which is basically a lambda without parameters, in a string: id = 1 and created_at < data_atual

knowing that id and created_at are defined only within that block.

Answer

In case the method is used instance_exec with a class and undefined methods are created dynamically, as are their comparators, causing them to return a string.

recalling that the instance_exec captures a lambda and uses only the methods defined in class that is, if you put a print inside that block, there will be an error because the function has not been defined.

Basically what I can conclude from all this is that in the case of ORM Sequel did this just to say that they know the language well, because they made it difficult to write a simple process that is to write a string.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.