Last Update: 2016-10-13 11:56:48 -0700

Virtual Row Blocks

Dataset methods where, order, and select all take blocks that are referred to as virtual row blocks. Many other dataset methods pass the blocks they are given into one of those three methods, so there are actually many Sequel::Dataset methods that take virtual row blocks.

Why Virtual Rows

Virtual Rows were created to work around the issue that some parts of Sequel's standard DSL could not be used on ruby 1.9. For example, the following Sequel code historically worked on ruby 1.8, but not ruby 1.9:

dataset.where(:a > :b[:c])
# WHERE a > b(c)

This code does not work on ruby 1.9 for two reasons. First, Symbol#> (like other inequality methods) is already defined in ruby 1.9, so Sequel does not override it to return an SQL inequality expression. Second, Symbol#[] is already defined on ruby 1.9, so Sequel does not override it to return an SQL function expression.

It's possible to use Sequel's DSL to represent such expressions, but it is a little verbose:

dataset.where(Sequel[:a] > Sequel.function(:b, :c))
# WHERE a > b(c)

The virtual row DSL makes such code more concise:

dataset.where{a > b(c)}

Regular Procs vs Instance Evaled Procs

Virtual row blocks behave differently depending on whether the block accepts an argument. If the block accepts an argument, it is called with an instance of Sequel::SQL::VirtualRow. If it does not accept an argument, it is evaluated in the context of an instance of Sequel::SQL::VirtualRow.

ds = DB[:items]
# Regular proc
ds.where{|o| o.column > 1}
# WHERE column > 1

# Instance-evaled proc
ds.where{column > 1}
# WHERE column > 1

If you aren't familiar with the difference between regular blocks and instance evaled blocks, you should probably consult a general ruby reference, but briefly, with regular procs, methods called without an explicit receiver inside the proc call the method on the receiver in the surrounding scope, while instance evaled procs call the method on the receiver of the instance_eval call. However, in both cases, local variables available in the surrounding scope will be available inside the proc. If that doesn't make sense, maybe this example will help:

def self.a
b = 32

# Regular proc
ds.where{|o| o.c > a - b}
# WHERE c > 10

# Instance-evaled proc
ds.where{c > a - b}
# WHERE c > (a - 32)

There are two related differences here. First is the usage of o.c vs c, and second is the difference between the use of a. In the regular proc, you couldn't call c without an explicit receiver in the proc, unless the self of the surrounding scope responded to it. For a, note how ruby calls the method on the receiver of the surrounding scope in the regular proc, which returns an integer, and does the subtraction before Sequel gets access to it. In the instance evaled proc, calling a without a receiver calls the a method on the VirtualRow instance. For b, note that it operates the same in both cases, as it is a local variable.

Basically, the choice for whether to use a regular proc or an instance evaled proc is completely up to you. The same things can be accomplished with both. Instance evaled procs tend to produce shorter code, but by modifying the scope can be more difficult for a new user to understand. That being said, I usually use instance evaled procs unless I need to call methods on the receiver of the surrounding scope inside the proc.

Local Variables vs Method Calls

If you have a method that accepts 0 arguments and has the same name as a local variable, you can call it with () to differentiate the method call from the local variable access. This is mostly useful in instance_evaled procs:

b = 32
ds.where{b() > b}
# WHERE b > 32

VirtualRow Methods

VirtualRow is a class that returns SQL::Identifiers, SQL::QualifiedIdentifiers, or SQL::Functions depending on how it is called.

SQL::Identifiers - Regular columns

SQL::Identifiers can be thought of as regular column references in SQL, not qualified by any table. You get an SQL::Identifier if the method is called without a block or arguments, and doesn't have a double underscore in the method name:

ds.where{|o| o.column > 1}
ds.where{column > 1}
# WHERE column > 1

SQL::QualifiedIdentifiers - Qualified columns

SQL::QualifiedIdentifiers can be thought of as column references in SQL that are qualified to a specific table. You get an SQL::QualifiedIdentifier if the method is called without a block or arguments, and has a double underscore in the method name:

ds.where{|o| o.table__column > 1}
ds.where{table__column > 1}
# WHERE table.column > 1

Using the double underscore for SQL::QualifiedIdentifiers was done to make usage very similar to using symbols, which by default also translate the double underscore into a qualified column.

Note that when Sequel.split_symbols = false is used, then virtual rows do not split symbols either. If you need to create qualified identifers when disabling symbol splitting, it is easy to do by calling [] on the SQL::Identifiers returned by regular methods calls:

ds.where{|o| o.table[:column] > 1}
ds.where{table[:column] > 1}
# WHERE table.column > 1

SQL::Functions - SQL function calls

SQL::Functions can be thought of as function calls in SQL. You get a simple function call if you call a method with arguments and without a block:

ds.where{|o| o.function(1) > 1}
ds.where{function(1) > 1}
# WHERE function(1) > 1

To call a SQL function with multiple arguments, just use those arguments in your function call:

ds.where{|o| o.function(1, o.a) > 1}
ds.where{function(1, a) > 1}
# WHERE function(1, a) > 1

If the SQL function does not accept any arguments, create an identifier, then call the function method on it to produce a function:{|o| o.version.function}{version.function}
# SELECT version()

To use the SQL wildcard (*) as the sole argument in a function call, create a function without arguments, then call the * method on the function:{|o| o.count.function.*}{count.function.*}
# SELECT count(*)

To append the DISTINCT keyword before the method arguments, just call the distinct method on the returned Function:{|o| o.count(o.col1).distinct}{count(col1).distinct}
# SELECT count(DISTINCT col1){|o| o.count(o.col1, o.col2).distinct}{count(col1, col2).distinct}
# SELECT count(DISTINCT col1, col2)

SQL::Functions with windows - SQL window function calls

Not all databases support window functions, but they are very helpful for certain types of queries. To use them, you should just call the over method on the Function object returned, with the options for the window:{|o| o.rank.function.over}{rank.function.over}
# SELECT rank() OVER (){|o| o.count.function.*.over}{count.function.*.over}
# SELECT count(*) OVER (){|o| o.sum(o.col1).over(:partition=>o.col2, :order=>o.col3)}{sum(col1).over(:partition=>col2, :order=>col3)}
# SELECT sum(col1) OVER (PARTITION BY col2 ORDER BY col3)


VirtualRows use method_missing to handle almost all method calls. Since the objects given by method_missing are SQL::Identifiers, SQL::QualifiedIdentifiers or SQL::Functions, you can use all operators that they provide (see DatasetFiltering):{|o| o.price - 100}{o.price - 100}
# SELECT (price - 100)

ds.where{|o| (o.price < 200) & ( * 100 >= 23)}
ds.where{(price < 200) & (tax * 100 >= 0.23)}
# WHERE ((price < 200) AND ((tax * 100) >= 0.23))

However, VirtualRows have special handling of some operator methods to make certain things easier. The operators all use a prefix form.

Math Operators

The standard +, -, *, and / mathematical operators are defined:{|o| o.-(1, o.a).as(b)}{self.-(1, a).as(b)}
# SELECT (1 - a) AS b

Boolean Operators

The & and | methods are defined to use AND and OR:

ds.where{|o| o.&({:a=>:b}, :c)}
ds.where{self.&({:a=>:b}, :c)}
# WHERE ((a = b) AND c)

The ~ method is defined to do inversion:

ds.where{|o| o.~({:a=>1, :b=>2})}
ds.where{self.~({:a=>1, :b=>2})}
# WHERE ((a != 1) OR (b != 2))

Inequality Operators

The standard >, <, >=, and <= inequality operators are defined:

ds.where{|o| o.>(1, :c)}
ds.where{self.>(1, :c)}
# WHERE (1 > c)

Literal Strings

The backtick operator can be used inside an instance-evaled virtual row block to create a literal string:

ds.where{a > %x`some SQL`}
# WHERE (a > some SQL)

You can use this on a regular virtual row block too, but it doesn't look as nice:

ds.where{|o| o.>(:a, o.%x`('some SQL'))}

Returning multiple values

It's common when using select and order virtual row blocks to want to return multiple values. If you want to do that, you just need to return an array:{|o| [o.column1, o.sum(o.column2).as(o.sum)]}{[column1, sum(column2).as(sum)]}
# SELECT column1, sum(column2) AS sum

Note that if you forget the array brackets, you'll end up with a syntax error:

# Invalid ruby syntax{|o| o.column1, o.sum(o.column2).as(o.sum)}{column1, sum(column2).as(sum)}

Alternative Description of the VirtualRow method call rules

  • If a block is given:

    • The block is currently not called. This may change in a future version.

    • If there are no arguments, an SQL::Function with the name of method used, and no arguments.

    • If the first argument is :*, an SQL::Function is created with a single wildcard argument (*).

    • If the first argument is :distinct, an SQL::Function is created with the keyword DISTINCT prefacing all remaining arguments.

    • If the first argument is :over, the second argument if provided should be a hash of options to pass to SQL::Window. The options hash can also contain :*=>true to use a wildcard argument as the function argument, or :args=>… to specify an array of arguments to use as the function arguments.

  • If a block is not given:

    • If there are arguments, an SQL::Function is returned with the name of the method used and the arguments given.

    • If there are no arguments and the method contains a double underscore, split on the double underscore and return an SQL::QualifiedIdentifier with the table and column.

    • Otherwise, create an SQL::Identifier with the name of the method.