Last Update: 2022-10-21 09:44:05 -0700

Virtual Row Blocks

Dataset methods where, order, and select all take blocks that are referred to as virtual row blocks. Many other dataset methods pass the blocks they are given into one of those three methods, so there are actually many Sequel::Dataset methods that take virtual row blocks.

Why Virtual Rows

Virtual rows offer a less verbose way to express many queries. For example, by default if you want to express an inequality filter in Sequel, you can do:

dataset.where(Sequel[:a] > Sequel.function(:b, :c))
# WHERE (a > b(c))

With virtual rows, you can use the less verbose:

dataset.where{a > b(c)}
# WHERE (a > b(c))

Regular Procs vs Instance Evaled Procs

Virtual row blocks behave differently depending on whether the block accepts an argument. If the block accepts an argument, it is called with an instance of Sequel::SQL::VirtualRow. If it does not accept an argument, it is evaluated in the context of an instance of Sequel::SQL::VirtualRow.

ds = DB[:items]
# Regular block
ds.where{|o| o.column > 1}
# WHERE (column > 1)

# Instance-evaled block
ds.where{column > 1}
# WHERE (column > 1)

If you aren’t familiar with the difference between regular blocks and instance evaled blocks, inside regular blocks methods called without an explicit receiver call the method on the receiver in the surrounding scope, while instance evaled blocks call the method on the receiver of the instance_eval call (the Sequel::SQL::VirtualRow instance in this case).

in both cases, local variables available in the surrounding scope will be available inside the block. However, instance variables in the surrounding scope will not be available inside the block if using an instance evaled block, and methods called without an explicit receiver inside an instance evaled block will not call methods in the surrounding scope. For example:

def self.a
b = 32
@d = 100

# Regular block
ds.where{|o| o.c > a - b + @d}
# WHERE (c > 110)

# Instance-evaled block
ds.where{c > a - b + @d}
# WHERE (c > ((a - 32) + NULL))

There are three related differences here:

  • Regular blocks use o.c instead of just c

  • a results in 42 in the regular block, but creates an expression object in the instance evaled block

  • @d results in 100 in the regular block, but nil in the instance evaled block

In the regular block, you need to call c with an explicit receiver (the virtual row block argument), while in the instance evaled block c can be called directly, as the default receiver has changed inside the block.

For a, note how ruby calls the method on the receiver of the surrounding scope in the regular block, which returns an integer, and does the subtraction before Sequel gets access to it. In the instance evaled block, calling a without a receiver calls the a method on the VirtualRow instance. For @d, note that in a regular block, the value hasn’t changed, but in the instance evaled block, instance variable access returns nil. For b, note that it operates the same in both cases, as it is a local variable.

The choice for whether to use a regular block or an instance evaled block is up to you. The same things can be accomplished with both. Instance evaled block tend to produce shorter code, but by modifying the scope can be more difficult to understand.

If you are not sure which to use, use instance evaled blocks unless you need to call methods or access instance variables of the surrounding scope inside the block.

Local Variables vs Method Calls

If you have a method that accepts 0 arguments and has the same name as a local variable, you can call it with () to differentiate the method call from the local variable access. This is mostly useful in instance evaled blocks:

b = 32
ds.where{b() > b}
# WHERE b > 32

It’s also possible to use an explicit self receiver in instance evaled blocks:

b = 32
ds.where{self.b > b}
# WHERE b > 32

VirtualRow Methods

VirtualRow is a class that returns SQL::Identifiers or SQL::Functions depending on how it is called.

SQL::Identifiers - Regular columns

SQL::Identifiers can be thought of as regular column references in SQL, not qualified by any table. You get an SQL::Identifier if the method is called without arguments:

ds.where{|o| o.column > 1}
ds.where{column > 1}
# WHERE (column > 1)

SQL::QualifiedIdentifiers - Qualified columns

You can qualified identifiers by calling [] on an identifier:

ds.where{|o| o.table[:column] > 1}
ds.where{table[:column] > 1}
# WHERE table.column > 1

SQL::Functions - SQL function calls

SQL::Functions can be thought of as function calls in SQL. You get a simple function call if you call a method with arguments:

ds.where{|o| o.function(1) > 1}
ds.where{function(1) > 1}
# WHERE function(1) > 1

To call a SQL function with multiple arguments, just use those arguments in your function call:

ds.where{|o| o.function(1, o.a) > 1}
ds.where{function(1, a) > 1}
# WHERE function(1, a) > 1

If the SQL function does not accept any arguments, create an identifier, then call the function method on it to produce a function:{|o| o.version.function}{version.function}
# SELECT version()

To use the SQL wildcard (*) as the sole argument in a function call, create a function without arguments, then call the * method on the function:{|o| o.count.function.*}{count.function.*}
# SELECT count(*)

To append the DISTINCT keyword before the method arguments, just call the distinct method on the returned Function:{|o| o.count(o.col1).distinct}{count(col1).distinct}
# SELECT count(DISTINCT col1){|o| o.count(o.col1, o.col2).distinct}{count(col1, col2).distinct}
# SELECT count(DISTINCT col1, col2)

SQL::Functions with windows - SQL window function calls

To create a window function call, just call the over method on the Function object returned, with the options for the window:{|o| o.rank.function.over}{rank.function.over}
# SELECT rank() OVER (){|o| o.count.function.*.over}{count.function.*.over}
# SELECT count(*) OVER (){|o| o.sum(o.col1).over(partition: o.col2, order: o.col3)}{sum(col1).over(partition: col2, order: col3)}
# SELECT sum(col1) OVER (PARTITION BY col2 ORDER BY col3)


VirtualRows use method_missing to handle almost all method calls. Since the objects given by method_missing are SQL::Identifiers or SQL::Functions, you can use all operators that they provide (see DatasetFiltering):{|o| o.price - 100}{price - 100}
# SELECT (price - 100)

ds.where{|o| (o.price < 200) & ( * 100 >= 23)}
ds.where{(price < 200) & (tax * 100 >= 0.23)}
# WHERE ((price < 200) AND ((tax * 100) >= 0.23))

However, VirtualRows have special handling of some operator methods to make certain things easier. The operators all use a prefix form.

Math Operators

The standard +, -, *, and / mathematical operators are defined:{|o| o.-(1, o.a).as(b)}{self.-(1, a).as(b)}
# SELECT (1 - a) AS b

Boolean Operators

The & and | methods are defined to use AND and OR:

ds.where{|o| o.&({a: :b}, :c)}
ds.where{self.&({a: :b}, :c)}
# WHERE ((a = b) AND c)

The ~ method is defined to do inversion:

ds.where{|o| o.~({a: 1, b: 2})}
ds.where{self.~({a: 1, b: 2})}
# WHERE ((a != 1) OR (b != 2))

Inequality Operators

The standard >, <, >=, and <= inequality operators are defined:

ds.where{|o| o.>(1, :c)}
ds.where{self.>(1, :c)}
# WHERE (1 > c)

Returning multiple values

It’s common when using select and order virtual row blocks to want to return multiple values. If you want to do that, you just need to return an array:{|o| [o.column1, o.sum(o.column2).as(o.sum)]}{[column1, sum(column2).as(sum)]}
# SELECT column1, sum(column2) AS sum

Note that if you forget the array brackets, you’ll end up with a syntax error:

# Invalid ruby syntax{|o| o.column1, o.sum(o.column2).as(o.sum)}{column1, sum(column2).as(sum)}

Split symbols

Note that if you turn on symbol splitting for backwards compatibility, Sequel will split virtual row methods with double underscores and return them as qualified identifiers:

Sequel.split_symbols = true
ds.where{|o| o.table__column}
WHERE table.column

It’s not recommended that you rely on this, it’s better to convert the calls to the recommended form:

ds.where{|o| o.table[:column]}