AS400 Support and SQL Parsing

Earlier this week, I added an AS400 subadapter to the JDBC adapter, based on a request from bhauff on IRC. It’s pretty basic, and probably would fail most of the integration tests (which rely on insert returning an autogenerated primary key value), but it will get the job done for most cases.

The only intersting part of the adapter is how it handles limits and offsets. Similar to Microsoft SQL Server, AS400 does not have native support for offsets, but unlike MSSQL, it also lacks support for limits (MSSQL uses TOP for limits). Offset support isn’t critical in Sequel, since Sequel only uses it when you request it, but limit(1) is used all the time internally whenever Sequel knows it only wants a single row. For that reason, a large part of Sequel breaks if you don’t have working limit support, so fixing limit was the highest priority.

Having no experience with AS400, I wouldn’t have known how to implement this support, but bhauff showed how it was handled by the relevent ActiveRecord adapter, and with a little help from me he had a working port:

 def select_limit_sql(sql)
   if @opts[:limit]
     l = @opts[:limit]
     of = @opts[:offset] || 0
     if o = if @opts[:order]
       order_by = "ORDER BY " << o.gsub(/\b\w+\.\b/," ")
     end
     s = "SELECT B.* FROM (SELECT A.*, row_number() over" +
         "(#{order_by}) AS b$a FROM (SELECT"
     sql.gsub!(/SELECT/i, s)
     sql << ") A ) B WHERE B.b$a > #{of} AND B.b$a <= #{l + of}"
   end
 end

As I explained to him, that’s not the way Sequel handles things, as it relies on parsing SQL with regular expressions. Let’s examine the problems with this code. For one, it assumes the :order option is a string, which is probably a valid assumption in ActiveRecord, but not in Sequel. That’s not the major issue though. Because the code uses sql.gsub!, any query with a subselect, or even queries that have a SELECT as text inside a string value, will be broken.

The Sequel way, used originally in the MSSQL shared adapter for emulated offset support, is to basically prepare a new dataset without a limit or offset and use the standard dataset literalization to create the SQL:

def select_sql
  return super unless l = @opts[:limit]
  o = @opts[:offset] || 0
  order = @opts[:order]
  dsa1 = dataset_alias(1)
  dsa2 = dataset_alias(2)
  rn = row_number_column
  irn = Sequel::SQL::Identifier.new(rn).qualify(dsa2)
  subselect_sql(unlimited.
   from_self(:alias=>dsa1).
   select_more(Sequel::SQL::QualifiedIdentifier.new(dsa1, WILDCARD),
   Sequel::SQL::WindowFunction.new(SQL::Function.new(:ROW_NUMBER),
    Sequel::SQL::Window.new(:order=>order)).as(rn)).
  from_self(:alias=>dsa2).
  select(Sequel::SQL::QualifiedIdentifier.new(dsa2, WILDCARD)).
  where((irn > o) & (irn <= l + o)))
end

This produces roughly the same query as the SQL parsing hack, but it is safe to use, and doesn’t get confused by subselects or string values containing SELECT. Basically, the limit and offset are saved, and a new dataset is created without the limit (using unlimited), then it is made into a subselect using from_self (twice!), then the relevent fields are selected, and a ROW_NUMBER window function is used and filtering is based on that to get the correct number of records at the correct offset.

Adding AS400 support brought to mind a recent tweet by Charles Nutter. Maybe now that Sequel has added AS400 support, it can be considered mature?