pg_auto_parameterize.rb

lib/sequel/extensions/pg_auto_parameterize.rb
Last Update: 2024-11-04 16:30:04 -0800

This extension changes Sequel’s postgres adapter to automatically parameterize queries by default. Sequel’s default behavior has always been to literalize all arguments unless specifically using parameters (via :$arg placeholders and the Dataset#prepare/call methods). This extension makes Sequel use string, numeric, blob, date, and time types as parameters. Example:

# Default
DB[:test].where(:a=>1)
# SQL: SELECT * FROM test WHERE a = 1

DB.extension :pg_auto_parameterize
DB[:test].where(:a=>1)
# SQL: SELECT * FROM test WHERE a = $1 (args: [1])

Other pg_* extensions that ship with Sequel and add support for PostgreSQL-specific types support automatically parameterizing those types when used with this extension.

This extension is not generally faster than the default behavior. In some cases it is faster, such as when using large strings. However, the use of parameters avoids potential security issues, in case Sequel does not correctly literalize one of the arguments that this extension would automatically parameterize.

There are some known issues with automatic parameterization:

  1. In order to avoid most type errors, the extension attempts to guess the appropriate type and automatically casts most placeholders, except plain Ruby strings (which PostgreSQL treats as an unknown type).

    Unfortunately, if the type guess is incorrect, or a plain Ruby string is used and PostgreSQL cannot determine the data type for it, the query may result in a DatabaseError. To fix both issues, you can explicitly cast values using Sequel.cast(value, type), and Sequel will cast to that type.

  2. PostgreSQL supports a maximum of 65535 parameters per query. Attempts to use a query with more than this number of parameters will result in a Sequel::DatabaseError being raised. Sequel tries to mitigate this issue by turning column IN (int, ...) queries into column = ANY(CAST($ AS int8[])) using an array parameter, to reduce the number of parameters. It also limits inserting multiple rows at once to a maximum of 40 rows per query by default. While these mitigations handle the most common cases where a large number of parameters would be used, there are other cases.

  3. Automatic parameterization will consider the same objects as equivalent when building SQL. However, for performance, it does not perform equality checks. So code such as:

    DB[:t].select{foo('a').as(:f)}.group{foo('a')}
    # SELECT foo('a') AS "f" FROM "t" GROUP BY foo('a')
    

    Will get auto paramterized as:

    # SELECT foo($1) AS "f" FROM "t" GROUP BY foo($2)
    

    Which will result in a DatabaseError, since that is not valid SQL.

    If you use the same expression, it will use the same parameter:

    foo = Sequel.function(:foo, 'a')
    DB[:t].select(foo.as(:f)).group(foo)
    # SELECT foo($1) AS "f" FROM "t" GROUP BY foo($1)
    

    Note that Dataset#select_group and similar methods that take arguments used in multiple places in the SQL will generally handle this automatically, since they will use the same objects:

    DB[:t].select_group{foo('a').as(:f)}
    # SELECT foo($1) AS "f" FROM "t" GROUP BY foo($1)
    

You can work around any issues that come up by disabling automatic parameterization by calling the no_auto_parameterize method on the dataset (which returns a clone of the dataset). You can avoid parameterization for specific values in the query by wrapping them with Sequel.skip_pg_auto_param.

It is likely there are corner cases not mentioned above when using this extension. Users are encouraged to provide feedback when using this extension if they come across such corner cases.

This extension is only compatible when using the pg driver, not when using the sequel-postgres-pr, jeremyevans-postgres-pr, or postgres-pr drivers, as those do not support bound variables.

Related module: Sequel::Postgres::AutoParameterize