Last Update: 2024-04-04 14:06:41 -0700

The provenance dataset extension tracks the locations of all dataset clones that resulted in the current dataset, and includes the information as a comment in the dataset’s SQL. This makes it possible to see how a query was built, which can aid debugging. Example:

  where{b > 10}.
# SQL:
# SELECT a FROM table WHERE (b > 10) ORDER BY c LIMIT 10 --
#  -- Dataset Provenance
#  -- Keys:[:from] Source:(eval at bin/sequel:257):2:in `<main>'
#  -- Keys:[:select] Source:(eval at bin/sequel:257):3:in `<main>'
#  -- Keys:[:where] Source:(eval at bin/sequel:257):4:in `<main>'
#  -- Keys:[:order] Source:(eval at bin/sequel:257):5:in `<main>'
#  -- Keys:[:limit] Source:(eval at bin/sequel:257):6:in `<main>'

With the above example, the source is fairly obvious and not helpful, but in real applications, where datasets can be built from multiple files, seeing where each dataset clone was made can be helpful.

The Source listed will skip locations in the Ruby standard library as well as Sequel itself. Other locations can be skipped by providing a Database :provenance_caller_ignore Regexp option:

DB.opts[:provenance_caller_ignore] = /\/gems\/library_name-/

Related module: Sequel::Dataset::Provenance