Tame complex queries with database views

Montreal.rb 2024-10-02

Accept-Language: en, fr, *;q=0.1

Please ask questions 🙋

(or share your knowledge)

What you'll (hopefully) learn

  1. What are DB views?
  2. When are they useful?
  3. How to use them with Rails

Why are you here talking about views?

I hear you ask

About me

Back to the subject at hand

To recap

Views

  1. Are a way to turn an SQL query into something that quacks like a table
  2. Behave like a subquery from a performance perspective
  3. Can have a standard ActiveRecord::Base in front of them (and be tested in much the same way)
  4. Can be created and managed with the Scenic gem

Materialized views

  1. Are written to disk
  2. Behave like a real table from a performance perspective
  3. Can be indexed
  4. Need periodic refreshing
  5. Can be refreshed concurrently provided there's a unique index

How are they useful?

Example 1

Combine two resources into one cohesive whole

Example 2

Make faceted search logic easier to maintain and reason about

Example 3

Provide useful abstractions across different domains

In general

  1. Views can be a great tool to DRY up your code base by creating an abstraction for complex or duplicated query logic
  2. They can postpone the need for manual denormalization, ETLs, or additional datastores by leveraging what you already have

Real-world example

Mastodon

What are the trade-offs?

Trade-offs

Views

  1. Are only as performant as the queries that define them, their underlying tables, and the planner's ability to sort things out
  2. Can make it very easy to paper over mountains of complexity

Trade-offs

Materialized views

  1. Easier to reason about for reads
  2. Refreshes can be quite taxing and may become a limiting factor
  3. Stale data

💁 pg_ivm seems promising, and there has been talk of a first-party solution.

Trade-offs

Performance?

  1. You are probably not Shopify
  2. Humans are expensive
  3. Measure measure measure

Annoyances

Annoyance 1

You cannot update a view that is referenced by another view without dropping and recreating the downstream views

Annoyance 2

There's nothing to tell you what version is currently running aside from the migration version

Annoyances 3

View definitions are frozen in time. You need to recreate them if a backing table changes.

Extras

Bonus use-case

Rename tables with zero downtime by leveraging updatable views

brandur.org/fragments/postgres-table-rename

Analyzing performance

github.com/dalibo/pev2