Tutorial Instructions

Materialized Views

Materialized views are saved queries that you store in your database as a table. Materialized views can be queried like any other table. Typically materialized views are used for situations where you want to save yourself, or the database, from intensive queries or for data that is frequently used.

Advantages to use materialized views:

For performance. Often a materialized view is a better response time than a join, or a summation statement, since the data has been precomputed. No matter how complex the query, how many tables involved, Postgres stores these results as a simple table.

For easier query writing. Materialized views can simplify complex queries when you can easily join the materialized view with other data. Materialized views can be very helpful for data that you are often joining to save you from writing the joins multiple times in multiple different queries.

The downsides of materialized views are:

They are static. The materialized view has to be refreshed for new data to be added. This can be done on a regular basis with a cron job. However, materialized views would not be idea for something that needs accurate up to the second data, like banking.

They take up disk space. Materialized views are stored on your database disk in the same way table data is, so you’re essentially storing some of your data twice.

Situations where materialized views are ideal:

You often need summarized data

You do not need real-time accurate data

Our ecommerce example:

Let’s say we have an ecommerce site with a Postgres database. We have a products table, an orders table, and a product_orders table.

SELECT * FROM products LIMIT 10; SELECT * from orders LIMIT 10; SELECT * from product_orders LIMIT 10;

We want to show on our site how often this product has been purchased lately. Our marketing department has decided this information is helpful for buyers and influences our shoppers. The problem is we don’t have an easy sum of product orders. Since we are just storing individual products and individual orders, we don’t want the database calculating this sum every time someone visits a product page. It does not really need to be that accurate, close enough is good enough here.

Materialized view creation sql:

Here’s an example of a materialized view that will get my skus and the shipped quantity by skus, showing the most frequently sold skus at the top since I’m ordering by qty in descending order.

CREATE MATERIALIZED VIEW recent_product_sales AS SELECT p.sku, SUM(po.qty) AS total_quantity FROM products p JOIN product_orders po ON p.sku = po.sku JOIN orders o ON po.order_id = o.order_id WHERE o.status = 'Shipped' GROUP BY p.sku ORDER BY 2 DESC;

If you have a materialized view, it is generally a good idea to create an index for it to save your database from full scans.

CREATE INDEX sku_qty ON recent_product_sales(total_quantity);

Using the Materialized View

We can query this view as we would any other table.

SELECT * FROM recent_product_sales;

Or we could we can quickly review the top 10 products sold without having to write a sub-query to sum or rank.

SELECT sku FROM recent_product_sales LIMIT 10;

We can use our materialized view in a query too. This is a query that should show the product sku, name, price, sale prices, and recent quantity sold.

SELECT p.sku, p.name, p.price, p.sale_price, COALESCE(rps.total_quantity, 0) AS recent_sales_quantity FROM products p LEFT JOIN recent_product_sales rps ON p.sku = rps.sku;

This is something you could, for example, show on an ecommerce site as a popular items list. And because you’re calling data from a materialized view, you aren’t doing any heavy database queries to have the quantities generated for you every time you have someone look at it.

If you have new data and need to update the view, you can refresh it with:

REFRESH MATERIALIZED VIEW recent_product_sales;

Loading terminal...

Loading terminal...