blog image

Hello everyone, whether you work as a data analyst, engineer, or data scientist, knowing Derived Tables and Common Table Expressions (CTEs) is a must. It’s important to know how to use them.

Derived tables and Common Table stand are used on advanced SQL queries, offering an extensive range of functionalities to handle complex data sets efficiently. However, understanding these concepts might seem daunting without practical examples. Let’s dive into their usage, definitions, and exemples.

What are Derived Tables (DTs)?

Imagine you’re a chef preparing a complex dish that requires a specific mix of ingredients processed in a certain way before they can be combined into the final masterpiece. In the realm of databases, derived tables serve a similar purpose. They act as temporary tables (or subqueries) that allow you to extract and manipulate a subset of data from your database as if working with a standalone table.

For instance, if you own a bookstore and wish to analyze the sales data of books priced over 20€, you could use a derived table to first isolate (extract) the relevant sales records before performing further analysis.

SQL Example Simplified:

SELECT * 
FROM (SELECT book_id, title FROM sales_records WHERE price > 20) AS ExpensiveBooks

Here, ExpensiveBooks is a derived table that contains only records of books sold for over 20€.

Common Table Expressions (CTEs)

Common Table Expressions (CTEs) are like temporary markers in your data exploration. They allow you to name a specific query block, which can then be referenced elsewhere in your SQL query. Think of it as bookmarking a section of a guidebook that you know you’ll need to refer back to.

CTEs are especially handy when dealing with recursive queries or when your query needs to be organized in a more readable manner. Let’s use again the bookstore exemple, so suppose you wanted to find all the books that are more expensive than the average price of books in your store. A CTE can help by first calculating the average price and then using it to filter the books.

SQL Example Simplified:

;WITH AveragePrice AS (SELECT AVG(price) AS AvgPrice FROM sales_records)
SELECT title FROM sales_records
WHERE price > (SELECT AvgPrice FROM AveragePrice)

AveragePrice is a CTE calculating the average book price, which is then used to find books priced higher than this average.

Practical Case Using Derived Tables and CTEs

To make derived tables and CTEs concepts more concrete, let’s consider a simple scenario. Imagine your bookstore has launched an online platform, and you want to analyze the performance of employees in the logistics department—specifically, those whose processing time for orders is below the department average.

With a Derived Table:

SELECT employee_id, average_processing_time
FROM
  (SELECT employee_id, AVG(processing_time) AS average_processing_time
   FROM logistics
   GROUP BY employee_id) AS ProcessingStats
WHERE average_processing_time < (SELECT AVG(processing_time) FROM logistics)

In this example, ProcessingStats serves as a derived table to calculate each employee’s average processing time.

With a CTE:

;WITH ProcessingStats AS (
  SELECT employee_id, AVG(processing_time) AS average_processing_time
  FROM logistics
  GROUP BY employee_id
)
SELECT employee_id, average_processing_time
FROM ProcessingStats
WHERE average_processing_time < (SELECT AVG(processing_time) FROM logistics)

Both approaches aim to filter employees with below-average processing times, but CTEs offer a clearer structure, making the query easier to read and maintain.

Derived tables and CTEs are powerful tools and it’s important to know how and when to use them, that can make your work significantly more efficient and comprehensible.

Leave a Reply

Your email address will not be published. Required fields are marked *