Databases

Advanced Querying with Aggregations

Lecture 4: Aggregations, Partitions & Window Functions

GROUP BY, HAVING, RANK, SUM OVER, etc.

Relational databases : Querying with aggregations


Databases can represent a very large volume of information, and we need to determine some aggregations (summaries) of data to handle them correctly in our "human" analysis. We may need to

Relational databases : Querying with aggregations (2)


SELECT count(*) FROM facilities

Relational databases : Querying with aggregations (3)


SELECT count(*) FROM facilities WHERE name LIKE 'Tennis%'

Relational databases : sorting and limiting data

Relational databases: example for ORDER BY

it can be combined with ASC and DESC to organize result in either ascending or descending order

SELECT * FROM members
ORDER BY joindate desc;

Relational databases: summarizing with group by

the following will group bookings by memberid and count each group sub total bookings

SELECT bookings.memid as memberid, count(bookings.bookid) as cnt FROM bookings
LEFT JOIN members ON members.memid = bookings.memid
GROUP BY bookings.memid
ORDER BY cnt desc

Relational databases : sub SELECTs

One can consider using the result of a SELECT as a regular table

SELECT memberid, cnt FROM (
	SELECT bookings.memid as memberid, count(bookings.bookid) as cnt FROM bookings
	LEFT JOIN members ON members.memid = bookings.memid
	GROUP BY bookings.memid
	ORDER BY cnt desc
) mem_bookings;

Relational databases : sub SELECTs with rank (MySQL 8.0+)

MySQL 8.0+ supports window functions like RANK()

SELECT
    memberid, cnt, pos
FROM (
    SELECT
        bookings.memid as memberid,
        COUNT(bookings.bookid) as cnt,
        RANK() OVER (ORDER BY COUNT(bookings.bookid) DESC) as pos
    FROM bookings
    LEFT JOIN members ON members.memid = bookings.memid
    GROUP BY bookings.memid
) mem_bookings
WHERE pos <= 4;

Note: RANK() requires MySQL 8.0 or higher. For older versions, use variables or LIMIT.

Alternative for MySQL < 8.0 (without RANK)

For older MySQL versions, use LIMIT instead:

SELECT
    bookings.memid as memberid,
    COUNT(bookings.bookid) as cnt
FROM bookings
LEFT JOIN members ON members.memid = bookings.memid
GROUP BY bookings.memid
ORDER BY cnt DESC
LIMIT 4;

Or use user-defined variables:

SET @rank = 0;
SELECT memberid, cnt, @rank := @rank + 1 AS pos
FROM (
    SELECT memid as memberid, COUNT(*) as cnt
    FROM bookings
    GROUP BY memid
    ORDER BY cnt DESC
) ranked
LIMIT 4;

Query processing order

the example with RANK () OVER () shows an interesting fact about processing order

Exercise: Aggregations Practice

Using the bookings database, write queries for:

  1. Calculate the total revenue per facility
  2. Find the average number of slots booked per booking
  3. Find the facility with the highest total revenue
  4. List members who have made more than 10 bookings
  5. Calculate monthly revenue for each facility
-- Example solution for #1:
SELECT f.name, 
       SUM(b.slots * 
           CASE WHEN b.memid = 0 
           THEN f.guestcost 
           ELSE f.membercost END) as revenue
FROM bookings b
JOIN facilities f ON b.facid = f.facid
GROUP BY f.facid, f.name
ORDER BY revenue DESC;

MySQL Window Functions (8.0+)

Window functions perform calculations across rows related to the current row

Function Description Example
ROW_NUMBER() Unique row number ROW_NUMBER() OVER (ORDER BY sales)
RANK() Rank with gaps RANK() OVER (ORDER BY score DESC)
DENSE_RANK() Rank without gaps DENSE_RANK() OVER (ORDER BY score)
SUM() OVER() Running total SUM(amount) OVER (ORDER BY date)
AVG() OVER() Moving average AVG(price) OVER (PARTITION BY category)

Exercise: Window Functions

Practice using window functions (MySQL 8.0+):

  1. Rank facilities by total bookings
  2. Calculate running total of bookings per member
  3. Find the difference between each member's bookings and the average
  4. Assign row numbers to bookings ordered by date
-- Example solution for #1:
SELECT 
    f.name,
    COUNT(*) as booking_count,
    RANK() OVER (ORDER BY COUNT(*) DESC) as rank
FROM bookings b
JOIN facilities f ON b.facid = f.facid
GROUP BY f.facid, f.name;

MySQL Performance Tips for Aggregations

-- Example: Check query execution plan
EXPLAIN SELECT f.name, COUNT(*) 
FROM bookings b
JOIN facilities f ON b.facid = f.facid
GROUP BY f.facid;

Advanced Exercise: Complex Query

Create a comprehensive report showing:

-- Hint: You'll need to combine:
-- - Multiple aggregations (SUM, COUNT, AVG)
-- - Window functions for percentage and ranking
-- - Subqueries for total revenue calculation
-- - DISTINCT for unique member counting

Slide Overview