Scalar Subqueries

A scalar subquery returns exactly one value per row. It is often used in the SELECT clause to fetch aggregated or derived values.

SELECT user_id, name,
       (SELECT COUNT(*) FROM Friends WHERE Friends.user1 = Users.user_id) AS total_friends
FROM Users;

Execution Plan:

 Nested Loop  (cost=0.00..25.50 rows=1000 width=36)
   ->  Seq Scan on Users  (cost=0.00..10.00 rows=1000 width=36)
   ->  Aggregate  (cost=0.00..15.50 rows=1 width=4)
         ->  Seq Scan on Friends  (cost=0.00..10.00 rows=1000 width=4)

This results in a nested loop, meaning the subquery is executed once per row, which can be slow for large datasets unless indexed properly.

Subquery in JOIN

A subquery can be materialized as a derived table and joined with the main query. This improves performance when filtering or aggregating before joining.

SELECT U.user_id, U.name, F.friend_count
FROM Users U
LEFT JOIN (
    SELECT user1, COUNT(*) AS friend_count
    FROM Friends
    GROUP BY user1
) F
ON U.user_id = F.user1;

Execution Plan:

 Hash Join  (cost=25.00..45.00 rows=1000 width=40)
   Hash Cond: (U.user_id = F.user1)
   ->  Seq Scan on Users U  (cost=0.00..10.00 rows=1000 width=36)
   ->  Hash  (cost=15.00..15.00 rows=1000 width=8)
         ->  HashAggregate  (cost=10.00..15.00 rows=1000 width=8)
               Group Key: Friends.user1
               ->  Seq Scan on Friends  (cost=0.00..10.00 rows=1000 width=4)

The query planner performs a hash join, making it more efficient than a correlated subquery.

Subquery in WHERE

Subqueries in WHERE filter rows based on computed conditions.

SELECT user_id, name
FROM Users
WHERE user_id IN (
    SELECT user1 FROM Friends GROUP BY user1 HAVING COUNT(*) > 1
);

Execution Plan:

 Hash Semi Join  (cost=20.00..40.00 rows=500 width=36)
   Hash Cond: (Users.user_id = Friends.user1)
   ->  Seq Scan on Users  (cost=0.00..10.00 rows=1000 width=36)
   ->  Hash  (cost=10.00..10.00 rows=500 width=4)
         ->  HashAggregate  (cost=5.00..10.00 rows=500 width=4)
               Group Key: Friends.user1
               ->  Seq Scan on Friends  (cost=0.00..5.00 rows=1000 width=4)

This avoids a correlated subquery by precomputing the valid user IDs before filtering.

Subquery in HAVING

Used to filter groups after aggregation.

SELECT user1, COUNT(*) AS friend_count
FROM Friends
GROUP BY user1
HAVING COUNT(*) > (
    SELECT AVG(friend_count)
    FROM (SELECT user1, COUNT(*) AS friend_count FROM Friends GROUP BY user1) AS SubQ
);

Execution Plan:

 HashAggregate  (cost=15.00..20.00 rows=500 width=8)
   Group Key: Friends.user1
   Filter: (count(*) > (SubPlan 1))
   ->  Seq Scan on Friends  (cost=0.00..10.00 rows=1000 width=4)
   SubPlan 1
     ->  Aggregate  (cost=5.00..5.01 rows=1 width=4)
           ->  HashAggregate  (cost=5.00..5.00 rows=500 width=4)
                 Group Key: Friends.user1
                 ->  Seq Scan on Friends  (cost=0.00..5.00 rows=1000 width=4)

Here, the inner subquery runs first, then its result is used in filtering.

Subquery in FROM

A subquery in FROM creates a temporary table.

SELECT sub.user1, sub.friend_count
FROM (
    SELECT user1, COUNT(*) AS friend_count
    FROM Friends
    GROUP BY user1
) sub
WHERE sub.friend_count > 1;

Execution Plan:

 Hash Join  (cost=20.00..40.00 rows=500 width=8)
   Hash Cond: (sub.user1 = Users.user_id)
   ->  HashAggregate  (cost=10.00..15.00 rows=1000 width=8)
         Group Key: Friends.user1
         ->  Seq Scan on Friends  (cost=0.00..10.00 rows=1000 width=4)
   ->  Hash  (cost=10.00..10.00 rows=500 width=4)
         ->  Seq Scan on Users  (cost=0.00..10.00 rows=1000 width=4)

The query planner may materialize the subquery result as a temporary structure before filtering.

Query TypeExecution Characteristics
Scalar SubqueryEvaluates for each row (costly if unindexed)
JOIN SubqueryPrecomputes values before joining (efficient)
WHERE SubqueryEvaluates once before filtering (efficient)
HAVING SubqueryRuns after aggregation (efficient with indexing)
FROM SubqueryMaterializes before filtering (efficient with indexing)