Skip to content

Optimizing Heavy Queries in SQL Server: Running Total Case Study

Query performance optimization is one of the most critical challenges in database management. Even a well-designed system may face performance bottlenecks when queries grow in complexity or when they are executed against large datasets. Among the common performance issues, calculating Running Totals stands out as a frequent requirement in reporting and analytics workloads — but it can also become a significant source of inefficiency if not implemented correctly.

In this case study, I will share a real-world example from my recent work with SQL Server, where a heavy query built on a large view suffered from slow performance due to an inefficient Running Total calculation. By analyzing the problem and testing different optimization approaches, I was able to dramatically improve performance without introducing new indexes — simply by restructuring the query and applying a more efficient algorithm.

Prerequisites

Dataset preparation for Running Total performance tests (SQL Server)

To analyze and compare different methods for calculating a Running Total in
SQL Server, a reproducible test dataset was required. For convenience, the
Sales.SalesOrderDetail table from the AdventureWorks2022 sample database
was used to generate simulated rows. This choice is solely for convenience — the tests do not
depend on any special properties of that table and can be reproduced with other data sources.

About 30,000 rows were inserted into a dedicated table named
dbo.Population. This table was used for all Running Total calculations and
performance comparisons in the case study.

Table structure

CREATE TABLE dbo.Population
(
    [Date] DATE NOT NULL PRIMARY KEY,
    Births INT,
    Deaths INT
);
GO

Data population (CTE)

A Common Table Expression (CTE) was used to produce simulated Date, Births
and Deaths values based on columns from
AdventureWorks2022.Sales.SalesOrderDetail. The first 30,000 rows were then inserted
into dbo.Population:

;WITH cte AS
(
    SELECT
        DATEADD(DAY, sod.SalesOrderDetailID, '1899-12-31') AS [Date],
        (sod.ProductID * sod.OrderQty) % 200 AS Births,
        (sod.ProductID * sod.OrderQty) % 210 AS Deaths
    FROM AdventureWorks2022.Sales.SalesOrderDetail AS sod
)
INSERT dbo.Population ([Date], Births, Deaths)
SELECT TOP (30000)
    c.[Date], c.Births, c.Deaths
FROM cte AS c;
GO

After population, the dataset provides sufficient volume to evaluate multiple Running Total
scenarios and measure resource usage (CPU, execution time, logical reads, writes, etc.).

In the following sections, six methods for calculating Running Total in SQL Server
are reviewed (other approaches exist as well). For each method the SQL code, execution plan,
strengths and weaknesses, and resource usage metrics are presented to support an objective
comparison.

Method 1: Using INNER JOIN

Self-Join approach for calculating Running Totals

In this method, the Population table is joined with itself (Self-Join). For each row in p1, all rows from p2 with a date less than or equal to the current row are aggregated. While this technique is conceptually simple and straightforward, it quickly becomes inefficient as the dataset grows.

SQL Code

SELECT  p1.[Date], 
        p1.Births, 
        SUM(p2.Births) AS RunningTotal_Births,
        p1.Deaths, 
        SUM(p2.Deaths) AS RunningTotal_Deaths
    FROM dbo.Population AS p1 
        INNER JOIN dbo.Population AS p2 
            ON p2.[Date] <= p1.[Date] 
    GROUP BY p1.[Date], p1.Births, p1.Deaths
    ORDER BY p1.[Date]; 
GO

Execution Plan

Using Inner join
Execution plan for Method 1 (Self-Join)

Advantages

  • Simple and intuitive to implement
  • Reasonable choice for very small datasets

Disadvantages

  • Very high processing cost O(n²)
  • Heavy I/O Reads
  • Extremely inefficient on large datasets

Results

  • Rows processed: 450,015,000
  • CPU: 122,484
  • Duration: 80,442 ms
  • Reads: 1,189,435
  • Writes: 0

Method 2: Using Subquery

Row-by-row subquery approach for Running Totals

In this method, a subquery is executed for each row in the Population table. The subquery calculates the cumulative sum of all rows with a date less than the current one. While this technique avoids the explicit Join used in Method 1 and may look logically simpler, it still suffers from severe performance limitations when the dataset is large.

SQL Code

SELECT  [Date], 
        Births, 
        Births + COALESCE((SELECT SUM(Births) 
                    FROM dbo.Population AS s 
                    WHERE s.[Date] < o.[Date]), 0) AS RunningTotal_Births,
        Deaths, 
        Deaths + COALESCE((SELECT SUM(Deaths) 
                    FROM dbo.Population AS s 
                    WHERE s.[Date] < o.[Date]), 0) AS RunningTotal_Deaths
    FROM dbo.Population AS o 
    ORDER BY [Date]; 
GO

Execution Plan

Using Subquery Execution plan for Method 2 (Subquery)

Advantages

  • Simple and does not require a Join
  • Logically more readable than Method 1

Disadvantages

  • Still an O(n²) algorithm
  • Each row triggers a separate subquery
  • Very high CPU and Reads consumption

Results

  • Rows processed: 449,985,000
  • CPU: 272,297
  • Duration: 301,638 ms
  • Reads: 4,755,824
  • Writes: 0

Method 3: Using Update with Accumulating Variables

Sequential update with variables to compute Running Totals

In this method, the data is first inserted into a temporary table, and then an UPDATE statement is used
to calculate the Running Total sequentially with the help of variables.
This approach transforms the problem into a linear O(n) algorithm, significantly reducing execution
time and CPU usage compared to self-joins or subqueries.

SQL Code

DECLARE @tmp TABLE 
( 
    [Date] DATE PRIMARY KEY, 
    Births INT, 
    RunningTotal_Births INT,
    Deaths INT, 
    RunningTotal_Deaths INT 
); 
  
DECLARE @RunningTotal_Births INT = 0; 
DECLARE @RunningTotal_Deaths INT = 0; 
  
INSERT @tmp([Date], Births, RunningTotal_Births, Deaths, RunningTotal_Deaths) 
    SELECT [Date], Births, RunningTotal_Births = 0, 
           Deaths, RunningTotal_Deaths = 0 
    FROM dbo.Population 
    ORDER BY [Date]; 
  
UPDATE @tmp SET 
    @RunningTotal_Births = RunningTotal_Births = @RunningTotal_Births + Births,
    @RunningTotal_Deaths = RunningTotal_Deaths = @RunningTotal_Deaths + Deaths 
FROM @tmp
OPTION(FORCE ORDER);
  
SELECT [Date], Births, RunningTotal_Births, Deaths, RunningTotal_Deaths 
FROM @tmp 
ORDER BY [Date];
GO

Execution Plan

Using Update with Variables
Execution plan for Method 3 (Update with Accumulating Variables)

Advantages

  • Optimized CPU and execution time
  • Linear algorithm O(n)
  • Much better performance compared to Join and Subquery

Disadvantages

  • Requires a temporary table and longer code
  • Less readable compared to the Window Function approach

Results

  • CPU: 265
  • Duration: 574 ms
  • Reads: 226,926
  • Writes: 177

Method 4: Using Recursive CTE

Recursive approach with Common Table Expressions (CTE)

This method leverages a recursive CTE to calculate running totals by iteratively adding values row by row. It’s a creative way to demonstrate the power of recursion in SQL, though it comes with performance limitations on large datasets. The MAXRECURSION option is required when processing long sequences of data.

SQL Code

;WITH x AS 
( 
    SELECT  [Date], 
            Births, Births AS RunningTotal_Births, 
            Deaths, Deaths AS RunningTotal_Deaths
        FROM dbo.Population 
        WHERE [Date] = '1900-01-01' 
    UNION ALL 
    SELECT  y.[Date], 
            y.Births, x.RunningTotal_Births + y.Births,
            y.Deaths, x.RunningTotal_Deaths + y.Deaths 
        FROM x 
            INNER JOIN dbo.Population AS y 
                ON y.[Date] = DATEADD(DAY, 1, x.[Date]) 
) 
SELECT  [Date], 
        Births, RunningTotal_Births, 
        Deaths, RunningTotal_Deaths 
    FROM x 
    ORDER BY [Date] 
    OPTION (MAXRECURSION 0); 
GO

Execution Plan

Using Recursive CTE Execution plan for Method 4 (Recursive CTE)

Advantages

  • An interesting idea that demonstrates the power of CTEs
  • Useful for teaching and specific scenarios

Disadvantages

  • Slower compared to Update or Window Function
  • High complexity with large datasets
  • Requires the MAXRECURSION option for big data

Results

  • CPU: 357
  • Duration: 660 ms
  • Reads: 330,125
  • Writes: 0

Method 5: Using Cursor

Row-by-row processing using cursors and cumulative variables

This method processes rows sequentially with a CURSOR, updating cumulative variables as each row is fetched.
It provides full control over row-by-row logic, but suffers from significant performance drawbacks on large datasets.

SQL Code


DECLARE @tmp TABLE 
( 
    [Date] DATE PRIMARY KEY, 
    Births INT, 
    RunningTotal_Births INT,
    Deaths INT, 
    RunningTotal_Deaths INT 
); 
  
DECLARE 
    @Date       DATE, 
    @Births     INT, 
    @RunningTotal_Births INT = 0,
    @Deaths     INT, 
    @RunningTotal_Deaths INT = 0; 
  
DECLARE c CURSOR LOCAL STATIC FORWARD_ONLY READ_ONLY FOR 
    SELECT [Date], Births, Deaths 
        FROM dbo.Population 
        ORDER BY [Date]; 

OPEN c; 
FETCH NEXT FROM c INTO @Date, @Births, @Deaths; 
WHILE @@FETCH_STATUS = 0 
BEGIN 
    SET @RunningTotal_Births = @RunningTotal_Births + @Births; 
    SET @RunningTotal_Deaths = @RunningTotal_Deaths + @Deaths; 
  
    INSERT @tmp([Date], Births, RunningTotal_Births, Deaths, RunningTotal_Deaths) 
        SELECT @Date, @Births, @RunningTotal_Births, @Deaths, @RunningTotal_Deaths; 
  
    FETCH NEXT FROM c INTO @Date, @Births, @Deaths; 
END 
CLOSE c; 
DEALLOCATE c; 
  
SELECT [Date], Births, RunningTotal_Births, Deaths, RunningTotal_Deaths
    FROM @tmp
    ORDER BY [Date]; 
GO

Advantages

  • Full control over each row
  • High flexibility for complex scenarios

Disadvantages

  • Slowest method with large datasets
  • High CPU usage and long execution time
  • Not suitable for production environments

Results

  • CPU: 891
  • Duration: 2,701 ms
  • Reads: 181,934
  • Writes: 109

Method 6: Using Window Function

Efficient running totals using OVER (ORDER BY ...)

This is the most efficient method in SQL Server for calculating running totals. The SUM() OVER (ORDER BY ...) construct is optimized internally by the SQL Server engine, providing both simplicity and the best performance among all methods.

SQL Code


SELECT  [Date], 
        Births, 
        SUM(Births) OVER (ORDER BY [Date]) AS RunningTotal_Births,
        Deaths, 
        SUM(Deaths) OVER (ORDER BY [Date]) AS RunningTotal_Deaths
    FROM dbo.Population 
    ORDER BY [Date]; 
GO

Execution Plan

Using Window Function Execution plan for Method 6 (Window Function)

Advantages

  • Best performance among all methods
  • Very simple and readable code
  • Optimized by the SQL Server engine

Disadvantages

  • Not available in older versions of SQL Server (before 2012)

Results

  • CPU: 234
  • Duration: 318 ms
  • Reads: 285,771
  • Writes: 32

Final Comparison of Methods

Performance and suitability summary of all approaches

MethodCPUDuration (ms)ReadsWritesCode SimplicityAlgorithm ComplexitySuitable For
INNER JOIN122,48480,4421,189,4350SimpleO(n²)Educational / Small data
Subquery272,297301,6384,755,8240Simpler than JoinO(n²)Educational / Small data
Update + Variables265574226,926177MediumO(n)Large data / High performance
Recursive CTE357660330,1250MediumO(n) (but expensive)Experimental / Special cases
Cursor8912,701181,934109ComplexO(n) with high costSpecial scenarios, not for production
Window Function234318285,77132Very simpleOptimized O(n)Best choice for Production

Profiler Overview

Profiler Comparison of All Methods
SQL Server Profiler results comparing all methods

Your Feedback Matters

I would love to hear your thoughts on these methods and your own experiences with running total calculations. Have you tried any other approaches that worked better in your projects?

Also, feel free to connect with me on social media: GitHub | LinkedIn

Your feedback helps make future content more practical and comprehensive 🌱

Leave a Reply

Your email address will not be published. Required fields are marked *