Cartesian Explosion and Split Query in Entity Framework Core


In this tutorial, we will understand what cartesian explosion is and how to solve that problem. Let’s look at this query:

var people = await ctx.People
   .Include(p => p.Emails)
   .Include(p => p.Addresses)
   .ToListAsync();

It translates to:

 SELECT "p"."Id", "p"."FirstName", "p"."LastName", "e"."Id", "e"."PersonEmail", "e"."PersonId", "a"."Id", "a"."PersonAddress", "a"."PersonId"
      FROM "People" AS "p"
      LEFT JOIN "Emails" AS "e" ON "p"."Id" = "e"."PersonId"
      LEFT JOIN "Addresses" AS "a" ON "p"."Id" = "a"."PersonId"
      ORDER BY "p"."Id", "e"."Id"
  • Person joins with Email and Address. Both joins are at the same level.
  • A person can have multiple emails and adressess.

Let’s say a person with id = 1 have 10 emails and 10 addresses. The query returns 1x10x10 = 100 rows for 1 person. It is just for one person, how much they can be for 100 people. This problem is known as cartesian explosion

surprised cat [Image credit: pngegg.com]

📢 Cartesian explosion does not occurs if joins are in multiple leveles. Let’s say Person->Order->OrderItems. In this situation, Person joins with Order and Order joins with OrderItems. Ef core Query would be something like ctx.People.Include(p=>p.Orders).ThenInclude(o=>o.OrderItems).

What is the solution then? [Image generated by Gemini]

EF Core 5.0 introduced a feature called SplitQuery.

var people = await ctx.People
   .Include(p => p.Emails)
   .Include(p => p.Addresses)
   .AsSplitQuery()
   .ToListAsync();

Which will translate into:

info: Microsoft.EntityFrameworkCore.Database.Command[20101]
      Executed DbCommand (33ms) [Parameters=[], CommandType='Text', CommandTimeout='30']
      SELECT "p"."Id", "p"."FirstName", "p"."LastName"
      FROM "People" AS "p"
      ORDER BY "p"."Id"

info: Microsoft.EntityFrameworkCore.Database.Command[20101]
      Executed DbCommand (1ms) [Parameters=[], CommandType='Text', CommandTimeout='30']
      SELECT "e"."Id", "e"."PersonEmail", "e"."PersonId", "p"."Id"
      FROM "People" AS "p"
      INNER JOIN "Emails" AS "e" ON "p"."Id" = "e"."PersonId"
      ORDER BY "p"."Id"

info: Microsoft.EntityFrameworkCore.Database.Command[20101]
      Executed DbCommand (0ms) [Parameters=[], CommandType='Text', CommandTimeout='30']
      SELECT "a"."Id", "a"."PersonAddress", "a"."PersonId", "p"."Id"
      FROM "People" AS "p"
      INNER JOIN "Addresses" AS "a" ON "p"."Id" = "a"."PersonId"
      ORDER BY "p"."Id"

It will more helpful if person has 10 email and 10 addresses. Query with join return 1 x 10 x 10 = 100 rows while split query return 1 + 10 + 10 = 21 rows. Which is quite an improvement.


See also

Buy Me A Coffee