Polymorphism in EF Core

Introduction

It's almost inevitable in your daily routine not to come across a scenario involving polymorphism. I was inspired to write this article because today I needed to add a new type of menu item in JJInfinity. If you're not familiar with polymorphism, take a look at this article by Beatriz.

Modeling the Entity Hierarchy

Since talking about menu items is a bit difficult, let’s simplify. Taking advantage of the fact that the latest episode of Andor , the best Star Wars series in my opinion, just released yesterday, we will work with droids. Imagine a system that manages different types of droids. We model this with a base class Droid and specific subclasses:


public abstract class Droid
{
    public Guid Id { get; set; }
    public string Name { get; set; }
    public string Model { get; set; }
}

public class AstromechDroid : Droid
{
    public bool HasShipInterface { get; set; }
}

//This is Andor's K2-SO droid model :)
public class KXSeriesDroid : Droid
{
    public int AutonomyLevel { get; set; }
    public bool HasBlaster { get; set; }
}

Using Table-per-Hierarchy (TPH)

By default, EF Core uses TPH. The configuration adds a "Type" column (discriminator) to indicate the droid type in the database:

protected override void OnModelCreating(ModelBuilder modelBuilder)
{
    modelBuilder.Entity<Droid>()
        .HasDiscriminator<string>("DroidType")
        .HasValue<AstromechDroid>("Astromech")
        .HasValue<KXSeriesDroid>("KXSeries");
}

This mapping uses a single table Droids containing all necessary columns for the different types, with the column DroidType as discriminator. In my opinion, this model goes against a normalized database, because it ends up generating many columns that remain null depending on the entity type stored.

If you need to squeeze every bit of extra performance from your application, it might be worth it. But in most cases, for example on a web server, it’s not worth ruining your entity mapping for a few ms of performance.

Using Table-per-Type (TPT)

For TPT, EF Core will create a table for each concrete type:

protected override void OnModelCreating(ModelBuilder modelBuilder)
{
    modelBuilder.Entity<Droid>().ToTable("Droids");
    modelBuilder.Entity<AstromechDroid>().ToTable("AstromechDroids");
    modelBuilder.Entity<KXSeriesDroid>().ToTable("KXSeriesDroids");
}

In this approach, the base table (Droids) contains common columns and derived tables contain only their specific properties. EF Core performs joins to assemble the complete entities. This is my favorite mapping type and what I use most of the time.

The less talked about disadvantage is that the query generated by EF Core is pretty ugly, meaning that even if you only need to retrieve one droid, it will do a JOIN with all tables like in the example below:


SELECT [d].[Id], [d].[Name], [d].[Model], [a].[HasShipInterface], [k].[AutonomyLevel], [k].[HasBlaster]
FROM [Droids] AS [d]
LEFT JOIN [AstromechDroids] AS [a] ON [d].[Id] = [a].[Id]
LEFT JOIN [KXSeriesDroids] AS [k] ON [d].[Id] = [k].[Id]
WHERE [d].[Id] = @__p_0

With only 2 droid types, the query is quite simple. But imagine this system running on Coruscant, with 150 different droid types — in that scenario, you might want to rethink the mapping strategy. In my case, for example, in JJInfinity we have about 5 menu types, so these extra JOINs are not a big deal.

Using Table-per-Concrete Type (TPC)

Starting with EF Core 7.0, we can use TPC. Each concrete class will have its own full table:

protected override void OnModelCreating(ModelBuilder modelBuilder)
{
    modelBuilder.Entity<Droid>().UseTpcMappingStrategy();
}

All resulting tables will contain the base class properties, which eliminates the need for joins, at the cost of data redundancy. This is the "middle ground" scenario between TPH and TPT: queries become somewhat simpler and faster, but the database has more duplicated data.

Querying Polymorphic Entities

Polymorphic queries work naturally in EF Core. Regardless of the chosen scenario, the code is the same. For example:

var allDroids = await context.Droids.ToListAsync();

EF Core will identify the correct types based on the discriminator and instantiate objects of type AstromechDroid, KXSeriesDroid, or others as necessary.

Filtering by type is also straightforward:

var onlyAstromechs = await context.Droids.OfType<AstromechDroid>().ToListAsync();

Or retrieving droids with high autonomy:

var autonomousDroids = await context.Droids
    .OfType<KXSeriesDroid>()
    .Where(kx => kx.AutonomyLevel > 7)
    .ToListAsync();

Modeling Considerations

TPH is simple and efficient for quick reads but can generate null columns and complicate schema maintenance.
TPT keeps the schema more normalized, easier to understand, and avoids null columns but can impact performance due to joins.
TPC combines benefits of both, eliminating joins but with data duplication, which can increase disk space and update cost.

Conclusion

Since the C# code for the queries is always the same, the final decision is quite related to your data modeling. Your choice should balance performance, maintenance, and database modeling. For most cases, I personally recommend TPT because it offers a good middle ground, with clean modeling and acceptable queries for a few entities, but for scenarios with high read performance demand, TPH or TPC might be more suitable.

And remember: May the Force be with you.