Union/Concat returning more than one entity with owned types generates incorrect joins
While performing set operation on two queries that both return two entities and the second entity has an owned type, joins for owned type's properties are made to the first table causing them to be empty.
var studentIds = new List<int> { 123 };
var planningsOnDate = c.Plannings.Where(p => p.EventsDate == new DateTime(2024, 04, 04));
var query = planningsOnDate.SelectMany(p =>
p.AutoAllocations.Where(a => studentIds.Contains(a.Student.Id)),
(p, a) => new { a.Student, studentPlanning = p }
)
.Union(
planningsOnDate.SelectMany(p =>
p.Allocations.Where(a => studentIds.Contains(a.Student.Id)),
(p, a) => new { a.Student, studentPlanning = p })
);
var results = await query.ToListAsync();
Console.WriteLine(results.First().studentPlanning.TimeRange); // null
In the above query Planning entity has an owned type TimeRange with StartTime and EndTime properties
generated query:
exec sp_executesql N'SELECT [t0].[Id], [t0].[Id0], [t0].[EventsDate], [e7].[Id], [e7].[EndTime], [e7].[StartTime]
FROM (
SELECT [e2].[Id], [e].[Id] AS [Id0], [e].[EventsDate]
FROM [Planning] AS [e]
INNER JOIN (
SELECT [e0].[Id], [e0].[StudentId], [e0].[PlanningId], [e1].[Id] AS [Id0]
FROM [AutoAllocation] AS [e0]
INNER JOIN [Student] AS [e1] ON [e0].[StudentId] = [e1].[Id]
WHERE [e1].[Id] IN (
SELECT [s].[value]
FROM OPENJSON(@__studentIds_0) WITH ([value] int ''$'') AS [s]
)
) AS [t] ON [e].[Id] = [t].[PlanningId]
INNER JOIN [Student] AS [e2] ON [t].[StudentId] = [e2].[Id]
WHERE [e].[EventsDate] = ''2024-04-04''
UNION
SELECT [e4].[Id], [e3].[Id] AS [Id0], [e3].[EventsDate]
FROM [Planning] AS [e3]
INNER JOIN (
SELECT [e5].[Id], [e5].[StudentId], [e5].[PlanningId], [e6].[Id] AS [Id0]
FROM [Allocation] AS [e5]
INNER JOIN [Student] AS [e6] ON [e5].[StudentId] = [e6].[Id]
WHERE [e6].[Id] IN (
SELECT [s0].[value]
FROM OPENJSON(@__studentIds_0) WITH ([value] int ''$'') AS [s0]
)
) AS [t1] ON [e3].[Id] = [t1].[PlanningId]
INNER JOIN [Student] AS [e4] ON [t1].[StudentId] = [e4].[Id]
WHERE [e3].[EventsDate] = ''2024-04-04''
) AS [t0]
LEFT JOIN [Planning] AS [e7] ON [t0].[Id] = [e7].[Id]',N'@__studentIds_0 nvarchar(4000)',@__studentIds_0=N'[123]'
in this query we can observe that last LEFT JOIN is made to [t0].[Id] which corresponds to Student.Id, but it should be made to Planning.Id
Additional notes:
- Marking owned type as required (
.Navigation(x => x.TimeRange).IsRequired()) causesSqlNullException - The same query worked in EF Core 6
Workaround:
It is possible to replace the order of entities in returned anonymous type (p, a) => new { a.Student, studentPlanning = p } -> (p, a) => new { studentPlanning = p, a.Student } and the join is made to the correct (first) entity, but if both classes had owned types only one of them would work.
Full code:
async Task Main()
{
var studentIds = new List<int> { 123};
var c = new Context();
var planningsOnDate = c.Plannings.Where(p => p.EventsDate == new DateTime(2024, 04, 04));
var query = planningsOnDate.SelectMany(p =>
p.AutoAllocations.Where(a => studentIds.Contains(a.Student.Id)),
(p, a) => new { a.Student, studentPlanning = p }
//(p, a) => new { studentPlanning = p, a.Student }
)
.Union(
planningsOnDate.SelectMany(p =>
p.Allocations.Where(a => studentIds.Contains(a.Student.Id)),
(p, a) => new { a.Student, studentPlanning = p })
//(p, a) => new { studentPlanning = p, a.Student })
);
var results = await query.ToListAsync();
Console.WriteLine(results.First().studentPlanning.TimeRange); // null
}
public class Context : DbContext
{
public DbSet<Planning> Plannings { get; set; }
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
base.OnModelCreating(modelBuilder);
modelBuilder.Entity<Planning>(config =>
{
config.ToTable("Planning");
config.HasMany(c => c.Allocations).WithOne().HasForeignKey("PlanningId").OnDelete(DeleteBehavior.Cascade).IsRequired();
config.HasMany(c => c.AutoAllocations).WithOne().HasForeignKey("PlanningId").OnDelete(DeleteBehavior.Cascade).IsRequired();
config.Property(c => c.EventsDate).HasColumnType("date");
config.OwnsOne(x => x.TimeRange, x =>
{
x.Property(c => c.StartTime).HasColumnName("StartTime").IsRequired();
x.Property(c => c.EndTime).HasColumnName("EndTime").IsRequired();
});//.Navigation(x => x.TimeRange).IsRequired();
});
modelBuilder.Entity<AutoAllocation>(config =>
{
config.ToTable("AutoAllocation");
config.HasOne(c => c.Student).WithMany().HasForeignKey("StudentId").IsRequired();
});
modelBuilder.Entity<Allocation>(config =>
{
config.ToTable("Allocation");
config.HasOne(c => c.Student).WithMany().HasForeignKey("StudentId").IsRequired();
});
modelBuilder.Entity<Student>(config =>
{
config.ToTable("Student");
});
}
protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
{
optionsBuilder
.UseLazyLoadingProxies()
.UseSqlServer("...")
.EnableSensitiveDataLogging(true);
}
}
public class Planning
{
public long Id { get; set; }
public DateTime EventsDate { get; set; }
public virtual List<AutoAllocation> AutoAllocations { get; set; }
public virtual List<Allocation> Allocations { get; set; }
public virtual TimeRange TimeRange { get; set; }
}
public class AutoAllocation
{
public long Id { get; set; }
public virtual Student Student { get; set; }
}
public class Allocation
{
public long Id { get; set; }
public virtual Student Student { get; set; }
}
public class Student
{
public int Id { get; set; }
}
public class TimeRange
{
public string StartTime { get; set; }
public string EndTime { get; set; }
}
Include provider and version information
EF Core version: Database provider: Microsoft.EntityFrameworkCore.SqlServer Target framework: .NET 8 EF Core Version: 8.0.4
Confirmed, thanks for filing and for the minimal repro, and sorry it took so long to get around to looking at it. I simplified the repro (see below).
Given this query:
_ = c.Plannings
.SelectMany(p => p.AutoAllocations, (p, a) => new { a.Student, studentPlanning = p })
.Union(c.Plannings.SelectMany(p => p.Allocations, (p, a) => new { a.Student, studentPlanning = p }))
.ToListAsync();
... we generate the following SQL (with EF 9.0):
SELECT [u].[Id], [u].[Id0], [p1].[Id], [p1].[TimeRange_EndTime], [p1].[TimeRange_StartTime]
FROM (
SELECT [s].[Id], [p].[Id] AS [Id0]
FROM [Plannings] AS [p]
INNER JOIN [AutoAllocation] AS [a] ON [p].[Id] = [a].[PlanningId]
INNER JOIN [Student] AS [s] ON [a].[StudentId] = [s].[Id]
UNION
SELECT [s0].[Id], [p0].[Id] AS [Id0]
FROM [Plannings] AS [p0]
INNER JOIN [Allocation] AS [a0] ON [p0].[Id] = [a0].[PlanningId]
INNER JOIN [Student] AS [s0] ON [a0].[StudentId] = [s0].[Id]
) AS [u]
LEFT JOIN [Plannings] AS [p1] ON [u].[Id] = [p1].[Id]
The LEFT JOIN at the end is because Planning has an owned entity (TimeRange); the query does a self-join of Planning (coming from inside the subquery) with itself [p1], to get the owned columns from [p1]*. However, the join is done against the wrong column; the [u].[Id] projected from inside the subquery corresponds to the Student Id, not to the planning (which is [u].[Id0].
* Note that the entire final LEFT JOIN seems unnecessary; it is done in order to get the owned TimeRange owned entity on Planning, but its columns (StartTime/EndTime) should simply be select inside and projected out, rather than re-joining the Planning table with itself. This is likely something that will be implemented properly out of the box for complex types.
Simplified repro
await using var context = new MyContext();
await context.Database.EnsureDeletedAsync();
await context.Database.EnsureCreatedAsync();
var c = new MyContext();
_ = c.Plannings
.SelectMany(p => p.AutoAllocations, (p, a) => new { a.Student, studentPlanning = p })
.Union(c.Plannings.SelectMany(p => p.Allocations, (p, a) => new { a.Student, studentPlanning = p }))
.ToListAsync();
public class MyContext : DbContext
{
public DbSet<Planning> Plannings { get; set; }
protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
=> optionsBuilder
.UseSqlServer("Server=localhost;Database=test;User=SA;Password=Abcd5678;Connect Timeout=60;ConnectRetryCount=0;Encrypt=false")
.LogTo(Console.WriteLine, LogLevel.Information)
.EnableSensitiveDataLogging();
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
modelBuilder.Entity<Planning>(config =>
{
config.HasMany(c => c.Allocations).WithOne().IsRequired();
config.HasMany(c => c.AutoAllocations).WithOne().IsRequired();
config.OwnsOne(x => x.TimeRange);//.Navigation(x => x.TimeRange).IsRequired();
});
modelBuilder.Entity<AutoAllocation>(config => config.HasOne(c => c.Student).WithMany().IsRequired());
modelBuilder.Entity<Allocation>(config => config.HasOne(c => c.Student).WithMany().IsRequired());
}
}
public class Planning
{
public long Id { get; set; }
public virtual List<AutoAllocation> AutoAllocations { get; set; }
public virtual List<Allocation> Allocations { get; set; }
public virtual TimeRange TimeRange { get; set; }
}
public class AutoAllocation
{
public long Id { get; set; }
public virtual Student Student { get; set; }
}
public class Allocation
{
public long Id { get; set; }
public virtual Student Student { get; set; }
}
public class Student
{
public int Id { get; set; }
}
public class TimeRange
{
public string StartTime { get; set; }
public string EndTime { get; set; }
}
We basically don't properly support owned entity types with set operations; SelectExpression.GenerateOwnedReferenceEntityProjectionExpression() - which is responsible for adding the owned columns, and therefore also the join to tables containing them - has various handling for rowsets that are SelectExpressions, but not for set operations (which can contain SelectExpressions). As a result, we go into a codepath that assumes that the rowset simply projects out columns with the names of the properties, so we get a column named Id although the correct column for the Id property is actually Id0.
I'm noting that mapping TimeRange as a complex property instead of an owned entity produces the right SQL, without the unneeded LEFT JOIN with self:
SELECT [s].[Id], [p].[Id] AS [Id0], [p].[TimeRange_EndTime], [p].[TimeRange_StartTime]
FROM [Plannings] AS [p]
INNER JOIN [AutoAllocation] AS [a] ON [p].[Id] = [a].[PlanningId]
INNER JOIN [Student] AS [s] ON [a].[StudentId] = [s].[Id]
UNION
SELECT [s0].[Id], [p0].[Id] AS [Id0], [p0].[TimeRange_EndTime], [p0].[TimeRange_StartTime]
FROM [Plannings] AS [p0]
INNER JOIN [Allocation] AS [a0] ON [p0].[Id] = [a0].[PlanningId]
INNER JOIN [Student] AS [s0] ON [a0].[StudentId] = [s0].[Id]
I'm putting this as a bug on the backlog for now. The fix isn't trivial, and with the move from owned entity types to complex types, this will become less important.