efcore icon indicating copy to clipboard operation
efcore copied to clipboard

How do I debug why a performance issue occurs when it doesn't if I use the EF given query?

Open TehWardy opened this issue 1 year ago • 6 comments

EF threw the below exception information at me indicating a timeout on what seems like a fairly complex query. The reality is the query was actually quite simple and the complexity was a result of the interplay between various query filters defined on the context.

Here's that query ...

        string[] systems = systemId.Split(',');
        return Db.RemittanceAdvices
             .Where(ra => systems.Contains(ra.SourceSystemId));

My model is configured with these query filters for the tables that are seemingly hit in the query ...

modelBuilder.Entity<ActiveTransaction>().HasQueryFilter(i => i.Buckets.Any());
modelBuilder.Entity<BucketActiveTransaction>().HasQueryFilter(i => i.Bucket != null);
modelBuilder.Entity<BucketRole>().HasQueryFilter(i => i.Role != null);
modelBuilder.Entity<Bucket>().HasQueryFilter(i => i.Roles.Count != 0);
modelBuilder.Entity<B2BUser>().HasQueryFilter(u => UserIsPortalAdmin || u.Id == AuthInfo.SSOUserId);
modelBuilder.Entity<B2BRole>().HasQueryFilter(r => UserIsPortalAdmin || r.Users.Any());
modelBuilder.Entity<B2BUserRole>().HasQueryFilter(r => UserIsPortalAdmin || r.UserId == AuthInfo.SSOUserId);

When I took the query given and set the params to the expected values and ran it I got 0 results but the query ran instantly. So I have a couple of questions ....

  1. How do I figure out the actual param values that was given to this to cause the timeout?
  2. How do I go about debugging performance (just some recommendations) with complex EF models?

Here's the query in question (I suspect a missing index or 2 but i'm seeing nested sub queries which raised some eyebrows) ...

Failed executing DbCommand (30,004ms) [Parameters=[@__TypedProperty_1='?' (DbType = Int32), @__ef_filter__UserIsPortalAdmin_0='?' (DbType = Boolean), @__ef_filter__SSOUserId_1='?' (Size = 450), @__systems_0='?' (Size = 4000)], CommandType='Text', CommandTimeout='30']

SELECT TOP(@__TypedProperty_1) [a].[Id], [a].[AgreedDiscountPaymentDate], [a].[CaptureDate], [a].[Comment], [a].[CostOfFunding], [a].[CreationDate], [a].[CreditorExternalState], [a].[CreditorGeneralLedgerDate], [a].[CreditorName], [a].[CreditorReference], [a].[CreditorState], [a].[CreditorVATReference], [a].[CurrencyId], [a].[DebtorExternalState], [a].[DebtorGeneralLedgerDate], [a].[DebtorName], [a].[DebtorPaymentDate], [a].[DebtorReference], [a].[DebtorState], [a].[DebtorVATReference], [a].[DeliveryDate], [a].[DueDate], [a].[FundableValue], [a].[FunderExternalState], [a].[FunderReference], [a].[FundingState], [a].[LastUpdated], [a].[Number], [a].[OfferAcceptanceDate], [a].[OfferExpiryDate], [a].[OfferRate], [a].[OfferValue], [a].[ProposedDiscountPaymentDate], [a].[ReceivedDate], [a].[RelatedTransactionReference], [a].[SourceSystemId], [a].[TaxPoint], [a].[TransactionId], [a].[TransactionReference], [a].[TransactionType], [a].[TypeId], [a].[UnpaidValue], [a].[Value], [a].[ValueBeforeTax], [a].[ValueOfTax]
FROM [Transactions].[ActiveTransactions] AS [a]
WHERE EXISTS (
    SELECT 1
    FROM [Masterdata].[BucketActiveTransactions] AS [b]
    INNER JOIN (
        SELECT [b0].[Id], [b0].[CreatedBy], [b0].[CreatedOn], [b0].[Description], [b0].[Key], [b0].[LastUpdated], [b0].[LastUpdatedBy], [b0].[Name], [b0].[ParentId], [b0].[Path]
        FROM [Masterdata].[Buckets] AS [b0]
        WHERE (
            SELECT COUNT(*)
            FROM [Masterdata].[BucketRoles] AS [b1]
            INNER JOIN (
                SELECT [r].[Id], [r].[CreatedBy], [r].[CreatedOn], [r].[Description], [r].[LastUpdated], [r].[LastUpdatedBy], [r].[Name], [r].[Privs], [r].[UsersArePortalAdmins]
                FROM [Security].[Roles] AS [r]
                WHERE @__ef_filter__UserIsPortalAdmin_0 = CAST(1 AS bit) OR EXISTS (
                    SELECT 1
                    FROM [Security].[UserRoles] AS [u]
                    WHERE (@__ef_filter__UserIsPortalAdmin_0 = CAST(1 AS bit) OR [u].[UserId] = @__ef_filter__SSOUserId_1) AND [r].[Id] = [u].[RoleId])
            ) AS [t] ON [b1].[RoleId] = [t].[Id]
            WHERE [b0].[Id] = [b1].[BucketId]) <> 0 OR (
            SELECT COUNT(*)
            FROM [Masterdata].[BucketRoles] AS [b1]
            INNER JOIN (
                SELECT [r].[Id], [r].[CreatedBy], [r].[CreatedOn], [r].[Description], [r].[LastUpdated], [r].[LastUpdatedBy], [r].[Name], [r].[Privs], [r].[UsersArePortalAdmins]
                FROM [Security].[Roles] AS [r]
                WHERE @__ef_filter__UserIsPortalAdmin_0 = CAST(1 AS bit) OR EXISTS (
                    SELECT 1
                    FROM [Security].[UserRoles] AS [u]
                    WHERE (@__ef_filter__UserIsPortalAdmin_0 = CAST(1 AS bit) OR [u].[UserId] = @__ef_filter__SSOUserId_1) AND [r].[Id] = [u].[RoleId])
            ) AS [t] ON [b1].[RoleId] = [t].[Id]
            WHERE [b0].[Id] = [b1].[BucketId]) IS NULL
    ) AS [t0] ON [b].[BucketId] = [t0].[Id]
    WHERE [a].[Id] = [b].[ActiveTransactionId]) AND [a].[SourceSystemId] IN (
    SELECT [s].[value]
    FROM OPENJSON(@__systems_0) WITH ([value] nvarchar(50) '$') AS [s]
)
ORDER BY [a].[DebtorPaymentDate] DESC, [a].[Id]

Here's the exception and stack trace information I logged out post the exception ...

An exception occurred while iterating over the results of a query for context type 'B2B.Data.EF.B2BDbContext'.
Microsoft.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired.  The timeout period elapsed prior to completion of the operation or the server is not responding.
 ---> System.ComponentModel.Win32Exception (258): The wait operation timed out.
   at Microsoft.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
   at Microsoft.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
   at Microsoft.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
   at Microsoft.Data.SqlClient.SqlDataReader.TryConsumeMetaData()
   at Microsoft.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString, Boolean isInternal, Boolean forDescribeParameterEncryption, Boolean shouldCacheForAlwaysEncrypted)
   at Microsoft.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean isAsync, Int32 timeout, Task& task, Boolean asyncWrite, Boolean inRetry, SqlDataReader ds, Boolean describeParameterEncryptionRequest)
   at Microsoft.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, TaskCompletionSource`1 completion, Int32 timeout, Task& task, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry, String method)
   at Microsoft.Data.SqlClient.SqlCommand.ExecuteReader(CommandBehavior behavior)
   at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReader(RelationalCommandParameterObject parameterObject)
   at Microsoft.EntityFrameworkCore.Query.Internal.SplitQueryingEnumerable`1.Enumerator.InitializeReader(Enumerator enumerator)
   at Microsoft.EntityFrameworkCore.Query.Internal.SplitQueryingEnumerable`1.Enumerator.<>c.<MoveNext>b__22_0(DbContext _, Enumerator enumerator)
   at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.<>c__DisplayClass28_0`2.<Execute>b__0(DbContext context, TState state)
   at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.ExecuteImplementation[TState,TResult](Func`3 operation, Func`3 verifySucceeded, TState state)
   at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.Execute[TState,TResult](TState state, Func`3 operation, Func`3 verifySucceeded)
   at Microsoft.EntityFrameworkCore.Query.Internal.SplitQueryingEnumerable`1.Enumerator.MoveNext()
ClientConnectionId:e459882d-7e2e-484c-9d11-a582db962729
Error Number:-2,State:0,Class:11

TehWardy avatar Apr 29 '24 16:04 TehWardy

How do I figure out the actual param values that was given to this to cause the timeout?

@TehWardy you can enable EF's SQL logging, so that you get the precise SQL query which timeouts. You can use EnableSensitiveDataLogging() to get parameter values logged as well, but be careful with the security implications of doing that. Once these things are enabled, you should have everything you need in the logs to reproduce the slowness/timeout outside of EF and .NET.

How do I go about debugging performance (just some recommendations) with complex EF models?

That's a very open-ended question... First, EF model complexity generally shouldn't impact query performance (though it can definitely impact startup performance). I'd generally look at enable OpenTelemetry tracing at your ADO.NET driver, at which point you should be able to plot and analyze start/stop data for your database commands, and understand what's going on. Beyond that I'd need more specific/concrete questions to help out.

roji avatar Apr 29 '24 17:04 roji

Thanks @roji if I recall I hit problems Enabling EF's SQL logging, I'll take another run at it though.

You are correct though ... It's a very open ended question, I am thinking my issue is that I have "stacked" the query filters in such a way that EF is building inner joins in sub queries when I suspect if I was to hand crank this I would likely just build a base query with joins.

Do you know if it's possible to somewhat advise / adjust how EF builds the SQL for queries? Like maybe giving it strategy hints or something?

One idea i did have was to write the security checking stuff into a sproc or function then call that from the EF filters.

I have possibly a model architecture problem in that I have like primary entitties with collections of child entities and the model contains all these in what I call buckets with attached roles that grant users privs.

I can't seem to figure out an efficient way to model tenant trees of data that behave like a file system and have EF play nice.

TehWardy avatar Apr 29 '24 17:04 TehWardy

@TehWardy we really need a very concrete LINQ query with a very concrete SQL that seems somehow insufficient - at that point we can go deeper into this. Users shouldn't need to be experts or know about "strategy hints" in order to get EF to generate good SQL from LINQ. So ideally, as always simply submit a minimal, runnable repro that shows your query and we can discuss from there.

Otherwise, of course you can use SQL, either via a stored function/procedure, FromSql() or some other means.

roji avatar Apr 29 '24 18:04 roji

Ok thanks @roji i'll see what I can put together for you.

TehWardy avatar Apr 29 '24 19:04 TehWardy

@roji I've pulled out the key pieces of my codebase and created a public repo here ... https://github.com/TehWardy/EFSample

On compile and run this will create a new DB on a local default SQL instance an then run the query in question on it logging out the SQL from my initial post here.

With no data in the DB of course it runs instantly but this at least gives you an idea of what I am working with.

... let me know if you want anything else.

Many thanks for your help.

TehWardy avatar Apr 29 '24 21:04 TehWardy

@TehWardy to make sure we don't waste any time, can you please post the SQL you're getting and the SQL you'd like to get?

roji avatar Apr 29 '24 22:04 roji

Honestly @roji I think this is a case of like you said above ...

Users shouldn't need to be experts or know about "strategy hints" in order to get EF to generate good SQL from LINQ

My problem right now is trying to understand why when EF runs the SQL it reports from that Sample it's timing out after 30 seconds, yet when I take that same SQL and run it in SQL management studio it runs in under 1 second.

So of course my initial thought was ... Maybe I've misunderstood something, or perhaps there's a way I could write the filters "better" to encourage EF to avoid these 0(n) traps?

I'm guessing there is some difference between me running it and EF running it but I don't understand and my issue is figuring out what that is so I defer to you as the expert here.

I have a feeling this is not ideal but I'm at a bit of a loss right now.

Some additional background on the model design Perhaps a follow on question might add more context to my current thought process (I don't know if this is correct though) ... The reason for the query complexity boils down to the Query Filters on each of the tables involved in the query.

Essentially the logic boils down to something like this for the entire DB ...

  • Buckets are a sort of security scope into which other entity types are put.
  • Bucket have a collection of BucketRoles which in turn have a Role linked to them.
  • Roles in turn are attached to UserRoles then a User.

So when I ask for DbContext.{T} the access check is something like ... T => Bucket{T}s => Bucket => BucketRoles => Role

^ this is what I think could be a flat set with regular joins in the SQL query instead of the sub query and inner join logic it's currently building.

In memory I have cached the User and their attached UserRoles and Roles. So I'm able to by means of the HTTP context and session information from the request take my UserId and inject that into the context as AuthInfo.

From that "request context" / "scope" I pull the cached info and then ask my question with the QueryFilters ...

         modelBuilder.Entity<B2BUser>().HasQueryFilter(u => u.Id == AuthInfo.SSOUserId);
        modelBuilder.Entity<B2BRole>().HasQueryFilter(r => r.Users.Any());
        modelBuilder.Entity<B2BUserRole>().HasQueryFilter(r => r.UserId == AuthInfo.SSOUserId);

Only then do I attempt the actual query ...

IEnumerable<T> data = DbContext.Set<T>();

... where I rely on EF to do the grunt work here.

You've seen the current SQL output I suspect though that EF could resolve cases like this differently with a flat set join (hence my point that I might do it differently for perf reasons).

My thinking is that if EF generated something like this instead the perf problem would be gone ...

SELECT ...
	FROM [Payments].[RemittanceAdvices] AS [a] ...
	JOIN [Masterdata].[BucketRemittanceAdvices] AS [b] ...
	JOIN [Masterdata].[Buckets] AS [b0] ...
	JOIN [Masterdata].[BucketRoles] AS [b1] ...
	JOIN [Security].[Roles] AS [r] ...
	JOIN [Security].[UserRoles] AS [u] ...
		WHERE [a].[Id] = [b].[RemittanceAdviceId]) AND [a].[SourceSystemId] IN (
		SELECT [s].[value]
		FROM OPENJSON(@__systems_0) WITH ([value] nvarchar(50) '$') AS [s]
		ORDER BY [a].[DebtorPaymentDate] DESC, [a].[Id]

I'm not a SQL genius so i'm not 100% on the join type that might be appropriate here, that example is merely for illustrative purposes.

... The plot thickens: by analyzing the query a little more I noticed this little gem in the middle of it ...

              WHERE (
                  SELECT COUNT(*)
                  FROM [Masterdata].[BucketRoles] AS [b1]
                  INNER JOIN (
                      SELECT [r0].[Id], [r0].[CreatedBy], [r0].[CreatedOn], [r0].[Description], [r0].[LastUpdated], [r0].[LastUpdatedBy], [r0].[Name], [r0].[Privs], [r0].[UsersArePortalAdmins]
                      FROM [Security].[Roles] AS [r0]
                      WHERE @__ef_filter__UserIsPortalAdmin_0 = CAST(1 AS bit) OR EXISTS (
                          SELECT 1
                          FROM [Security].[UserRoles] AS [u]
                          WHERE (@__ef_filter__UserIsPortalAdmin_0 = CAST(1 AS bit) OR [u].[UserId] = @__ef_filter__SSOUserId_1) AND [r0].[Id] = [u].[RoleId])
                  ) AS [t] ON [b1].[RoleId] = [t].[Id]
                  WHERE [b0].[Id] = [b1].[BucketId]) <> 0 OR (
                  SELECT COUNT(*)
                  FROM [Masterdata].[BucketRoles] AS [b1]
                  INNER JOIN (
                      SELECT [r0].[Id], [r0].[CreatedBy], [r0].[CreatedOn], [r0].[Description], [r0].[LastUpdated], [r0].[LastUpdatedBy], [r0].[Name], [r0].[Privs], [r0].[UsersArePortalAdmins]
                      FROM [Security].[Roles] AS [r0]
                      WHERE @__ef_filter__UserIsPortalAdmin_0 = CAST(1 AS bit) OR EXISTS (
                          SELECT 1
                          FROM [Security].[UserRoles] AS [u]
                          WHERE (@__ef_filter__UserIsPortalAdmin_0 = CAST(1 AS bit) OR [u].[UserId] = @__ef_filter__SSOUserId_1) AND [r0].[Id] = [u].[RoleId])
                  ) AS [t] ON [b1].[RoleId] = [t].[Id]
                  WHERE [b0].[Id] = [b1].[BucketId]) IS NULL

This suggested to me that if I was able to optimise my query filter logic in some way or grant EF some "hint" I would not need this inner selection and join logic on the same table set twice. This could also be why in some logical circumstances I'd hit weird timeout situations due to contentions (maybe, I might be reaching here though).

Have I missed something?

TehWardy avatar Apr 30 '24 09:04 TehWardy

Ok interesting ... VS recommends changing all Collection .Any() calls to .Count > 0 calls so my Bucket query filter looks like this ...

modelBuilder.Entity<Bucket>().HasQueryFilter(i => i.Roles.Count != 0);

A friend just pointed out that VS is not aware of what EF will do in translating this to SQL so it's making C# only best practice advice. When I convert this to ...

modelBuilder.Entity<Bucket>().HasQueryFilter(i => i.Roles.Any());

The problem is gone!

It seems i'm falling victim to VS not understanding EF / SQL. Of course it makes sense when you consider that SQL evaluations of EXISTS(...) vs SELECT COUNT() evaluations are drastically different.

Is this a weird edge case that EF can handle? I don't even know if this is to be honest ... but sheesh! My problem is at least solved.

I will leave this to you to decide if you want to close the ticket or do more investigation on it though @roji but I've learned a valuable lesson here about attention to detail and the differences between .Count() and .Any()

TehWardy avatar Apr 30 '24 10:04 TehWardy

Interesting, thanks for diving into it @TehWardy... Especially given the recommendation to replace Any() with .Count > 0, EF should probably identify this and normailze the latter to the former - I'll take a look.

roji avatar Apr 30 '24 10:04 roji

Closing in favor of #33644

roji avatar Apr 30 '24 13:04 roji

Duplicate of #33644

roji avatar Apr 30 '24 13:04 roji

FYI merged #33644 which causes EF to treat Count > 0 (and Count != 0) just like Any(), so this should no longer be a problem.

roji avatar May 02 '24 05:05 roji