EntityFramework6.Npgsql icon indicating copy to clipboard operation
EntityFramework6.Npgsql copied to clipboard

EntityFramework 6 PostgreSQL NpgSQL performance with big tables

Open alvisemion opened this issue 7 years ago • 2 comments

This is my environment:

  • Microsoft Framework .NET 4.5 (64bit)
  • EntityFramework 6 (file version: 6.1.3129.0)
  • EntityFramework6.Npgsql 3.1.1.0
  • Npgsql 3.1.0 (file vers 3.1.0.20516)

I've a PostgreSQL instance, version 9.5 on a Centos.

I've created an .NET application using Code first context.

In OnModelCreating i've set default schema

  protected override void OnModelCreating(DbModelBuilder modelBuilder)
    {
    	var schema = System.Configuration.ConfigurationManager.AppSettings["schema"];
    	modelBuilder.HasDefaultSchema(schema);
    	...
    }

I've a database with about 220 tables, only the first 5 tables have million rows:

I've a big table with ~4.7M rows. The table is PRATICA and ID_PRATICA is a bigint column marked as PRIMARY KEY.

1 COUNT VIA ENTITIY FRAMEWORK

I've profiled a simple count on primary key filter.

    var idPra = new Decimal(23572340);
    var num = svcPra.GetListCount(x => x.ID_PRATICA == idPra);

where svcPra is class that implement this simple method:

public virtual long GetListCount(
    	Expression<Func<TEntity, bool>> filter)
    {
    	IQueryable<TEntity> qry = Session.Set<TEntity>();
    
    	if (filter != null)
    	{
    		qry = qry.Where(filter);
    	}
    
    	return qry.LongCount();
    }

The sql generated (profiled) is

SELECT "GroupBy1"."A1" AS "C1"
FROM   (SELECT CAST (count(1) AS int8) AS "A1"
        FROM   "BOLLIEX"."PRATICA" AS "Extent1"
        WHERE  "Extent1"."ID_PRATICA" = p0 /* @p__linq__0 */) AS "GroupBy1"

where p0 is the parameter

RESULT: That count via Entity Framework takes over 3000 milliseconds I've ignored the first time query that is involved in well know "first query slow problem" issue

2 COUNT VIA NPGSQLCONNECTION

I've used the same connstring used in above Entity Frameork context.

var sql = @"
    SELECT ""GroupBy1"".""A1"" AS ""C1""
    FROM   (SELECT CAST (count(1) AS int8) AS ""A1""
            FROM   ""BOLLIEX"".""PRATICA"" AS ""Extent1""
            WHERE  ""Extent1"".""ID_PRATICA"" = 23572340 /* @p__linq__0 */) AS ""GroupBy1""
    ";
    
    using (var conn = new Npgsql.NpgsqlConnection(connString))
    {
    	conn.Open();
    	using (var cmd = new Npgsql.NpgsqlCommand(sql, conn))
    	{
    		var r = cmd.ExecuteReader();
    		while (r.Read())
    		{
    			var v = r[0];
    		}
    
    	}
    }

RESULT: That command takes ~20 milliseconds

3 COUNT VIA PGADMIN EDITOR

I've executed the above count query in pg admin query editor.

RESULT: The count takes ~15 milliseconds

4 CONDITIONS

I've tried to modify the schema, using public schema with no result. I've used VACUUM, ANALYZE, REINDEX command for table and for entire database with no result.

5 OTHER TESTS

  1. I've downloaded : EF sources EF.Npgsql sources Npgsql sources

    I come into deep debug and the problem seems not involve EF.Npgsql but how npgsql works.
    
  2. I've alterd PRATICA structure to a single column table with million records with no effects, slow query.

  3. I've reduced PRATICA table to 1M rows and the EF GetListCount takes around 2000 ms Than i've reduced PRATICA table to 300K rows and the EF GetListCount reduced to less than 1000ms

  4. I've tried to get a list of item instead counting. I've tried with and with out paging but fetching items take seconds!

  5. I've tried to change connection string parameter as pooling

  6. I've created a PostgreSql database instance on MS windows 2012 with Centos's same hardware configuration. No effect on performance, same slowness.

6 CONCLUSION

It seems that EF with PostgreSQL via npgsql affect performance with big (huge rows number) tables.

alvisemion avatar Feb 13 '18 11:02 alvisemion

A bit old maybe, but aint your problem here that you query a table with bigint but with a decimal parameter, which i guess makes postgres having to convert your id column to decimal while doing the where check?

Decimal vs Bigint?

McDoit avatar Jun 04 '19 14:06 McDoit

@McDoit sounds like an excellent suggestion...

roji avatar Jun 12 '19 17:06 roji