EF Core - Partitioning Filter

Asked

Viewed 227 times

7

I am working in a database that has the following feature, 90% a clustered index that integrates two columns.: FilialID and DataCriacao.

These tables are partitioned by FilialID, the simple creation or removal of a Filial causes the re-creation of these partitions.

At system level, FilialID is unique and can be obtained globally, so I need to add a filtro standard in all queries, where all entities that own the property FilialID will be filtered by him.

for example, if I do the following query.:

var entity = db.EntitiesA
    .Include(x => x.EntitiesB)
    .Include(x => c.EntitiesC)
    .Find(x => x.EntityAID = id);

should generate a similar consultation to the following.:

SELECT *
FROM EntitiesA A
JOIN EntitiesB B ON A.EntityAID = B.EntityAID
JOIN EntitiesB C ON A.EntityAID = C.EntityAID
WHERE 
    A.EntityAID = @id AND
    A.FilialID = @filialId AND
    B.FilialID = @filialId AND
    C.FilialID = @filialId

MOTIVATION

Queries

Queries that use Partition elimination could have comparable or improved performance with Larger number of Partitions. Queries that do not use Partition elimination could take longer to execute as the number of Partitions increases.

For example, assume a table has 100 Million Rows and Columns A, B, and C. In scenario 1, the table is divided into 1000 Partitions on column A. In scenario 2, the table is divided into 10,000 Partitions on column a. A query on the table that has a WHERE clause Filtering on column A will perform Partition elimination and scan one Partition. That same query may run Faster in scenario 2 as there are Fewer Rows to scan in a Partition. A query that has a WHERE clause Filtering on column B will scan all Partitions. The query may run Faster in scenario 1 than in scenario 2 as there are Fewer Partitions to scan.

Queries que use Operators such as TOP or MAX/MIN on Columns other than the partitioning column may Experience Reduced performance with partitioning because all Partitions must be evaluated.

The result of the above consultation will be the same if I inform the FilialID or omit it, but when informing the same, the partitions referring to the other Filiais will be ignored, thus avoiding locks unnecessary and improving consultation performance.

EDIT

Attempt using the Query Filter of EntityFramework-Plus, but the same does not work with Includes.

public MyContext()
{
    this.Filter<EntityA>(set => set.Where(entity => entity.FilialID == Global.FilialID));
    this.Filter<EntityB>(set => set.Where(entity => entity.FilialID == Global.FilialID));
    this.Filter<EntityC>(set => set.Where(entity => entity.FilialID == Global.FilialID));
}
  • Note that this partition advantage is only true for the first table in the query. In your first example, it is only advantageous to have the FilialID in the parent table, ie, EntitiesA. As the EntitiesB and EntitiesC are already "tied" in the EntitiesA probably by a Foreign key, add the field FilialID in these two is not only redundant, it does not help at all in the performance (incidentally, it can cause an extra check that is unnecessary). But for the first one it is indeed interesting yes, or if you have to do JOIN with a table that doesn’t have FK with the others, then it would also make sense.

  • Now returning to the question, your intention is to automatically add a WHERE() in the query (maybe inside the DbContext) to filter this field FiltroID (provided that Global.FilialID have some value filled), without needing to warm up with this in the rest of the system, is this?

  • Btw my first statement is not valid if you can have the same Entitya Id for different branches, ie if the table is double key.

3 answers

3

The gypsy’s response led me to the EF Core 2.0, as well as the Roadmap and the Preview of the same.

In EF 2.0 it is possible to do the following:

public abstract class EntidadeBase
{
    public Guid FilialID { get; set; }
    public DateTime DataCriacao { get; set; }
}

public class EntidadeA : EntidadeBase
{
    public Guid EntidadeAID { get; set; }

    public ICollection<EntidadeB> EntidadesB { get; set; }
    public ICollection<EntidadeC> EntidadesC { get; set; }
}

public class EntidadeB : EntidadeBase
{
    public Guid EntidadeBID { get; set; }
    public Guid EntidadeAID { get; set; }
    public EntidadeA EntidadeA { get; set; }
}

public class EntidadeC : EntidadeBase
{
    public Guid EntidadeCID { get; set; }
    public Guid EntidadeAID { get; set; }
    public EntidadeA EntidadeA { get; set; }
}

public class MyContext : DbContext
{
    public static Guid FilialID { get; set; }

    public DbSet<EntidadeA> EntidadesA { get; set; }
    public DbSet<EntidadeB> EntidadesB { get; set; }
    public DbSet<EntidadeC> EntidadesB { get; set; }

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.Entity<EntidadeBase>().HasQueryFilter(p => p.FilialID == MyContext.FilialID);
    }
}

2

I do not know if I understood your question very well, but it seems you want to generate a generic query with a global filter, since in your application the field FilialId is a global field.

An approach that can meet your need would be the creation of a IQueryable of the desired table within the context, which would already implement its filter, below an example:

public class MyContext : DbContext
{
    public DbSet<EntityA> EntitiesASet { get; set; }

    public IQueryable<EntityA> EntitiesA
    {
        return EntitiesASet
                   .Include(x => x.EntitiesB)
                   .Include(x => c.EntitiesC)
                   .Where(x => x.FilialID == Global.FilialID);
    }
}

In this case, when you call the code:

var dados = db.EntitiesA.ToList();

The data will already come filtered with the desired filter.

Observing: I have used approaches like this in Entityframework 6.1.3, never tested with EF Core, but I believe it will work.

  • Filialid is not the primary key, so x.Entityaid == Global.Filialid should not bring anything unless there is a collision of guids (very unlikely).

  • Filialid should not interfere with the results of the queries, but to make good use of the partitioning functions, it is necessary to include all the cluesterized fields of the partitioned tables involved in the consultation in Where.

  • So regardless of using include or joins, do the Lazy or Eager load, the condition should be added.

  • @Tobiasmesquita, so what you need is to group by affiliateID?

  • You do not have a Foreign Key that interconnects Entitya, Entityb and Entityc through the "Filialid field" ?

2


On the date of this reply:

Two alternatives:

  • Back to Entity Framework 6;
  • Use a painful form.

The painful form is:

var entity = db.EntitiesA
    .Include(x => x.EntitiesB)
    .Include(x => c.EntitiesC)
    .Where(...)
    .FiltrarPorFilial();

.FiltrarPorFilial() is an extension:

public static IEnumerable<T> FiltrarPorFilial(this IQueryable<T> consulta)
{
    foreach (var registro in consulta.ToList()) 
    {
        // Faça aqui yield return de todos os registros que estejam nas condições desejadas. 
    }
}
  • 1

    I saw the EF Core 2.0 Roadmap here: Global query filters (#5774) - Allows a vertical filter to be configured for an entity type. This filter then applies to all queries, including eager loading (i.e. Include()).

  • 1

    Just remembering that global filter is one thing, and filter by demand on Include() is another. The global filter works to Includes(), but it doesn’t have the same dynamic filter expression power as in Entityframework.Dynamicfilters.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.