Single-table hierarchical data in EF Core
StackOverflow user Marlolsh had a database table with a simple hierarchy – one of the columns was a parent ID, referencing the same table. When getting all entities from the database table as a JSON tree, some entities appeared twice, and they wanted some help to remove duplicates from the output.
The entity class looked similar to this:
public class Entity
{
[Key]
public int Id { get; set; }
public string Description { get; set; }
[ForeignKey("ParentId")]
[JsonIgnore]
public int? ParentId { get; set; }
[JsonIgnore]
public virtual Entity Parent { get; set; }
public virtual IEnumerable<Entity> Children { get; set; }
}
The code for selecting all entities from the database was something like this:
return dbContext.Entities
.Include(x => x.Children)
.ToList();
The resulting JSON output returned included duplicates of all non-top-level entities:
[
{
"id": 5,
"description": "Animal"
"children": [
{
"id": 13
"description": "Dog"
"children": null
},
{
"id": 17
"description": "Cat"
"children": null
}
]
},
{
"id": 13
"description": "Dog"
"children": null
},
{
"id": 17
"description": "Cat"
"children": null
}
]
In some cases, this would be the desired result, but in most cases I would think not. When you have a tree-like structure like this, you should probably not get child-elements included on the top level. Instead, what Marlolsh is looking for makes much more sense:
[
{
"id": 5,
"description": "Animal"
"children": [
{
"id": 13
"description": "Dog"
"children": null
},
{
"id": 17
"description": "Cat"
"children": null
}
]
}
]
Solution
The problem is that too many entities are included on the top level, so the solution is to add a filtering on the top level. By including a call to the Where
method, we can keep only those entities that are not the child of some other entity. We do that by looking for entities where Parent == null
:
return dbContext.Entities
.Where(x => x.Parent == null)
.Include(x => x.Children)
.ToList();
The Where
clause only affects the top-level entities, and the Include
method makes sure that all children are included in the result. Only the top-level entities are present at the root level of the resulting JSON output.