Description
Unlike in previous versions of EF, currently calling Add()
on an object using EF7 won't mark any of its related objects as added.
We know this is not the only behavior that customers expect from Add()
(see #2434 and #2387 as examples), however this is certainly a behavior we wanted to expose because in past versions of EF we had a lot of feedback from customers not expecting certain related objects to be marked as added.
On one hand this kind of feedback has been very common among customers trying to work with disconnected graphs: Since EF does not provide a high level API for disconnected graphs, in those situations it is quite usual to use an algorithm that walks the graph and explicitly marks each individual object with a specific state, making it desirable to use an API that won't have any side effects on the state of other objects.
On the other hand even customers that used Add()
in a standard unit of work pattern (and for whom it is very useful that the method automatically adds related objects at once) were often surprised that previous versions of EF would mark the whole graph as added.
From the perspective of those customers we seem to fundamentally lack the ability to differentiate between objects which should be added from those that shouldn't, however there doesn't seem to be a single rationale that informs their expectations.
As a general approach, we have talked about having two different APIs for the two different behaviors, e.g.:
Add()
for single objects would be for customers writing lower level logic and needing the finer controlAdd()
for graphs would be for customers using the context on a unit of work pattern and wanting the API to smartly mark additional objects as added
We could simply call them AddSingleObject()
and AddGraph()
, but another option is to just make the Add()
method the one that tries to automatically infer what should be added and have a separate lower level/fine control API that allows to set the state to added on a single object, e.g. what we already have in the EntityEntry.State.
Regardless of the shape of the API, we need to figure out what behavior to use for Add()
for graphs.
Here are some options:
-
Leverage containment/aggregates:
Aggregate-oriented databases (e.g. document databases) offer a good frame of reference. Adding an aggregate root object inherently adds the whole aggregate but doesn't add anything external to the agregate.From an O/RM perspective the key to emulate this behavior would be to have knowledge of the shape of the aggregate. As mentioned in the comments below, GraphDiff is an example of an extension for EF6 that layers this type of knowledge on top of the EF model to enable smarter decisions on state transitions, e.g. if a navigation property is "painted" with a call to
AssociatedCollection()
orAssociatedEntity()
, then adds won't spread through the relationship in the direction of the navigation property. On the other hand if the navigation property is touched by a call toOwnedCollection()
orOwnedEntity()
then adds will spread through that relationship in the direction of the navigation property.We currently don't have the ability to model aggregates on EF (it something that we have talked about adding in the future, see Take advantage of ownership to enable aggregate behaviors in model #1985), but we actually do have at least a rudimentary understanding of child vs. non-child objects, which we use (or are currently planning to use) to drive cascading/orphan deletes for
Remove()
. In that sense @tonysneed offered interesting feedback some time ago saying that he expected the behavior ofAdd()
for graphs to be more consistent withRemove()
:with the vCurrent behavior ... if I add a territory to an employee, then the territory will be marked as Added and EF will attempt to add it to the Territories table. If I want to just add the relationship to an existing territory, then I need to explicitly mark it as Unchanged, which makes sense but isn't all that intuitive. It's also inconsistent with the behavior for removing entities from a relationship, where removing a territory from an employee will not mark it as Deleted, without the need to explicitly mark it as Unchanged.
We could indeed use this kind of rudimentary "aggregates by convention" based on FK relationships as the basis of the behavior of
Add()
for graphs. If we later improve EF to be more aware of aggregates,Add()
for graphs would automatically pick up the new information. -
Leverage key values:
Another approach @rowanmiller suggested to explore is to use key values to determine which objects need to be added.Note that basing the rules just on aggregates, the following code would insert a new Post, but would assume the associated Blog is existing:
var blog = new Blog { Title = "Blogging from the leaf" }; context.Posts.Add(new Post { Contents = x, Blog = blog});
In order to get the Blog object also added, you would need to indicate it explicitly:
var blog = context.Blogs.Add(new Blog { Title = "Blogging from the leaf" }); context.Posts.Add(new Post { Contents = x, Blog = blog});
Note that if the key of Blog is generated, we would actually have enough information to know whether it is an existing entity by just looking at its key. Indeed, if we leverage key values we can infer what objects need to be Added irrespective of the shape of the aggregate and the direction of relationships: assuming generated keys, the first snippet would add both Blog and Post, regardless of the fact that Blog is the principal and Post is being added.
Note that this would have equivalent semantics to the aggregate approach for a number of cases, e.g. for identifying relationships any child object of an added principal should have temporary keys too and therefore would become added automatically.
The approach per se won't work well for entities that don't have generated keys as those will look to EF as existing entities. We need to understand if that is ok or whether we would fallback to a different strategy.
-
Hybrid:
The idea is to figure out a way in which can leverage both aggregates and keys, e.g. leverage keys but fallback to aggregates for objects that don't have generated keys.
Modified objects on the edge
It seems that for all cases any dependent entity that we won't add but that points to a principal that is added should be updated to be able to update the FK value.