-
Notifications
You must be signed in to change notification settings - Fork 1k
[data design] Observations #14696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This comment was marked as outdated.
This comment was marked as outdated.
I think it might make sense to have
I could totally forsee an
JTI seems to be nice because it would allow our observations for niche or interesting use-cases add on additional structured data. Cribbing on JSONB for the payload to start makes good sense as we learn, but I think there is an argument to be made that in the future we may want more columnar data. |
Great comments! An interesting part here is that the concept of the Then use JTI to have distinct tables for ProjectObservations, ReleaseObservations, UserObservations, et al? I'll continue tinkering with this model to figure out how these actually relate to each other and put something here for to look at. |
After further tinkering with this data model even more, I'm coming to a simpler approach for the Observers side. A model & mixin that looks a little like: class Observer(Base):
__tablename__ = "observer"
id: Mapped[UUID] = mapped_column(primary_key=True)
...
class RefObserverMixin:
observer_id: Mapped[UUID | None] = mapped_column(
ForeignKey("observer.id"), comment="Observer ID. Null if not set"
)
@declared_attr.directive
def observer(cls) -> Mapped[Observer]:
return relationship(
Observer, cascade="all, delete-orphan", single_parent=True
) should allow us to add a(nother) mixin to One drawback is that an Observer has no clue where it came from. 🤔 |
sqlalchemy-utils has a Generic Relationship that looks like it might be closer to the thing we want: https://sqlalchemy-utils.readthedocs.io/en/latest/generic_relationship.html but there might be an issue with sqlalchemy 2.0.22. I'll try it out. |
Considering there's still an unresolved issue with sqlalchemy-utils vs sqlalchemy, I kept reading the docs and tinkering with different implementations until I found one that worked well enough for now. In this case, we get an ObserverAssociation table that stores an ID, a discriminator type, and the related ID of the thing we're discriminating with (User today, anything mixing in the relation tomorrow). This adds a column to the mixed-in model (User) to store the related observer ID so we can "step" through the Observer association tables to get Observations (which have yet to exist). I'll proceed with adding an Observer model and mixin. One outstanding question I have is what do we expect to happen if the original Observer parent is removed? Do we expect the associated Observer to also go away? And what of any of the Observations that Observer made? Should they go away too? Or do we accept that if we remove an Observer's parent, we get an orphaned Observer and preserve their Observations. I'm going with "Don't delete the Observer, since they own Observations." We might do something where we re-link to |
After working through a few approaches, I have been able to wrangle our code and sqlalchemy to produce this (abbreviated) data model so far (Thanks @ewdurbin and DBML for making the representation relatively easy!) ![]() In this example, I've mixed in an Observer to The next step here is to determine how to model the relationships so I can ask any mixed-in Observer for all of the *Observations they made. |
See #14834 for what I've come up with so far. There's something I'm missing on the creation patterns through proxies, so have done some in-place hackery that needs to be resolved, as well as being able to reference all *Observations from a given |
@miketheman did #14834 fully implement the concerns you had remaining with proxies? |
@ewdurbin I think it did - it's still not as "ActiveRecord-y" as I want it to be, but it is good enough for how we use it. Thanks for following up! |
Today we use
Event/HasEvents
to mix in to a model to produce distinct*_events
tables for the associated model. It's cool, it works. 🎉 The pattern is use for Events is calledtable_per_related
.We often will query for events like
someproject.events
to get all events related to theProject
,somerelease.events
for all events related to aRelease
, etc.Since the only ForeignKey is to the parent (mixed in) class, there's rarely ever a need to express "find all Events for all Releases of a given Project".
In thinking about Observations as a similar pattern, I'm considering adding a FK to the User that provided the Observation.
If we wanted to ask "Show me all kinds of Observations made by this User", we would need to collect the observations from any mixed-in models, a la (
user.project_observations + user.release_observations + ...
)If we wanted to get all observations from a related User in one-shot, a la
user.observations
(using relationship on a polymorphic identity) we have to model things a little differently.We have two options here:
Read more in https://docs.sqlalchemy.org/en/20/orm/inheritance.html
We currently use JTI with OIDC publishers resulting in 3 discrete tables and +1 for any new publisher added in the future. Looks a little like this:
JTI works well for OIDC since the distinct tables differ in their structure, whereas
Events
(and the futureObservations
) conform to a more uniform structure.Using STI to show a proposed table as an example like this:
Questions in my head are:
user.id
a good or bad idea?I think good, but maybe not?
user.observations
?I can imagine wanting to look at what Observations a User had provided over time, probably other use cases.
STI seems simpler to me as there's fewer tables to content with, and fewer JOINs in the resulting query.
(These questions are similar to ones I had about Roles, TeamRoles, and OrganizationRoles, so it's possible that the learnings here will materialize in changes over there as well)
The text was updated successfully, but these errors were encountered: