Skip to content

Preparing for Federation: "Remote" Users / Federated ID #9045

@criztovyl

Description

@criztovyl

Hello,

I am working with the ForgeFed working group on federating software development platforms (already mention in #1612 and #184).

While the spec is still in the works, I think some basics, independent of the actual content and requirements of the spec can already be implemented to prepare for federation.
Building on this preparations it might be easier to build a Gitea-ForgeFed PoC to better understand the things the spec needs to cover.
Or you could even implement an Gitea-specific federation, although I guess that might be unfair to ForgeFed :P

To me one of this basics is addressing the challenge in the difference of assumptions centralized and decentralized systems make about users.
Federation is a decentralised system made up of centralized systems, so this differences will need to be addressed if you want to support federation.

One (to me central) difference in assumptions can be expressed through the question
of the interchangeability of the "user" concept of centralized systems and decentralized systems.

For example in both centralized and decentralized systems as PR object will be attributed to something that can be conceptualized as a user.

In a centralized system that user can be uniquely identified and addressed just by the username.
While usernames are still a thing in federated systems, there they are neither unique nor are they enough to identity the user, you need additional information for that. In federated systems this information normally is the instance. Together they then form the federated ID.

(Please excuse the cryptic nature of the following paragraph, but for completeness I want mention it here anyway:) While for the UI level user and instance would enough, on the technical level it's not enough for ForgeFed. Due to building on ActivityPub, in turn leveraging Linked Data concepts, which in turn build on the Web, internet-level username+instance is not enough and you won't get around URIs on the Web-level. Due to it and it's limitations already mentioned in the linked issues, I leave off WebFinger here.

Anyway, this issue is concerned with the change of the central identifier of user's from their username to an federated ID (in whatever form).

There are this strategies I am aware of to add a federated ID to an data model (there are not completely nor genuinely mine):

  • add the federated ID to the user entity
  • use an dedicated entity for users coming from federation
  • split the user entity into "Authentication" and "Identity" entities, where local users have both and federated users just the latter.

I, personally, am in favour of the third option because to me that is the clearer way to address the changes that are required to the object model.

If you just add the federated ID to the user entity this means you also have to break a central assumption about the user entity: their usernames no longer being unique. I feel uncomfortable with the implications this leads to, mainly concerning the mix-up of the local-authentication and general-identity domains.

You could also introduce federated users as new concept with their own entity.
This also means you will have to touch all the areas that are concerned about displaying or attributing users. I think you won't get around that if you want to do it good and right. But additionally you will have to duplicate things that you already have for users, like profiles, maybe display logic, etc. (Go not having generics even making this harder to implement and maintain non-redundantly.)

This brings me to the third approach, separating the user entity.
It builds on the assumption that the user entity actually is made up of two entities, the authentication (for login) entity and the identity entity.
The two entities are not independent in centralized systems, so they are combined into one. In federated systems on the other hand they are independent: Not every identity has login information associated with it. Due to that it does not make sense to combine them. This leads to the proposal of splitting up the user entity.
You then have one entity with (login-)username, password and the corresponding identity reference. And one entity with all the other identity suff, like the display name, (non-unique) username, federated ID, avatar URL, ...
While you still need to touch much code, in this case you than can use the identity entity for both local and federated identities, sharing the logic surrounding them (e.g. profiles) :)

So far my stance on this. While I am in favour of my approach feel free to advocate for the other approaches, in the end it's mainly a point of view, I am open to input. :)

Thanks for reading.

Activity

lafriks

lafriks commented on Nov 19, 2019

@lafriks
Member

It actually really depends on what are use case about what federated/external user will be able to do. If they can not host repositories than it is ok to have federated user table and moving display name, username etc to it. Through it would require total rewrite on every part of code where local user table is used to now use federated user table that could be hell of a work :)

added
type/proposalThe new feature has not been accepted yet but needs to be discussed first.
on Dec 14, 2019
lunny

lunny commented on Dec 14, 2019

@lunny
Member

I totally agree @lafriks. Before we begin to do that, we need to know some questions:

  1. Who wants this feature? Personal gitea user / Companies with private gitea / Git hosting website via gitea or others?
  2. Why they need this feature?
  3. How they want to use this feature?
criztovyl

criztovyl commented on Dec 15, 2019

@criztovyl
Author

If they can not host repositories than it is ok to have federated user table and moving display name, username etc to it.

Federated users can have repositories, but hosted on the instance of the user.
In other words, you don't only have federated users but also federated projects (incl. repos, issues, PRs, etc). But that's out of scope of this issue.

  1. Who wants this feature? Personal gitea user / Companies with private gitea / Git hosting website via gitea or others?

People like me who don't like the idea of development of Free and Open-Source software being centralized on, well, centralized, non-free services. (GitHub)

  1. Why they need this feature?

While there exists Free alternatives like Gitea and GitLab, their instances are isolated from each other and therefore have usage disadvantages. (And are subject to network effects.)
Even if I am motivated to make an account on, let's say Debian GitLab, it's limited to that instance.
In the worst case each project has it's own instance and I have so many accounts to check. (Yes there is email notifications, but we want something web-by otherwise Git+ML would be enough, too.)

  1. How they want to use this feature?

For federated collaboration, breaking GitHub's network effect.

I have 1 account on my home instance and can collaborate with any project that supports federation (i.e. ForgeFed).

criztovyl

criztovyl commented on Dec 15, 2019

@criztovyl
Author

gitea/models/user.go

Lines 94 to 97 in 7217b70

ID int64 `xorm:"pk autoincr"`
LowerName string `xorm:"UNIQUE NOT NULL"`
Name string `xorm:"UNIQUE NOT NULL"`
FullName string

Is it correct that

  • Name is the user name (criztovyl) and
  • FullName is the display name ("Christoph Schulz")

?

But what is the reason for LowerName? (strings.ToLower(u.Name))

axifive

axifive commented on Dec 15, 2019

@axifive
Member

In order not to use lower(Name) in SQL queries.

criztovyl

criztovyl commented on Dec 17, 2019

@criztovyl
Author

In order not to use lower(Name) in SQL queries.

But why at all? Don't want to change it, just understand it. :)


I started toying a little bit around and stupidly moved some attributes from user.go (type User) to a new user_identity.go (type Identity) and added some code to AfterLoad that fills the old attributes with values from the identity.

Afterwards I ran into problems writing a migration that moves the attributes from User to Identity. I will look further into this by studying existing migrations and xorm docs, but would be really happy for some further hints. :)

I will share some commits next time.

axifive

axifive commented on Dec 17, 2019

@axifive
Member

But why at all? Don't want to change it, just understand it. :)

Because the username is case insensitive, and when a user sends requests like https://try.gitea.io/AxIFiVe or https://try.gitea.io/Axifive need to quickly find the unique user, so we just call strings.ToLower(username) and sends a simple SQL query.
Convert fields to lowercase for selection is a very expensive DB operation, it is much easier to add a second field.

criztovyl

criztovyl commented on Jan 12, 2020

@criztovyl
Author

I promised code.

I just pushed the code I am struggling with, you can find it here: https://github.com/criztovyl/gitea/blob/master/models/migrations/v200.go (200 to make merging/rebasing easier)

Line 30, the sync of the Identity model, works as expected: the table is created.
But Line 35, the sync of the User model, does not seem to work, the table still has the old columns afterwards.

zeripath

zeripath commented on Jan 13, 2020

@zeripath
Contributor

Sync2 will only add columns.

If you want to delete columns you need to use:

func dropTableColumns(sess *xorm.Session, tableName string, columnNames ...string) (err error) {

Please note that migrations should not refer to things in models or elsewhere - those could be changed in future migrations. They have to be completely self contained - no references to other Gitea code.

So:

https://github.com/criztovyl/gitea/blob/2b4065f5ea0921090092b4a5ed61db3bdde2d725/models/migrations/v200.go#L30

Here you need to actually have a copy of identity. Similarly here:

https://github.com/criztovyl/gitea/blob/2b4065f5ea0921090092b4a5ed61db3bdde2d725/models/migrations/v200.go#L14

This was probably intended to be xorm: "-" in which case it's probably not needed in the migration.

Looking at your proposed identity table how come you can't use login_source for this?

criztovyl

criztovyl commented on Jan 13, 2020

@criztovyl
Author

Thanks for the hints :)

The issue with login_source, as far as I analysed it, is that it still requires usernames to be unique on their own.

For federation, usernames are not unique, only the Identity is (where the unique identifier for such identities is typically an URI/IRI); usernames are more like a further-limited display name.

criztovyl

criztovyl commented on Jan 13, 2020

@criztovyl
Author

https://github.com/criztovyl/gitea/blob/2b4065f5ea0921090092b4a5ed61db3bdde2d725/models/migrations/v200.go#L14

This was probably intended to be xorm: "-" in which case it's probably not needed in the migration.

I see, but how is the Identity referenced by the IdentityId filled? Is it automagically or do I need to add it somewhere?

And if I create the table initially, the I still need the Identity model, so I cannot leave it off, right?

criztovyl

criztovyl commented on Jan 13, 2020

@criztovyl
Author

And another quick question; is it possible to run gitea instance that has all the testdata from the fixtures in it's db?

I am sometimes of visual type. For example would I like to verify the profile is still displayed correctly, e.g. has the right organization assignments.

jolheiser

jolheiser commented on Jan 13, 2020

@jolheiser
Member

You can take a look at https://github.com/go-gitea/gitea/blob/master/contrib/pr/checkout.go as it loads fixtures to help us check PRs.

You could make a modified version perhaps.

criztovyl

criztovyl commented on Jan 13, 2020

@criztovyl
Author

go run -tags "sqlite sqlite_unlock_notify" contrib/pr/checkout.go -run works like charm :)

25 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    topic/federationtype/proposalThe new feature has not been accepted yet but needs to be discussed first.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Participants

      @ptman@lunny@techknowlogick@lafriks@yitsushi

      Issue actions

        Preparing for Federation: "Remote" Users / Federated ID · Issue #9045 · go-gitea/gitea