-
Notifications
You must be signed in to change notification settings - Fork 1k
Remember device for 30 days - Part 1 #13166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
def _generate_device_id(self, user: User) -> str: | ||
""" | ||
Generates a unique device id for the given user. | ||
""" | ||
attempts_max = 1000 | ||
attempts = 0 | ||
while attempts < attempts_max: | ||
device_id = secrets.token_urlsafe(DEVICE_ID_BYTES) | ||
if not any(device.device_id == device_id for device in user.user_devices): | ||
return device_id | ||
attempts += 1 | ||
raise RuntimeError("Failed to generate device id") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why we need a device ID and a device secret? My first thought is that it should be sufficient to generate a random token, store it in a cookie, and store the hash of that token along with a reference to the user in the database.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could even just model it as
class UserDevice(db.Model):
__tablename__ = "user_devices"
user_id = Column(
UUID(as_uuid=True), ForeignKey("users.id"), nullable=False, index=True
)
device_id = Column(
UUID(as_uuid=True), unique=True, nullable=False, index=True
)
saved_date = Column(
DateTime(timezone=False), nullable=False, server_default=sql.func.now()
)
device_info = # ....maybe a JSONB column?
And instead of storing a secret at all, use the TokenService
to send a signed token containing {"username": User.username, "device_id": UserDevice.id}
.
We can then just see if the signed token validates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I thought it would be important to use a slow hash with a salt for the device secret, so that if an attacker gained access to the database, they would have a very hard time reversing the hash and finding the original secret. Using a salt means we have to hash the secret for each value we want to compare in the database. Since we're using a slow hash, I didn't want to have to hash the secret multiple times; if a user had several devices, that could add noticeable time. This is why I introduced the device ID.
Now, I'll admit, I might be going a bit overkill with the security here; a slow hash with a salt might not be necessary with a randomly-generated 128-bit string. It could be fine storing a plain SHA256 hash of the secret in the database. But since this is a highly security sensitive area, I figured I would rather err on the side of too much security.
In regards to sending a signed token instead, if we could store all the necessary information in the token and not need a new database table, I think that would be the better option. However, since we will probably want to give the user the ability to revoke devices, I think a database table is necessary. I think using a signed token or a hash of a secret are both equally good approaches. I'm inclined to leave it as a hash of a secret since that's what I already have, but if there are advantages to the signed token that I'm not thinking of, I am willing to switch to that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree keeping a record in the DB for the Device is necessary for revocation. Only reason I think I prefer the token approach is that it makes the data in the table a bit more transparent, I'm not committed to it. Definitely welcome other thoughts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I heard you on the brute-force side of things, but I'm still leaning towards using a TokenService rather than storing the hashed secret in the database.
Not to rely on tokens themselves, but just for the handling. When a device is saved we still retain the data in the DB for revocation purposes, we just send a signed token to the client that they can provide at login for warehouse to validate.
Any thoughts from the other @pypi/warehouse-admins?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can change this to use a TokenService instead. I'm thinking that to complete this issue (#5867), we won't need to add the database table yet. When we do the followup work to add the revocation, we can add the database table then. For now, we would just store the cookie in the user's browser and then verify it on the server side. Does that sound reasonable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ewdurbin What are your thoughts on my previous question?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll plan on using a signed token that is verified server side without a database table at the current time. I'm going to close this PR and create a new one since this is a significant rework.
My reasoning for not creating the database table is because revocation will not be a part of this current phase, so we shouldn't add code and data for functionality that won't actually be used. It's possible that we might not get around to implementation the revocation phase for a while. If you think I should include revocation in this current phase or I should include the database table in this current phase in preparation for future revocation, let me know, and I'm happy to adjust my plans.
e964376
to
3e038d0
Compare
def _generate_device_id(self, user: User) -> str: | ||
""" | ||
Generates a unique device id for the given user. | ||
""" | ||
attempts_max = 1000 | ||
attempts = 0 | ||
while attempts < attempts_max: | ||
device_id = secrets.token_urlsafe(DEVICE_ID_BYTES) | ||
if not any(device.device_id == device_id for device in user.user_devices): | ||
return device_id | ||
attempts += 1 | ||
raise RuntimeError("Failed to generate device id") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I heard you on the brute-force side of things, but I'm still leaning towards using a TokenService rather than storing the hashed secret in the database.
Not to rely on tokens themselves, but just for the handling. When a device is saved we still retain the data in the DB for revocation purposes, we just send a signed token to the client that they can provide at login for warehouse to validate.
Any thoughts from the other @pypi/warehouse-admins?
Adding new db table and user_service functions for saving devices
3e038d0
to
0eb13ae
Compare
Closing to rework this to use a signed token as ewdurbin suggested. |
I am breaking this feature up into 2 PRs to make reviewing easier. This PR adds the new db table and user_service functions for saving devices. Part 2 of this feature can be found in ristomcgehee#2 in case you want to see how this code is used, but you shouldn't need to review Part 2 to review this PR. This PR can be merged to main independently of Part 2.
Part of #5867