-
Notifications
You must be signed in to change notification settings - Fork 102
Open
Description
@NorfairKing's blog post describes a HashDoS attack on aeson
which is enabled by u-c
's O(n) handling of hash collisions and use of the same default salt for all hashing operations.
While possible mitigation measures for aeson
are discussed in haskell/aeson#864, u-c
should also prepare a proper response and possibly implement security features for users who are affected via aeson
or in other ways.
In particular, I hope to find answers for the following questions in this thread (and possibly in separate sub-threads):
- What mitigation measures can affected
u-c
users enable in the short term? - Is security against collision attacks a design goal for
u-c
?
2a. If yes, to what extent should we trade performance and API bloat for security features? - What mitigation measures should be implemented in
u-c
?
I'd also like to point out that I have very limited knowledge and experience with security issues, so I'd be very grateful if more experienced people could chime in and share their advice. :)
eyeinsky, frasertweedale, ron-wolf, dhess and phlummox
Metadata
Metadata
Assignees
Type
Projects
Milestone
Relationships
Development
Select code repository
Activity
sjakobi commentedon Sep 13, 2021
A few ideas – maybe some more experienced people can comment whether these are any good:
hashable
with-frandom-init-seed
, to make it slightly harder to produce colliding keys. Thehashable
maintainer doesn't recommend this, but it should be somewhat useful anyway. See this discussion on r/haskell.HashMap
s andHashSet
s. Whenn
is small,O(n)
operations are less problematic.Data.Map
andData.Set
fromcontainers
instead of this package. These use theOrd
methods for performing lookups and insertions, and therefore aren't vulnerable to collision attacks.hashmap
package which relies onData.Map
for storing any collisions. Note that this package doesn't offer aStrict
API that ensures that map values are evaluated to WHNF. Maybe @RyanGlScott, @augustss or other users can comment in which cases this package is a suitable replacement foru-c
.EDIT 2020-10-03:
Note that a random hash salt has very limited security benefits as long as a weak hash function like FNV is used. For FNV, it is possible to construct multi-collisions that collide no matter what salt they are hashed with: https://medium.com/@robertgrosse/generating-64-bit-hash-collisions-to-dos-python-5b21404a5306
dhess commentedon Sep 13, 2021
Thanks for looking into this!
My only bit of feedback is this: because there are existing, deployed services that are vulnerable to this attack, and because shutting those services down or replacing them with something that doesn't use
aeson
is not feasible, a timely short-term fix is just as important as whatever long-term, proper fix the various parties come up with.In other words, please let's not let perfect be the enemy of good-enough-for-now here. Even something that makes producing collisions slightly more difficult is helpful at this stage, IMO.
I'm encouraged that you've jumped right into mitigations in your first 2 posts, so it looks like this discussion is off to a great start!
NorfairKing commentedon Sep 13, 2021
-frandom-init-seed
is good enough for the very short term (AFAICT)The collisionless containers approach solves the specific exploit that I've built, but doesn't guarantee that there's no other cheap way to produce an exploit.
treeowl commentedon Sep 13, 2021
Is there a way to mitigate the performance degradation of a random seed? How bad does it measure out?
brandon-leapyear commentedon Sep 13, 2021
✨ This is an old work account. Please reference @brandonchinn178 for all future communication ✨
I also opened this issue for another possible mitigation: haskell-unordered-containers/hashable#218
IIUC it's not that a random seed will reduce performance, it's that one would get different hash values every time one restarts the application
NorfairKing commentedon Sep 13, 2021
More ideas:
createHashmap :: IO (HashMap k v)
api where the hashmap stores its own (randomly generated) saltbrandon-leapyear commentedon Sep 13, 2021
✨ This is an old work account. Please reference @brandonchinn178 for all future communication ✨
+1 to the
createHashMap
idea, in addition tocreateHashMapWith :: Int -> HashMap k v
to get deterministic seeds (e.g. for getting deterministic results for tests)ysangkok commentedon Sep 13, 2021
Why should the function be in IO? We already have a class that captures needing random state, it is RandomGen. A PRNG would be good enough to solve this problem, real randomness is not needed.
treeowl commentedon Sep 13, 2021
A random seed will most likely be implemented something like this:
This means that every (initial) seed access has to check a tag and follow a pointer. The seed will almost always be in L1 cache when heavy
HashMap
use is happening, but we should check that it's not too bad to access it.sjakobi commentedon Sep 13, 2021
Here's some earlier discussion with @tibbe and @infinity0 on mitigating collision attacks: #265 (comment)
The gist is that in order to make it sufficiently hard for an attacker to produce hash collisions, you need both a strong hash function like SipHash and a random seed.
The problem with SipHash is that apparently it's so slow that you might as well switch to
Data.Map
– at least that's what was mentioned in our internal discussions. Nevertheless, ahashable
patch that usesSipHash
for theText
instance seems like a reasonable short-term mitigation measure when combined with-frandom-init-seed
.Regarding the proposed fix in #217, @tibbe's assessment was that it would still require a strong hash function and a random seed to be reasonably secure. With many weaker hash functions, it is possible to generate seed-independent collisions, see e.g. this blog post. By this assessment, #217 "adds" little security of its own.
sjakobi commentedon Sep 13, 2021
Storing the salt within the
HashMap
was proposed in #45. Maybe we can use that issue to discuss the details of this idea.25 remaining items
Add HashMapT salt, which allows creation of salt with Nat.
Add HashMapT salt, which allows creation of salt with Nat.
Add HashMapT salt, which allows creation of salt with Nat.
jappeace commentedon Sep 22, 2021
For lack of better ideas I implemented this: #321
sjakobi commentedon Sep 27, 2021
I have rekindled #265 to discuss approaches for making the hash salt less predictable to attackers.
I'm reluctant to invest much time into that debate while I'm not aware of a hash function that would make the whole fuss worthwhile though (see #319 (comment)).
If anyone's interested, I think finding an appropriate hash function would be the best way to make progress on this issue.
I noticed that rust currently uses SipHash 1-3:
@tibbe, is that the same SipHash variant that you tried? https://en.wikipedia.org/wiki/SipHash#Overview indicates that SipHash 2-4 might be more common, but also slower.
jappeace commentedon Sep 27, 2021
I did some digging, spihash was still available in hashable in 2012: https://github.com/haskell-unordered-containers/hashable/tree/fea8260b9e0c0596fc7ef0c608364b3960649f26/cbits
It's quite easy to change that code to be spihash-1-3 (the numbers just indicate the loops of hashing/finilization).
I don't know how recent the SipHash implementation was tested for performance, but it looks relatively easy to add it back into hashable, and run the benchmarks once more. Perhaps it would be possible to speed it up? maybe we can try?
sip hash reference implementation (paper explaining it is linked there as well)
sjakobi commentedon Sep 29, 2021
@jappeace Good find! Yeah, it would be interesting to set up benchmarks with that code.
The HighwayHash project also contains a supposedly faster
SipHash
implementation that we could give a spin.jberryman commentedon Sep 29, 2021
A little OT, but in case it comes up and since I haven't wrote about this anywhere: I started a project a few years ago named hashabler that aimed to do a few things:
hashable
modular by separating choice of hash function from the byte-stream-supplying code (i.e. theHashable
instance)Unfortunately I realized I didn't really understand (2) until after I'd made a couple releases and towards the end of a big rewrite, etc. I managed to get the library about 2/3 of the way fixed up but lost steam and haven't had the energy to pick it up since.
But (2) is interesting and I think not very well-understood. But the short version is a composable hashing library like
hashable
essentially needs to do the same work as a serialization library: the byte-supplying function (Hashable instance) needs to represent a uniquely decodable code ().The point being you can get collisions from your choice of hash function, but also from the Hashable instance itself even in the presence of a perfect hash function (well, in theory. The obvious bad instances in
hashable
itself like (IIRC)([a], [a])
were fixed a long time ago, but perhaps in an ad hoc way, and without documenting how user should avoid the same mistake)FWIW I'd like to brush that work off some day. In the interim I've had the thought that I could use backpack to allow the choice of hash function to be configurable, without it needing to be part of the regular public API (so e.g.
containers
could use it transparently)You're welcome to steal with attribution my siphash implementation here if helpful: https://github.com/jberryman/hashabler/blob/master/src/Data/Hashabler/SipHash.hs . It looks like in my version locally I've got a somewhat faster implementation that uses handwritten asm (only because ghc doesn't have bitwise rotate primops)
Add security advisory to package description (#320)
sjakobi commentedon Oct 9, 2021
For reference:
aeson
users can now avoid this vulnerability by enablingaeson
's newordered-keymap
flag which makesaeson
useData.Map.Strict
for storing JSON objects: https://hackage.haskell.org/package/aeson-2.0.1.0/changelogaeson
hackworthltd/primer#51containers
andhashmap
should be disabled by default #333