Skip to content

Add forM #592

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Add forM #592

wants to merge 1 commit into from

Conversation

hasufell
Copy link
Member

I'm not entirely sure if forM s f = fmap fromList . (flip T.forM f) . toList $ s would be faster.

@treeowl
Copy link
Contributor

treeowl commented Jan 17, 2019

It seems a bit odd to offer the flipped version of mapM and not mapM. And it seems quite odd to offer a version of mapM without offering a version of traverse. The foldr' is extremely surprising; I'd be astonished if that was what you really intended. The toList version should surely tolerate an Applicative constraint.

traverseSet :: (Ord b, Applicative f) => (a -> f b) -> Set a -> f (Set b)
traverseSet f = fmap fromListN . traverse f . toList

I'm not sure whether we can do better than that. The monadic version should probably look like this:

mapMSet :: (Ord b, Monad m) => (a -> m b) -> Set a -> m (Set b)
mapMSet f s0 = foldr go return s0 empty
  where
     go x r s = f x >>= \y -> r $! insert y s

or alternatively perhaps

mapMSet :: (Ord b, Monad m) => (a -> m b) -> Set a -> m (Set b)
mapMSet f = foldM go empty
  where
    go s x = f x >>= \y -> pure $! insert y s

Note: you should definitely email the libraries list with a formal proposal; in particular, naming these things can be a tad tricky. traverse and mapM would probably be "right", but have the potential to break existing code. I don't feel strongly about the names.

@hasufell
Copy link
Member Author

hasufell commented Feb 1, 2019

I cannot really judge what is more efficient or more performant, since reasoning about those things are not easy in haskell.

@treeowl
Copy link
Contributor

treeowl commented Feb 1, 2019

The monadic versions I propose insert into the accumulator set as actions are performed, which will likely be considerably more efficient in many cases. Do you understand why the foldr'-based version you proposed does not do so? The toList/fromList version also does not insert incrementally, but it doesn't need a full Monad; do you understand why Applicative isn't sufficient to work incrementally? I don't know much about your Haskell background.

@treeowl
Copy link
Contributor

treeowl commented Feb 1, 2019

Ah, I see now that you've written some very similar code yourself, in http://hackage.haskell.org/package/hpath-0.9.2/docs/src/System.Posix.Directory.Traversals.html#traverseDirectory

@hasufell
Copy link
Member Author

hasufell commented Feb 8, 2019

Do you understand why the foldr'-based version you proposed does not do so?

Yes, becauce b in (\a b -> f a >>= (\x -> fmap (insert x) b)) is the recursion, so we are fmapping into an action into an action...

However, if we never care about the Set, but only about the actions, then foldr would be more "efficient" no?

do you understand why Applicative isn't sufficient to work incrementally?

I'm not sure what you mean by that.

@hasufell
Copy link
Member Author

hasufell commented Jan 7, 2020

ping

@treeowl
Copy link
Contributor

treeowl commented Jan 7, 2020

You still haven't come up with an implementation I consider reasonable. foldr' still has no business there.

@hasufell hasufell force-pushed the forM branch 2 times, most recently from 4b996f0 to ff01a54 Compare January 7, 2020 14:29
@treeowl
Copy link
Contributor

treeowl commented Jan 7, 2020

Much better. But you'll still need to propose this on the libraries list to get community input.

@hasufell
Copy link
Member Author

hasufell commented Jan 7, 2020

Much better. But you'll still need to propose this on the libraries list to get community input.

I did

@treeowl
Copy link
Contributor

treeowl commented Jan 7, 2020

Can you link to the discussion thread here? Unfortunately I don't remember it.

@hasufell
Copy link
Member Author

hasufell commented Jan 7, 2020

Can you link to the discussion thread here? Unfortunately I don't remember it.

Actually, not I did, but someone else: https://mail.haskell.org/pipermail/libraries/2019-January/029335.html

@hasufell
Copy link
Member Author

hasufell commented Jan 7, 2020

I'm also interested in benchmarks and will test a bit I think.

@treeowl
Copy link
Contributor

treeowl commented Jan 7, 2020

Ah, yes, that "someone else" was actually me. But there wasn't any real discussion, aside from one comment by Henning. Could you try to reignite the conversation?

@hasufell
Copy link
Member Author

hasufell commented Jan 7, 2020

I compared

forM :: (Ord b, Monad m) => Set a -> (a -> m b) -> m (Set b)
forM s0 f = fmap (GHCExts.fromListN (size s0)) . traverse f . toList $ s0

with

forM :: (Ord b, Monad m) => Set a -> (a -> m b) -> m (Set b)
forM s0 f = foldr go return s0 empty
  where
     go x r s = f x >>= \y -> r $! insert y s

and

forM :: (Ord b, Monad m) => Set a -> (a -> m b) -> m (Set b)
forM s0 f = foldM go empty $ s0
  where
    go s x = f x >>= \y -> pure $! insert y s

test was:

         bench "forM" $ whnf (\s -> runIdentity $ S.forM s (\e -> pure $ e + 1)) s

toList version:

benchmarked forM
time                 125.3 μs   (124.1 μs .. 126.2 μs)
                     0.999 R²   (0.998 R² .. 0.999 R²)
mean                 124.8 μs   (123.9 μs .. 125.7 μs)
std dev              3.237 μs   (2.729 μs .. 4.030 μs)
variance introduced by outliers: 11% (moderately inflated)

foldr version:

benchmarked forM
time                 1.548 ms   (1.544 ms .. 1.553 ms)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 1.539 ms   (1.535 ms .. 1.543 ms)
std dev              12.80 μs   (10.27 μs .. 16.49 μs)

foldM version:

benchmarked forM
time                 2.591 ms   (2.571 ms .. 2.624 ms)
                     0.998 R²   (0.994 R² .. 0.999 R²)
mean                 2.608 ms   (2.590 ms .. 2.638 ms)
std dev              74.54 μs   (51.70 μs .. 116.5 μs)
variance introduced by outliers: 11% (moderately inflated)

@hasufell
Copy link
Member Author

hasufell commented Jan 7, 2020

Ah, yes, that "someone else" was actually me. But there wasn't any real discussion, aside from one comment by Henning. Could you try to reignite the conversation?

Given the disinterest, does this justify two threads?

@hasufell
Copy link
Member Author

hasufell commented Jan 7, 2020

I find the benchmark results interesting and confusing.

@treeowl
Copy link
Contributor

treeowl commented Jan 7, 2020 via email

@hasufell
Copy link
Member Author

hasufell commented Jan 7, 2020

         bench "forM" $ whnf (\s -> runIdentity $ S.forM s (\e -> pure $ e + 1)) s

@treeowl
Copy link
Contributor

treeowl commented Jan 7, 2020

You're probably running into the weird special-case optimization in fromList for (strictly?) increasing lists.

@hasufell
Copy link
Member Author

hasufell commented Jan 7, 2020

bench "forM" $ whnf (\s -> runIdentity $ S.forM s (\e -> pure $ e * 12 `mod` 4)) s

makes the foldr version go faster:

benchmarked forM
time                 173.3 μs   (171.7 μs .. 175.2 μs)
                     0.999 R²   (0.998 R² .. 1.000 R²)
mean                 173.2 μs   (172.5 μs .. 174.2 μs)
std dev              2.743 μs   (2.188 μs .. 3.780 μs)

fromList one is still the fastest.

@andreasabel
Copy link
Member

whnf only computes the weak head normal form, you need to use nf, the full normal form. (I fell into this trap before, see agda/agda@8dae71a#commitcomment-36008099 ).

@andreasabel
Copy link
Member

My opinion on this PR:

  • I think that we should have mapM = traverse rather than two different implementation. Also, the effects should occur in order from the least set element to the greatest. Spec:
mapM f s = traverse f s = Set.fromList <$> do mapM f $ Set.toList s
  • I think we don't need for; this is anyway a misnamed function in Applicative because the reasonable analogy of forM = flip mapM is for = flip map and this is violated here.

@hasufell
Copy link
Member Author

hasufell commented Jan 10, 2020

whnf only computes the weak head normal form, you need to use nf, the full normal form. (I fell into this trap before, see agda/agda@8dae71a#commitcomment-36008099 ).

Thanks, will adjust!

* I think that we should have `mapM = traverse` rather than two different implementation.

I did that, because they have different performance characteristics I think. It appears the applicative version is faster. See above. But I don't know in which circumstances. I guess it makes sense to unify those.

* I think we don't need `for`; this is anyway a misnamed function in `Applicative` because the reasonable analogy of `forM = flip mapM` is `for = flip map` and this is violated here.

I prefer consistency here. for is already well known and I use it a lot. I don't think this PR is the place to get rid of historical decisions.

@treeowl
Copy link
Contributor

treeowl commented Jan 10, 2020

@andreasabel, could you do me a favor and keep the discussion of what to include and by what name to the libraries list? It's useful to have it all in one place and to get more exposure to comments from other users.

-- Like 'traverse' from Data.Traversable. This is less generic, since 'Set'
-- does not have a Traversable instance.
traverse :: (Ord b, Applicative m) => (a -> m b) -> Set a -> m (Set b)
traverse f s0 = fmap (GHCExts.fromListN (size s0)) . DT.traverse f . toList $ s0
Copy link
Member Author

@hasufell hasufell Jan 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know there are some subtle differences on whether to use traverse f s0 vs traverse f, e.g. inlining. What's the verdict here? Do we want it half-pointfree or fully pointfree even?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@massudaw any idea?

@treeowl
Copy link
Contributor

treeowl commented Jan 10, 2020

@hasufell, you should bring back the previous mapM at least long enough to benchmark it using a monad other than Identity! Try ST s, for example.

@hasufell
Copy link
Member Author

hasufell commented Jan 10, 2020

* I think that we should have `mapM = traverse` rather than two different implementation.  Also, the effects should occur in order from the least set element to the greatest. Spec:

In fact, this breaks ancient GHC versions where Monad is not a superclass of Applicative:
https://travis-ci.org/haskell/containers/jobs/635377304#L737

@hasufell
Copy link
Member Author

hasufell commented Jan 10, 2020

@treeowl the results are when keeping the two different implementations. This looks significant.


benchmarked forM (Identity)
time                 1.542 ms   (1.536 ms .. 1.549 ms)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 1.539 ms   (1.536 ms .. 1.546 ms)
std dev              16.83 μs   (10.46 μs .. 28.81 μs)

benchmarked for (Identity)
time                 163.3 μs   (161.7 μs .. 165.0 μs)
                     0.999 R²   (0.998 R² .. 0.999 R²)
mean                 163.8 μs   (162.7 μs .. 165.1 μs)
std dev              3.992 μs   (3.276 μs .. 5.028 μs)

benchmarked forM (ST)
time                 1.758 ms   (1.750 ms .. 1.764 ms)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 1.765 ms   (1.760 ms .. 1.774 ms)
std dev              23.95 μs   (15.22 μs .. 38.66 μs)

benchmarked for (ST)
time                 277.3 μs   (274.9 μs .. 279.5 μs)
                     0.999 R²   (0.999 R² .. 1.000 R²)
mean                 279.8 μs   (278.0 μs .. 281.2 μs)
std dev              5.724 μs   (4.727 μs .. 7.641 μs)

@treeowl
Copy link
Contributor

treeowl commented Jan 10, 2020

Please let me see your ST benchmark code, and please include it in the benchmarks you add to the package.

@hasufell
Copy link
Member Author

Please let me see your ST benchmark code, and please include it in the benchmarks you add to the package.

ST is already included.

@treeowl
Copy link
Contributor

treeowl commented Jan 10, 2020

It looks like your ST benchmark produces elements in strictly increasing order, which bakes in an extra advantage for fromListN. Would you mind tweaking it to scatter elements around more to get something more realistic?

@treeowl
Copy link
Contributor

treeowl commented Jan 11, 2020

So how did the ordering change affect the benchmarks?

@hasufell
Copy link
Member Author

So how did the ordering change affect the benchmarks?

Not at all

@treeowl
Copy link
Contributor

treeowl commented Jan 11, 2020

Okay, so for now we don't need to have mapM, right?

@hasufell
Copy link
Member Author

hasufell commented Jan 11, 2020

Actually I was lying. The new benchmarks ran with the unified implementation. Here are the benchmarks for the explicit foldr (forM) and the fromListN (for):


benchmarked forM (Identity)
time                 1.538 ms   (1.509 ms .. 1.561 ms)
                     0.999 R²   (0.998 R² .. 0.999 R²)
mean                 1.494 ms   (1.486 ms .. 1.505 ms)
std dev              31.80 μs   (26.01 μs .. 40.72 μs)

benchmarked for (Identity)
time                 171.3 μs   (167.4 μs .. 176.8 μs)
                     0.996 R²   (0.994 R² .. 0.999 R²)
mean                 171.8 μs   (170.5 μs .. 173.4 μs)
std dev              4.873 μs   (3.835 μs .. 6.653 μs)
variance introduced by outliers: 13% (moderately inflated)

benchmarked forM (ST)
time                 9.340 μs   (9.246 μs .. 9.427 μs)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 9.413 μs   (9.377 μs .. 9.457 μs)
std dev              132.2 ns   (107.1 ns .. 161.1 ns)

benchmarked for (ST)
time                 5.228 μs   (5.134 μs .. 5.343 μs)
                     0.997 R²   (0.996 R² .. 0.999 R²)
mean                 5.285 μs   (5.237 μs .. 5.355 μs)
std dev              202.2 ns   (151.9 ns .. 294.4 ns)

I can't make any sense out of this. Why the forM with Identity is so slow, but the ST one not... totally escapes me.

this was with:

         bench "forM (Identity)" $ nf (\s -> runIdentity $ S.forM s (\e -> pure $ e + 1)) s
        , bench "for (Identity)" $ nf (\s -> runIdentity $ S.for s (\e -> pure $ e + 1)) s
        , bench "forM (ST)" $ nf (\s -> runST $ do
              ref <- newSTRef 0
              S.forM s (\e -> do
                              modifySTRef ref (+ 1)
                              x <- readSTRef ref
                              pure $ x + e * 36 `mod` 11)) (S.map (`mod` 51) s)
        , bench "for (ST)" $ nf (\s -> runST $ do
              ref <- newSTRef 0
              S.for s (\e -> do
                             modifySTRef ref (+ 1)
                             x <- readSTRef ref
                             pure $ x + e * 36 `mod` 11)) (S.map (`mod` 51) s)

@hasufell
Copy link
Member Author

I noticed when I change the test implementation to:

         bench "forM (Identity)" $ nf (\s -> runIdentity $ S.forM s (\e -> pure $ e + 1)) s
        , bench "for (Identity)" $ nf (\s -> runIdentity $ S.for s (\e -> pure $ e + 1)) s
        , bench "forM (ST)" $ nf (\s -> runST $ do
              ref <- newSTRef 0
              S.forM s (\e -> do
                              modifySTRef ref (+ 1)
                              x <- readSTRef ref
                              pure $ x + e * 36 `mod` 11)) s
        , bench "for (ST)" $ nf (\s -> runST $ do
              ref <- newSTRef 0
              S.for s (\e -> do
                             modifySTRef ref (+ 1)
                             x <- readSTRef ref
                             pure $ x + e * 36 `mod` 11)) s

Then the results are:

benchmarked forM (Identity)
time                 1.514 ms   (1.495 ms .. 1.536 ms)
                     0.999 R²   (0.998 R² .. 1.000 R²)
mean                 1.496 ms   (1.490 ms .. 1.504 ms)
std dev              22.49 μs   (17.10 μs .. 31.61 μs)

benchmarked for (Identity)
time                 169.9 μs   (168.5 μs .. 171.5 μs)
                     0.999 R²   (0.998 R² .. 1.000 R²)
mean                 169.1 μs   (168.0 μs .. 170.6 μs)
std dev              4.032 μs   (3.006 μs .. 5.988 μs)

benchmarked forM (ST)
time                 1.744 ms   (1.735 ms .. 1.752 ms)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 1.740 ms   (1.736 ms .. 1.746 ms)
std dev              16.39 μs   (10.13 μs .. 26.54 μs)

benchmarked for (ST)
time                 1.376 ms   (1.370 ms .. 1.380 ms)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 1.383 ms   (1.378 ms .. 1.390 ms)
std dev              19.05 μs   (13.22 μs .. 26.90 μs)

It seems small changes to the tests have significant impact.

But the forM and for seem to converge at some point.

@hasufell
Copy link
Member Author

Okay, so for now we don't need to have mapM, right?

I think it's better to have all 4 functions. Not my decision though.

@treeowl
Copy link
Contributor

treeowl commented Jan 11, 2020

If it doesn't add functionality and doesn't add performance, then there's a really high bar to adding it. Generally speaking, this is because it makes something really important significantly more convenient (like (!?) as an infix version of lookup) or because it's rather tricky to figure out how to do it most efficiently (like Data.Sequence.intersperse). I'm a bit skeptical about adding any of these functions, but I'm leaning toward letting you have traverse if others don't speak up, since all the work you've done has lessons to teach about how to make this perform well.

@@ -136,6 +136,12 @@ module Data.Set (
-- * Map
, S.map
, mapMonotonic
#if __GLASGOW_HASKELL__ >= 710
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite confusing and annoying; please avoid APIs exporting things conditional on GHC or base versions

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hvr, don't worry; I won't let it through like that. Please join the discussion on the libraries list about whether to add some of these.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hvr there are two problems with that. 7.6 and 7.8 don't have Applicative constraint on Monad class and 7.6 does not have fromListN at all, which would result in two different implementations of the same function having different performance characteristics. How do you propose to solve it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hvr ping

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there is no response from @hvr can you @treeowl give a definite list of required changes that need to be made in order for this to be merged, including how you want multiple GHC support to happen?

@sjakobi sjakobi mentioned this pull request May 29, 2020
@hasufell
Copy link
Member Author

hasufell commented Jul 4, 2020

No actionable input. I've lost interest in this PR.

@hasufell hasufell closed this Jul 4, 2020
@treeowl treeowl reopened this Jul 4, 2020
@sjakobi
Copy link
Member

sjakobi commented Jul 15, 2020

@treeowl Since you apparently don't want to let this die, I've assigned it to you. :)

@hasufell
Copy link
Member Author

This is dead. Please open your own PR if you're interested to continue. I'm cleaning up stale PRs every once in a while.

@hasufell hasufell closed this Jun 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants