Skip to content

iter: document general guidance for writing iterator APIs #71901

Open
@mvdan

Description

@mvdan

While https://pkg.go.dev/iter does a good job at explaining the basics of iterators, it leaves out a few important bits of information which may be really useful when writing APIs with iterators. The two most important of them being:

  1. Whether an exported func or method should itself be an iterator, or return an iterator.

The consensus seems to be to export funcs which return iterators, e.g. func (*Foo) Bars() iter.Seq[Bar] used like for bar := range foo.Bars(), rather than func (*Foo) Bars(yield func(Bar) bool) used like for bar := range foo.Bars. This is a bit more consistent with cases where one needs to supply parameters to obtain an iterator, as then the iterator must be a return parameter.

See #66626 (comment), for example, where @adonovan originally proposed adding methods to go/types which were directly iterators.

  1. How errors should be returned to the caller.

If an iteration can fail with an error, it's not obvious whether to return one top-level error, like func Foo() (iter.Seq[Bar], error), or to provide an error at each iteration step, like func Foo() iter.Seq2[Bar, error]. Arguments can be made either way, but I think fundamentally one can implement any reasonable semantics with either signature.

The original proposal at #61897 seemed to clearly favor iter.Seq2[Bar, error] via its func Lines(file string) iter.Seq2[string, error] example, yet none of the value-error examples or text have survived into the final godoc we have today.

As of today I think there is no clear consensus for errors; as recently as last October it was still being discussed as part of a new API proposal.

There may be other API conventions that the iter godoc should mention as well, but these two seem like the most important to me. I would suggest that we document these guidelines sooner than later, so that iterator APIs across the Go ecosystem can be reasonably consistent and predictable.

Activity

thepudds

thepudds commented on Feb 22, 2025

@thepudds
Member

yet none of the value-error examples or text have survived into the final godoc we have today.

FWIW, that was a conscious decision. See for example the commit message here:
https://go-review.googlesource.com/c/go/+/591096

(There was a separate comment on the rationale that I couldn’t dig up immediately, but I think in short it was something like it wasn’t 100% obvious if it was the right idiom, and I think also an element of “let’s first see how people use it for real”).

gabyhelp

gabyhelp commented on Feb 22, 2025

@gabyhelp

Related Discussions

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

dsnet

dsnet commented on Feb 22, 2025

@dsnet
Member

I understand the sentiment to deliberately not document until we first get experience, but I also agree with @mvdan that we should eventually (hopefully sooner rather than later) guidance on what to do. Doing something like iter.Seq2[Bar, error] is what I've been leaning towards lately.

jub0bs

jub0bs commented on Feb 22, 2025

@jub0bs
Contributor

@mvdan

If an iteration can fail with an error, it's not obvious whether to return one top-level error, like func Foo() (iter.Seq[Bar], error) [...]

That error result can communicate that iterator instantiation failed, but it cannot communicate that iteration itself fails. You'd need something like func Foo() (iter.Seq[Bar], func() error) (which was proposed in some other issue), wouldn't you?

jub0bs

jub0bs commented on Feb 22, 2025

@jub0bs
Contributor

Another point of clarification that would be welcome: iterator classification. I feel that differentiating single-use iterators from all other iterators was a poor choice. "Stateless" iterators (i.e. iter.Seq[Foo] and iter.Seq2[Bar, Baz] that are pure functions) vs. all others would be more useful.

"Stateless" iterators are easy to reason about, whereas "stateful" iterators come in all shapes and forms: single-use, resumable, etc. An analogy that comes to mind is linear systems vs nonlinear systems in control theory:

the theory of nonlinear systems is like a theory of non-elephants

gazerro

gazerro commented on Feb 22, 2025

@gazerro
Contributor

@jub0bs

You'd need something like func Foo() (iter.Seq[Bar], func() error) (which was proposed in some other issue), wouldn't you?

func Foo() (iter.Seq[Bar], func() error) was proposed by @jba (70084 comment), and it is indeed a convincing alternative to func Foo() iter.Seq2[Bar, error]:

iter, errf := Foo()
for x := range iter {
    ...
}
if err := errf(); err != nil {
    return err
}
jub0bs

jub0bs commented on Feb 22, 2025

@jub0bs
Contributor

@gazerro Yes, that's the issue I was thinking of. Thanks. I'm not sure I'd be in favour of promoting this approach, though.

mvdan

mvdan commented on Feb 22, 2025

@mvdan
MemberAuthor

Oops, of course, I meant to write something like that - so that an error (or a wrapped list of errors) can be returned once the iterator is done.

ianlancetaylor

ianlancetaylor commented on Feb 23, 2025

@ianlancetaylor
Contributor

This discussion is why we don't have general guidance about how to return errors. People don't yet agree.

I think a commonly known method, such as All, should return an iterator rather than be an iterator. That ensure consistency of the common method across different types. This is mentioned (briefly) at https://go.dev/blog/range-functions#standard-push-iterators. But I don't know that we need a convention for methods that are specific to a given type.

added this to the Backlog milestone on Feb 24, 2025
added
NeedsDecisionFeedback is required from experts, contributors, and/or the community before a change can be made.
on Feb 24, 2025
prattmic

prattmic commented on Feb 24, 2025

@prattmic
Member

cc @rsc

djdv

djdv commented on Feb 26, 2025

@djdv
jub0bs

jub0bs commented on May 29, 2025

@jub0bs
Contributor

FWIW, I've just published https://jub0bs.com/posts/2025-05-29-pure-vs-impure-iterators-in-go/, which builds upon one of my earlier comments.

TL;DR

  • Go has now standardised iterators.
  • Iterators are powerful.
  • Being functions under the hood, iterators can be closures.
  • The classification of iterators suggested by the documentation is ambiguous.
  • Dividing iterators into two categories, “pure” and “impure”, seems to me preferrable.
  • Whether iterators should be designed as “pure” whenever possible is unclear.
anuraaga

anuraaga commented on Jun 11, 2025

@anuraaga
Contributor

I was writing one of my first iterators and also felt it's quite difficult, more guidance could be helpful. Notably, I'm not sure what is a good pattern for defer.

My first broken code for reading lines from a object storage file was

func readLines(object string) (iter.Seq[[]byte], error) {
	ctx, cancel := context.WithTimeout(ctx, time.Second*30)
	defer cancel()

	o := d.storage.Bucket(bucket).Object(object).Retryer(
		storage.WithBackoff(gax.Backoff{}),
		storage.WithPolicy(storage.RetryAlways),
	)

	rc, err := o.NewReader(ctx)
	if err != nil {
		return nil, fmt.Errorf("distributor: creating object reader: %w", err)
	}
	defer rc.Close()
	s := bufio.NewScanner(rc)
	return func(yield func([]byte) bool) {
		for s.Scan() {
			if !yield(s.Bytes()) {
				break
			}
		}
	}, nil
}

Because the defer gets executed before iteration, the reader is already closed when trying to read rows, causing a truncated file. Note, I can't think of any use case for the above type of code so a vet error about defer in a function that returns an iter may be helpful.

Moving return func(yield func([]byte) bool) { to the top of the function allows defer to be executed correctly, but prevents returning initialization errors from readLines and they get defered to during iteration. This didn't seem intuitive - it doesn't seem that bad though and if it's generally accepted guidance, then I would probably have gone for it. Otherwise, avoiding defer and manually calling close methods within error code paths, until finally deferring within the iter function is possible but makes the code much more cumbersome. For now, to be able to have initialization errors before iteration and defers executed after iteration, I went back to just a normal callback func readLines(object string, yield func([]byte) error) error approach without using iter.Seq since it seemed simplest in the end.

adonovan

adonovan commented on Jun 11, 2025

@adonovan
Member

I'm not convinced that iterators are a good fit for file scanning, for several reasons.

First, when you open a file, you have a responsibility to close it when you're done. Go's push iterators do permit you to run cleanup code (rc.Close) at the end, when the Scan loop breaks. To ensure opens and closes are properly paired, the iterator should also open the file immediately before the Scan loop (not before creating the iterator) so that that each iterator has its own open file and thus operates independent of all others. Opening and closing a file in an iterator is perhaps a surprising expense, but it means iterators are "multiple use", which is less surprising than "single use".

However, iterators don't have a good way to report errors, whether related to the sequence as a whole (failure to open the file) or to a given element (failure to read or decode a single record). When using bufio.Scanner, you really should check for I/O errors surfaced through s.Err() at the end of the loop. Some people define iterators over a sequence of (value, error) pairs, but it's not obvious what to assume: is each error independent, or does the first one terminate the stream? Or can there be many errors? Can both a value and an error appear together, as with some Go functions that return (T, error)?

For these reasons, I would not attempt to try to fit I/O entirely within an iterator. You shouldn't try to hide the concepts of Open and Close, and it's best to avoid observable side effects (e.g. on the state of a file descriptor or a shared Scanner) in your iterator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    DocumentationIssues describing a change to documentation.NeedsDecisionFeedback is required from experts, contributors, and/or the community before a change can be made.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @anuraaga@gazerro@prattmic@ianlancetaylor@mvdan

        Issue actions

          iter: document general guidance for writing iterator APIs · Issue #71901 · golang/go