Skip to content

Discrete histograms/counters across steps/epochs #4736

Open
@gonnet

Description

@gonnet
Contributor

I'm currently tracking several discrete values, e.g. the number of iterations a loop has gone through, over several training steps and dumping them to a tf.summary.histogram at the end of each epoch.

My current approach amounts to storing counts of the values seen in a counts = tf.Variable(shape=[max_counter_value], initializer='zeros', dtype=tf.int32, trainable=False) and then constructing a "fake" tf.summary_histogram, e.g.

      data = tf.repeat(
          tf.range(max_counter_value, dtype=tf.float32),
          repeats=counts)
      tf.summary.histogram(name, data, step=step, buckets=2 * max_counter_value - 1)

This is a bit clunky for several reasons:

  • I need to inflate the counts into the data Tensor, which can contain several thousands/millions of values, even though tf.summary.histogram probably re-computes counts internally,
  • Plotting the data in TensorBoard smears the discrete values over several buckets, so I never actually see the total count,
  • (Somewhat tangentially) I have to handle accumulating and flushing the counters myself.

What would be really super cool would be a function named something like tf.summary.counter(value: tf.Tensor, name: str, flush_every_n_epochs:int = 1) where I can just dump in Tensors of integer types and get the discrete (unsmoothed) histograms every 'n' epochs.

I'm guessing the third part (accumulating values across steps) is probably a bit iffy since it would require maintaining some kind of state, but I'm hoping something like calling tf.summary.histogram with a set of pre-bucketed counts should be possible?

Cheers, Pedro

Activity

self-assigned this
on Mar 5, 2021
wchargin

wchargin commented on Mar 6, 2021

@wchargin
Contributor

Hi @gonnet! Thanks for reaching out (and sorry for the delayed response;
meant to post this message earlier). See #1015, #1803, et al. for prior
discussions about how the histogram summary recording and visualization
could be improved.

The request for “stateful histograms” (put in tensors across multiple
calls/steps, flush out a histogram on demand) is new, as far as I’m
aware, but there is some prior art in the TF 1.x PR curves summary.
Making that kind of thing TF2-compatible is not simple, but histograms
may be simpler than PR curves. I’ll keep this issue open for that
request. Can’t promise that we’ll implement it any time soon, but we’ll
keep it on the backlog.

nfelt

nfelt commented on Mar 15, 2021

@nfelt
Contributor

I'm hoping something like calling tf.summary.histogram with a set of pre-bucketed counts should be possible?

I think this is basically #900. They aren't quite the same, since that issue asks to use the visualization with direct probabilities rather than a true "histogram" that represents a many observations, so the semantics are a bit different, but presumably implementation wise it more or less amounts to the same thing: a bar chart where you specify a set of values to plot and each one is associated with some range of the x-axis. (This is distinct from cases where each value is associated with actually just a single point on the x-axis, or with some categorical value, which is more like #2145.)

So may want to consider if we do provide the ability to pass pre-bucketed counts whether we want to stick with a fairly restrictive option (i.e. they have to be integer counts and should semantically represent counts, e.g. be non-negative) or allow a more general case (e.g. values can be floating points, or possibly even negative).

canbakiskan

canbakiskan commented on Jul 1, 2021

@canbakiskan
Contributor

I tried to change histogram plots to look like bar graph plots: link

ss

Also, about the second point, if you have a predetermined number of buckets for all of your plots, you can edit this line this line and recompile. Then the counts will not be smeared if you pass buckets=<nb_bins> to tf.summary.histogram() or bins=<nb_bins> to torch.utils.tensorboard.writer.SummaryWriter().add_histogram(). That being said, it's a stopgap measure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

    Development

    No branches or pull requests

      Participants

      @nfelt@wchargin@gonnet@canbakiskan@arghyaganguly

      Issue actions

        Discrete histograms/counters across steps/epochs · Issue #4736 · tensorflow/tensorboard