deep learning - What does global_step mean in Tensorflow?



In this is tutorial code from TensorFlow website,

  1. could anyone help explain what does global_step mean?

    I found on the Tensorflow website written that global step is used count training steps, but I don't quite get what exactly it means.

  2. Also, what does the number 0 mean when setting up global_step?

    def training(loss,learning_rate):
        optimizer = tf.train.GradientDescentOptimizer(learning_rate)

        # Why 0 as the first parameter of the global_step tf.Variable?
        global_step = tf.Variable(0, name='global_step',trainable=False)

        train_op = optimizer.minimize(loss, global_step=global_step)

        return train_op

According to Tensorflow doc global_step: increment by one after the variables have been updated. Does that mean after one update global_step becomes 1?

3 Answers: 

global_step refers to the number of batches seen by the graph. Every time a batch is provided, the weights are updated in the direction that minimizes the loss. global_step just keeps track of the number of batches seen so far. When it is passed in the minimize() argument list, the variable is increased by one. Have a look at optimizer.minimize().

You can get the global_step value using tf.train.global_step(). Also handy are the utility methods tf.train.get_global_step or tf.train.get_or_create_global_step.

0 is the initial value of the global step in this context.


The global_step Variable holds the total number of steps during training across the tasks (each step index will occur only on a single task).

A timeline created by global_step helps us understand know where we are in the grand scheme, from each of the tasks separately. For instance, the loss and accuracy could be plotted against global_step on Tensorboard.


show you a vivid sample below:


train_op = tf.train.GradientDescentOptimizer(learning_rate=LEARNING_RATE).minimize(loss_tensor,global_step=tf.train.create_global_step())
with tf.Session() as sess:
    tf.logging.log_every_n(tf.logging.INFO,"np.mean(loss_evl)= %f at step %d",100,np.mean(loss_evl),

corresponding print

INFO:tensorflow:np.mean(loss_evl)= 1.396970 at step 1
INFO:tensorflow:np.mean(loss_evl)= 1.221397 at step 101
INFO:tensorflow:np.mean(loss_evl)= 1.061688 at step 201