Name Scope and Variable Scope


programming
import collections
import tensorflow as tf


# Util
def get_dependent_variables(tensor):
  """Returns all variables that the tensor `tensor` depends on.
  
  Forked from: https://stackoverflow.com/a/42861919/1218716
  
  Args:
    tensor: Tensor.
    
  Returns:
    List of variables.
  """
  # Initialize
  starting_op = tensor.op
  dependent_vars = []
  queue = collections.deque()
  queue.append(starting_op)
  op_to_var = {var.op: var for var in tf.trainable_variables()}
  visited = set([starting_op])

  while queue:
    op = queue.popleft()
    try:
      dependent_vars.append(op_to_var[op])
    except KeyError:
      # `op` is not a variable, so search its inputs (if any). 
      for op_input in op.inputs:
        if op_input.op not in visited:
          queue.append(op_input.op)
          visited.add(op_input.op)

  return dependent_vars

Implicit Variables

In TensorFlow, the parameters are implicitly involved in a function. For instance, the weights and biases in tf.layers.dense. Now suppose you have defined a function by composition of tf.layers.dense, like

def f(x):
  hidden = tf.layers.dense(x, 256, activation=tf.nn.relu)
  return tf.layers.dense(hidden, 1, activation=None)

And you employ this function

x = tf.placeholder(shape=[None, 64], dtype='float32', name='x')
y = f(x)

The variables are implicit forsooth:

get_dependent_variables(y)
[<tf.Variable 'dense_1/bias:0' shape=(1,) dtype=float32_ref>,
 <tf.Variable 'dense_1/kernel:0' shape=(256, 1) dtype=float32_ref>,
 <tf.Variable 'dense/bias:0' shape=(256,) dtype=float32_ref>,
 <tf.Variable 'dense/kernel:0' shape=(64, 256) dtype=float32_ref>]

When Using Variable-scope

Sometimes we want to reuse the variables. The implicitness of them forces the non-triviality of the reuse. Indeed, we have to employ a tf.variable_scope:

def f(x, reuse=None):
  with tf.variable_scope('f', reuse=reuse):
    return tf.layers.dense(x, 1, activation=tf.nn.relu)

Since the reuse flag in tf.layers.dense e.t.c. defaults to be None, inheriting the reuse of the variable-scope it locates in, this implementation simply reuse those implicit variable in the dense-layers.

y_1 = f(x, reuse=tf.AUTO_REUSE)
y_2 = f(x, reuse=tf.AUTO_REUSE)
print(get_dependent_variables(y_1) == get_dependent_variables(y_2))
True

When NOT Using Variable-scope

It seems that it’s convenient to substitute tf.name_scope and to use tf.variable_scope instead everywhere. This is NOT true.

Suppose we want to pass a function f(x; w) into two operators g and h, both of which accept a function and return a new function, and to make the variable w reused.

# Wrong implementation

def g(f):
  def _g(x):
    with tf.variable_scope('g'):
      return f(x) + 1
  return _g

def h(f):
  def _h(x):
    with tf.variable_scope('h'):
      return f(x) + 2
  return _h

We would expect that the variable in f is reused if the reuse flat is set to tf.AUTO_REUSE. But the fact is not so:

_f = lambda x: f(x, reuse=tf.AUTO_REUSE)
z_1 = g(_f)(x)
z_2 = h(_f)(x)
print(get_dependent_variables(z_1) == get_dependent_variables(z_2))
False

Let’s check reason explicitly:

print(get_dependent_variables(z_1))
[<tf.Variable 'g/f/dense/bias:0' shape=(1,) dtype=float32_ref>,
 <tf.Variable 'g/f/dense/kernel:0' shape=(64, 1) dtype=float32_ref>]
print(get_dependent_variables(z_2))
[<tf.Variable 'h/f/dense/bias:0' shape=(1,) dtype=float32_ref>,
 <tf.Variable 'h/f/dense/kernel:0' shape=(64, 1) dtype=float32_ref>]

As it’s seen, the additional variable-scopes of 'h' and 'g' are pre-pended to the variable-scope of 'f', and then renamed the variables. When the computer tried to reuse the variables named 'f/dense/...', but now with 'g/' pre-pended, later in operator h(), it found different names (i.e. 'h/f/dense/...'). By the definition of tf.get_variable(), it created new variables instead of reusing the created.

In this case, we shall use tf.name_scope to group together the node in the graph, instead of using tf.variable_scope:

# Correct implementation

def g(f):
  def _g(x):
    with tf.name_scope('g'):
      return f(x) + 1
  return _g

def h(f):
  def _h(x):
    with tf.name_scope('h'):
      return f(x) + 2
  return _h
_f = lambda x: f(x, reuse=tf.AUTO_REUSE)
z_1 = g(_f)(x)
z_2 = h(_f)(x)
print(get_dependent_variables(z_1))
[<tf.Variable 'f/dense/bias:0' shape=(1,) dtype=float32_ref>,
 <tf.Variable 'f/dense/kernel:0' shape=(64, 1) dtype=float32_ref>]
print(get_dependent_variables(z_2))
[<tf.Variable 'f/dense/bias:0' shape=(1,) dtype=float32_ref>,
 <tf.Variable 'f/dense/kernel:0' shape=(64, 1) dtype=float32_ref>]
print(get_dependent_variables(z_1) == get_dependent_variables(z_2))
True

Now as expected forsooth.