Here is some useful notes/tricks of Tensorflow 1.x, a powerful deep learning framework developed by Google, Inc.
Op
placeholder
1 2 3 4 5 6 7 8 9 10 11
tf.placeholder_with_default( input, shape, name=None ) """ args: input: a tensor. the default value to produce when output is not fed shape: a `tf.TensorShape` / list(int), possibly partial shape name(optional): tensor name returns: a Tensor. """
Tensor.eval(feed_dict=None, session=None) """ args: feed_dict: feed_dict like `session.run()` session: specify the session to evaluate the tensor. If none, the default session will be used. returns: A numpy array of values """
tf.group()
1 2 3 4 5 6
tf.group( *inputs, name=None ) Args: `*input`: zero or more tensors to group `name`(optional): op name
get the batch size from placeholder
1
bsz = tf.shape(placeholder)[0]
Tensor manupulation
tf.where
1 2 3 4 5 6 7 8 9 10
tf.where( condition, x=None, y=None, name=None ) """ Args: condition: bool tensor Returns: If x== y == None, return the coordinates of true elements of condition. Dim -> (# of true elements, condition.shape[-1]) """
tf.gather
tf.gather slices the params with indix indices along axis.
output shape = params.shape[:axis] + indices.shape[batch_dims:]+ params.shape[axis+1:]. The middle term indicates the axis to slice on.
1 2 3 4 5 6 7 8
tf.gather( params, indices, axis=None, batch_dims=0, name=None ) Args: params: tensor to gather from indices: tensor to index axis: axis in params to gather indices from. Default the 1st non-batch dimension. batch_dims: int. < rank(indices)
tf.gather_nd
Slice the params with the specified shape of indices.
Slice on the first N dims, where N=indices.shape[-1]. i.e., # of the slice op.
indices.shape[-1] <= params.rank
if equal, slice the element
if not equal, slice along the indices.shape[-1] axis.
out.shape = indices.shape[:-1] + params.shape[indices.shape[-1]:]. indices.shape[-1] indicate the dim after slicing.
tf.max_band_part Copy a tensor setting everything outside a central band in each innermost matrix to zero. That is, elements below the “num_lower” (if not -1) and above the “num_upper” are set to zeros.
tf.clip_by_norm(t, clip_norm, axes=None, name=None) """ t: grad tensor to be clipped clip_norm: A maximum clipping norm axes: dimension for gradient clipping. Default None indicates all dimensions. name: op name (optional). """ t * clip_norm / l2norm(t)
1 2 3 4 5
# example opt = tf.train.AdamOptimizer(learning_rate=0.001, beta1=0.9, beta2=0.999, epsilon=1e-08) grads_vals = self.opt.compute_gradients(self.loss) grads_vals = [(tf.clip_by_norm(g, clip_norm), v) for g, v in grads_vals if g isnotNone] train_op = opt.apply_gradients(grads_vals)
tf.clip_by_global_norm
1 2 3 4 5 6 7 8 9
tf.clip_by_global_norm(t_list, clip_norm, use_norm=None, name=None) """ t_list: a tuple or list of tensors clip_norm: A maximum clipping norm use_norm (optional): specify the global norm if already computed. name (optional): op name. """ t_list[i] * clip_norm / max(global_norm, clip_norm) global_norm = sqrt(sum([l2norm(t)**2for t in t_list]))
tf.clip_by_average_norm(t, clip_norm, name=None) """ t: grad tensor to be clipped clip_norm: A maximum clipping norm name: op name """ t * clip_norm / l2norm_avg(t)
where $m$ is the number of tensor $t$.
tf.clip_by_value
1 2 3 4 5 6 7 8 9
tf.clip_by_value(t, clip_value_min, clip_value_max, name=None) """ t: grad tensor clip_value_min: clip min value clip_value_max: clip max value name: op name """ t[t > clip_value_max] = clip_value_max t[t < clip_value_min] = clip_value_min
# method 1: (recommended) multinom = tf.distributions.Multinomial( total_count=tf.constant(1,dtype=tf.float32), # sample one for each record in the batch, that is [1, batch_size] probs=probs) sampled_actions = multinom.sample() # sample one action for data in the batch predicted_actions = tf.argmax(sampled_actions, axis=-1) action_probs = sampled_actions * predicted_probs action_probs = tf.reduce_sum(action_probs, axis=-1)
defbody(t, output_ta): # write in the value at the $t$-th index of TensorArray output_ta = output_ta.write(t, [2,3]) return t+1, output_ta
t = tf.constant(0) # define TensorArray output_ta = tf.TensorArray(dtype=tf.float32, size=1, dynamic_size=True) # while_loop result = tf.while_loop(condition, body, loop_vars=[t, output_ta]) last_t, last_out = result
final_out = last_out.stack() # return the value in the TensorArray
Eager execution mode
Eager execution mode support the dynamic graph and print out the values of tensors when creating graphs without using tf.Session() for version 1.x. [4]
placeholders = [op for op in imported_graph.get_operations() if op.type == "Placeholder"] print(len(placeholders)) for p in placeholders: print(p.name)
""" allow_soft_placement --------------------- Whether soft placement is allowed. If allow_soft_placement is true, an op will be placed on CPU if 1. there's no GPU implementation for the OP or 2. no GPU devices are known or registered or 3. need to co-locate with reftype input(s) which are from CPU. """ config = tf.ConfigProto( allow_soft_placement=True, # allow soft device placement log_device_placement=True# Whether device placements should be logged. )