You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
intra_op_parallelism_threads will control the maximum parallel speedup
for a single operator, TF use Eigen::ThreadPoolDevice to speed up the
calculation:
inter_op_parallelism_threads will execute the ops with multi-threads which no path with other ops.
Data Parallelism on GPUs
For the data parallelism, user need to convert the model into parallelism version,
each of the GPU will calculate gradients independently and merge
the gradients with average. The Following is a piece of the cifar10_multi_gpu_train.py:
...
# Get images and labels for CIFAR-10.images, labels=cifar10.distorted_inputs()
batch_queue=tf.contrib.slim.prefetch_queue.prefetch_queue(
[images, labels], capacity=2*FLAGS.num_gpus)
# Calculate the gradients for each model tower.tower_grads= []
withtf.variable_scope(tf.get_variable_scope()):
foriinxrange(FLAGS.num_gpus):
withtf.device('/gpu:%d'%i):
withtf.name_scope('%s_%d'% (cifar10.TOWER_NAME, i)) asscope:
# Dequeues one batch for the GPUimage_batch, label_batch=batch_queue.dequeue()
# Calculate the loss for one tower of the CIFAR model. This function# constructs the entire CIFAR model but shares the variables across# all towers.loss=tower_loss(scope, image_batch, label_batch)
....
# Calculate the gradients for the batch of data on this CIFAR tower.grads=opt.compute_gradients(loss)
# Keep track of the gradients across all towers.tower_grads.append(grads)
# We must calculate the mean of each gradient. Note that this is the# synchronization point across all towers.grads=average_gradients(tower_grads)
The Parallelism in TF
Two Kinds of Parallelism Configuration in Session
which
intra_op_parallelism_threads
will control the maximum parallel speedupfor a single operator, TF use
Eigen::ThreadPoolDevice
to speed up thecalculation:
inter_op_parallelism_threads
will execute the ops with multi-threads which no path with other ops.Data Parallelism on GPUs
For the data parallelism, user need to convert the model into parallelism version,
each of the GPU will calculate gradients independently and merge
the gradients with average. The Following is a piece of the cifar10_multi_gpu_train.py:
Links
The text was updated successfully, but these errors were encountered: