我正在尝试在Tensorflow中实现一个简单的hogwild概念证明。理想情况下,它会使用Python线程来执行更新。我已经完成了一个实现,紧接着来自TF网站的Deep MNIST专家教程,除了Hogwild。可在此处获取实施:https://github.com/nivwusquorum/a3c/blob/master/hogwild/mnist.py
代码的关键部分在这里:
def accuracy(session, graphs, data_iter, num_threads, train=False):
num_total = 0
num_correct = 0
def process_batch(batch_x, batch_y):
nonlocal num_correct
nonlocal num_total
with graphs.lease() as g:
input_placeholder, output_placeholder, \
keep_prob_placeholder, train_step_f, num_correct_f, \
no_op = g
batch_num_correct, _ = session.run(
[num_correct_f, train_step_f if train else no_op],
{
input_placeholder: batch_x,
output_placeholder: batch_y,
keep_prob_placeholder: 0.5 if train else 1.0,
})
num_correct += batch_num_correct
num_total += len(batch_x)
with BlockOnFullThreadPool(max_workers=num_threads, queue_size=num_threads // 2) as pool:
for i, (batch_x, batch_y) in enumerate(data_iter):
pool.submit(process_batch, batch_x, batch_y)
pool.shutdown(wait=True)
return float(num_correct) / float(num_total)
基本上我创建num_threads
个tensorflow表达式集合(在我的代码中它们命名很差 - 图形),其中变量被重用,优化器不使用锁定。现在,实现具有所有期望的特性(并行化加速,收敛性能等),但是每次运行它都会在实现最佳性能之后在时代30周围发生变化,这只是num_threads=10
和num_threads=1
的情况。不是num_threads=10
的情况(create table tbl(
id int unsigned not null auto_increment primary key,
t0 datetime,
t1 datetime
);
insert into tbl(t0,t1) values
('2016-03-01 12:00:00', '2016-08-15 16:50:00'),
('2016-05-15 20:00:00', '2016-08-15 16:50:00'),
('2016-06-30 19:00:00', '2016-08-15 16:50:00');
的每个纪元日志的培训/验证可在此处获得:http://pastebin.com/J1DqAtz8)。
有人知道可能导致什么原因吗?我会错过一些基本的东西吗例如,每个图形是否在单独的数组中计算梯度?