请考虑以下代码段,并注意设置mode = tf.placeholder(tf.string, name='mode')
training = tf.equal(mode, 'train')
...
layer = tf.layers.dropout(layer, rate=0.5, training=training)
...
with tf.Session() as sess:
sess.run(..., feed_dict={mode: 'train'}) # This turns on the dropout
sess.run(..., feed_dict={mode: 'test'}) # This turns off the dropout
等于1然后3,4或更多(取决于计算机上的线程资源)之间的总运行时间差异。在关闭更多任务时,我注意到运行时间要长得多。
我将数据集合传递给每个工作者实例,每个工作者任务同时读取。我认为只要这些操作只是读取或枚举,任务就可以访问共享数据结构而不会阻塞。
我的目标是通过读取操作分离迭代相同共享数据结构的多个任务,并且几乎在同一时间完成,而不管数字任务分离。
编辑:请参阅我实现numberTasksToSpinOff
的第二个代码段并创建每个工作人员自己的数据集,因此不会通过不同的任务/线程访问相同的数据结构。然而,我仍然看到了无法接受的开销。
Parallel.Foreach()
Second Code Snippet:
class Program
{
static void Main(string[] args)
{
Console.WriteLine($"Entry Main Function Thread Id: {Thread.CurrentThread.ManagedThreadId}");
//run
var task = Task.Run(async () =>
{
Console.WriteLine($"Entry RunMe Task Thread Id: {Thread.CurrentThread.ManagedThreadId}");
await RunMe();
Console.WriteLine($"Exit RunMe Task Thread Id: {Thread.CurrentThread.ManagedThreadId}");
});
task.Wait();
Console.WriteLine($"Exit Main Function Thread Id: {Thread.CurrentThread.ManagedThreadId}");
Console.WriteLine("Press key to quit");
Console.ReadLine();
}
private static async Task RunMe()
{
var watch = new Stopwatch();
var numberTasksToSpinOff = 6;
var numberItems = 20000;
var random = new Random((int)DateTime.Now.Ticks);
var dataPoints = Enumerable.Range(1, numberItems).Select(x => random.NextDouble()).ToList();
var tasks = new List<Task>();
var workers = new List<Worker>();
//structure workers
for (int i = 1; i <= numberTasksToSpinOff; i++)
{
workers.Add(new Worker(i, dataPoints));
}
//start timer
watch.Restart();
//spin off tasks
foreach (var worker in workers)
{
tasks.Add(Task.Run(() =>
{
Console.WriteLine($"Entry WorkerId: {worker.WorkerId} -> New Tasks spun off with in Thread Id: {Thread.CurrentThread.ManagedThreadId}");
worker.DoSomeWork();
Console.WriteLine($"Exit WorkerId: {worker.WorkerId} -> New Tasks spun off with in Thread Id: {Thread.CurrentThread.ManagedThreadId}");
}));
}
//completion tasks
await Task.WhenAll(tasks);
//stop timer
watch.Stop();
Console.WriteLine($"Time it took to complete in Milliseconds: {watch.ElapsedMilliseconds}");
}
}
public class Worker
{
public int WorkerId { get; set; }
private List<double> _data;
public Worker(int workerId, List<double> data)
{
WorkerId = workerId;
_data = data;
}
public void DoSomeWork()
{
var indexPos = 0;
foreach (var dp in _data)
{
var subSet = _data.Skip(indexPos).Take(_data.Count - indexPos).ToList();
indexPos++;
}
}
}
答案 0 :(得分:3)
注意:以下答案基于测试和观察而非确定性知识。
您分离的任务越多,生成的开销就越大,因此总执行时间也会增加。 但如果您从另一个角度考虑它,您会看到实际处理过的&#34;数据点&#34;将增加您启动的更多任务(直到达到可用硬件线程的限制):
我的机器(4C / 8T)上生成以下值,每个列表10000点:
直到达到我的核心限制&#34;处理后的数据显着增加,直到我达到我的线程限制&#34;它增加仍然显着,但之后它再次减少,因为增加了开销,没有更多可用的硬件资源。
答案 1 :(得分:2)
您是否看过并行任务?然后你可以做这样的事情。
例如:
if (workers.Any())
{
var parallelOptions = new ParallelOptions {MaxDegreeOfParallelism = Environment.ProcessorCount};
Parallel.ForEach(workers, parallelOptions, DoSomeWork);
}
private static void DoSomeWork(Worker worker)
{
}