我真的不明白这是怎么发生的。我确信这是非常不稳定的,并不总是完全一样。
我有以下代码,有一个核心方法和另一个方法通过启动多个任务同时调用核心方法,每个任务调用1个核心方法。
//here is the core method
public static bool ImportToSql(string targetConnectionString, IEnumerable<SyncedItem> accessFileItems)
{
var completelyOk = true;
using (var con = new SqlConnection(targetConnectionString))
{
foreach (var a in accessFileItems)
{
Log.WriteOnToday($"Syncing file: {a.Filename} ...");
var query = "some query here ...";
var retries = -1;
while (++retries <= 10)
{
if(retries > 0)
{
Thread.Sleep(60000);
Log.WriteOnToday($"Retrying to sync the file: \"{a.Filename}\" ({retries}/{10}) ...");
}
var r = ...;//do something here returning bool result
completelyOk &= r;
if (r)
{
break;
} else if(retries == 10)
{
Log.WriteOnToday($"FAILED to synce the file \"{a.Filename}\" after {10} retries");
}
}
}
}
return completelyOk;
}
//here is another method, actually the test data is grouped in 2 groups
//corresponding to 2 Tasks
public static async Task<bool> ParallelImportToSql(string targetConnectionString, IEnumerable<SyncedItem> accessFileItems)
{
IEnumerable<Task<bool>> tasks = accessFileItems.Select((e, i) => new { e, i })
.GroupBy(e => e.i % 2, (key,items) => items.Select(e => e.e))
.Select(g => Task.Run<bool>(() => ImportToSql(targetConnectionString, g)));
return (await Task.WhenAll(tasks)).All(e => e);
}
所以你可以看到有一个调用Log.WriteOnToday
来记录“Retrying ...”,我希望应该总是调用它(根据我的测试,r
总是为false)并写恰好是输出文件的一行。因此,如果它运行10次,应该有10行。但实际上输出结果中缺少一行。我真的很想知道如何发生这种情况,您能否在此处分析代码并确定哪些可能是错误的?
以下是输出结果(由Log.WriteOnToDay
为“重试...”编写):
4/12/2018 5:28:05 PM - HO-IT1 >> Syncing file: \\192.168.0.144\DataKH_311\11_04_2018\cust.mdb ...
4/12/2018 5:29:05 PM - HO-IT1 >> Retrying to sync the file: "\\192.168.0.144\DataKH_311\11_04_2018\cust.mdb" (1/10) ...
4/12/2018 5:29:05 PM - HO-IT1 >> Retrying to sync the file: "\\192.168.0.144\DataKH_311\SVR\cust.mdb" (1/10) ...
4/12/2018 5:30:45 PM - HO-IT1 >> Retrying to sync the file: "\\192.168.0.144\DataKH_311\11_04_2018\cust.mdb" (2/10) ...
4/12/2018 5:32:25 PM - HO-IT1 >> Retrying to sync the file: "\\192.168.0.144\DataKH_311\11_04_2018\cust.mdb" (3/10) ...
4/12/2018 5:32:25 PM - HO-IT1 >> Retrying to sync the file: "\\192.168.0.144\DataKH_311\SVR\cust.mdb" (3/10) ...
4/12/2018 5:34:05 PM - HO-IT1 >> Retrying to sync the file: "\\192.168.0.144\DataKH_311\SVR\cust.mdb" (4/10) ...
4/12/2018 5:34:05 PM - HO-IT1 >> Retrying to sync the file: "\\192.168.0.144\DataKH_311\11_04_2018\cust.mdb" (4/10) ...
4/12/2018 5:35:45 PM - HO-IT1 >> Retrying to sync the file: "\\192.168.0.144\DataKH_311\SVR\cust.mdb" (5/10) ...
4/12/2018 5:35:45 PM - HO-IT1 >> Retrying to sync the file: "\\192.168.0.144\DataKH_311\11_04_2018\cust.mdb" (5/10) ...
4/12/2018 5:37:25 PM - HO-IT1 >> Retrying to sync the file: "\\192.168.0.144\DataKH_311\SVR\cust.mdb" (6/10) ...
4/12/2018 5:37:26 PM - HO-IT1 >> Retrying to sync the file: "\\192.168.0.144\DataKH_311\11_04_2018\cust.mdb" (6/10) ...
4/12/2018 5:39:06 PM - HO-IT1 >> Retrying to sync the file: "\\192.168.0.144\DataKH_311\SVR\cust.mdb" (7/10) ...
4/12/2018 5:39:06 PM - HO-IT1 >> Retrying to sync the file: "\\192.168.0.144\DataKH_311\11_04_2018\cust.mdb" (7/10) ...
4/12/2018 5:40:46 PM - HO-IT1 >> Retrying to sync the file: "\\192.168.0.144\DataKH_311\SVR\cust.mdb" (8/10) ...
4/12/2018 5:40:46 PM - HO-IT1 >> Retrying to sync the file: "\\192.168.0.144\DataKH_311\11_04_2018\cust.mdb" (8/10) ...
4/12/2018 5:42:26 PM - HO-IT1 >> Retrying to sync the file: "\\192.168.0.144\DataKH_311\SVR\cust.mdb" (9/10) ...
4/12/2018 5:42:26 PM - HO-IT1 >> Retrying to sync the file: "\\192.168.0.144\DataKH_311\11_04_2018\cust.mdb" (9/10) ...
4/12/2018 5:44:06 PM - HO-IT1 >> Retrying to sync the file: "\\192.168.0.144\DataKH_311\SVR\cust.mdb" (10/10) ...
4/12/2018 5:44:06 PM - HO-IT1 >> Retrying to sync the file: "\\192.168.0.144\DataKH_311\11_04_2018\cust.mdb" (10/10) ...
4/12/2018 5:44:46 PM - HO-IT1 >> FAILED to synce the file "\\192.168.0.144\DataKH_311\SVR\cust.mdb" after 10 retries
4/12/2018 5:44:46 PM - HO-IT1 >> FAILED to synce the file "\\192.168.0.144\DataKH_311\11_04_2018\cust.mdb" after 10 retries
只有2个文件(要同步到sql的mdb文件),包括\\192.168.0.144\DataKH_311\SVR\cust.mdb
和\\192.168.0.144\DataKH_311\11_04_2018\cust.mdb
。第二个文件有10个日志行输出OK(正如预期)但是第一个文件只有9个日志行,第二行(对应于第二次重试)丢失(无缘无故)。如果查看时间,您可以看到第一(1/10)和第三(3/10)记录行之间几乎存在双重间隙(与同一文件的其他连续记录行对相比)(对于我上面提到的第一个文件)。所以看起来它不仅仅是一个简单的旁路。
错过该日志记录行并不是很重要,但在其他情况下,如果重要的事情(某些代码行)在这里被忽略,那可能会导致一个非常隐藏的错误(当然这是非常严重和不可接受的)
问题可能出在Log.WriteOnToDay
,这是代码:
public static class Log
{
public static string LastWrittenLogFileName { get; private set; }
public static void WriteOnToday(string message, bool prependedWithDate = true, bool includeMachineInfo = true, string logFilePrefix = "log_")
{
Write(message, prependedWithDate, includeMachineInfo, string.Format(@"Logs\{0}{1:ddMMyyyy}.txt", logFilePrefix, DateTime.Now));
}
public static void Write(string message, bool prependedWithDate = true, bool includeMachineInfo = true, string filename = "log.txt")
{
//check for relative path in a simple way
if (Path.GetPathRoot(filename).Length < 3)
{
if (_currentDirectory == null)
{
_currentDirectory = System.IO.Path.GetDirectoryName(System.Diagnostics.Process.GetCurrentProcess().MainModule.FileName);
}
filename = System.IO.Path.Combine(_currentDirectory, filename);
}
LastWrittenLogFileName = filename;
var dateText = "";
var machineInfo = "";
if (prependedWithDate) dateText = DateTime.Now + " - ";
if (includeMachineInfo) machineInfo = Environment.MachineName + " >> ";
message = string.Format("{0}{1}{2}", dateText, machineInfo, message);
var dir = Path.GetDirectoryName(filename);
if (!Directory.Exists(dir))
{
Directory.CreateDirectory(dir);
}
try
{
using (var sw = File.AppendText(filename))
{
sw.WriteLine(message);
}
}
catch
{
}
}
}
上面的catch
可能是吞下异常的地方。可能会抛出一些IOException。真的,我认为应该很少发生(除非硬盘驱动器出现问题)。