我有一个场景,可以将不同的csv文件处理为通用格式。我的CSV文件包含不同学生的学生ID,姓名和分数。但是csv文件中的数据分布因文件而异。例如,在第一个csv文件中,学生ID,名字,姓氏,生物学,化学,物理,英语,法语,数学中的分数分布如下:
分发类型1:
001,John,Doe,098,099,095,088,075,096
002,Jane,Doe,099,095,096,085,095,099
在另一个csv文件中,相同的数据以学生的名字,姓氏,学生ID,英语和法语中的100分,几何和代数中的50分,以及最后100分中的分数分布化学,生物和物理,如下:
分发类型2:
John,Doe,001,088,075,048,048,099,098,095
Jane,Doe,002,085,095,050,049,095,099,096
上述两个发行版的输出应为:
学生ID [空间]学生名字[空格]学生姓[空格]语言分数百分比(即英语+法语)[空格]数学分数百分比(代数+几何)[空间]%科学成绩(物理+化学+生物学)
因此对于上述两种分布,输出都是
001 John Doe 081 096 097
002 Jane Doe 090 099 097
要在第一种情况下转换输入,代码如下:
string csv1 = @"D:\MyPath\Distribution1.csv";
string outputPath = @"D:\MyPath\OutputFile.txt";
string line = string.Empty;
string outputLine = string.Empty;
string[] data = null;
using (StreamReader sr = new StreamReader(csv1)) {
while (!sr.EndOfStream) {
line = sr.ReadLine();
data = line.Split(new char[] {','}, StringSplitOptions.None);
outputLine = data[0] + " "
+ data[1] + " " + data[2] + " "
+ Convert.ToString(((Convert.ToInt32(data[6]) + Convert.ToInt32(data[7])) * 100)/200) + " "
+ data[8] + " "
+ Convert.ToString(((Convert.ToInt32(data[3]) + Convert.ToInt32(data[4]) + Convert.ToInt32(data[5])) * 100) / 300);
using (StreamWriter sw = new StreamWriter(outputPath)) {
sw.WriteLine(outputLine);
}
}
}
要处理第二个csv文件,它将是以下内容:
string csv2 = @"D:\MyPath\Distributio21.csv";
string outputPath = @"D:\MyPath\OutputFile.txt";
string line = string.Empty;
string outputLine = string.Empty;
string[] data = null;
using (StreamReader sr = new StreamReader(csv2)) {
while (!sr.EndOfStream) {
line = sr.ReadLine();
data = line.Split(new char[] {','}, StringSplitOptions.None);
outputLine = data[2] + " "
+ data[0] + " " + data[1] + " "
+ Convert.ToString(((Convert.ToInt32(data[3]) + Convert.ToInt32(data[4])) * 100)/200) + " "
+ Convert.ToString(((Convert.ToInt32(data[5]) + Convert.ToInt32(data[6])) * 100)/100) + " "
+ Convert.ToString(((Convert.ToInt32(data[7]) + Convert.ToInt32(data[8]) + Convert.ToInt32(data[9])) * 100) / 300);
using (StreamWriter sw = new StreamWriter(outputPath)) {
sw.WriteLine(outputLine);
}
}
}
生成outputLine的代码显然不太理想。我希望用表达式树替换它们。寻求有关如何替换以下代码片段的输入:
outputLine = data[2] + " "
+ data[0] + " " + data[1] + " "
+ Convert.ToString(((Convert.ToInt32(data[3]) + Convert.ToInt32(data[4])) * 100)/200) + " "
+ Convert.ToString(((Convert.ToInt32(data[5]) + Convert.ToInt32(data[6])) * 100)/100) + " "
+ Convert.ToString(((Convert.ToInt32(data[7]) + Convert.ToInt32(data[8]) + Convert.ToInt32(data[9])) * 100) / 300);
和
outputLine = data[0] + " "
+ data[1] + " " + data[2] + " "
+ Convert.ToString(((Convert.ToInt32(data[6]) + Convert.ToInt32(data[7])) * 100)/200) + " "
+ data[8] + " "
+ Convert.ToString(((Convert.ToInt32(data[3]) + Convert.ToInt32(data[4]) + Convert.ToInt32(data[5])) * 100) / 300);
表达式树。
还请让我知道表达式树如何与相应的输入文件相关联。我希望调用与文件类型对应的相应表达式树。
非常感谢任何帮助。
答案 0 :(得分:1)
您的代码最大的问题是没有separation of concerns。你应该做的是将你的代码分成三个部分:
在你的情况下,这样做可能是这样的:
class StudentScores1
{
public int Id { get; set; }
public string FirstName { get; set; }
public string LastName { get; set; }
public int Biology { get; set; }
public int Chemistry { get; set; }
public int Physics { get; set; }
public int English { get; set; }
public int French { get; set; }
public int Mathematics { get; set; }
}
class CombinedScores
{
public int Id { get; set; }
public string FirstName { get; set; }
public string LastName { get; set; }
public int Languages { get; set; }
public int Mathematics { get; set; }
public int Sciences { get; set; }
}
…
static IEnumerable<StudentScores1> ParseScores1(string inputPath)
{
using (var sr = new StreamReader(inputPath))
{
while (!sr.EndOfStream)
{
var data = sr.ReadLine().Split(',');
yield return
new StudentScores1
{
Id = int.Parse(data[0]),
FirstName = data[1],
LastName = data[2],
Biology = int.Parse(data[3]),
Chemistry = int.Parse(data[4]),
Physics = int.Parse(data[5]),
English = int.Parse(data[6]),
French = int.Parse(data[7]),
Mathematics = int.Parse(data[8])
};
}
}
}
static int Average(params int[] inputs)
{
return inputs.Sum() / inputs.Length;
}
static IEnumerable<CombinedScores> CombineScores1(
IEnumerable<StudentScores1> scores)
{
return scores.Select(
s =>
new CombinedScores
{
Id = s.Id,
FirstName = s.FirstName,
LastName = s.LastName,
Languages = Average(s.English, s.French),
Sciences = Average(s.Biology, s.Chemistry, s.Physics)
});
}
static void WriteOutput(
IEnumerable<CombinedScores> combinedScores, string outputPath)
{
using (var sw = new StreamWriter(outputPath))
{
foreach (var scores in combinedScores)
{
string outputLine = string.Format(
"{0:d3} {1} {2} {3:d3} {4:d3} {5:d3}",
scores.Id, scores.FirstName, scores.LastName,
scores.Languages, scores.Mathematics, scores.Sciences);
sw.WriteLine(outputLine);
}
}
}
它的代码比您原来的多,但它更安全,更清晰,更易于维护。完成所有这些操作后,您会发现只有ParseScores1
和CombineScores1
处理输入格式,因此您可以编写ParseScores2
,无需了解输出格式或组合分数和CombineScores2
。执行此操作后,应用程序的主要逻辑可能如下所示:
IEnumerable<CombinedScores> scores;
if (intputFormat1)
scores = CombineScores1(ParseScores1(inputPath));
else
scores = CombineScores2(ParseScores2(inputPath));
WriteOutput(scores, outputPath);