我有一个字符串列表(从文件中读取)按此顺序和格式,需要转换为类的列表。
1.0.1.0.1, Type: DateTime, Value: 06/03/2013 11:06:10
1.0.1.0.2, Type: DateTime, Value: 06/03/2014 11:06:10
1.0.1.0.3, Type: DateTime, Value: 06/03/2015 11:06:10
1.0.1.0.4, Type: DateTime, Value: 06/03/2016 11:06:10
1.0.1.0.5, Type: DateTime, Value: 06/03/2017 11:06:10
1.0.1.1.1, Type: Integer, Value: 1
1.0.1.1.2, Type: Integer, Value: 2
1.0.1.1.3, Type: Integer, Value: 3
1.0.0.1.4, Type: Integer, Value: 4
1.0.1.1.5, Type: Integer, Value: 5
1.0.1.2.1, Type: String, Value: Hello
1.0.1.2.2, Type: String, Value: Hello1
1.0.1.2.3, Type: String, Value: Hello2
1.0.1.2.4, Type: String, Value: Hello3
1.0.1.2.5, Type: String, Value: Hello4
这是我的班级
public class MyData
{
public DateTime DateTime {get;set;}
public int Index {get;set;}
public string Value {get;set;}
}
现在我想要的是将它转换为C#类列表
像这样......
List<MyData> myDataList = new List<MyData>();
MyData data1 = new MyData();
data1.DateTime = "06/03/2013 11:06:10";
data1.Index = 1;
data1.Value = "Hello";
myDataList.Add(data1);
MyData data2 = new MyData();
data2.DateTime = "06/03/2014 11:06:10";
data2.Index = 2;
data2.Value = "Hello1";
myDataList.Add(data2);
and so on..
这是我到目前为止所尝试的。
List<List<string>> allLists = lines
.Select(str => new { str, token = str.Split('.') })
.Where(x => x.token.Length >= 4)
.GroupBy(x => string.Concat(x.token.Take(4)))
.Select(g => g.Select(x => x.str).ToList())
.ToList();
我真的需要迭代还是可以修改My LINQ以获得所需的输出? 这是我的迭代。
foreach (var list in allLists)
{
MyData data = new MyData();
var splittedstring = list[0].Split(',').ToList();
if (splittedstring.Count == 3)
{
var valueData = splittedstring [2];
var indexof = valueData.IndexOf(':');
var value = valueData.Substring(indexof + 1);
// But Over here, how will get DateTime and Index ?
data.Value = value;
}
}
答案 0 :(得分:1)
这是我的解决方案,使用正则表达式。它可以通过提供基于匹配类型命名组(字符串)的条件正则表达式匹配来改进,但我认为这个概念更清晰,并且正则表达式更容易使用。按照目前的情况,日期格式不会像OP写的那样被验证,它们是假设,就像OP写的一样。
此解决方案可以容忍一些额外的空格和包含逗号的参数,但不容忍不精确的匹配,即将来在行中添加或删除的额外字段等。
我们的想法是首先将行解析为更“友好”的格式,然后按索引对友好格式进行分组,并通过迭代每个组(按索引)返回MyData行。
Regex r = new Regex(@"^(?<fieldName>(\d\.)+(?<index>\d*)), *Type: *(?<dataType>.*), *Value: (?<dataValue>.*)$");
public class MyData
{
public DateTime DateTime { get; set; }
public int Index { get; set; }
public string Value { get; set; }
}
class LogRow
{
public int Index { get; set; }
public string Type { get; set; }
public string Value { get; set; }
}
//In a parser I would rather not be too defensive, I let exceptions bubble up
IEnumerable<LogRow> ParseRows(IEnumerable<string> lines)
{
foreach (var line in lines)
{
var match = r.Matches(line).AsQueryable().Cast<Match>().Single();
yield return new LogRow()
{
Index = int.Parse(match.Groups["index"].Value),
Type = match.Groups["dataType"].Value,
Value = match.Groups["dataValue"].Value
};
}
}
IEnumerable<MyData> RowsToData(IEnumerable<LogRow> rows)
{
var byIndex = rows.GroupBy(b => b.Index).OrderBy(b=> b.Key);
//assume that rows exist for all MyData fields for a given index
foreach (var group in byIndex)
{
var rawRow = group.ToDictionary(g => g.Type, g => g);
var date = DateTime.ParseExact(rawRow["DateTime"].Value, "dd/MM/yyyy HH:mm:ss", CultureInfo.InvariantCulture);
yield return new MyData() { Index = group.Key, DateTime = date, Value = rawRow["String"].Value };
}
}
用法:
var myDataList = RowsToData(ParseRows(File.ReadAllLines("input.txt"))).ToList();
答案 1 :(得分:1)
首先,修复您的GroupBy
:string.Concat(x.token.Take(4))
可能会在点分隔数字不明确时产生不确定性。例如,1.23.4.5
和12.3.4.5
都会生成"12345"
字符串。请使用string.Join
代替非数字分隔符:
.GroupBy(x => string.Join("|", x.token.Take(4)))
现在,对于问题的主要部分,一个简单的解决方法是添加一个静态方法来解析三个字符串的列表,并在LINQ查询中使用它:
List<MyData> dataList = lines
.Select(str => new { str, token = str.Split('.') })
.Where(x => x.token.Length >= 4)
.GroupBy(x => string.Concat(x.token.Take(4)))
.Select(g => g.Select(x => x.str).ToList())
.Where(list => list.Count == 3)
.Select(MyDataFromList)
.ToList();
...
private static MyData MyDataFromList(List<string> parts) {
if (parts.Count != 3) {
throw new ArgumentException(nameof(parts));
}
var byType = parts
.Select(ToTypeAndValue)
.ToDictionary(t => t.Item1, t => t.Item2)
return new MyData {
DateTime = DateTime.Parse(byType["DateTime"])
, Index = int.Parse(byType["Integer"])
, Value = byType["String"]
};
}
private static Tuple<string,string> ToTypeAndValue(string s) {
var tokens = s.Split(',');
if (tokens.Length != 3) return null;
var typeParts = tokens[1].Split(':');
if (typeParts.Length != 2 || typeParts[0] != "Type") return null;
var valueParts = tokens[2].Split(':');
if (valueParts.Length != 2 || valueParts[0] != "Value") return null;
return Tuple.Create(typeParts[1].Trim(), typeParts[2].Trim());
}
请注意,上面的代码假设这三种类型是唯一的(因此使用Dictionary<string,string>
)。这是必需的,因为数据结构没有提供将值绑定到MyData
字段的其他方法。
答案 2 :(得分:1)
您可以使用正则表达式执行此操作。它看起来像是:
public List<MyData> GetData(string str){
var regexDate = new Regex(@"\d\.\d\.\d\.\d\.(?<id>\d).*DateTime.*Value:\s*(?<val>.*)");
var regexInteger = new Regex(@"\d\.\d\.\d\.\d\.(?<id>\d).*Integer.*Value:\s*(?<val>.*)");
var regexString = new Regex(@"\d\.\d\.\d\.\d\.(?<id>\d).*String.*Value:\s*(?<val>.*)");
var dict = new Dictionary<int, MyData>();
foreach (Match myMatch in regexDate.Matches(str))
{
if (!myMatch.Success) continue;
var index = int.Parse(myMatch.Groups["id"].Value);
dict[index] = new MyData()
{
Index = index,
DateTime = DateTime.ParseExact(myMatch.Groups["val"].Value, "dd/MM/yyyy HH:mm:ss", CultureInfo.InvariantCulture)
};
}
foreach (Match myMatch in regexInteger.Matches(str))
{
if (!myMatch.Success) continue;
var index = int.Parse(myMatch.Groups["id"].Value);
dict[index].Index = Int32.Parse(myMatch.Groups["val"].Value);
}
foreach (Match myMatch in regexString.Matches(str))
{
if (!myMatch.Success) continue;
var index = int.Parse(myMatch.Groups["id"].Value);
dict[index].Value = myMatch.Groups["val"].Value;
}
return dict.Values
}
答案 3 :(得分:1)
我只是采用手动方法...并且因为开始时的整数列表包含对象和属性的索引,所以使用这些而不是类型字符串是合乎逻辑的。 / p>
使用Dictionary
,您可以使用该对象索引在找到任何属性时创建新对象,并使用该索引存储它。每当遇到同一索引的其他属性时,都会检索该对象并在其上填写该属性。
public static List<MyData> getObj(String[] lines)
{
Dictionary<Int32, MyData> myDataDict = new Dictionary<Int32, MyData>();
const String valueStart = "Value: ";
foreach (String line in lines)
{
String[] split = line.Split(',');
// Too many fail cases; I just ignore any line that stops matching at any point.
if (split.Length < 3)
continue;
String[] numData = split[0].Trim().Split('.');
if (numData.Length < 5)
continue;
// Using the 4th number as property identifier. Could also use the
// type string, but switch/case on a numeric value is more elegant.
Int32 prop;
if (!Int32.TryParse(numData[3], out prop))
continue;
// Object index, used to reference the objects in the Dictionary.
Int32 index;
if (!Int32.TryParse(numData[4], out index))
continue;
String typeDef = split[1].Trim();
String val = split[2].TrimStart();
if (!val.StartsWith(valueStart))
continue;
val = val.Substring(valueStart.Length);
MyData data;
if (myDataDict.ContainsKey(index))
data = myDataDict[index];
else
{
data = new MyData();
myDataDict.Add(index, data);
}
switch (prop)
{
case 0:
if (!"Type: DateTime".Equals(typeDef))
continue;
DateTime dateVal;
// Don't know if this date format is correct; adapt as needed.
if (!DateTime.TryParseExact(val, "dd/MM/yyyy HH:mm:ss", System.Globalization.CultureInfo.InvariantCulture, System.Globalization.DateTimeStyles.None, out dateVal))
continue;
data.DateTime = dateVal;
break;
case 1:
if (!"Type: Integer".Equals(typeDef))
continue;
Int32 numVal;
if (!Int32.TryParse(val, out numVal))
continue;
data.Index = numVal;
break;
case 2:
if (!"Type: String".Equals(typeDef)) continue;
data.Value = val;
break;
}
}
return new List<MyData>(myDataDict.Values);
}
答案 4 :(得分:1)
这是我解决您问题的方法。我已经测试了它,你可以在这里测试它:Raw To Custom List
string text = rawData;
//Raw Data Is the exact data you read from textfile without modifications.
List<MyData> myDataList = new List<MyData>();
string[] eElco = text.Split( new[] { Environment.NewLine }, StringSplitOptions.None );
var tmem = eElco.Count();
var eachP = tmem / 3;
List<string> unDefVal = new List<string>();
foreach (string rw in eElco)
{
String onlyVal = rw.Split(new[] { "Value: " } , StringSplitOptions.None)[1];
unDefVal.Add(onlyVal);
}
for (int i = 0; i < eachP; i++)
{
int ind = Int32.Parse(unDefVal[i + eachP]);
DateTime oDate = DateTime.ParseExact(unDefVal[i], "dd/MM/yyyy hh:mm:ss",System.Globalization.CultureInfo.InvariantCulture);
MyData data1 = new MyData();
data1.DateTime = oDate;
data1.Index = ind;
data1.Value = unDefVal[i + eachP + eachP];
myDataList.Add(data1);
Console.WriteLine("Val1 = {0}, Val2 = {1}, Val3 = {2}",
myDataList[i].Index,
myDataList[i].DateTime,
myDataList[i].Value);
}