我正在读取CSV文件,记录被记录为字符串[]。我想获取每条记录并将其转换为自定义对象。
T GetMyObject<T>();
目前我正在通过非常慢的反射来做到这一点。我正在测试一个包含数百万条记录的515 Meg文件。解析时间不到10秒。使用Convert.ToSomeType
手动转换创建自定义对象大约需要20秒,但通过反射转换为对象大约需要4分钟。
自动处理此问题的好方法是什么?
似乎在PropertyInfo.SetValue
方法花了很多时间。我尝试缓存属性MethodInfo
setter并使用它,但它实际上更慢。
我也尝试将其转换为像Jon Skeet在这里建议的那样委托:Improving performance reflection , what alternatives should I consider,但问题是我不知道属性类型是提前的。我能够得到代表
var myObject = Activator.CreateInstance<T>();
foreach( var property in typeof( T ).GetProperties() )
{
var d = Delegate.CreateDelegate( typeof( Action<,> )
.MakeGenericType( typeof( T ), property.PropertyType ), property.GetSetMethod() );
}
这里的问题是我无法将委托转换为像Action<T, int>
这样的具体类型,因为int
的属性类型未提前知道。
答案 0 :(得分:7)
我要说的第一件事是手动编写一些示例代码 ,它会告诉您可以预期的绝对最佳情况 - 看看您当前的代码是否值得修复。
如果您正在使用PropertyInfo.SetValue
等,那么绝对可以让它更快,即使使用juts object
- HyperDescriptor可能是一个好的开始(它显着比原始反射更快,但不会使代码变得更复杂)。
为了获得最佳性能,动态IL方法是可行的方法(预编译一次);在2.0 / 3.0中,可能是DynamicMethod
,但在3.5中我赞成Expression
(使用Compile()
)。如果您想了解更多细节,请告诉我?
使用Expression
和CsvReader
的实现,它使用列标题来提供映射(它沿着相同的行发明了一些数据);它使用IEnumerable<T>
作为返回类型,以避免必须缓冲数据(因为你似乎有很多):
using System;
using System.Collections.Generic;
using System.Globalization;
using System.IO;
using System.Linq;
using System.Linq.Expressions;
using System.Reflection;
using LumenWorks.Framework.IO.Csv;
class Entity
{
public string Name { get; set; }
public DateTime DateOfBirth { get; set; }
public int Id { get; set; }
}
static class Program {
static void Main()
{
string path = "data.csv";
InventData(path);
int count = 0;
foreach (Entity obj in Read<Entity>(path))
{
count++;
}
Console.WriteLine(count);
}
static IEnumerable<T> Read<T>(string path)
where T : class, new()
{
using (TextReader source = File.OpenText(path))
using (CsvReader reader = new CsvReader(source,true,delimiter)) {
string[] headers = reader.GetFieldHeaders();
Type type = typeof(T);
List<MemberBinding> bindings = new List<MemberBinding>();
ParameterExpression param = Expression.Parameter(typeof(CsvReader), "row");
MethodInfo method = typeof(CsvReader).GetProperty("Item",new [] {typeof(int)}).GetGetMethod();
Expression invariantCulture = Expression.Constant(
CultureInfo.InvariantCulture, typeof(IFormatProvider));
for(int i = 0 ; i < headers.Length ; i++) {
MemberInfo member = type.GetMember(headers[i]).Single();
Type finalType;
switch (member.MemberType)
{
case MemberTypes.Field: finalType = ((FieldInfo)member).FieldType; break;
case MemberTypes.Property: finalType = ((PropertyInfo)member).PropertyType; break;
default: throw new NotSupportedException();
}
Expression val = Expression.Call(
param, method, Expression.Constant(i, typeof(int)));
if (finalType != typeof(string))
{
val = Expression.Call(
finalType, "Parse", null, val, invariantCulture);
}
bindings.Add(Expression.Bind(member, val));
}
Expression body = Expression.MemberInit(
Expression.New(type), bindings);
Func<CsvReader, T> func = Expression.Lambda<Func<CsvReader, T>>(body, param).Compile();
while (reader.ReadNextRecord()) {
yield return func(reader);
}
}
}
const char delimiter = '\t';
static void InventData(string path)
{
Random rand = new Random(123456);
using (TextWriter dest = File.CreateText(path))
{
dest.WriteLine("Id" + delimiter + "DateOfBirth" + delimiter + "Name");
for (int i = 0; i < 10000; i++)
{
dest.Write(rand.Next(5000000));
dest.Write(delimiter);
dest.Write(new DateTime(
rand.Next(1960, 2010),
rand.Next(1, 13),
rand.Next(1, 28)).ToString(CultureInfo.InvariantCulture));
dest.Write(delimiter);
dest.Write("Fred");
dest.WriteLine();
}
dest.Close();
}
}
}
使用TypeConverter
而不是Parse
的第二个版本(请参阅评论):
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Globalization;
using System.IO;
using System.Linq;
using System.Linq.Expressions;
using System.Reflection;
using LumenWorks.Framework.IO.Csv;
class Entity
{
public string Name { get; set; }
public DateTime DateOfBirth { get; set; }
public int Id { get; set; }
}
static class Program
{
static void Main()
{
string path = "data.csv";
InventData(path);
int count = 0;
foreach (Entity obj in Read<Entity>(path))
{
count++;
}
Console.WriteLine(count);
}
static IEnumerable<T> Read<T>(string path)
where T : class, new()
{
using (TextReader source = File.OpenText(path))
using (CsvReader reader = new CsvReader(source, true, delimiter))
{
string[] headers = reader.GetFieldHeaders();
Type type = typeof(T);
List<MemberBinding> bindings = new List<MemberBinding>();
ParameterExpression param = Expression.Parameter(typeof(CsvReader), "row");
MethodInfo method = typeof(CsvReader).GetProperty("Item", new[] { typeof(int) }).GetGetMethod();
var converters = new Dictionary<Type, ConstantExpression>();
for (int i = 0; i < headers.Length; i++)
{
MemberInfo member = type.GetMember(headers[i]).Single();
Type finalType;
switch (member.MemberType)
{
case MemberTypes.Field: finalType = ((FieldInfo)member).FieldType; break;
case MemberTypes.Property: finalType = ((PropertyInfo)member).PropertyType; break;
default: throw new NotSupportedException();
}
Expression val = Expression.Call(
param, method, Expression.Constant(i, typeof(int)));
if (finalType != typeof(string))
{
ConstantExpression converter;
if (!converters.TryGetValue(finalType, out converter))
{
converter = Expression.Constant(TypeDescriptor.GetConverter(finalType));
converters.Add(finalType, converter);
}
val = Expression.Convert(Expression.Call(converter, "ConvertFromInvariantString", null, val),
finalType);
}
bindings.Add(Expression.Bind(member, val));
}
Expression body = Expression.MemberInit(
Expression.New(type), bindings);
Func<CsvReader, T> func = Expression.Lambda<Func<CsvReader, T>>(body, param).Compile();
while (reader.ReadNextRecord())
{
yield return func(reader);
}
}
}
const char delimiter = '\t';
static void InventData(string path)
{
Random rand = new Random(123456);
using (TextWriter dest = File.CreateText(path))
{
dest.WriteLine("Id" + delimiter + "DateOfBirth" + delimiter + "Name");
for (int i = 0; i < 10000; i++)
{
dest.Write(rand.Next(5000000));
dest.Write(delimiter);
dest.Write(new DateTime(
rand.Next(1960, 2010),
rand.Next(1, 13),
rand.Next(1, 28)).ToString(CultureInfo.InvariantCulture));
dest.Write(delimiter);
dest.Write("Fred");
dest.WriteLine();
}
dest.Close();
}
}
}
答案 1 :(得分:1)
您应该创建一个DynamicMethod
或一个表达式树,并在运行时构建静态类型代码。
这会产生相当大的设置成本,但根本没有每个对象的开销 但是,这样做有些困难,并且会导致难以调试的复杂代码。