我有一个char[] salary
,其中包含来自string
的数据。我想将char[] salary
转换为float
,但是按照我尝试的方法,它似乎非常慢,
float ff = float.Parse(new string(salary));
根据Visual Studio的Performance Profiler,此处理量过多:
因此,我想知道是否有更快的方法来执行此操作,因为此处的性能很重要。
char[]
的格式如下:
[ '1', '3, '2', ',' '2', '9']
基本上是一种类似于JSON的浮点型,转换为适合char[]
的每个数字(和逗号)。
编辑:
我已经重新格式化了代码,似乎性能下降实际上是从char[]
到string
的转换,而不是从string
到float
的解析。
答案 0 :(得分:5)
由于此问题已从“解析float
的最快方法是什么?”更改为关于“从string
获取char[]
的最快方法是什么?”,我用BenchmarkDotNet
编写了一些基准测试以比较各种方法。我的发现是,如果您已经拥有char[]
,那么就像将它传递给string(char[])
构造函数一样,您将获得比以前更快的速度。
您说输入文件“被读入byte[]
,然后将byte[]
中代表float
的部分提取到char[]
中。”由于您有byte
组成了float
中孤立的byte[]
文本,因此也许可以跳过中间的char[]
来提高性能。假设您有相当于...的东西
byte[] floatBytes = new byte[] { 0x31, 0x33, 0x32, 0x2C, 0x32, 0x39 }; // "132,29"
...您可以使用Encoding.GetString()
...
string floatString = Encoding.ASCII.GetString(floatBytes);
...这几乎是将Encoding.GetChars()
的结果传递给string(char[])
构造函数的两倍...
char[] floatChars = Encoding.ASCII.GetChars(floatBytes);
string floatString = new string(floatChars);
您会在我的结果中找到最后列出的那些基准...
BenchmarkDotNet=v0.11.0, OS=Windows 10.0.17134.165 (1803/April2018Update/Redstone4)
Intel Core i7 CPU 860 2.80GHz (Max: 2.79GHz) (Nehalem), 1 CPU, 8 logical and 4 physical cores
Frequency=2732436 Hz, Resolution=365.9738 ns, Timer=TSC
.NET Core SDK=2.1.202
[Host] : .NET Core 2.0.9 (CoreCLR 4.6.26614.01, CoreFX 4.6.26614.01), 64bit RyuJIT
Clr : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.3131.0
Core : .NET Core 2.0.9 (CoreCLR 4.6.26614.01, CoreFX 4.6.26614.01), 64bit RyuJIT
Method | Runtime | Categories | Mean | Scaled |
----------------------------------------------------- |-------- |----------------- |----------:|-------:|
String_Constructor_CharArray | Clr | char[] => string | 13.51 ns | 1.00 |
String_Concat | Clr | char[] => string | 192.87 ns | 14.27 |
StringBuilder_Local_AppendSingleChar_DefaultCapacity | Clr | char[] => string | 60.74 ns | 4.49 |
StringBuilder_Local_AppendSingleChar_ExactCapacity | Clr | char[] => string | 60.26 ns | 4.46 |
StringBuilder_Local_AppendAllChars_DefaultCapacity | Clr | char[] => string | 51.27 ns | 3.79 |
StringBuilder_Local_AppendAllChars_ExactCapacity | Clr | char[] => string | 49.51 ns | 3.66 |
StringBuilder_Field_AppendSingleChar | Clr | char[] => string | 51.14 ns | 3.78 |
StringBuilder_Field_AppendAllChars | Clr | char[] => string | 32.95 ns | 2.44 |
| | | | |
String_Constructor_CharPointer | Clr | void* => string | 29.28 ns | 1.00 |
String_Constructor_SBytePointer | Clr | void* => string | 89.21 ns | 3.05 |
UnsafeArrayCopy_String_Constructor | Clr | void* => string | 42.82 ns | 1.46 |
| | | | |
Encoding_GetString | Clr | byte[] => string | 37.33 ns | 1.00 |
Encoding_GetChars_String_Constructor | Clr | byte[] => string | 60.83 ns | 1.63 |
SafeArrayCopy_String_Constructor | Clr | byte[] => string | 27.55 ns | 0.74 |
| | | | |
String_Constructor_CharArray | Core | char[] => string | 13.27 ns | 1.00 |
String_Concat | Core | char[] => string | 172.17 ns | 12.97 |
StringBuilder_Local_AppendSingleChar_DefaultCapacity | Core | char[] => string | 58.68 ns | 4.42 |
StringBuilder_Local_AppendSingleChar_ExactCapacity | Core | char[] => string | 59.85 ns | 4.51 |
StringBuilder_Local_AppendAllChars_DefaultCapacity | Core | char[] => string | 40.62 ns | 3.06 |
StringBuilder_Local_AppendAllChars_ExactCapacity | Core | char[] => string | 43.67 ns | 3.29 |
StringBuilder_Field_AppendSingleChar | Core | char[] => string | 54.49 ns | 4.11 |
StringBuilder_Field_AppendAllChars | Core | char[] => string | 31.05 ns | 2.34 |
| | | | |
String_Constructor_CharPointer | Core | void* => string | 22.87 ns | 1.00 |
String_Constructor_SBytePointer | Core | void* => string | 83.11 ns | 3.63 |
UnsafeArrayCopy_String_Constructor | Core | void* => string | 35.30 ns | 1.54 |
| | | | |
Encoding_GetString | Core | byte[] => string | 36.19 ns | 1.00 |
Encoding_GetChars_String_Constructor | Core | byte[] => string | 58.99 ns | 1.63 |
SafeArrayCopy_String_Constructor | Core | byte[] => string | 27.81 ns | 0.77 |
...从运行此代码开始(需要BenchmarkDotNet
assembly并使用/unsafe
进行编译)...
using System;
using System.Linq;
using System.Runtime.InteropServices;
using System.Text;
using BenchmarkDotNet.Attributes;
namespace StackOverflow_51584129
{
[CategoriesColumn()]
[ClrJob()]
[CoreJob()]
[GroupBenchmarksBy(BenchmarkDotNet.Configs.BenchmarkLogicalGroupRule.ByCategory)]
public class StringCreationBenchmarks
{
private static readonly Encoding InputEncoding = Encoding.ASCII;
private const string InputString = "132,29";
private static readonly byte[] InputBytes = InputEncoding.GetBytes(InputString);
private static readonly char[] InputChars = InputString.ToCharArray();
private static readonly sbyte[] InputSBytes = InputBytes.Select(Convert.ToSByte).ToArray();
private GCHandle _inputBytesHandle;
private GCHandle _inputCharsHandle;
private GCHandle _inputSBytesHandle;
private StringBuilder _builder;
[Benchmark(Baseline = true)]
[BenchmarkCategory("char[] => string")]
public string String_Constructor_CharArray()
{
return new string(InputChars);
}
[Benchmark(Baseline = true)]
[BenchmarkCategory("void* => string")]
public unsafe string String_Constructor_CharPointer()
{
var pointer = (char*) _inputCharsHandle.AddrOfPinnedObject();
return new string(pointer);
}
[Benchmark()]
[BenchmarkCategory("void* => string")]
public unsafe string String_Constructor_SBytePointer()
{
var pointer = (sbyte*) _inputSBytesHandle.AddrOfPinnedObject();
return new string(pointer);
}
[Benchmark()]
[BenchmarkCategory("char[] => string")]
public string String_Concat()
{
return string.Concat(InputChars);
}
[Benchmark()]
[BenchmarkCategory("char[] => string")]
public string StringBuilder_Local_AppendSingleChar_DefaultCapacity()
{
var builder = new StringBuilder();
foreach (var c in InputChars)
builder.Append(c);
return builder.ToString();
}
[Benchmark()]
[BenchmarkCategory("char[] => string")]
public string StringBuilder_Local_AppendSingleChar_ExactCapacity()
{
var builder = new StringBuilder(InputChars.Length);
foreach (var c in InputChars)
builder.Append(c);
return builder.ToString();
}
[Benchmark()]
[BenchmarkCategory("char[] => string")]
public string StringBuilder_Local_AppendAllChars_DefaultCapacity()
{
var builder = new StringBuilder().Append(InputChars);
return builder.ToString();
}
[Benchmark()]
[BenchmarkCategory("char[] => string")]
public string StringBuilder_Local_AppendAllChars_ExactCapacity()
{
var builder = new StringBuilder(InputChars.Length).Append(InputChars);
return builder.ToString();
}
[Benchmark()]
[BenchmarkCategory("char[] => string")]
public string StringBuilder_Field_AppendSingleChar()
{
_builder.Clear();
foreach (var c in InputChars)
_builder.Append(c);
return _builder.ToString();
}
[Benchmark()]
[BenchmarkCategory("char[] => string")]
public string StringBuilder_Field_AppendAllChars()
{
return _builder.Clear().Append(InputChars).ToString();
}
[Benchmark(Baseline = true)]
[BenchmarkCategory("byte[] => string")]
public string Encoding_GetString()
{
return InputEncoding.GetString(InputBytes);
}
[Benchmark()]
[BenchmarkCategory("byte[] => string")]
public string Encoding_GetChars_String_Constructor()
{
var chars = InputEncoding.GetChars(InputBytes);
return new string(chars);
}
[Benchmark()]
[BenchmarkCategory("byte[] => string")]
public string SafeArrayCopy_String_Constructor()
{
var chars = new char[InputString.Length];
for (int i = 0; i < InputString.Length; i++)
chars[i] = Convert.ToChar(InputBytes[i]);
return new string(chars);
}
[Benchmark()]
[BenchmarkCategory("void* => string")]
public unsafe string UnsafeArrayCopy_String_Constructor()
{
fixed (char* chars = new char[InputString.Length])
{
var bytes = (byte*) _inputBytesHandle.AddrOfPinnedObject();
for (int i = 0; i < InputString.Length; i++)
chars[i] = Convert.ToChar(bytes[i]);
return new string(chars);
}
}
[GlobalSetup(Targets = new[] { nameof(StringBuilder_Field_AppendAllChars), nameof(StringBuilder_Field_AppendSingleChar) })]
public void SetupStringBuilderField()
{
_builder = new StringBuilder();
}
[GlobalSetup(Target = nameof(UnsafeArrayCopy_String_Constructor))]
public void SetupBytesHandle()
{
_inputBytesHandle = GCHandle.Alloc(InputBytes, GCHandleType.Pinned);
}
[GlobalCleanup(Target = nameof(UnsafeArrayCopy_String_Constructor))]
public void CleanupBytesHandle()
{
_inputBytesHandle.Free();
}
[GlobalSetup(Target = nameof(String_Constructor_CharPointer))]
public void SetupCharsHandle()
{
_inputCharsHandle = GCHandle.Alloc(InputChars, GCHandleType.Pinned);
}
[GlobalCleanup(Target = nameof(String_Constructor_CharPointer))]
public void CleanupCharsHandle()
{
_inputCharsHandle.Free();
}
[GlobalSetup(Target = nameof(String_Constructor_SBytePointer))]
public void SetupSByteHandle()
{
_inputSBytesHandle = GCHandle.Alloc(InputSBytes, GCHandleType.Pinned);
}
[GlobalCleanup(Target = nameof(String_Constructor_SBytePointer))]
public void CleanupSByteHandle()
{
_inputSBytesHandle.Free();
}
public static void Main(string[] args)
{
BenchmarkDotNet.Running.BenchmarkRunner.Run<StringCreationBenchmarks>();
}
}
}
答案 1 :(得分:3)
在float
解析方面,根据您调用的float.Parse()
的重载以及传递给它的内容,会有一些收获。我运行了一些比较这些重载的基准测试(请注意,我将十进制分隔符从','
更改为'.'
,以便可以指定CultureInfo.InvariantCulture
)。
例如,调用占用IFormatProvider
的重载可以使性能提高大约10%。为NumberStyles
参数指定NumberStyles.Float
(“ lax”)会导致沿任一方向 大约一个百分点的性能变化,并且对输入数据进行一些假设,仅指定NumberStyles.AllowDecimalPoint
(“严格”)可以使性能提高几分。 (float.Parse(string)
overload使用NumberStyles.Float | NumberStyles.AllowThousands
。)
关于对输入数据进行假设的主题,如果您知道所使用的文本具有某些特征(单字节字符编码,无无效数字,无负数,无指数,无需处理{{ 3}}或NaN
/ positive无穷大等),您最好直接从byte
进行解析,并放弃任何不必要的特殊情况处理和错误检查。我在基准测试中包含了一个非常简单的实现,它能够比negative更快地从float
获得byte[]
float
到float.Parse(string)
1}}来自string
!
这是我的基准测试结果...
BenchmarkDotNet=v0.11.0, OS=Windows 10.0.17134.165 (1803/April2018Update/Redstone4)
Intel Core i7 CPU 860 2.80GHz (Max: 2.79GHz) (Nehalem), 1 CPU, 8 logical and 4 physical cores
Frequency=2732436 Hz, Resolution=365.9738 ns, Timer=TSC
.NET Core SDK=2.1.202
[Host] : .NET Core 2.0.9 (CoreCLR 4.6.26614.01, CoreFX 4.6.26614.01), 64bit RyuJIT
Clr : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.3131.0
Core : .NET Core 2.0.9 (CoreCLR 4.6.26614.01, CoreFX 4.6.26614.01), 64bit RyuJIT
Method | Runtime | Mean | Scaled |
-------------------------------------------------------------- |-------- |-----------:|-------:|
float.Parse(string) | Clr | 145.098 ns | 1.00 |
'float.Parse(string, IFormatProvider)' | Clr | 134.191 ns | 0.92 |
'float.Parse(string, NumberStyles) [Lax]' | Clr | 145.884 ns | 1.01 |
'float.Parse(string, NumberStyles) [Strict]' | Clr | 139.417 ns | 0.96 |
'float.Parse(string, NumberStyles, IFormatProvider) [Lax]' | Clr | 133.800 ns | 0.92 |
'float.Parse(string, NumberStyles, IFormatProvider) [Strict]' | Clr | 127.413 ns | 0.88 |
'Custom byte-to-float parser [Indexer]' | Clr | 7.657 ns | 0.05 |
'Custom byte-to-float parser [Enumerator]' | Clr | 566.440 ns | 3.90 |
| | | |
float.Parse(string) | Core | 154.369 ns | 1.00 |
'float.Parse(string, IFormatProvider)' | Core | 138.668 ns | 0.90 |
'float.Parse(string, NumberStyles) [Lax]' | Core | 155.644 ns | 1.01 |
'float.Parse(string, NumberStyles) [Strict]' | Core | 150.221 ns | 0.97 |
'float.Parse(string, NumberStyles, IFormatProvider) [Lax]' | Core | 142.591 ns | 0.92 |
'float.Parse(string, NumberStyles, IFormatProvider) [Strict]' | Core | 135.000 ns | 0.87 |
'Custom byte-to-float parser [Indexer]' | Core | 12.673 ns | 0.08 |
'Custom byte-to-float parser [Enumerator]' | Core | 584.236 ns | 3.78 |
...从运行此代码开始(需要BenchmarkDotNet
assembly)...
using System;
using System.Globalization;
using BenchmarkDotNet.Attributes;
namespace StackOverflow_51584129
{
[ClrJob()]
[CoreJob()]
public class FloatParsingBenchmarks
{
private const string InputString = "132.29";
private static readonly byte[] InputBytes = System.Text.Encoding.ASCII.GetBytes(InputString);
private static readonly IFormatProvider ParsingFormatProvider = CultureInfo.InvariantCulture;
private const NumberStyles LaxParsingNumberStyles = NumberStyles.Float;
private const NumberStyles StrictParsingNumberStyles = NumberStyles.AllowDecimalPoint;
private const char DecimalSeparator = '.';
[Benchmark(Baseline = true, Description = "float.Parse(string)")]
public float SystemFloatParse()
{
return float.Parse(InputString);
}
[Benchmark(Description = "float.Parse(string, IFormatProvider)")]
public float SystemFloatParseWithProvider()
{
return float.Parse(InputString, CultureInfo.InvariantCulture);
}
[Benchmark(Description = "float.Parse(string, NumberStyles) [Lax]")]
public float SystemFloatParseWithLaxNumberStyles()
{
return float.Parse(InputString, LaxParsingNumberStyles);
}
[Benchmark(Description = "float.Parse(string, NumberStyles) [Strict]")]
public float SystemFloatParseWithStrictNumberStyles()
{
return float.Parse(InputString, StrictParsingNumberStyles);
}
[Benchmark(Description = "float.Parse(string, NumberStyles, IFormatProvider) [Lax]")]
public float SystemFloatParseWithLaxNumberStylesAndProvider()
{
return float.Parse(InputString, LaxParsingNumberStyles, ParsingFormatProvider);
}
[Benchmark(Description = "float.Parse(string, NumberStyles, IFormatProvider) [Strict]")]
public float SystemFloatParseWithStrictNumberStylesAndProvider()
{
return float.Parse(InputString, StrictParsingNumberStyles, ParsingFormatProvider);
}
[Benchmark(Description = "Custom byte-to-float parser [Indexer]")]
public float CustomFloatParseByIndexing()
{
// FOR DEMONSTRATION PURPOSES ONLY!
// This code has been written for and only tested with
// parsing the ASCII string "132.29" in byte form
var currentIndex = 0;
var boundaryIndex = InputBytes.Length;
char currentChar;
var wholePart = 0;
while (currentIndex < boundaryIndex && (currentChar = (char) InputBytes[currentIndex++]) != DecimalSeparator)
{
var currentDigit = currentChar - '0';
wholePart = 10 * wholePart + currentDigit;
}
var fractionalPart = 0F;
var nextFractionalDigitScale = 0.1F;
while (currentIndex < boundaryIndex)
{
currentChar = (char) InputBytes[currentIndex++];
var currentDigit = currentChar - '0';
fractionalPart += currentDigit * nextFractionalDigitScale;
nextFractionalDigitScale *= 0.1F;
}
return wholePart + fractionalPart;
}
[Benchmark(Description = "Custom byte-to-float parser [Enumerator]")]
public float CustomFloatParseByEnumerating()
{
// FOR DEMONSTRATION PURPOSES ONLY!
// This code has been written for and only tested with
// parsing the ASCII string "132.29" in byte form
var wholePart = 0;
var enumerator = InputBytes.GetEnumerator();
while (enumerator.MoveNext())
{
var currentChar = (char) (byte) enumerator.Current;
if (currentChar == DecimalSeparator)
break;
var currentDigit = currentChar - '0';
wholePart = 10 * wholePart + currentDigit;
}
var fractionalPart = 0F;
var nextFractionalDigitScale = 0.1F;
while (enumerator.MoveNext())
{
var currentChar = (char) (byte) enumerator.Current;
var currentDigit = currentChar - '0';
fractionalPart += currentDigit * nextFractionalDigitScale;
nextFractionalDigitScale *= 0.1F;
}
return wholePart + fractionalPart;
}
public static void Main()
{
BenchmarkDotNet.Running.BenchmarkRunner.Run<FloatParsingBenchmarks>();
}
}
}
答案 2 :(得分:2)
在家里制定优化细节的有趣主题:)祝大家健康。
我的目标是:在C#中尽快将Ascii CSV矩阵转换为float矩阵。为此,它会产生string.Split()行并分别转换每个术语也会带来开销。为了克服这个问题,我修改了BACON的行解析我的float的解决方案,使其像这样使用:
var falist = new List<float[]>();
for (int row=0; row<sRowList.Count; row++)
{
var sRow = sRowList[row];
falist.Add(CustomFloatParseRowByIndexing(nTerms, sRow.ToCharArray(), '.'));
}
下面是我的行解析器变体的代码。这些是基准测试结果,将40x31矩阵转换为1000x:
Benchmark0:拆分行并解析每一项以转换为浮点矩阵 dT = 704 ms
基准1:将每一项拆分行和TryParse以转换为浮点矩阵 dT = 640毫秒
Benchmark2:拆分行和CustomFloatParseByIndexing将条款转换为浮点矩阵 dT = 211 ms
基准3:使用CustomFloatParseRowByIndexing将行转换为浮点矩阵 dT = 120 ms
public float[] CustomFloatParseRowByIndexing(int nItems, char[] InputBytes, char DecimalSeparator)
{
// Convert semicolon-separated floats from InputBytes into nItems float[] result.
// Constraints are:
// - no scientific notation or .x allowed
// - every row has exactly nItems values
// - semicolon delimiter after each value
// - terms 'u' or 'undef' or 'undefined' allowed for bad values
// - minus sign allowed
// - leading space allowed
// - all terms must comply
// FOR DEMO PURPOSE ONLY
// based on BACON on Stackoverflow, modified to read nItems delimited float values
// https://stackoverflow.com/questions/51584129/convert-a-float-formated-char-to-float
var currentIndex = 0;
var boundaryIndex = InputBytes.Length;
bool termready, ready = false;
float[] result = new float[nItems];
int cItem = 0;
while (currentIndex < boundaryIndex)
{
termready = false;
if ((char)InputBytes[currentIndex] == ' ') { currentIndex++; continue; }
char currentChar;
var wholePart = 0;
float sgn = 1;
while (currentIndex < boundaryIndex && (currentChar = (char)InputBytes[currentIndex++]) != DecimalSeparator)
{
if (currentChar == 'u')
{
while ((char)InputBytes[currentIndex++] != ';') ;
result[cItem++] = -9999.0f;
continue;
}
else
if (currentChar == ' ')
{
continue;
}
else
if (currentChar == ';')
{
termready = true;
break;
}
else
if (currentChar == '-') sgn = -1;
else
{
var currentDigit = currentChar - '0';
wholePart = 10 * wholePart + currentDigit;
}
}
var fractionalPart = 0F;
var nextFractionalDigitScale = 0.1F;
if (!termready)
while (currentIndex < boundaryIndex)
{
currentChar = (char)InputBytes[currentIndex++];
if (currentChar == ';')
{
termready = true;
break;
}
var currentDigit = currentChar - '0';
fractionalPart += currentDigit * nextFractionalDigitScale;
nextFractionalDigitScale *= 0.1F;
}
if (termready)
{
result[cItem++] = sgn * (wholePart + fractionalPart);
}
}
return result;
}
答案 3 :(得分:1)