我想知道我们应该如何读取由字符串,双精度和字符等组成的复杂csv文件。
例如,您能否提供一个可以在此csv文件中提取数值的成功命令?
点击here。
例如:
yield curve data 2013-10-04
Yields in percentages per annum.
Parameters - AAA-rated bonds
Series key Parameters Description
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA0 2.03555 Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Beta 0 - Euro, provided by ECB
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA1 -2.009068 Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Beta 1 - Euro, provided by ECB
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA2 24.54184 Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Beta 2 - Euro, provided by ECB
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA3 -21.80556 Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Beta 3 - Euro, provided by ECB
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.TAU1 5.351378 Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Tau 1 - Euro, provided by ECB
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.TAU2 4.321162 Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Tau 2 - Euro, provided by ECB
这些是文件中信息的一部分。我尝试csvread('yc_latest.csv', 6, 1, [6,1,6,1])
获取值2.03555,但它给了我以下错误:
Error using dlmread (line 139)
Mismatch between file and format string.
Trouble reading number from file (row 1u, field 3u) ==> "Euro area (changing composition) -
Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous
compounding - yield error minimisation - Yield curve parameters, Beta 0
Error in csvread (line 50)
m=dlmread(filename, ',', r, c, rng);
答案 0 :(得分:5)
我强烈建议您使用matlab中的“导入数据”功能(它位于“HOME”工具栏中)。
特别注意截图中它还可以为您生成代码,以便将来自动化。
答案 1 :(得分:2)
这是一个非常糟糕的解决方案。不幸的是,Matlab在读取csv文件时非常震撼,这使得这种hackery成为一种不幸的必需品。从好的方面来说,你可能只需要编写一次这样的代码。
fid = fopen('yc_latest.csv'); %// open the file
%// parse as csv, skipping the first six lines
contents = textscan(fid, '%s %f %[^\n]', 'HeaderLines', 6);
%// unpack the fields and give them meaningful names
[seriesKey, parameters, description] = contents{:};
fclose(fid); %// don't forget this!
答案 2 :(得分:0)
Chris解决方案的替代方案:
fid=fopen('yc_latest.csv');
Rows = textscan(fid,'%s', 'delimiter','\n'); %Creates a temporary cell array with the rows
fclose(fid);
%looks for the lines with a euro value:
value=strfind(Rows,'Euro');
Idx = find(~cellfun('isempty', value));
Columns= cellfun(@(x) textscan(x,'%f','delimiter','\t','CollectOutput',1), Rows);
Columns= cellfun(@transpose, Columns, 'UniformOutput', 0);
具有实际欧元值的所有行的索引存储在Idx中。
答案 3 :(得分:0)
您可能希望以这种方式使用textscan
。
每行都使用常规分隔符(制表符,空格)进行解析,使用的格式为%*s
,带有星号以跳过第一个元素(YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA0 ),然后%f
获取感兴趣的值,最后%*[^\n]
跳过剩余的行。
fid = fopen(filename);
C = textscan(fid, '%*s%f%*[^\n]', 'HeaderLines', 6);
fclose(fid);
values = C{1};