如何将复杂的csv文件导入数值向量到Matlab中

时间:2013-10-07 14:22:03

标签: matlab csv

我想知道我们应该如何读取由字符串,双精度和字符等组成的复杂csv文件。

例如,您能否提供一个可以在此csv文件中提取数值的成功命令?

点击here

例如:

yield curve data 2013-10-04     
Yields in percentages per annum.        


Parameters - AAA-rated bonds        
Series key   Parameters  Description
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA0  2.03555 Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Beta 0 - Euro, provided by ECB
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA1  -2.009068   Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Beta 1 - Euro, provided by ECB
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA2  24.54184    Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Beta 2 - Euro, provided by ECB
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA3  -21.80556   Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Beta 3 - Euro, provided by ECB
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.TAU1   5.351378    Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Tau 1 - Euro, provided by ECB
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.TAU2   4.321162    Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Tau 2 - Euro, provided by ECB

这些是文件中信息的一部分。我尝试csvread('yc_latest.csv', 6, 1, [6,1,6,1])获取值2.03555,但它给了我以下错误:

   Error using dlmread (line 139)
    Mismatch between file and format string.
    Trouble reading number from file (row 1u, field 3u) ==> "Euro area (changing composition) -
    Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous
    compounding - yield error minimisation - Yield curve parameters, Beta 0

    Error in csvread (line 50)
        m=dlmread(filename, ',', r, c, rng);

4 个答案:

答案 0 :(得分:5)

我强烈建议您使用matlab中的“导入数据”功能(它位于“HOME”工具栏中)。

特别注意截图中它还可以为您生成代码,以便将来自动化。 enter image description here

答案 1 :(得分:2)

这是一个非常糟糕的解决方案。不幸的是,Matlab在读取csv文件时非常震撼,这使得这种hackery成为一种不幸的必需品。从好的方面来说,你可能只需要编写一次这样的代码。

fid = fopen('yc_latest.csv');   %// open the file

%// parse as csv, skipping the first six lines
contents = textscan(fid, '%s %f %[^\n]', 'HeaderLines', 6); 

%// unpack the fields and give them meaningful names
[seriesKey, parameters, description]   = contents{:};

fclose(fid);                    %// don't forget this!

答案 2 :(得分:0)

Chris解决方案的替代方案:

fid=fopen('yc_latest.csv');
Rows = textscan(fid,'%s', 'delimiter','\n'); %Creates a temporary cell array with the rows
fclose(fid);

%looks for the lines with a euro value:
value=strfind(Rows,'Euro'); 
Idx = find(~cellfun('isempty', value)); 

Columns= cellfun(@(x) textscan(x,'%f','delimiter','\t','CollectOutput',1), Rows);
Columns= cellfun(@transpose, Columns, 'UniformOutput', 0);

具有实际欧元值的所有行的索引存储在Idx中。

答案 3 :(得分:0)

您可能希望以这种方式使用textscan

每行都使用常规分隔符(制表符,空格)进行解析,使用的格式为%*s,带有星号以跳过第一个元素(YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA0 ),然后%f获取感兴趣的值,最后%*[^\n]跳过剩余的行。

fid = fopen(filename);                                
C = textscan(fid, '%*s%f%*[^\n]', 'HeaderLines', 6); 
fclose(fid);

values   = C{1};