从R中的csv文件获取列数据

时间:2014-07-11 20:34:50

标签: r

我是R新手。我有一个带有股票市场数据的csv文件

Daily NYSE Group Volume in NYSE Listed,,,,
,,,,
,,,,
,Trade Date,NYSE Group Shares,NYSE Group Trades,NYSE Group Dollar Volume
,,,,
2010,1/4/10,"1,425,504,460","4,628,115","$38,495,460,645 "
2010,1/5/10,"1,754,011,750","5,394,016","$43,932,043,406 "
2010,1/6/10,"1,655,507,953","5,494,460","$43,816,749,660 "
2010,1/7/10,"1,797,810,789","5,674,297","$44,104,237,184 "
2010,1/8/10,"1,545,692,647","5,008,824","$40,816,677,580 "

我将csv文件读作

> nysedata=read.csv("/Users/a/Downloads/nyse-vol-small.csv");
> nysedata
  Daily.NYSE.Group.Volume.in.NYSE.Listed          X               X.1
1                                     NA                             
2                                     NA                             
3                                     NA Trade Date NYSE Group Shares
4                                     NA                             
5                                   2010     1/4/10     1,425,504,460
6                                   2010     1/5/10     1,754,011,750
7                                   2010     1/6/10     1,655,507,953
8                                   2010     1/7/10     1,797,810,789
9                                   2010     1/8/10     1,545,692,647
                X.2                      X.3
1                                           
2                                           
3 NYSE Group Trades NYSE Group Dollar Volume
4                                           
5         4,628,115         $38,495,460,645 
6         5,394,016         $43,932,043,406 
7         5,494,460         $43,816,749,660 
8         5,674,297         $44,104,237,184 
9         5,008,824         $40,816,677,580 

> vol=nysedata["X.2"]
> vol
                X.2
1                  
2                  
3 NYSE Group Trades
4                  
5         4,628,115
6         5,394,016
7         5,494,460
8         5,674,297
9         5,008,824

我需要获取第5行到第9行并将它们转换为数字,以便我使用数字。我试过这些:

> vol[5,]
[1] 4,628,115
7 Levels:  4,628,115 5,008,824 5,394,016 5,494,460 ... NYSE Group Trades
> vol[5,9]
NULL

我查看了教程,但没有找到任何可以给我这个的东西。

1 个答案:

答案 0 :(得分:1)

# Read in data, skip first 3 lines
nysedata = read.csv("/Users/asubram1/Downloads/nyse-vol-small.csv", skip = 3)
# Delete the first 5th line    
nysedata =  nysedata[-1,]

# Remove all the $ and , in order to convert the factors to characters to numeric
temp = apply(nysedata[,c(1,3:5)], 2, function(x) as.numeric(as.character(gsub("[[:punct:]]","", x))))
nysedata = data.frame(temp, Trade.Date =  nysedata$Trade.Date)

> nysedata
     X NYSE.Group.Shares NYSE.Group.Trades NYSE.Group.Dollar.Volume Trade.Date
1 2010        1425504460           4628115              38495460645     1/4/10
2 2010        1754011750           5394016              43932043406     1/5/10
3 2010        1655507953           5494460              43816749660     1/6/10
4 2010        1797810789           5674297              44104237184     1/7/10
5 2010        1545692647           5008824              40816677580     1/8/10