如何顺利删除无效记录?

时间:2015-08-12 05:58:37

标签: r

我必须读取一个csv文件,一行代表一条记录。在该文件中,如果错过某些字段的数据,则将替换为“ - ”。应删除具有此类值的任何记录。我怎样才能顺利完成?

我知道有一种非常繁琐的方法:用“ - ”比较每一列,并删除每一行都有该值。但它似乎非常愚蠢。

csv文件是这样的:

RB1511.SHF,2015-07-22,"2,001.0000","1,984.0000","2,035.0000","1,965.0000","1,997.0000","1,997.3900","1,628.0000","9,494.0000"
RB1511.SHF,2015-07-23,"1,986.0000","1,995.0000","1,995.0000","1,969.0000","1,979.0000","1,979.2700",640.0000,"9,216.0000"
RB1511.SHF,2015-07-24,"1,996.0000",--,"2,000.0000","1,975.0000",--,"1,986.9700","1,882.0000",--
ZN1607.SHF,2015-08-10,"14,305.0000","14,475.0000","14,475.0000","14,305.0000","14,360.0000","14,361.6600",6.0000,16.0000
ZN1607.SHF,2015-08-05,"14,890.0000","14,850.0000","14,890.0000","14,850.0000","14,870.0000","14,870.0000",4.0000,10.0000
ZN1607.SHF,2015-08-06,"14,750.0000","14,850.0000","14,750.0000","14,750.0000","14,750.0000",0.0000,0.0000,10.0000
CU1607.SHF,2015-07-24,--,"37,540.0000","38,200.0000",--,"37,710.0000","37,717.2200",--,336.0000

1 个答案:

答案 0 :(得分:2)

您可以使用NA中的na.strings将其替换为read.csv() 您只需将text = txt替换为您的文件名。

txt <- 'RB1511.SHF,2015-07-22,"2,001.0000","1,984.0000","2,035.0000","1,965.0000","1,997.0000","1,997.3900","1,628.0000","9,494.0000"
RB1511.SHF,2015-07-23,"1,986.0000","1,995.0000","1,995.0000","1,969.0000","1,979.0000","1,979.2700",640.0000,"9,216.0000"
RB1511.SHF,2015-07-24,"1,996.0000",--,"2,000.0000","1,975.0000",--,"1,986.9700","1,882.0000",--
ZN1607.SHF,2015-08-10,"14,305.0000","14,475.0000","14,475.0000","14,305.0000","14,360.0000","14,361.6600",6.0000,16.0000
ZN1607.SHF,2015-08-05,"14,890.0000","14,850.0000","14,890.0000","14,850.0000","14,870.0000","14,870.0000",4.0000,10.0000
ZN1607.SHF,2015-08-06,"14,750.0000","14,850.0000","14,750.0000","14,750.0000","14,750.0000",0.0000,0.0000,10.0000
CU1607.SHF,2015-07-24,--,"37,540.0000","38,200.0000",--,"37,710.0000","37,717.2200",--,336.0000'


read.csv(text = txt, header = FALSE, na.strings = "--")
#           V1         V2          V3          V4          V5          V6          V7          V8         V9        V10
# 1 RB1511.SHF 2015-07-22  2,001.0000  1,984.0000  2,035.0000  1,965.0000  1,997.0000  1,997.3900 1,628.0000 9,494.0000
# 2 RB1511.SHF 2015-07-23  1,986.0000  1,995.0000  1,995.0000  1,969.0000  1,979.0000  1,979.2700   640.0000 9,216.0000
# 3 RB1511.SHF 2015-07-24  1,996.0000        <NA>  2,000.0000  1,975.0000        <NA>  1,986.9700 1,882.0000       <NA>
# 4 ZN1607.SHF 2015-08-10 14,305.0000 14,475.0000 14,475.0000 14,305.0000 14,360.0000 14,361.6600     6.0000    16.0000
# 5 ZN1607.SHF 2015-08-05 14,890.0000 14,850.0000 14,890.0000 14,850.0000 14,870.0000 14,870.0000     4.0000    10.0000
# 6 ZN1607.SHF 2015-08-06 14,750.0000 14,850.0000 14,750.0000 14,750.0000 14,750.0000      0.0000     0.0000    10.0000
# 7 CU1607.SHF 2015-07-24        <NA> 37,540.0000 38,200.0000        <NA> 37,710.0000 37,717.2200       <NA>   336.0000