我有一个像
这样的CSV文件LocationList,Identity,Category
"New York,New York,United States","42","S"
"NA,California,United States","89","lyt"
"Hartford,Connecticut,United States","879","polo"
"San Diego,California,United States","45454","utyr"
"Seattle,Washington,United States","uytr","69"
"NA,NA,United States","87","tree"
我想从'LocationList'列中删除所有'NA'
期望的结果 -
LocationList,Identity,Category
"New York,New York,United States","42","S"
"California,United States","89","lyt"
"Hartford,Connecticut,United States","879","polo"
"San Diego,California,United States","45454","utyr"
"Seattle,Washington,United States","uytr","69"
"United States","87","tree"
列数不固定,可能会增加或减少。此外,我想写入没有引号的CSV文件,也没有转义为“LocationList”列。
如何在R中实现以下功能? R的新手任何帮助表示赞赏。
答案 0 :(得分:2)
在这种情况下,您只想更换NA,
。但是,这不是删除NA
值的标准方法。
假设dat
是您的数据,请使用
dat$LocationList <- gsub("^(NA,)+", "", dat$LocationList)
答案 1 :(得分:1)
尝试:
my.data <- read.table(text='LocationList,Identity,Category
"New York,New York,United States","42","S"
"NA,California,United States","89","lyt"
"Hartford,Connecticut,United States","879","polo"
"San Diego,California,United States","45454","utyr"
"Seattle,Washington,United States","uytr","69"
"NA,NA,United States","87","tree"', header=T, sep=",")
my.data$LocationList <- gsub("NA,", "", my.data$LocationList)
my.data
# LocationList Identity Category
# 1 New York,New York,United States 42 S
# 2 California,United States 89 lyt
# 3 Hartford,Connecticut,United States 879 polo
# 4 San Diego,California,United States 45454 utyr
# 5 Seattle,Washington,United States uytr 69
# 6 United States 87 tree
如果在写入常规csv文件时删除了引号,则稍后将无法读取数据。这是因为您在LocationList
变量中的每个值中都有逗号,因此您可以在字段中间使用逗号并在字段之间标记中断。您可以尝试使用write.csv2()
,这将指示带有分号;
的新字段。你可以使用:
write.csv2(my.data, file="myFile.csv", quote=FALSE, row.names=FALSE)
产生以下文件:
LocationList;Identity;Category
New York,New York,United States;42;S
California,United States;89;lyt
Hartford,Connecticut,United States;879;polo
San Diego,California,United States;45454;utyr
Seattle,Washington,United States;uytr;69
United States;87;tree
(我现在注意到行的 Identity
和 Category
的值 5
大概搞砸了。你可能想在写入文件之前切换它们。)
x <- my.data[5, 2]
my.data[5, 2] <- my.data[5, 3]
my.data[5, 2] <- x
rm(x)