R:如何动态地从数据框中删除列

时间:2015-07-22 09:13:26

标签: r dynamic dataframe multiple-columns

我有一个有2200行的数据集。我必须一次删除大量列(例如:大约400个)。此操作非常频繁地发生,并且要删除的列每次都会变化。要删除的列将位于文本文件中。

这就是我解决这个问题的方法。

#Reading data
myData = read.csv("myDataFile.csv")

#Getting the column names which should be deleted
colToDelete = read.table("columnsToBeRemoved.txt")

#processing the names list
tempList = as.character(unlist(colToDelete))
cat(paste(shQuote(tempList, type="cmd"), collapse=","))

newDataSet = subset(myData, select = - ??)

我使用cat(paste(shQuote(tempList, type="cmd"), collapse=","))获取逗号分隔字符串中的名称列表。输出是

  

" 04_ic_1306"" 06_iEC042_1314"" 13_iEcDH1_1363"" 18_iEcHS_1320"" 26_iEcolC_1368"&# 34; 31_iEcSMS35_1347"" 33_iECs_1301"" 34_iECUMN_1333"" 36_iEKO11_1354"" 39_iJO1366"" 47_iZ_1308&#34 ;," 54_iSFxv_1172"

我已尝试过子集和data.table方法,但我没有运气使用其中任何一种方法。我收到以下错误。我没有将字符串指定为选择命令。

  

-a中的错误:一元运算符的无效参数

我主要是指previous stackoverflow question

2 个答案:

答案 0 :(得分:1)

b<- "04_ic_1306"
a[,paste(b)]<-NULL

现在要迭代地执行此操作,您可能必须编写循环并将文件名保存在数组中

[1] "04_ic_1306"       "06_iEC042_1314"   "13_iEcDH1_1363"   "18_iEcHS_1320"   
[5] "26_iEcolC_1368"   "31_iEcSMS35_1347" "33_iECs_1301"     "34_iECUMN_1333"  
[9] "36_iEKO11_1354"   "39_iJO1366"       "47_iZ_1308"       "54_iSFxv_1172" 

答案 1 :(得分:1)

这可能是您的解决方案:

# Create data frame with 5 columns
df <- data.frame(a=rnorm(10), b=rnorm(10), c=rnorm(10), d=rnorm(10), e=rnorm(10))

# Select two columns to be removed
remove_col <- c("b", "d")

# Identify them in the column names
remove_col <- names(df) %in% remove_col

# Remove them using an inverse (the !) logical vector
df[,!remove_col]