Question

我有一个数据框，其中包含以下元素，我希望有一个记录子集。

location <- c('london', 'london','london', 'newyork' ,'newyork', 'paris', 'delhi')
year<- c(1990, 1991, 1992, 2001, 2002, 2003,2001)

df<- data.frame(location,year)

我有一个矢量说

x<- c('newyork', 'delhi')

我希望对数据框进行子集化，使得最终数据帧包含除x中未列出的位置之外的所有元素。我想创建一个测试数据框，我试过这个

 test1 <- df[df$location %in% c('newyork','delhi'), ]

它让我反其道而行之。有人可以帮忙吗？

我期待这样的输出：

       location year 
       london    1990
       london    1991
       london    1992
       paris     2003

Answer 1

使用Dplyr：

new_df <- df %>% 
  filter(!(location %in% c("newyork", "delhi")))

Answer 2

正如@ycw在评论中指出的那样，否定逻辑条件会给你预期的结果

location <- c('london', 'london','london', 'newyork' ,'newyork', 'paris', 'delhi')
year <- c(1990, 1991, 1992, 2001, 2002, 2003,2001)

df <- data.frame(location, year)

x <- c('newyork', 'delhi')

# add"!" to the subset condition
test1 <- df[ !df$location %in% c('newyork','delhi'), ] 

test1

结果

  location year
1   london 1990
2   london 1991
3   london 1992
6    paris 2003

Answer 3

如果您只想从原始数据框中排除几个元素，您还可以按如下方式创建子集：

location <- c('london', 'london','london', 'newyork' ,'newyork', 
'paris', 'delhi')
year<- c(1990, 1991, 1992, 2001, 2002, 2003,2001)

df<- data.frame(location,year)

# Identify which elements you wish to remove and precede with NOT operator (!)
df2 <- df[!df$location=="newyork" & !df$location=="paris",]

df2

请注意，如果您计划过滤多个元素，则效率不高。在那些情况下，ycw和Damian的方法更好。

但是，如果你只有一个或几个要删除的元素，上面的安排是一个简单，快速，合理的方法来实现你的目标：

 location year
1   london 1990
2   london 1991
3   london 1992
7    delhi 2001

从另一个向量中对R中的数据进行子集化（排除）

3 个答案: