我有这个数据框:
ID Description
1 Tree fell on car
2 Tree was uprooted
3 While cutting tree, it came down
4 Tree came down
我正在尝试在数据框中的一列中搜索天气字。我通过使用多个由'OR'分隔的GREPL函数来做到这一点。但是,我想结合两个grepl函数说:“如果描述中包含此单词和此单词,但不包含此单词,则为天气”。如果您看一下上面的数据框,则可以假定“树木倒下”被归类为天气,但是“砍伐树木时倒下”与天气无关。
我从其他堆栈溢出答案中尝试的代码是:
Data$Type<-ifelse(grepl(' Tree|^Tree|-
Tree|:Tree',Data$DESCRIPTION,ignore.case=TRUE)&
grepl('^[^Cutting]*[Feel|Fell|Fall|Up Rooted|Uprooted|Came Down| Down|Knocked
Onto|Caused Damage]
[^Cutting]*$',Data$DESCRIPTION,ignore.case=TRUE)), "weather", "Not
Classified")
但这不起作用。我尝试过:
Data$Type<-ifelse(grepl(' Tree|^Tree|-
Tree|:Tree',Data$DESCRIPTION,ignore.case=TRUE)& grepl('Feel|Fell|Fall|Up
Rooted|Uprooted|Came Down| Down|Knocked Onto|Caused
Damage',Data$DESCRIPTION,ignore.case=TRUE) &
!grepl('Cutting',Data$DESCRIPTION,ignore.case=TRUE)), "Weather", "Not
Classified")
我期待这个结果:
ID Description Type
1 Tree fell on car "Weather"
2 Tree was uprooted "Weather"
3 While cutting tree, it came down "Non-Weather"
4 Tree came down "Weather"
但是这些不起作用。谢谢
答案 0 :(得分:0)
由于只有两种情况(天气和非天气),我认为只使用grepl就足够了:
df$Type <- sapply(df$Description,
function(x) ifelse(grepl(pattern = 'Tree|fell|^cutting',x = x),'Weather','Non-Weather'))
[1] "Weather" "Weather" "Non-Weather" "Weather"
答案 1 :(得分:0)
我最终只是做这样的事情,以确保“ Ice”是一个天气词,但要确定“ Maker”。
ifelse(grepl('Ice$| Ice |,Ice |^Ice | Ice,',Data$DESCRIPTION,ignore.case=TRUE) &
!grepl('Maker',Data$DESCRIPTION,ignore.case=TRUE))