我有以下数据框:
df <- structure(list(traffic_Count_Street = c("16th St", "17th St",
"Agnes St", "Ayers St", "Ayers St", "Ayers St", "Ayers St", "Baldwin Blvd",
"Baldwin Blvd", "Baldwin Blvd","S Brahma Blvd"),
unit_Street = c("Baldwin Blvd", "Baldwin Blvd", "Baldwin Blvd", "Baldwin Blvd", "Baldwin Blvd",
"Baldwin Blvd", "Baldwin Blvd", "Baldwin Blvd", "Baldwin Blvd",
"Baldwin Blvd","S 14th St")), .Names = c("traffic_Count_Street", "unit_Street"
), row.names = c(NA, 11L), class = "data.frame")
traffic_Count_Street unit_Street
1 16th St Baldwin Blvd
2 17th St Baldwin Blvd
3 Agnes St Baldwin Blvd
4 Ayers St Baldwin Blvd
5 Ayers St Baldwin Blvd
6 Ayers St Baldwin Blvd
7 Ayers St Baldwin Blvd
8 Baldwin Blvd Baldwin Blvd
9 Baldwin Blvd Baldwin Blvd
10 Baldwin Blvd Baldwin Blvd
11 S Brahma Blvd S 14th St
我希望返回两行中每行不匹配的行,或者只是每列的第一个字符匹配
结果如下:
traffic_Count_Street unit_Street
1 S Brahma Blvd S 14th St
我有以下但我不确定它是否正确。
require(dplyr)
result = df%>%
filter(traffic_Count_Street != unit_Street & traffic_Count_Street[1] == unit_Street[1])
答案 0 :(得分:2)
我们可以使用substr
提取每列的第一个字符,比较(==
)和filter
行以及OP代码中的其他比较。
df %>%
filter(substr(traffic_Count_Street, 1, 1) == substr(unit_Street, 1, 1) &
traffic_Count_Street != unit_Street)
# traffic_Count_Street unit_Street
#1 S Brahma Blvd S 14th St
或使用data.table
setDT(df)[df[,Reduce(`!=`, .SD) & substr(.SD[[1]],1,1) == substr(.SD[[2]], 1, 1)]]
# traffic_Count_Street unit_Street
#1: S Brahma Blvd S 14th St
或使用base R
subset(df, substr(traffic_Count_Street, 1, 1) == substr(unit_Street, 1, 1) &
traffic_Count_Street != unit_Street)
答案 1 :(得分:2)
使用data.table
fot其糖语法:
library(data.table)
setDT(dat)[substr(traffic_Count_Street, 1, 1) == substr(unit_Street, 1, 1) &
traffic_Count_Street != unit_Street]
# traffic_Count_Street unit_Street
# 1: S Brahma Blvd S 14th St