我有一个带有ID和值列的示例数据框:
ID_short Value
Boar 4
Pig 5
Duck 6
Dog 7
Cat 8
Horse 9
我有另一个数据框,该数据框的一列具有相同的ID,但扩展了更多字符:
ID_Extended
Duck_p15
Dog32
PigGG
Horse_p12
Cat_Ok
Boar_Ko_1999_test
我想将此ID_Extended列添加到第一个数据帧,并且我希望扩展ID仍与正确行中的短ID匹配。 ID是类字符。
所需输出示例:
ID Value ID_Extended
Boar 4 Boar_Ko_1999_test
Pig 5 PigGG
Duck 6 Duck_p15
Dog 7 Dog32
Cat 8 Cat_Ok
Horse 9 Horse_p12
答案 0 :(得分:2)
这是东西:
df1$D_Extended <-
df2$ID_Extended[sapply(df1$ID_short,
function(x) match(x, substr(df2$ID_Extended, 1, nchar(x))))]
df1
ID_short Value D_Extended
1 Boar 4 Boar_Ko_1999_test
2 Pig 5 PigGG
3 Duck 6 Duck_p15
4 Dog 7 Dog32
5 Cat 8 Cat_Ok
6 Horse 9 Horse_p12
数据:
df1 <- data.frame(
ID_short = c("Boar", "Pig", "Duck", "Dog", "Cat", "Horse"),
Value = 4:9,
stringsAsFactors = FALSE
)
df2 <- data.frame(
ID_Extended = c("Duck_p15", "Dog32", "PigGG","Horse_p12", "Cat_Ok", "Boar_Ko_1999_test"),
stringsAsFactors = FALSE
)
答案 1 :(得分:2)
从'df2'中提取'ID_Extended'的子字符串后,我们可以使用match
df1$ID_Extended <- df2$ID_Extended[match(df1$ID_short,
sub("^([A-Z][a-z]+).*", "\\1", df2$ID_Extended))]
df1 <- structure(list(ID_short = c("Boar", "Pig", "Duck", "Dog", "Cat",
"Horse"), Value = 4:9), class = "data.frame", row.names = c(NA,
-6L))
df2 <- structure(list(ID_Extended = c("Duck_p15", "Dog32", "PigGG",
"Horse_p12", "Cat_Ok", "Boar_Ko_1999_test")), class = "data.frame",
row.names = c(NA,
-6L))