我在R中有两个数据框,名为house和candidate。
house
House Region Military_Strength
1 Stark The North 20000
2 Targaryen Slaver's Bay 110000
3 Lannister The Westerlands 60000
4 Baratheon The Stormlands 40000
5 Tyrell The Reach 30000
candidates
House Name Region
1 Lannister Jamie Lannister Westros
2 Stark Robb Stark North
3 Stark Arya Stark Westros
4 Lannister Cersi Lannister Westros
5 Targaryen Daenerys Targaryen Mereene
6 Baratheon Robert Baratheon Westros
7 Mormont Jorah Mormont Mereene
我想在house的基础上合并两个数据帧。为此我有 完成:
merge(candidates, house, by="House", sort=FALSE)
输出结果为:
House Name Region.x Region.y Military_Strength
1 Lannister Jamie Lannister Westros The Westerlands 60000
2 Lannister Cersi Lannister Westros The Westerlands 60000
3 Stark Robb Stark North The North 20000
4 Stark Arya Stark Westros The North 20000
5 Targaryen Daenerys Targaryen Mereene Slaver's Bay 110000
6 Baratheon Robert Baratheon Westros The Stormlands 40000
我想从每个房子(如果有的话)中删除第二个名字候选人,但是它 Military_Strength应该加到同一个房子的第一个候选人。
例如:
4 Stark Arya Stark Westros The North 20000
将被移除但是,20000将被添加到第3行Robb Stark Military_Strength。 如何以适当的方式做到这一点?
答案 0 :(得分:1)
从df1
之后获得的data.frame merge()
开始,可以继续:
df1$Military_Strength <- with(df1, ave(Military_Strength, House, FUN=sum))
df1[!duplicated(df1$House),]
# House Name Region.x Region.y Military_Strength
#1 Lannister Jamie Lannister Westros The Westerlands 120000
#3 Stark Robb Stark North The North 40000
#5 Targaryen Daenerys Targaryen Mereene Slaver's Bay 110000
#6 Baratheon Robert Baratheon Westros The Stormlands 40000
此示例中使用的数据:
df1 <- structure(list(House = structure(c(2L, 2L, 3L, 3L, 4L, 1L),
.Label = c("Baratheon", "Lannister", "Stark", "Targaryen"),
class = "factor"), Name = structure(c(4L, 2L, 5L, 1L, 3L, 6L),
.Label = c("Arya Stark", "Cersi Lannister", "Daenerys Targaryen",
"Jamie Lannister", "Robb Stark", "Robert Baratheon"),
class = "factor"), Region.x = structure(c(3L, 3L, 2L, 3L, 1L, 3L),
.Label = c("Mereene", "North", "Westros"), class = "factor"),
Region.y = structure(c(4L, 4L, 2L, 2L, 1L, 3L),
.Label = c("Slaver's Bay", "The North", "The Stormlands",
"The Westerlands"), class = "factor"),
Military_Strength = c(60000L, 60000L, 20000L, 20000L, 110000L,
40000L)), .Names = c("House", "Name", "Region.x", "Region.y",
"Military_Strength"), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6"))