添加具有不同行数的列

时间:2018-11-18 20:44:45

标签: r

我有两个要合并的数据框,但是我不确定如何将其中一个合并为不同的列数。第一个数据帧就是这个:

Species Name                 Country           Lat         Lon  
Corynosoma hannae            New Zealand     -46.5000     170.3000
Polymorphus brevis           Mexico           19.4206    -102.2060
Acanthocephala terminalis    United States    38.1806    -83.4505
Polymorphus brevis           Mexico           30.5603    -115.9420
Polymorphus brevis           Mexico           19.6728    -99.7078
Polymorphus brevis           Mexico           19.6833    -101.8830
Polymorphus brevis           Mexico           30.5603    -115.9420
Polymorphus brevis           Mexico           30.5603    -115.9420

第二个数据帧:

Species Name                 Country          Number of Records
Corynosoma hannae            New Zealand              3
Polymorphus brevis           Mexico                   41 
Acanthocephala terminalis    United States            1

第二个数据框记录了每个国家获得了多少个物种样本。我希望能够将“记录数”添加到第一个数据帧中,主要是进行分组,以使数据帧如下所示:

    Species Name                 Country           Lat         Lon       Number of Records  
    Corynosoma hannae            New Zealand     -46.5000     170.3000         3
    Acanthocephala terminalis    United States    38.1806    -83.4505          1
    Polymorphus brevis           Mexico           30.5603    -115.9420         41
    Polymorphus brevis           Mexico           19.6728    -99.7078
    Polymorphus brevis           Mexico           19.6833    -101.8830
    Polymorphus brevis           Mexico           30.5603    -115.9420
    Polymorphus brevis           Mexico           30.5603    -115.9420
    Acanthocephala confraterna   United States    35.6859    -83.4986           2

因此,我不想为每首Polymorphus brevis重复,例如41。我希望它只是将墨西哥发现的所有Polymorphus brevis样本归为“记录数”列的一行。任何帮助,将不胜感激。我正在尝试使用rworldmap包使用此数据框创建气泡图。

3 个答案:

答案 0 :(得分:2)

类似的东西:

library(dplyr)

left_join(df1, df2, by = c("Species Name", "Country")) %>%
  group_by(`Species Name`, Country) %>%
  mutate(
    `Number of Records` = as.numeric(as.character(`Number of Records`)),
    `Number of Records` = ifelse(row_number() == 1, `Number of Records`, NA)
    )

答案 1 :(得分:1)

我同意前面的两个答案,即只需将一个新列添加到一个数据帧中,并使用来自另一数据帧的信息即可。一种实现方法是使用函数match()

# define 1st df:
df1 <- data.frame( 
  Observations = c("obs1", "obs2", "obs3"),
  Data = c(sample(1:20, 3))
  )

# define 2nd df:
df2 <- data.frame( 
  OtherObservations = c("obs1", "obs2", "obs3"),
  OtherData = c(1, 2, NA)
)

# now add to df1 the relevant column in df2 based on matching data in either data frame:
df1$NewColumn <- df2$OtherData[match(df1$Observations, df2$OtherObservations)]
df1

答案 2 :(得分:0)

根据我们的评论,一个更简单的解决方案可能是在数据框1中添加一个新列,而不是将两个数据框连接起来-

library(dplyr)

df1 %>%
  group_by(`Species Name`, Country) %>%
  mutate(
    nRecords = ifelse(row_number() == 1, n(), NA_integer_)
  ) %>%
  ungroup()