我正在尝试创建一个新列,该列合并两个现有列中的字符。我的df当前如下所示:
ID Town
AK_Town_0233470 Hooper Bay
CA_Town_0603330 Avilla Beach
CA_Town_0616462 Corte Madera
CA_Town_0623042 Eureka
CA_Town_0625338 Foster City
我正在尝试创建一个看起来像这样的新列(New_ID):
ID Town New_ID
AK_Town_0233470 Hooper Bay Hooper Bay, AK
CA_Town_0603330 Avilla Beach Avilla Beach, CA
CA_Town_0616462 Corte Madera Corte Madera, CA
CA_Town_0623042 Eureka Eureka, CA
CA_Town_0625338 Foster City Foster City, CA
我认为tidyverse unite可能会有所帮助,但我不仅仅是将这些列进行合并,而是将ID列的一部分添加到Town列,并包括一个逗号。
谢谢您的帮助!
答案 0 :(得分:0)
函数paste
和substr
可以像下面这样轻松地实现:
df <- read.table(stringsAsFactor = FALSE, header=TRUE, sep = ",",
text =
"ID, Town
AK_Town_0233470, Hooper Bay
CA_Town_0603330, Avilla Beach
CA_Town_0616462, Corte Madera
CA_Town_0623042, Eureka
CA_Town_0625338, Foster City")
# Construct new ID using paste
df$new_id <- paste0(df$Town, ", ", substr(df$ID, 1,2))
print(df)
# ID Town new_id
#1 AK_Town_0233470 Hooper Bay Hooper Bay, AK
#2 CA_Town_0603330 Avilla Beach Avilla Beach, CA
#3 CA_Town_0616462 Corte Madera Corte Madera, CA
#4 CA_Town_0623042 Eureka Eureka, CA
#5 CA_Town_0625338 Foster City Foster City, CA
答案 1 :(得分:0)
我的方法与上面的Anders非常相似,但是我使用了mutate
中的dplyr
函数。
df <- data.frame(ID = c("AK_Town_0233470", "CA_Town_0603330", "CA_Town_0616462",
"CA_Town_0623042", "CA_Town_0625338"),
Town = c("Hooper Bay", "Avilla Beach", "Corte Madera", "Eureka", "Foster City"))
df %>%
mutate(New_ID = paste0(Town, ", ", str_extract(df$ID, pattern = "[[:alpha:]][[:alpha:]]")))
这是结果:
ID Town New_ID 1 AK_Town_0233470 Hooper Bay Hooper Bay, AK 2 CA_Town_0603330 Avilla Beach Avilla Beach, CA 3 CA_Town_0616462 Corte Madera Corte Madera, CA 4 CA_Town_0623042 Eureka Eureka, CA 5 CA_Town_0625338 Foster City Foster City, CA
我使用了str_extract
包中的stringr
函数(也在tidyverse
中),而不是基数R中的substr()
,但它也可以工作。
df %>%
mutate(New_ID = paste0(Town, ", ", substr(df$ID, 1, 2)))