我正在努力弄清楚如何在数据帧中将单个“Name”列拆分为同一数据帧中的另外两列FistName和LastName。挑战在于我的一些姓名有几个姓氏。本质上,我想取第一个单词(或字符串的元素)并将其放在FirstName列中,然后将所有后续文本(当然减去空格)放入LastName列。
这是我的DataFrame“tteam”
NAME <- c('John Doe','Peter Gynn','Jolie Hope-Douglas', 'Muhammad Arnab Halwai')
TITLE <- c("assistant", "manager", "assistant", "specialist")
tteam<- data.frame(NAME, TITLE)
我想要的输出是这样的:
FirstName <- c("John", "Peter", "Jolie", "Muhammad")
LastName <- c("Doe", "Gynn", "Hope-Douglas", "Arnab Halwai")
tteamdesire <- data.frame(FirstName, LastName, TITLE)
我尝试了以下代码来创建一个只有名称的新数据框,允许我从第一列中提取名字。但是,我无法将姓氏列入任何顺序。
names <- tteam$NAME ## puts full names into names vector
namesdf <- data.frame(do.call('rbind', strsplit(as.character(names),' ',fixed=TRUE)))
## splits out all names into a dataframe PROBLEM IS HERE!
答案 0 :(得分:7)
您可以使用extract
tidyr
library(tidyr)
extract(tteam, NAME, c("FirstName", "LastName"), "([^ ]+) (.*)")
# FirstName LastName TITLE
#1 John Doe assistant
#2 Peter Gynn manager
#3 Jolie Hope-Douglas assistant
#4 Muhammad Arnab Halwai specialist
答案 1 :(得分:4)
尝试:
> firstname = sapply(strsplit(NAME, ' '), function(x) x[1])
> firstname
[1] "John" "Peter" "Jolie" "Muhammad"
> lastname = sapply(strsplit(NAME, ' '), function(x) x[length(x)])
> lastname
[1] "Doe" "Gynn" "Hope-Douglas" "Halwai"
或:
> ll = strsplit(NAME, ' ')
>
> firstname = sapply(ll, function(x) x[1])
> lastname = sapply(ll, function(x) x[length(x)])
>
> firstname
[1] "John" "Peter" "Jolie" "Muhammad"
> lastname
[1] "Doe" "Gynn" "Hope-Douglas" "Halwai"
答案 2 :(得分:3)
1)sub
data.frame(FirstName = sub(" .*", "", tteam$NAME),
LastName = sub("^\\S* ", "", tteam$NAME),
tteam[-1])
2)gsubfn :: read.pattern 在NAME<-
我们可以省略as.character
,如果它已经是字符(而不是因素):
library(tteam)
cn <- c("FirstName", "LastName")
NAME <- as.character(tteam$NAME)
cbind( read.pattern(text = NAME, pattern = "^(\\S*) (.*)", col.names = cn), tteam[-1])
更新更新解决方案,使其符合tteam
并添加第二个解决方案。
答案 3 :(得分:0)
您可以使用软件包 unglue :
library(unglue)
unglue_unnest(tteam, NAME, "{FirstName} {LastName}")
#> TITLE FirstName LastName
#> 1 assistant John Doe
#> 2 manager Peter Gynn
#> 3 assistant Jolie Hope-Douglas
#> 4 specialist Muhammad Arnab Halwai