我有一个包含以下行的列:
Cursor.close()
我想根据找到的值创建一个新列:ie
Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access
Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access
Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access
Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access
Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access
Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access
Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access
Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access
Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: No home phone calls or line rental|NET: Internet access
因此,如果我们采用第一行示例:
if Telephone line rental is found, then in the new column, I want to code as V
if Fixed broadband, then code as B
if Mobile phone = M
if Paid for TV service / TV and/or sport code as T
该类别将是:Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access
NET:XXX | NET:需要忽略字符串的XXXX部分。
完整的投资组合可以是这4个的任意组合,但它们必须按以下顺序V, B, M
我一直在谷歌上搜索并阅读V, B, M, T
尝试用library(stringr)
拆分字符串,但它无效。
还有其他想法吗?
此致
DPUT:
sep = "\\|"
答案 0 :(得分:1)
您可以像这样使用grepl
:
df <- read.table(text='"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access"
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access"
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access"
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access"
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access"
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access"
"Telephone line rental|Fixed broadband|Mobile phone|NET: No home phone calls|NET: Internet access"
"Telephone line rental|Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: Internet access"
"Fixed broadband|Mobile phone|Paid for TV service|NET: No home phone calls|NET: No home phone calls or line rental|NET: Internet access"',
header=FALSE,stringsAsFactors=FALSE)
tv <- c("Paid for TV service","TV","sport code")
df$new_col <- paste(ifelse(grepl("Telephone line rental",df$V1),"V",""),
ifelse(grepl("Fixed broadband",df$V1),"B",""),
ifelse(grepl("Mobile phone",df$V1),"M",""),
ifelse(grepl(paste(tv,collapse = "|"), df$V1),"T","")
)