我有.csv文件,我在其中创建了一个自定义字典来替换单词,我将其作为R中的数据框上传,例如:
word replacement
Hello Hi
Good Best
Good Night Sweet Morning
我想要做的是扫描.csv中的文本并扫描每个单元格,如果它包含我的自定义词典中的任何单词或短语,则用替换词替换该单词或短语。
请帮我处理代码,我是R的新手。
答案 0 :(得分:0)
#Dictionary data frame
dict <- data.frame( original = c("word", "Hello","Good","Good Night"),
replace = c("replacement", "Hi", "Best", "Sweet Morning"),
stringsAsFactors=FALSE)
dict
# original replace
# 1 word replacement
# 2 Hello Hi
# 3 Good Best
# 4 Good Night Sweet Morning
# Data frame where the words need to be replaced
df <- data.frame ( col1 = c( "Hello", "World", "Good","coffee"),
col2 = c("Good Night","To all my friends","I have no","word"),
stringsAsFactors=FALSE)
df
# col1 col2
#1 Hello Good Night
#2 World To all my friends
#3 Good I have no
#4 coffee word
apply(df,
MARGIN=c(1,2),
FUN=function(x){ pos=which(dict[,1] == x);
if(length(pos)>0) return(dict[pos[1],2]) else return(x)})
# col1 col2
#[1,] "Hi" "Sweet Morning"
#[2,] "World" "To all my friends"
#[3,] "Best" "I have no"
#[4,] "coffee" "replacement"