我试图用WHICH循环在R中模仿INDEX和MATCH的EXCEL公式,但结果是NA。
EXCEL中具有INDEX和MATCH的公式将数据按顺序排列,但在R中效果不佳。这是一个像数据一样的EXCEL示例:
我可以根据HEAD列中的数字连接TRUNK列中的单词。
HEAD列的公式将数据从TRUNK提取到INDEX,并根据其数字[2](即[s±])将HEAD的单词与[balony]匹配。换句话说,该公式从表中生成两个单词的短语。 = INDEX(PARSER!B:B;(MATCH(PARSER!G3; PARSER!A:A; 0)))
现在在R中,我可以读取数据,创建data.frames和for循环以使用头和主词填充新表,但是效果不佳。
graf <- read.csv("graf.txt", sep = "\t", quote = "\t", header = FALSE)
names(graf)[1] = "nr"
names(graf)[2] = "trunk"
names(graf)[3] = "lemat"
names(graf)[4] = "head"
nrheaddf = cbind.data.frame(graf$head,as.character(graf$trunk))
names(nrheaddf)[1] = "HEAD"
names(nrheaddf)[2] = "TRUNK"
nrtrunkdf = cbind.data.frame(graf$nr,as.character(graf$trunk))
names(nrtrunkdf)[1] = "NR"
names(nrtrunkdf)[2] = "TRUNK"
as.character(nrheaddf$TRUNK[6]) #BALONY
which(nrtrunkdf$NR == as.character(nrheaddf$HEAD[6])) #7
nrtrunkdf$TRUNK[which(nrtrunkdf$NR == as.character(nrheaddf$HEAD[6]))[1]] #są
grafi <- as.numeric(count(graf))
JOINER <- data.frame(matrix(nrow = grafi, ncol = 2))
joinv <- list()
for (i in grafi) {
joinv <- nrtrunkdf$V2[which(nrheaddf$V1 == nrtrunkdf$V1[i])][1]
JOINER[i] <- joinv
}
[<-.data.frame
(*tmp*
,i,值= NULL)中的错误:
新列将在现有列之后留下空白
head(WSD$Lemma)
“ktoś”“ go”“ krokodyl”“myśle攓barwić”
“szkło”
head(KEYWORDS$V1)
“ktośgo”“ krokodylsię”
“ jamyślꔓ barwioneszkło”
“ misiꔓnieokreślonekształty”
WSDKEY <- as.data.frame(cbind.na(WSD$Lemma,KEYWORDS$V1), stringsAsFactors = FALSE)
但是此解决方案不起作用: get_head <-函数(i){ 如果(!(i%in%WSDKEY $ V2)) 返回(NA) 其他 头<-WSDKEY [WSDKEY $ V2 == i,'V1'] return(as.character(head)) }
答案 0 :(得分:0)
这是您的意思吗?
library(dplyr)
# The used Data
my_data <- read.table(text = "nr TRUNK lemat HEAD
1 balony balon 2
2 są być 4
3 swobodne swobodny 2
4 ale ale 14
5 w w 4
6 ramach rama 5
7 długości długość 6
8 sznurka sznurek 7
9 [ [ 14
10 '#' '#' 9", header = TRUE)
my_data
my_data %>%
mutate(HEAD = my_data[HEAD, 'TRUNK']) %>% # replace the numbers with the values from TRUNK
mutate(joined_text = paste(HEAD, TRUNK)) %>% # paste the text together in a new column
select(HEAD, TRUNK, joined_text) # select the needed columns
然后我得到了:
# HEAD TRUNK joined_text
# są balony są balony
# ale są ale są
# są swobodne są swobodne
# <NA> ale NA ale
# ale w ale w
# w ramach w ramach
# ramach długości ramach długości
# długości sznurka długości sznurka
# <NA> [ NA [
# [ # [ #
更新:
如果您不想依赖行索引,这是另一种可行的方法
# define a function to find and extract the right HEAD
get_head <- function(i){
if (!(i %in% my_data$nr))
return(NA)
else
head <- my_data[my_data$nr == i,'TRUNK']
return(as.character(head))
}
# replace with the new values
my_data$HEAD <- sapply(my_data$HEAD, get_head)
# now concatenate the text and select the columns you want
my_data %>%
mutate(joined_text = paste(HEAD, TRUNK)) %>% # paste the text together in a new column
select(HEAD, TRUNK, joined_text)
如果您要匹配字符串而不是数字,则此方法也适用。