档案1

ABILITY 3

DEVELOPS 5

ENVIRONMENTAL 4

file 2

ABILITY 5

DEVELOPS 7

ENVIRONMENTAL 1

所以基本上我想用文件1和2中的最大值填充最大出现向量。例如，在最大出现向量中，ENVIRONMENTAL的最大出现次数应该更改为4（扫描文件1后的最大出现次数）和文件2）。这是我的代码：

# Find the largest frequency of the given keywords by searching the keyword sets

# Start by defining and initializing the max occurence vector

keywordslength=length(keywords)
keywordmax=data.frame(keywords)
keywordmax$Max=0

# Start by reading the keyword set and keeping the frequency of the keyword
ksearch1=read.csv("set1.csv",header=FALSE,sep=",")
ksearch1$V1=toupper(ksearch1$V1)

# Now scan ksearch1 for the word in question
    for (i in 1:keywordslength)
{
    # Establish the keyword
    testkey=keywords[i]
    testmax=0

    # Scan ksearch1     
    for (j in 1:length(ksearch1$V1))
    {
        if (ksearch1[j,1]==testkey)
        {
            testmax=ksearch1[j,2]

        }

        if (subset(keywordmax, keywords==testkey, select=c(Max))>=testmax)
        {
            keywordmax[which(keywords==testkey),2]=testmax
        }
    }

}

Answer 1

这应该有效

创建关键字列表和两个文件

keywords <- as.data.frame(c("Ability","Develops","Environmental"))
max_occur <- data.frame(keywords,c(0,0,0))
file1 <- data.frame(keywords,c(3,5,4))
file2 <- data.frame(keywords,c(5,7,1))

正确重命名列

colnames(file1) <- c("V1","V2")
colnames(file2) <- c("V1","V2")
colnames(keywords) <- c("V1")
colnames(max_occur) <- c("V1","V2")

根据关键字

对数据框进行排序

keywords <- as.data.frame(keywords[sort(keywords$V1,decreasing = FALSE),])
max_occur <- as.data.frame(max_occur[sort(max_occur$V1,decreasing = FALSE),])
file1 <- file1[sort(file1$V1,decreasing=FALSE),]
file2 <- file2[sort(file2$V1,decreasing=FALSE),]

重命名它们，因为它们被转换为因子

colnames(keywords) <- c("V1")
colnames(max_occur) <- c("V1","V2")

找到最大值并存储在max_occur

中

for(i in 1:length(keywords$V1)){
  max_occur$V2[i] <- max(max_occur$V2[i],file1$V2[i],file2$V2[i])  
}

如果每个文件中都没有关键字，请告诉我。我会稍微改变一下代码。从你发布的内容。它们都出现在每个文件中。

用R

档案1

file 2

1 个答案: