我正在学习如何在R中使用for
循环,但这对我能做的事情似乎有点复杂。
我有一些名称格式为"collar41361_41365.0.x.csv"
的文件,并希望进行一系列计算,结果将在同一文件的新列中显示。
我一次仅对一个文件执行了此操作,但是希望对所有"collar41361_41365.0.x.csv"
个文件进行自动处理。
下面是"collar41361_41365.0.x.csv"
文件外观的一小部分示例:
> collaraccuracy<-fread("collar41361_41365.0.8.csv",stringsAsFactors = F)
> print(collaraccuracy)
V1 observed predicted probability results1 results2 results
1: 1 Head-up Vigilance 0.2727273 NEGATIVE TRUE TRUE_NEGATIVE
2: 2 Head-up Grazing 0.7272727 NEGATIVE TRUE TRUE_NEGATIVE
3: 3 Head-up Grazing 0.7272727 NEGATIVE TRUE TRUE_NEGATIVE
4: 4 Head-up Grazing 0.5454545 NEGATIVE TRUE TRUE_NEGATIVE
5: 5 Head-up Grazing 0.7272727 NEGATIVE TRUE TRUE_NEGATIVE
我需要计算"TRUE_POSITIVES"
(TP),"FALSE_POSITIVES"
(FP),"TRUE_NEGATIVES"
(TN)和"FALSE_NEGATIVES"
(FN)的总数并计算一个序列措施,例如:
1)精度=(tn + tp)/(tn + tp + fn + fp)
2)精度= tp /(tp + fp)
3)召回= tp /(tp + fn)
这是分析单个文件时的处理方式:
collaraccuracy<-fread("collar41361_41365.0.8.csv",stringsAsFactors = F)
tp<-length(grep("TRUE_POSITIVE", collaraccuracy$results))
fp<-length(grep("FALSE_POSITIVE", collaraccuracy$results))
tn<-length(grep("TRUE_NEGATIVE", collaraccuracy$results))
fn<-length(grep("FALSE_NEGATIVE", collaraccuracy$results))
accuracy = (tn+tp)/(tn+tp+fn+fp)
accuracy
precision = tp/(tp+fp)
precision
recall = tp/(tp+fn)
recall
我想创建一个for
循环,该循环将:
1)读取名称格式为"collar41361_41365.0.x.csv"
的所有文件,并为每个文件计算accuracy
,precision
和recall
的值。
2)为每个文件创建三个标题为"accuracy"
,"precision"
和"recall"
的列,并将公式的结果粘贴到下面。
任何帮助都是由衷的感谢!
答案 0 :(得分:1)
类似的事情应该起作用。不确定我是否完全了解预期的输出
# setwd('') # to folder where your csv files are
# change 'file.csv' to 'collar41361_41365.0'
f <- list.files(path = getwd(), full.names = F, pattern = 'file.csv')
dfs <- list()
for(i in 1:length(f)){
collaraccuracy <- data.table::fread(f[i],stringsAsFactors = F)
tp <- length(grep("TRUE_POSITIVE", collaraccuracy$results))
fp <- length(grep("FALSE_POSITIVE", collaraccuracy$results))
tn <- length(grep("TRUE_NEGATIVE", collaraccuracy$results))
fn <-length(grep("FALSE_NEGATIVE", collaraccuracy$results))
# append the results to the files
collaraccuracy$accuracy <- (tn+tp)/(tn+tp+fn+fp)
collaraccuracy$precision <- tp/(tp+fp)
collaraccuracy$recall <- tp/(tp+fn)
# you make way to write them to a different directory
data.table::fwrite(collaraccuracy, file = paste0('new',f[i]))
}