Question

我在txt文件（T1.txt）中有以下模型摘要：

=== Summary ===

Correctly Classified Instances         423               88.6792 %
Incorrectly Classified Instances        54               11.3208 %
Kappa statistic                          0.6766
Mean absolute error                      0.0854
Root mean squared error                  0.2656
Relative absolute error                 38.4098 %
Root relative squared error             79.9279 %
Coverage of cases (0.95 level)          91.6143 %
Mean rel. region size (0.95 level)      36.1985 %
Total Number of Instances              477     

=== Confusion Matrix ===

   a   b   c   <-- classified as
 357  20   7 |   a = 1
  12  37  11 |   b = 2
   3   1  29 |   c = 3

我想将最后一个矩阵提取到dataframe（df1）：

> df1
       a   b   c   
     357  20   7 
      12  37  11 
       3   1  29

我们必须考虑到txt文件背后的模型不再存在（我只有txt文件）。此外，矩阵大小可以在一个文件之间变化，其行数不必等于列数。

Answer 1

我们可以使用readLines，grep来查看文件，找到包含“混淆矩阵”的行，对行进行子集，使用gsub删除子字符串，并阅读read.table

lines <- readLines('Avi.txt', warn=FALSE)
i1 <- grep('Confusion Matrix', lines)
read.table(text=gsub('(<-|\\|).*', '', 
        lines[(i1+2):length(lines)]), header=TRUE)
#    a  b  c
#1 357 20  7
#2  12 37 11
#3  3  1 29

从txt文件中提取Matrix

1 个答案: