我在Knime中使用R Learner。我想离散矩阵,如下所示:
> my_matrix= as(knime.in,"matrix");
> dput(head(my_matrix, 5))
structure(c("KS", "OH", "NJ", "OH", "OK", "128", "107", "137",
" 84", " 75", "415", "415", "415", "408", "415", "No", "No",
"No", "Yes", "Yes", "Yes", "Yes", "No", "No", "No", "25", "26",
" 0", " 0", " 0", "265.1", "161.6", "243.4", "299.4", "166.7",
"110", "123", "114", " 71", "113", "45.07", "27.47", "41.38",
"50.90", "28.34", "197.4", "195.5", "121.2", " 61.9", "148.3",
" 99", "103", "110", " 88", "122", "16.78", "16.62", "10.30",
" 5.26", "12.61", "244.7", "254.4", "162.6", "196.9", "186.9",
" 91", "103", "104", " 89", "121", "11.01", "11.45", " 7.32",
" 8.86", " 8.41", "10.0", "13.7", "12.2", " 6.6", "10.1", " 3",
" 3", " 5", " 7", " 3", "2.70", "3.70", "3.29", "1.78", "2.73",
"1", "1", "0", "2", "3", "False", "False", "False", "False",
"False"), .Dim = c(5L, 20L), .Dimnames = list(c("Row0", "Row1",
"Row2", "Row3", "Row4"), c("State", "Account length", "Area code",
"International plan", "Voice mail plan", "Number vmail messages",
"Total day minutes", "Total day calls", "Total day charge", "Total eve minutes",
"Total eve calls", "Total eve charge", "Total night minutes",
"Total night calls", "Total night charge", "Total intl minutes",
"Total intl calls", "Total intl charge", "Customer service calls",
"Churn")))
我使用以下代码对矩阵进行离散化:
require(arules)
#require(arulesViz)
my_matrix= as(knime.in,"matrix");
my_rows= nrow(my_matrix);
my_cols= ncol(my_matrix);
#discretize(x, method="interval", categories = 3, labels = NULL,
# ordered=FALSE, onlycuts=FALSE, ...)
typeof(my_matrix)
vector = my_matrix[,2]
my_matrix[,2] = discretize(vector, method="interval", categories = 3, labels=c("length0","length1","length2"))
my_matrix[,3] = ...
etc...
在代码行的相应性中:
my_matrix[,2] = discretize(vector, method="interval", categories = 3, labels=c("length0","length1","length2"))
我收到以下错误:
seq.default中的错误(从= min(x,na.rm = TRUE),到= max(x,na.rm = TRUE),:'来自'不能是NA,NaN或无限
如果我把" sum(is.na(vector))放在这里:
vector = my_matrix[,2]
sum(is.na(vector))
my_matrix[,2] = discretize(vector, method="interval", categories = 3, labels=c("length0","length1","length2"))
我明白了:
> sum(is.na(vector))
[1] 0
所以我在向量中没有NA元素。无论如何,typeof(矩阵)是"字符"。如果我打印矢量,我会得到以下内容:
> vector = my_matrix[,2]
> sum(is.na(vector))
[1] 0
> head(vector, 20)
Row0 Row1 Row2 Row3 Row4 Row5 Row6 Row7 Row8 Row9 Row10 Row11 Row12
"128" "107" "137" " 84" " 75" "118" "121" "147" "117" "141" " 65" " 74" "168"
Row13 Row14 Row15 Row16 Row17 Row18 Row19
" 95" " 62" "161" " 85" " 93" " 76" " 73"
答案 0 :(得分:0)
问题是你的向量是由字符串组成的。理想情况下,你可以解决这个问题。这种转换的节点确实存在。
但是你也可以替换
vector = my_matrix[,2]
通过
vector = as.numeric(my_matrix[,2])