因子是R中的一种向量元素 也可以订购的分类值。值存储 内部为带有标记级别的整数。
# In R:
> x = c( "high" , "medium" , "low" , "high" , "medium" )
> xf = factor( x )
> xf
[1] high medium low high medium
Levels: high low medium
> as.numeric(xf)
[1] 1 3 2 1 3
> xfo = factor( x , levels=c("low","medium","high") , ordered=TRUE )
> xfo
[1] high medium low high medium
Levels: low < medium < high
> as.numeric(xfo)
[1] 3 2 1 3 2
我检查了Julia documentation和John Myles White的Comparing Julia and R’s Vocabularies(可能是淫秽) - 似乎没有factor
这样的概念。是否经常使用因子,以及julia解决这个问题的方法是什么?
答案 0 :(得分:3)
PooledDataArray
包中的DataFrames
是与R的因素相对应的一种可能的替代方案。以下使用它实现您的示例:
julia> using DataFrames # install with Pkg.add(DataFrames) if required
julia> x = ["high" , "medium" , "low" , "high" , "medium"];
julia> xf = PooledDataArray(x)
5-element DataArrays.PooledDataArray{ASCIIString,UInt32,1}:
"high"
"medium"
"low"
"high"
"medium"
julia> xf.refs
5-element Array{UInt32,1}:
0x00000001
0x00000003
0x00000002
0x00000001
0x00000003
julia> xfo = PooledDataArray(x,["low","medium","high"]);
julia> xfo.refs
5-element Array{UInt32,1}:
0x00000003
0x00000002
0x00000001
0x00000003
0x00000002
答案 1 :(得分:0)
CategoricalArrays.jl
的{{1}}类似于因素。