什么是朱莉娅对R因子概念的解决方案?

时间:2016-01-03 01:06:34

标签: julia

  

因子是R中的一种向量元素   也可以订购的分类值。值存储   内部为带有标记级别的整数。

# In R:
> x = c( "high" , "medium" , "low" , "high" , "medium" )

> xf = factor( x )
> xf
[1] high     medium low     high     medium
Levels: high low medium

> as.numeric(xf)
[1] 1 3 2 1 3

> xfo = factor( x , levels=c("low","medium","high") , ordered=TRUE )
> xfo
[1] high     medium low     high     medium
Levels: low < medium < high

> as.numeric(xfo)
[1] 3 2 1 3 2

我检查了Julia documentation和John Myles White的Comparing Julia and R’s Vocabularies(可能是淫秽) - 似乎没有factor这样的概念。是否经常使用因子,以及julia解决这个问题的方法是什么?

2 个答案:

答案 0 :(得分:3)

PooledDataArray包中的DataFrames是与R的因素相对应的一种可能的替代方案。以下使用它实现您的示例:

julia> using DataFrames # install with Pkg.add(DataFrames) if required

julia> x = ["high" , "medium" , "low" , "high" , "medium"];

julia> xf = PooledDataArray(x)
5-element DataArrays.PooledDataArray{ASCIIString,UInt32,1}:
 "high"  
 "medium"
 "low"   
 "high"  
 "medium"

julia> xf.refs
5-element Array{UInt32,1}:
 0x00000001
 0x00000003
 0x00000002
 0x00000001
 0x00000003

julia> xfo = PooledDataArray(x,["low","medium","high"]);

julia> xfo.refs
5-element Array{UInt32,1}:
 0x00000003
 0x00000002
 0x00000001
 0x00000003
 0x00000002

答案 1 :(得分:0)

CategoricalArrays.jl的{​​{1}}类似于因素。