Julia中DataFrame的两列或多列的矢量化连接

时间:2018-01-23 00:40:43

标签: dataframe julia

我有一个Julia DataFrame,其中包含多个StringInt列。我想以矢量化方式将它们水平粘合在一起以生成一列。在R中,我会使用paste。朱莉娅有可能吗?

所需的输出不是hcatvcat操作的输出,例如these。目标是使用行"x1[i]:x2[i]"创建一个新的字符串列,其中x1[i]x2[i]x1x2列的对应行元素。 DataFrame对象。

朱莉娅示例

# tested in Julia v0.5.0 and v0.6.2
# example data frame
y = DataFrame(x1 = [1,2,3], x2 = ["A","B","C"])

# goal: make column ["1:A"; "2:B", "3:C"]
# desired output format for one row
join( [ y[1,:x1], y[1,:x2] ], ":" ) # > "1:A"

# doesn't work with vectors, makes one long string
# (0.5) > "[1,2,3]:String[\"A\",\"B\",\"C\"]"
# (0.6) > "Any[1, 2, 3]:Any[\"A\", \"B\", \"C\"]"
join([y[:,:x1], y[:,:x2]], ":")

# default broadcast operation doesn't work either
# (0.5) > ERROR: MethodError: no method matching size(::String)
# (0.6) > 2-element Array{String,1}:
#           "1:2:3"
#           "A:B:C"
join.([y[:,:x1], y[:,:x2]], ":")

R示例

# same data structure as before
y = data.frame(x1 = c(1:3), x2 = c("A", "B", "C"))

# desired output format with 'paste'
paste(y$x1, y$x2, sep = ":") # > "1:A" "2:B" "3:C"

1 个答案:

答案 0 :(得分:1)

可能的替代方案是:

  1. ["$(r[:x1]):$(r[:x2])" for r in eachrow(y)]

  2. [join(Array(r),":") for r in eachrow(y)]

  3. mapslices(x->join(x,":"),(Array(y)),2)

  4. map(x->join(x,":"),zip(y[:x1],string.(y[:x2])))

  5. [string(y[:x1][i])*":"*string(y[:x2][i]) for i=1:nrow(y)]

  6. 它们在性能方面并不完全相同(选项5最快但更具体)。