Question

我在Julia中有一个具有数百列的DataFrame，我想在第一列之后插入一列。

例如在此DataFrame中：

df = DataFrame(
  colour = ["green","blue"],
  shape = ["circle", "triangle"],
  border = ["dotted", "line"]
)

我想在area之后插入colour列，但不专门提及shape和border（在我的情况下是数百个不同的列）

df[:area] = [1,2]

在此示例中，我可以使用（但专门指shape和border）：

df = df[[:colour, :area, :shape, :border]] # with specific reference to shape and border names

Answer 1

好，祝贺您找到了解决您自己的方法，但是有一个内置函数在语义上更清晰，可能更快一些：

using DataFrames

df = DataFrame(
  colour = ["green","blue"],
  shape = ["circle", "triangle"],
  border = ["dotted", "line"]
)

insert!(df, 3, [1,2], :area)

其中3是插入后新列的预期索引，[1,2]是其内容，而:area是名称。您可以在加载?insert!程序包后在REPL中键入DataFrames来找到更详细的文档。

值得注意的是，!是函数名称的一部分。 It's a Julia convention表示该函数将更改其参数。

Answer 2

在提出问题时，我也找到了解决方案（经常发生）。我仍然在此处发布问题，以将其保留在内存中（为我自己）和其他人使用。

在“添加”新列之前保存列名称就足够了：

df = DataFrame(
  colour = ["green","blue"],
  shape = ["circle", "triangle"],
  border = ["dotted", "line"]
)
dfnames = names(df)
df[:area] = [1,2]

df = df[vcat(dfnames[1:1],:area,dfnames[2:end])]

Answer 3

rows = size(df)[1]    # tuple gives you (rows,columns) of the DataFrame

insertcols!(df,       # DataFrame to be changed
    1,                # insert as column 1
    :Day => 1:rows,   # populate as "Day" with 1,2,3,..
    makeunique=true)  # if the name of the column exist, make is Day_1

如何在julia DataFrame中的特定位置插入一列（不参考现有的列名）

3 个答案: