我正在尝试通过首先放置几列,然后再放置其他所有列来重新排列DataFrame中的列。
使用R的string s1= q.front;
,它看起来像:
dplyr
容易。使用Python的library(dplyr)
df = tibble(col1 = c("a", "b", "c"),
id = c(1, 2, 3),
col2 = c(2, 4, 6),
date = c("1 Feb", "2 Feb", "3 Feb"))
df2 = select(df,
id, date, everything())
,这是我尝试过的方法:
pandas
import pandas as pd
df = pd.DataFrame({
"col1": ["a", "b", "c"],
"id": [1, 2, 3],
"col2": [2, 4, 6],
"date": ["1 Feb", "2 Feb", "3 Feb"]
})
# using sets
cols = df.columns.tolist()
cols_1st = {"id", "date"}
cols = set(cols) - cols_1st
cols = list(cols_1st) + list(cols)
# wrong column order
df2 = df[cols]
# using lists
cols = df.columns.tolist()
cols_1st = ["id", "date"]
cols = [c for c in cols if c not in cols_1st]
cols = cols_1st + cols
# right column order, but is there a better way?
df3 = df[cols]
方式较为繁琐,但是我对此还很陌生。有更好的方法吗?
答案 0 :(得分:3)
您可以使用df.drop
:
>>> df = pd.DataFrame({
"col1": ["a", "b", "c"],
"id": [1, 2, 3],
"col2": [2, 4, 6],
"date": ["1 Feb", "2 Feb", "3 Feb"]
})
>>> df
col1 id col2 date
0 a 1 2 1 Feb
1 b 2 4 2 Feb
2 c 3 6 3 Feb
>>> cols_1st = ["id", "date"]
>>> df[cols_1st + list(df.drop(cols_1st, 1))]
id date col1 col2
0 1 1 Feb a 2
1 2 2 Feb b 4
2 3 3 Feb c 6
答案 1 :(得分:1)
就像在 R 中使用 datar
一样简单:
>>> from datar.all import c, f, tibble, select, everything
>>> df = tibble(col1 = c("a", "b", "c"),
... id = c(1, 2, 3),
... col2 = c(2, 4, 6),
... date = c("1 Feb", "2 Feb", "3 Feb"))
>>>
>>> df2 = select(df,
... f.id, f.date, everything())
>>>
>>> df2
id date col1 col2
<int64> <object> <object> <int64>
0 1 1 Feb a 2
1 2 2 Feb b 4
2 3 3 Feb c 6
我是包的作者。如果您有任何问题,请随时提交问题。
答案 2 :(得分:0)
通常,R和Python Pandas之间的最佳翻译是使用基数R,基数R遵循相同的语义,例如在向量上进行逻辑索引,此处为列名。请注意以下与否定和in
函数的相似之处:
# R
mycols <- c("id", "date")
df2 <- df[c(mycols, colnames(df)[!colnames(df) %in% c(mycols)])]
# PANDAS (OLDER, NON-RECOMMENDED WAY)
mycols = ["id", "date"]
df2 = df[mycols + df.columns[~df.columns.isin(mycols)].tolist()]
# PANDAS (CURRENT, RECOMMENDED WAY WITH reindex)
df2 = df.reindex(mycols + df.columns[~df.columns.isin(mycols)].tolist(),
axis='columns')