Question

我有一个水果数据集，其中包含名称，颜色，重量，大小，种子

         Fruit dataset

         Name     Colour    Weight  Size   Seeds   Unnamed

         Apple    Apple     Red     10.0   Big     Yes  

         Apple    Apple     Red     5.0    Small   Yes  

         Pear     Pear      Green   11.0   Big     Yes  

         Banana   Banana    Yellow  4.0    Small   Yes  

         Orange   Orange    Orange  5.0    Small   Yes

问题在于，颜色列是名称的重复列，并且值向右移动1列，从而创建包含属于列种子的值的无用列（未命名）。是否有一种简单的方法可以删除Color中的重复值，并将其余的列值从重量向左移回到左侧1列。我希望我不会在这里混淆任何人。

欲望结果

         Fruit dataset

         Name     Colour  Weight Size    Seeds   Unnamed(will be dropped)

         Apple    Red     10.0   Big     Yes  

         Apple    Red     5.0    Small   Yes  

         Pear     Green   11.0   Big     Yes  

         Banana   Yellow  4.0    Small   Yes  

         Orange   Orange  5.0    Small   Yes

Answer 1

你可以这样做：

In [23]: df
Out[23]:
     Name  Colour  Weight  Size  Seeds Unnamed
0   Apple   Apple     Red  10.0    Big     Yes
1   Apple   Apple     Red   5.0  Small     Yes
2    Pear    Pear   Green  11.0    Big     Yes
3  Banana  Banana  Yellow   4.0  Small     Yes
4  Orange  Orange  Orange   5.0  Small     Yes

In [24]: cols = df.columns[:-1]

In [25]: cols
Out[25]: Index(['Name', 'Colour', 'Weight', 'Size', 'Seeds'], dtype='object')

In [26]: df = df.drop('Colour', 1)

In [27]: df.columns = cols

In [28]: df
Out[28]:
     Name  Colour  Weight   Size Seeds
0   Apple     Red    10.0    Big   Yes
1   Apple     Red     5.0  Small   Yes
2    Pear   Green    11.0    Big   Yes
3  Banana  Yellow     4.0  Small   Yes
4  Orange  Orange     5.0  Small   Yes

Answer 2

If you would like to shift the columns without changing the contents in the column then user EdChum has resolved. See below or Click here.

In:
df = pd.DataFrame({'a':randn(3), 'b':randn(3), 'c':randn(3)})
df

Out:
          a         b         c
0 -0.682446 -0.200654 -1.609470
1 -1.998113  0.806378  1.252384
2 -0.250359  3.774708  1.100771
In:
cols = list(df)
cols[1], cols[0] = cols[0], cols[1]
cols

Out:
['b', 'a', 'c']

In:
df = df.ix[:,cols]

Out:
          b         a         c
0 -0.200654 -0.682446 -1.609470
1  0.806378 -1.998113  1.252384
2  3.774708 -0.250359  1.100771

Answer 3

您可以使用熊猫shift：df.shift(-1, axis=1)

df 示例：

df = pd.DataFrame({
    "Fruit": ["Apples", "Oranges", "Bananas", "Apples", "Oranges", "Bananas"],
    "Amount": [4, 1, 2, 2, 4, 5],
    "City": ["SF", "SF", "SF", "Montreal", "Montreal", "Montreal"]
})

# generalize all column as object to prevent forcing NaN due to incompatible dtypes
df[df.columns] = df[df.columns].astype('object')

# shift: to shift the value
# dropna(axis=1): to drop column with NaN result 
df.shift(-1, axis=1).dropna(axis=1)

将列移动到左侧Pandas Dataframe

3 个答案: