pandas DataFrame再次出现了列

时间:2014-01-30 13:31:17

标签: python pandas

很抱歉,如果我做了一些愚蠢的事情,但我对此问题感到非常困惑:我将一个DataFrame传递给一个函数,并在该函数内部添加一个列并删除它。在此之前没什么奇怪的,但在函数完成后,全局名称范围的DataFrame显示添加的& drops列。如果我将DF声明为全局,则不会发生......

此测试代码显示了由Python 3.3.3 / 2.7.6和pandas 0.13.0 / 0.12.0组合产生的四种情况中的问题:

#!/usr/bin/python
import pandas as pd

# FUNCTION DFcorr
def DFcorr(df):
    # Calculate column of accumulated elements
    df['SUM']=df.sum(axis=1)
    print('DFcorr: DataFrame after add column:')
    print(df)
    # Drop column of accumulated elements
    df=df.drop('SUM',axis=1)
    print('DFcorr: DataFrame after drop column:')
    print(df)  

# FUNCTION globalDFcorr
def globalDFcorr():
    global C
    # Calculate column of accumulated elements
    C['SUM']=C.sum(axis=1)
    print('globalDFcorr: DataFrame after add column:')
    print(C)
    # Drop column of accumulated elements
    print('globalDFcorr: DataFrame after drop column:')
    C=C.drop('SUM',axis=1)
    print(C)  

######################### MAIN #############################
C = pd.DataFrame.from_items([('A', [1, 2]), ('B', [3 ,4])], orient='index', columns['one', 'two'])
print('\nMAIN: Initial DataFrame:')
print(C)
DFcorr(C)
print('MAIN: DataFrame after call to DFcorr')
print(C)

C = pd.DataFrame.from_items([('A', [1, 2]), ('B', [3 ,4])], orient='index', columns=['one', 'two'])
print('\nMAIN: Initial DataFrame:')
print(C)
globalDFcorr()
print('MAIN: DataFrame after call to globalDFcorr')
print(C)

在这里你是输出:

MAIN: Initial DataFrame:
   one  two
A    1    2
B    3    4

[2 rows x 2 columns]
DFcorr: DataFrame after add column:
   one  two  SUM
A    1    2    3
B    3    4    7

[2 rows x 3 columns]
DFcorr: DataFrame after drop column:
   one  two
A    1    2
B    3    4

[2 rows x 2 columns]
MAIN: DataFrame after call to DFcorr
   one  two  SUM
A    1    2    3
B    3    4    7

[2 rows x 3 columns]

MAIN: Initial DataFrame:
   one  two
A    1    2
B    3    4

[2 rows x 2 columns]
globalDFcorr: DataFrame after add column:
   one  two  SUM
A    1    2    3
B    3    4    7

[2 rows x 3 columns]
globalDFcorr: DataFrame after drop column:
   one  two
A    1    2
B    3    4

[2 rows x 2 columns]
MAIN: DataFrame after call to globalDFcorr
   one  two
A    1    2
B    3    4

[2 rows x 2 columns]

我错过了什么?非常感谢!

1 个答案:

答案 0 :(得分:4)

请注意DFCorr中的这一行:

df=df.drop('SUM',axis=1)

df.drop方法返回一个新的DataFrame。它不会改变原始的df

DFcorr内,df只是一个局部变量。 分配df不会影响全局变量C。只有df突变会影响C

因此,您可以通过将该行更改为DFcorr来使globalDFcorr更像df.drop('SUM',axis=1, inplace=True)

{{1}}