Question

免责声明：我是Python新手

我想问一下是否有人可能知道为什么pandas DataFrames可以接受np.round（）如果从例如df.describe（）复制DataFrame，但是如果我手动创建DataFrame则它不起作用单独定义的键/标签字符串：

df = {}
index = [...index list...]
columns = [...key list...]
df = pd.DataFrame(index=index, columns=columns)

如果我插入值然后应用np.round（），

给出“AttributeError：rint”。如果我复制df.describe（），更改一些值，然后执行np.round（）它工作正常。两者都是DataFrames所以我不明白为什么行为会有所不同。

代码示例

df1 = pd.DataFrame({'foo1' : np.random.randn(5),'foo2' : np.random.randn(5)})
df1.iloc[:,0] = np.round(df1.iloc[:,0],decimals=3) # works fine
df1

df2 = {}
index = ['foo1','foo2','foo3']
columns = ['oof1','oof2']
df2 = pd.DataFrame(index=index, columns=columns)

num = 0
for i in df1.median():
   df2.ix[0,num] = df1.median()[num]
   df2.ix[1,num] = df1.median()[num]
   df2.ix[2,num] = df1.median()[num]
   num += 1

np.around(df2.ix[:,0],decimals=3) # gives 'AttributeError: rint'

Answer 1

感谢你提供一个例子。

问题是第二个数据框中的dtypes是“对象”。您可以使用以下代码将它们转换为浮点数：

In [47]: df2 = df2.convert_objects(convert_numeric=True)

In [48]: np.around(df2.ix[:,0],decimals=3)
Out[48]: 
foo1   -0.039
foo2   -0.039
foo3   -0.039
Name: oof1, dtype: float64

通常，一次构建整个DataFrame通常更好，而不是像在df2中那样逐个构建它。例如，你可以做这样的事情，避免转换dtypes。

In [50]: df2 = pd.DataFrame({'oof1': df1['foo1'].median(),
    ...:                     'oof2': df1['foo2'].median()}, index=index)

In [51]: np.around(df2.ix[:,0],decimals=3)
Out[51]: 
foo1   -0.039
foo2   -0.039
foo3   -0.039
Name: oof1, dtype: float64

pandas DataFrame轮

1 个答案: