在Pandas中对Pivoted DataFrame执行数据分析

时间:2013-04-04 17:11:06

标签: python database pandas dataframe

我正在从数据库加载数据,并创建一个DataFrame,

db_resultset = self.result.fetchall()
df = DataFrame(db_resultset)
df.columns = self.result.keys()
pivoted_data = df.pivot(index='id', columns='item')

    data =
    id  item  val
     1    A    10
     2    A    25
     1    B    12
     1    C    15
     2    C    2
     1    D    7
     2    D    9
     ...

    pivoted_data =
         A    B    C    D
    1   10   12   15    7
    2   25   NaN   2    9
    ...

我想计算成对相关,pivoted_data.corr()之类的内容,这会导致以下错误:

File "/.../pandas/core/frame.py", line 4469, in corr
    numeric_df = self._get_numeric_data()
  File "/.../pandas/core/frame.py", line 4989, in _get_numeric_data
    return self.ix[:, []]
  File "/.../pandas/core/indexing.py", line 34, in __getitem__
    return self._getitem_tuple(key)
  File "/.../pandas/core/indexing.py", line 224, in _getitem_tuple
    retval = retval.ix._getitem_axis(key, axis=i)
  File "/.../pandas/core/indexing.py", line 342, in _getitem_axis
    return self._getitem_iterable(key, axis=axis)
  File "/.../pandas/core/indexing.py", line 408, in _getitem_iterable
    not isinstance(keyarr[0], tuple)):

对一组数据进行分析的最佳方法是什么?我曾想过将pivoted_data转换回DataFrame,但这似乎不是一个理想的解决方案。

** 编辑:

回应杰夫的评论:

pivoted_data.get_dtype_counts() =
object    319

1 个答案:

答案 0 :(得分:0)

不确定是否正确地将行读入数据框。尝试:

df = pd.DataFrame.from_records(db_curr.fetchall(),
                               index=["id", "item"],
                               columns=[col_desc[0] for col_desc in db_curr.description])
df = df.unstack()

最后一行产生透视数据。