pandas中forframe循环中的KeyError

时间:2015-07-01 20:01:27

标签: python for-loop pandas dataframe keyerror

我将数据放入热图的散景布局中,但我得到的是KeyError:'1'。它出现在pivot_table.head() Out[101]: Month 1 2 3 4 5 6 7 8 9 CompanyName Company 1 182 270 278 314 180 152 110 127 129 Company 2 163 147 192 142 186 231 214 130 112 Company 3 126 88 99 139 97 97 96 37 79 Company 4 84 89 71 95 80 89 83 88 104 Company 5 91 96 94 66 81 77 87 83 68 Month 10 11 12 CompanyName Company 1 117 127 81 Company 2 117 93 101 Company 3 116 111 95 Company 4 93 78 64 Company 5 83 95 65 行上有人知道为什么会这样吗?

我正在使用的数据透视表位于:

pivot_table = pivot_table.reset_index()
pivot_table['CompanyName'] = [str(x) for x in pivot_table['CompanyName']]
Companies = list(pivot_table['CompanyName'])
months = ["1","2","3","4","5","6","7","8","9","10","11","12"]
pivot_table = pivot_table.set_index('CompanyName')

# this is the colormap from the original plot
colors = ["#75968f", "#a5bab7", "#c9d9d3", "#e2e2e2", "#dfccce",
    "#ddb7b1", "#cc7878", "#933b41", "#550b1d" ]

# Set up the data for plotting. We will need to have values for every
# pair of year/month names. Map the rate to a color.
month = []
company = []
color = []
rate = []
for y in Companies:
    for m in months:
        month.append(m)
        company.append(y)
        num_calls = pivot_table[m][y]
        rate.append(num_calls)
        color.append(colors[min(int(num_calls)-2, 8)])

以下是导致错误的代码部分:

pivot_table.info()
<class 'pandas.core.frame.DataFrame'>
Index: 46 entries, Company1 to LastCompany
Data columns (total 12 columns):
1.0     46 non-null float64
2.0     46 non-null float64
3.0     46 non-null float64
4.0     46 non-null float64
5.0     46 non-null float64
6.0     46 non-null float64
7.0     46 non-null float64
8.0     46 non-null float64
9.0     46 non-null float64
10.0    46 non-null float64
11.0    46 non-null float64
12.0    46 non-null float64
dtypes: float64(12)
memory usage: 4.5+ KB

并根据要求:

pivot_table.columns
Out[103]: Index([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0], dtype='object')

{{1}}

散景代码也在这里:http://bokeh.pydata.org/en/latest/docs/gallery/unemployment.html

2 个答案:

答案 0 :(得分:1)

尝试将循环更改为

#!/bin/bash

while read word; do
    if [ -z $word ]; then
        break
    fi

    # rest of code...
done

看起来你可以在没有任何循环的情况下实现同样的目标。您循环遍历行索引和列索引以单独访问每个条目并将它们附加到列表,因此alt=csv只是数据框中所有元素的列表。你可以通过

实现这一目标
for m in pivot_table.columns:

我在这里错过了什么吗?

答案 1 :(得分:1)

我尝试了以下代码,它可以在我的电脑上运行。我使用.loc来避免潜在的密钥错误。

import pandas as pd
import numpy as np

# just following your previous post to simulate your data
np.random.seed(0)
dates = np.random.choice(pd.date_range('2015-01-01 00:00:00', '2015-06-30 00:00:00', freq='1h'), 10000)
company = np.random.choice(['company' + x for x in '1 2 3 4 5'.split()], 10000)
df = pd.DataFrame(dict(recvd_dttm=dates, CompanyName=company)).set_index('recvd_dttm').sort_index()
df['C'] = 1
df.columns = ['CompanyName', '']
result = df.groupby([lambda idx: idx.month, 'CompanyName']).agg({df.columns[1]: sum}).reset_index()
result.columns = ['Month', 'CompanyName', 'counts']
pivot_table = result.pivot(index='CompanyName', columns='Month', values='counts')




colors = ["#75968f", "#a5bab7", "#c9d9d3", "#e2e2e2", "#dfccce",
    "#ddb7b1", "#cc7878", "#933b41", "#550b1d" ]

month = []
company = []
color = []
rate = []
for y in pivot_table.index:
    for m in pivot_table.columns:
        month.append(m)
        company.append(y)
        num_calls = pivot_table.loc[y, m]
        rate.append(num_calls)
        color.append(colors[min(int(num_calls)-2, 8)])