Question

total_val_count = dataset[attr].value_counts()      
    for i in range(len(total_val_count.index)):
        print total_val_count[i]

我编写了这段代码，用于计算数据框中属性的所有不同值的出现次数。我面临的问题是我无法使用索引0访问第一个值。我在第一个循环运行中得到一个KeyError：0错误。

total_val_count包含正确的值，如下所示：

34 2887
4 2708
13 2523
35 2507
33 2407
3 2404
36 2382
26 2378
16 2282
22 2187
21 2141
12 2104
25 2073年 5 2052
15 2044年 17 2040
14 2027
28 1984年 1980年27月 23 1979年 24 1960年 1953年3月 29 1936年 31 1884年 1877年11月18日 7 1858年 37 1767
20 1762
11 1740
8 1722
6 1693
32 1692
10 1662
9 1576
19 1308
2 1266
1 175
38 63
dtype：int64

Answer 1

total_val_count是一个系列。系列的索引是dataset[attr]中的值，并且系列中的值是dataset[attr]中出现的关联值的次数。

当您使用total_val_count[i]索引系列时，Pandas会在索引中查找i并返回关联的值。换句话说，total_val_count[i]按索引值索引，而不是按顺序索引。将系列视为从索引到值的映射。使用普通索引时，例如total_val_count[i]，它的行为更像dict而非list。

您收到KeyError，因为0不是索引中的值。要按顺序编制索引，请使用total_val_count.iloc[i]。

话虽如此，不建议使用for i in range(len(total_val_count.index)) - 或者相同的东西，for i in range(len(total_val_count))。而不是

for i in range(len(total_val_count)):
    print(total_val_count.iloc[i])

你可以使用

for value in total_val_count.values:
    print(value)

这更具可读性，允许您将所需的值作为变量value访问，而不是使用更加繁琐的total_val_count.iloc[i]。

这是一个示例，显示如何迭代值，键，键和值：

import pandas as pd

s = pd.Series([1, 2, 3, 2, 2])
total_val_count = s.value_counts()

print(total_val_count)
# 2    3
# 3    1
# 1    1
# dtype: int64

for value in total_val_count.values:
    print(value)
    # 3
    # 1
    # 1

for key in total_val_count.keys():
    print(key)
    # 2
    # 3
    # 1

for key, value in total_val_count.iteritems():
    print(key, value)
    # (2, 3)
    # (3, 1)
    # (1, 1)

for i in range(len(total_val_count)):
    print(total_val_count.iloc[i])
    # 3
    # 1
    # 1

没有从pandas获得0索引value_counts（）

1 个答案: