我正在尝试使用matplotlib创建元素频率条形图。 为了实现这一点,我需要能够计算pandas dataframe列中相对于标志列表的出现次数。 下面将给出我笔记本/数据中代码的粗略草图:
# list of filtered values
filtered = [200, 201, 201, 201, 201, 201,
211, 211, 211, 211, 211, 211, 211, 211, 211, 211, 211, 211, 211, 211,
237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237,
237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237,
237, 237, 237, 237, 237, 237, 237, 237, 250, 250, 250, 250, 250, 250, 250,
250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250,
250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250,
250, 250, 250, 250, 254]
# list of flags to use for filtering
flags = [200, 201, 211, 237, 239, 250, 254, 255]
# this was just a line to code for testing
flags_dict = {200:0,201:0,211:0,237:0,239:0,250:0,254:0,255:0}
freq = filtered.value_counts()
"""
Expected flags_dict:
200: 1
201: 5
211: 14
237: 38
239: 0
250: 40
254: 1
255: 0
"""
"""
These are the values from the real dataframe but they do not take into
account the other flags in the flags list
freq:
250.0 7682
211.0 3734
200.0 1483
239.0 180
201.0 34
"""
答案 0 :(得分:1)
使用isin
假设filtered
是一个系列。
In [1]: filtered[filtered.isin(flags)].value_counts().reindex(flags, fill_value=0)
Out[1]: 200 1
201 5
211 14
237 38
239 0
250 41
254 1
255 0
dtype: int64
要获取字典,只需添加to_dict
In [2]: filtered[filtered.isin(flags)].value_counts().reindex(flags, fill_value=0).to_dict()
Out[2]: {200: 1, 201: 5, 211: 14, 237: 38, 239: 0, 250: 41, 254: 1, 255: 0}
答案 1 :(得分:0)
我刚才想出了这个,但必须有一个更好/更快的方法来完成这个
.Range("I1").Value = iVal = Application.WorksheetFunction.CountIf(Range("A3:A103"), "CrisisNameTextBox1.Value")
答案 2 :(得分:0)
如果我理解正确,这就是你需要的:
import pandas as pd
filtered = [200, 201, 201, 201, 201, 201, 211, 211, 211, 211, 211, 211, 211, 211, 211,
211, 211, 211, 211, 211,
237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237,
237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237,
237, 237, 237, 237, 237, 237, 237, 237, 250, 250, 250, 250, 250, 250, 250,
250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250,
250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250,
250, 250, 250, 250, 254]
filtered = pd.Series(filtered)
freq = filtered.value_counts(sort=False)
flags = [200, 201, 211, 237, 239, 250, 254, 255]
flags_dict = {}
for flag in flags:
try:
flags_dict[flag] = freq[flag]
except:
flags_dict[flag] = 0