我有一些三个变量的数据框,我想为每个变量创建一个每个标签相对计数的字典。
我轻松创建了一个完全输出我想要的forloop,但是我的lambda会产生奇怪的结果。
以下是数据:
<body>
<div class="wrapper">
<div class="page">
<ul id="nav" class="sf-menu">
<li class="level0 level-top parent first"><a href="#" target="_self" class=" level-top "><span>Category 01</span></a>
<ul class="level0 ">
<li class="level1 first "><a href="#" target="_self" class=""><span>cabelos</span></a></li>
<li class="level1 "><a href="#" target="_self" class=""><span>olhos</span></a></li>
<li class="level1 "><a href="#" target="_self" class=""><span>lábios</span></a></li>
<li class="level1 "><a href="#" target="_self" class=""><span>corpo</span></a></li>
<li class="level1 "><a href="#" target="_self" class=""><span>pescoço</span></a></li>
<li class="level1 first last last-col"><a href="#" target="_self" class=""><span>unhas</span></a></li>
</ul>
</li>
<li class="level0 level-top parent mega-pos-01"><a href="#" target="_self" class=" level-top "><span>Category 02</span></a>
<ul class="level0 megamenu mega-wFull mega-col6">
<li class="level1 parent first "><a href="#" target="_self" class=" "><span>subcategoria 01</span></a>
<ul class="level1 ">
<li class="level2 first last "><a href="#" target="_self" class=""><span>face</span></a></li>
</ul>
</li>
<li class="level1 "><a href="#" target="_self" class=""><span>subcategoria 41</span></a></li>
<li class="level1 "><a href="#" target="_self" class=""><span>subcategoria 02</span></a></li>
<li class="level1 "><a href="#" target="_self" class=""><span>subcategoria 03</span></a></li>
<li class="level1 "><a href="#" target="_self" class=""><span>subcategoria 04</span></a></li>
<li class="level1 parent last-col"><a href="#" target="_self" class=" "><span>subcategoria 05</span></a>
<ul class="level1 ">
<li class="level2 parent first "><a href="#" target="_self" class=" "><span>subcategoria 37</span></a>
<ul class="level2 ">
<li class="level3 first "><a href="#" target="_self" class=""><span>subcategoria 34</span></a></li>
<li class="level3 first last "><a href="#" target="_self" class=""><span>subcategoria 33</span></a></li>
</ul>
</li>
<li class="level2 parent first last "><a href="#" target="_self" class=" "><span>subcategoria 38</span></a>
<ul class="level2 ">
<li class="level3 first "><a href="#" target="_self" class=""><span>subcategoria 36</span></a></li>
<li class="level3 first last "><a href="#" target="_self" class=""><span>subcategoria 35</span></a></li>
</ul>
</li>
</ul>
</li>
<li class="level1 "><a href="#" target="_self" class=""><span>subcategoria 06</span></a></li>
<li class="level1 "><a href="#" target="_self" class=""><span>subcategoria 07</span></a></li>
<li class="level1 "><a href="#" target="_self" class=""><span>subcategoria 08</span></a></li>
<li class="level1 "><a href="#" target="_self" class=""><span>subcategoria 09</span></a></li>
<li class="level1 "><a href="#" target="_self" class=""><span>subcategoria 10</span></a></li>
<li class="level1 last-col"><a href="#" target="_self" class=""><span>subcategoria 11</span></a></li>
<li class="level1 "><a href="#" target="_self" class=""><span>subcategoria 12</span></a></li>
<li class="level1 "><a href="#" target="_self" class=""><span>subcategoria 13</span></a></li>
<li class="level1 "><a href="#" target="_self" class=""><span>subcategoria 14</span></a></li>
<li class="level1 "><a href="#" target="_self" class=""><span>subcategoria 15</span></a></li>
<li class="level1 "><a href="#" target="_self" class=""><span>subcategoria 16</span></a></li>
<li class="level1 last-col"><a href="#" target="_self" class=""><span>subcategoria 17</span></a></li>
<li class="level1 "><a href="#" target="_self" class=""><span>subcategoria 18</span></a></li>
<li class="level1 "><a href="#" target="_self" class=""><span>subcategoria 19</span></a></li>
<li class="level1 "><a href="#" target="_self" class=""><span>subcategoria 20</span></a></li>
<li class="level1 "><a href="#" target="_self" class=""><span>subcategoria 21</span></a></li>
<li class="level1 "><a href="#" target="_self" class=""><span>subcategoria 22</span></a></li>
<li class="level1 last-col"><a href="#" target="_self" class=""><span>subcategoria 23</span></a></li>
<li class="level1 "><a href="#" target="_self" class=""><span>subcategoria 24</span></a></li>
<li class="level1 "><a href="#" target="_self" class=""><span>subcategoria 25</span></a></li>
<li class="level1 "><a href="#" target="_self" class=""><span>subcategoria 26</span></a></li>
<li class="level1 "><a href="#" target="_self" class=""><span>subcategoria 27</span></a></li>
<li class="level1 "><a href="#" target="_self" class=""><span>subcategoria 28</span></a></li>
<li class="level1 last-col"><a href="#" target="_self" class=""><span>subcategoria 29</span></a></li>
<li class="level1 "><a href="#" target="_self" class=""><span>subcategoria 30</span></a></li>
<li class="level1 "><a href="#" target="_self" class=""><span>subcategoria 31</span></a></li>
<li class="level1 "><a href="#" target="_self" class=""><span>subcategoria 32</span></a></li>
<li class="level1 "><a href="#" target="_self" class=""><span>feminino</span></a></li>
<li class="level1 "><a href="#" target="_self" class=""><span>masculino</span></a></li>
<li class="level1 first last last-col"><a href="#" target="_self" class=""><span>desodorante</span></a></li>
</ul>
</li>
<li class="level0 level-top "><a href="#" target="_top" class=" level-top"><span>Category 03</span></a></li>
<li class="level0 level-top "><a href="#" target="_top" class=" level-top"><span>Category 04</span></a></li>
<li class="level0 level-top parent last "><a href="#" target="_self" class=" level-top "><span>Category 05</span></a>
<ul class="level0 ">
<li class="level1 first "><a href="#" target="_self" class=""><span>outros</span></a></li>
<li class="level1 "><a href="#" target="_self" class=""><span>gel de banho</span></a></li>
<li class="level1 parent first last "><a href="#" target="_self" class=" "><span>loção corporal</span></a>
<ul class="level1 ">
<li class="level2 first "><a href="#" target="_self" class=""><span>subcategoria 40</span></a></li>
<li class="level2 first last "><a href="#" target="_self" class=""><span>subcategoria 39</span></a></li>
</ul>
</li>
</ul>
</li>
</ul>
</div>
</div>
</body>
这个for循环产生我想要的确切输出:
In [3]:
import pandas as pd
raw_data = {
'category1': ['Red', 'Red', 'Red', 'Green'],
'category2': ['Plane', 'Plane', 'Plane', 'Car'],
'category3': ['Orange', 'Orange', 'Orange', 'Banana'],
}
df = pd.DataFrame(raw_data)
df
Out[3]:
category1 category2 category3
0 Red Plane Orange
1 Red Plane Orange
2 Red Plane Orange
3 Green Car Banana
然而,这个lambda由于某种未知原因而失败:
In [4]:
forloop = {}
for column in df:
forloop[column] = df[column].value_counts(normalize=True).to_dict()
forloop
Out[4]:
{'category1': {'Green': 0.25, 'Red': 0.75},
'category2': {'Car': 0.25, 'Plane': 0.75},
'category3': {'Banana': 0.25, 'Orange': 0.75}}
答案 0 :(得分:1)
我实际上无法理解这里出了什么问题,除了它没有解开dict
电话,这是一个实现你想要的圆形方式:
In [86]:
ratio = lambda x: x.value_counts(normalize=True)
output_lambda = df.apply(lambda x: [x.value_counts().to_dict()]).apply(lambda x: x[0]).to_dict()
output_lambda
Out[86]:
{'category1': {'Green': 1, 'Red': 3},
'category2': {'Car': 1, 'Plane': 3},
'category3': {'Banana': 1, 'Orange': 3}}
看起来它将函数对象绑定为列值而不是将其解压缩为dict,我上面所做的是将value_counts
作为列表返回,然后再次调用apply
解压缩单个元素列表。这会强制将dict解压缩到初始apply
调用中的单个元素列表中:
In [87]:
output_lambda = df.apply(lambda x: [x.value_counts().to_dict()])
output_lambda
Out[87]:
category1 [{'Green': 1, 'Red': 3}]
category2 [{'Plane': 3, 'Car': 1}]
category3 [{'Banana': 1, 'Orange': 3}]
dtype: object
答案 1 :(得分:1)
我想问题是lambda
函数返回的对象无法通过pandas转换为Series
或DataFrame
(但应由pandas专家确认)。
只需略微修改代码即可实现几乎相同的功能:
ratio = lambda x: x.value_counts(normalize=True)
output_lambda = df.apply(ratio).to_dict()
如果您不希望在nan
中使用output_lambda
,则可以使用此答案中提出的解决方案:https://stackoverflow.com/a/26033302/4709400