我有一些数据帧,使用以下代码生成:
from collections import defaultdict
import pandas as pd
mydict = { ('x305', 'BoxType1-1'): { 'box': 'x305', 'box#': '0', 'boxCode': 'Z8', 'version': '00.00' },
('x305', 'BoxType1-2'): { 'box': 'x305', 'box#': '0', 'boxCode': 'K8', 'version': '01.00' },
('x307', 'BoxType1-1'): { 'box': 'x307', 'box#': '0', 'boxCode': 'Z8', 'serialNo': 'None', 'version': '00.00' },
('x307', 'BoxType2-1'): { 'box': 'x307', 'box#': '0', 'boxCode': 'Z8', 'serialNo': 'None', 'version': '00.00' },
('x403', 'BoxType1-1'): { 'box': 'x403', 'box#': '0', 'boxCode': 'Z8', 'bla': 'None', 'version': '00.00' },
('x405', 'BoxType1-2'): { 'box': 'x405', 'box#': '0', 'boxCode': 'Z8', 'serialNo': 'None', 'version': '00.00' },
('x405', 'BoxType2-1'): { 'box': 'x405', 'box#': '0', 'boxCode': 'Z8', 'version': '00.00' },
('x510', 'BoxType1-3'): { 'box': 'x510', 'box#': '0', 'boxCode': 'Z8', 'version': '01.00' } }
boxTypes = [ 'BoxType1', 'BoxType2' ]
dataframes = defaultdict( set )
for boxType in boxTypes:
dataframes[ boxType ] = pd.DataFrame.from_dict( { ( box, bt ): mydict[ ( box, bt ) ]
for box, bt in mydict.keys()
if boxType in bt },
orient='index' )
print dataframes[ 'BoxType1' ]
box version box# boxCode bla serialNo
x305 BoxType1-1 x305 00.00 0 Z8 NaN NaN
BoxType1-2 x305 01.00 0 K8 NaN NaN
x307 BoxType1-1 x307 00.00 0 Z8 NaN None
x403 BoxType1-1 x403 00.00 0 Z8 None NaN
x405 BoxType1-2 x405 00.00 0 Z8 NaN None
x510 BoxType1-3 x510 01.00 0 Z8 NaN NaN
现在我试图找到计算整个数据帧中特定行元组的方法。例如,我想要一个这样的函数:
def countRowTuples( df, columns ):
'''
Count occurrences of row tuple in dataframe
and return a new dataframe with a count column at the end
'''
df2 = countRowTuples( dataframes['BoxType1'], columns=[ boxCode, bla, version ] )
df2 =
box version box# boxCode bla serialNo count
x305 BoxType1-1 x305 00.00 0 Z8 NaN NaN 3
BoxType1-2 x305 01.00 0 K8 NaN NaN 1
x307 BoxType1-1 x307 00.00 0 Z8 NaN None 3
x403 BoxType1-1 x403 00.00 0 Z8 None NaN 1
x405 BoxType1-2 x405 00.00 0 Z8 NaN None 3
x510 BoxType1-3 x510 01.00 0 Z8 NaN NaN 1
或者,该函数可以摆脱原始索引和重复行,并返回如下数据框:
df2 =
version boxCode bla count
1 00.00 Z8 NaN 3
2 01.00 K8 NaN 1
3 00.00 Z8 None 1
4 01.00 Z8 NaN 1
任何人都有什么好的见解我会如何解决这个问题?
我尝试过以下操作,但一直返回一个空数据框:(。
df = dataframes[ 'BoxType1' ]
print df.groupby(df.columns.tolist()).size().reset_index().rename(columns={0:'count'})
Empty DataFrame
Columns: [box, version, box#, boxCode, bla, serialNo, count]
Index: []
答案 0 :(得分:2)
由于空值,group by未执行所需的计数。试试这个:
dataframes['BoxType1'].fillna("NaN").groupby(["version", "boxCode", "bla"]).size().reset_index(name="count")
答案 1 :(得分:0)
一种方法是添加另一个填充了一列的列,按照您想要不同值的字段分组(您需要用某些值填充这些NaN)并对添加的列求和。
$menus = get_registered_nav_menus();
foreach ( $menus as $location => $description ) {
wp_nav_menu( array(
'theme_location' => $location,
'depth' => 1,
'container' => 'div',
'container_class' => 'col-md px-0 mb-5',
'container_id' => $location,
'menu_class' => 'navbar-nav h5',
'fallback_cb' => 'WP_Bootstrap_Navwalker::fallback',
'walker' => new WP_Bootstrap_Navwalker())
);
}