Question

我有一个数据帧（df）

                 id                       company          sector currency           price 
0     BBG.MTAA.MS.S                  MEDIASET SPA  Communications      EUR        4.334000
1    BBG.MTAA.TIT.S            TELECOM ITALIA SPA  Communications      EUR        1.091000    
2    BBG.XETR.DTE.S       DEUTSCHE TELEKOM AG-REG  Communications      EUR       15.460000   
3   BBG.XLON.BARC.S                  BARCLAYS PLC       Financial      GBp        3.414498    
4    BBG.XLON.BTA.S                  BT GROUP PLC  Communications      GBp        5.749122    
5   BBG.XLON.HSBA.S             HSBC HOLDINGS PLC       Financial      GBp        6.716041    
6   BBG.XLON.LLOY.S      LLOYDS BANKING GROUP PLC       Financial      GBp        1.027752    
7   BBG.XLON.STAN.S        STANDARD CHARTERED PLC       Financial      GBp        9.707300    
8   BBG.XLON.TRIL.S        THOMSON REUTERS UK LTD  Communications      GBp             NaN         
9    BBG.XLON.VOD.S            VODAFONE GROUP PLC  Communications      GBp        3.035487    
10  BBG.XMCE.BBVA.S  BANCO BILBAO VIZCAYA ARGENTA       Financial      EUR        7.866000

我可以在扇区字段上创建一个数据透视表（使用以下代码查找同一扇区中有多少公司：

sectorPivot = df.pivot_table(index=['sector'], aggfunc='count')

看起来像这样：

                currency  id     company
sector                            
Communications         6   6           6
Financial              5   5           5

但是我想过滤掉价格等于'NaN'的公司，所以我有一个看起来像

的数据透视表

                currency  id     company
sector                            
Communications         5   5           5
Financial              5   5           5

（请注意，由于其中一个broad_sector股票的'NaN'价格，通信行业的数量从6减少到5

有人可以让我知道我是怎么做到的吗？

非常感谢

Answer 1

在您的支点前使用dropna(subset=['price']。

df.dropna(subset=['price']).pivot_table(index=['sector'], aggfunc='count')

按非透视数据框列

1 个答案: