一列与第三列相匹配的交叉表

时间:2018-12-17 17:19:58

标签: python python-3.x pandas dataframe crosstab

我正在尝试根据第三列匹配的一列做交叉表。以示例数据为例:

df = pd.DataFrame({'demographic' : ['A', 'B', 'B', 'A', 'C', 'C'],
                'id_match' : ['101', '101', '201', '201', '26', '26'],
                'time' : ['10', '10', '16', '16', '1', '1']})

其中id_match匹配,我想为人口统计列的交叉表查找时间的总和。输出如下:

  A  B  C
A 0  52 0
B 52 0  0
C 0  0  2

希望这很有道理,如果没有,请发表评论。谢谢J

1 个答案:

答案 0 :(得分:1)

您可以使用 Filter filter = mCustomerFilterDao.fetchcustomerSettings(settings); // Want to add observe for filter table so that we have latest customer //settings. This is my Query. List<Book> booklist = new ArrayList<>(); for(Book book:bookrecords) { if(book.status == filter.status) {booklist.add(book);} } merge解决此问题:

crosstab

如果您需要用零填充的NaN,则可以使用u = df.reset_index() v = u.merge(u, on='id_match').query('index_x != index_y') r = pd.crosstab(v.demographic_x, v.demographic_y, v.time_x.astype(int) + v.time_y.astype(int), aggfunc='sum') print(r) demographic_y A B C demographic_x A NaN 52.0 NaN B 52.0 NaN NaN C NaN NaN 4.0

fillna