从多个数据框列可视化常见字符串值的好方法是什么?

时间:2019-06-28 04:27:42

标签: python pandas matplotlib data-visualization

我有多个数据框(每个城市一个),“名称”列表示该城市的组织名称。

如何可视化每两个城市的通用名称和所有城市的通用名称,以便于理解?

示例:

  df1            df2

  Name           Name       
'Apollo'        'Kims'
'MedWorks'      'AIMs'
'Cradle'        'Apollo'
'Kims'          'Bronte Co'
'Collins'       'Cradle'

每个城市共有10个以上的值(名称)。我不确定venn图是否可以使用字符串值,但是即使它们可以,也不能以良好的格式容纳所有数据。

尝试过this as suggested,但我得到了:

TypeError: unsupported operand type(s) for -: 'str' and 'str'

1 个答案:

答案 0 :(得分:2)

使用matplotlib_venn

import pandas as pd
from matplotlib_venn import venn2

set1 = set(df1['Name'])
set2 = set(df2['Name'])

venn = venn2([set1, set2])
venn.get_label_by_id('100').set_text('\n'.join(map(str,set1-set2)))
venn.get_label_by_id('110').set_text('\n'.join(map(str,set1&set2)))
venn.get_label_by_id('010').set_text('\n'.join(map(str,set2-set1)))
# venn.get_label is quoted from https://stackoverflow.com/questions/55717203/plot-actual-set-items-in-python-not-the-number-of-items

输出:

enter image description here