假设你有:
import pandas as pd
x = pd.Series(["A", "B", "A", "A", None, "B", "A", None], dtype = "category")
y = pd.Series([ 1, 2, 3, None, 1, 2, 3, 2])
如果你pd.crosstab(x, y, dropna = False)
,你会得到:
col_0 1.0 2.0 3.0
row_0
A 1 0 2
B 0 2 0
省略了其中一个值为null的三个(x
,y
)对。 (参数dropna
是misleadingly named。)如何创建包含这些值的列联表,如下表所示?
col_0 1.0 2.0 3.0 NaN
row_0
A 1 0 2 1
B 0 2 0 0
NaN 1 1 0 0
答案 0 :(得分:1)
将NaN
转换为字符串是否有效?
pd.crosstab(x.replace(np.nan, 'NaN'),y.replace(np.nan, 'NaN'),dropna=False)
结果:
col_0 1.0 2.0 3.0 NaN
row_0
A 1 0 2 1
B 0 2 0 0
NaN 1 1 0 0