如何从两个熊猫数据帧为每个单元创建元组？

时间：2019-10-22 02:40:33

标签： python pandas

我想比较两个熊猫数据框的摘要。一种想法是从两个数据帧中创建元组并查看值。但是我正在努力去做。

设置

import numpy as np
import pandas as pd
import seaborn as sns

df = sns.load_dataset('iris').iloc[:,:-1]
df1 = df.describe().T
df2 = df.sample(50).describe().T

输出

df1
              count      mean       std  min  25%   50%  75%  max
sepal_length  150.0  5.843333  0.828066  4.3  5.1  5.80  6.4  7.9
sepal_width   150.0  3.057333  0.435866  2.0  2.8  3.00  3.3  4.4
petal_length  150.0  3.758000  1.765298  1.0  1.6  4.35  5.1  6.9
petal_width   150.0  1.199333  0.762238  0.1  0.3  1.30  1.8  2.5

df2
              count   mean       std  min    25%   50%    75%  max
sepal_length   50.0  5.884  0.804924  4.4  5.100  5.85  6.475  7.9
sepal_width    50.0  3.086  0.452661  2.2  2.825  3.00  3.375  4.4
petal_length   50.0  3.842  1.761967  1.2  1.600  4.60  5.100  6.9
petal_width    50.0  1.256  0.773320  0.1  0.400  1.40  1.975  2.4

必填：

tuples like this and so on
              count   mean       std  min    25%   50%    75%  max
sepal_length   (50.0,150.0)    
sepal_width    
petal_length   
petal_width    tuples for all the cells.

问题

我将非常欣赏比较这两个数据框的其他方法，例如绘图等。

1 个答案:

答案 0 :(得分：1)

您可以这样做：

data = [ [( round(j,2) , round(i,2)) for i,j in zip(df1[c],df2[c])]
          for c in df1.columns
       ]

comparisons = pd.DataFrame(data,columns=df1.index,index=df1.columns).T
comparisons

比较平均值和中位数

import numpy as np
import pandas as pd
import seaborn as sns


df = sns.load_dataset('iris').iloc[:,:-1]

df1 = df.describe().T
df2 = df.sample(50,random_state=100).describe().T

pd.concat([df1.rename(columns=lambda x: x+'_1'),df2],axis=1)\
[['mean_1','mean','50%_1','50%']]\
.style.highlight_min(subset=['mean_1','mean'],axis=1,color='gray')\
.highlight_min(subset=['50%_1','50%'],axis=1,color='tomato')

礼物：

                      count          mean           std         min  \
sepal_length  (50.0, 150.0)  (5.88, 5.84)   (0.8, 0.83)  (4.4, 4.3)   
sepal_width   (50.0, 150.0)  (3.09, 3.06)  (0.45, 0.44)  (2.2, 2.0)   
petal_length  (50.0, 150.0)  (3.84, 3.76)  (1.76, 1.77)  (1.2, 1.0)   
petal_width   (50.0, 150.0)   (1.26, 1.2)  (0.77, 0.76)  (0.1, 0.1)   

                      25%          50%          75%         max  
sepal_length   (5.1, 5.1)  (5.85, 5.8)  (6.47, 6.4)  (7.9, 7.9)  
sepal_width   (2.82, 2.8)   (3.0, 3.0)  (3.38, 3.3)  (4.4, 4.4)  
petal_length   (1.6, 1.6)  (4.6, 4.35)   (5.1, 5.1)  (6.9, 6.9)  
petal_width    (0.4, 0.3)   (1.4, 1.3)  (1.98, 1.8)  (2.4, 2.5)