获取不同的列值并合并数据帧

时间:2016-01-19 14:18:28

标签: python pandas unique distinct concat

我正在尝试转换sql语句

    SELECT distinct table1.[Name],table1.[Phno]
    FROM table1
    union
    select distinct table2.[Name],table2.[Phno] from table2
    UNION 
    select distinct table3.[Name],table3.[Phno] from table3;

现在我有4个数据帧:table1,table2,table3。

table1   
     Name        Phno
0  Andrew  6175083617
1  Andrew  6175083617
2   Frank  7825942358
3   Jerry  3549856785
4     Liu  9659875695
table2
     Name        Phno
0   Sandy  7859864125
1  Nikhil  9526412563
2   Sandy  7859864125
3    Tina  7459681245
4   Surat  9637458725
table3
     Name        Phno
0   Patel  9128257489
1    Mary  3679871478
2  Sandra  9871359654
3    Mary  3679871478
4    Hali  9835167465

现在我需要得到这些数据帧的不同值并将它们联合起来并得到输出:

sample output
  Name        Phno
0   Andrew  6175083617
1    Frank  7825942358
2    Jerry  3549856785
3      Liu  9659875695
4    Sandy  7859864125
5   Nikhil  9526412563
6     Tina  7459681245
7    Surat  9637458725
8    Patel  9128257489
9     Mary  3679871478
10  Sandra  9871359654
11    Hali  9835167465

我试图获取一个数据帧table1的唯一值,如下所示:

table1_unique = pd.unique(table1.values.ravel()) #which gives me 
table1_unique
array(['Andrew', 6175083617L, 'Frank', 7825942358L, 'Jerry', 3549856785L,
   'Liu', 9659875695L], dtype=object)

但我把它们作为一个数组。我甚至尝试使用以下方法将它们转换为数据帧:

table1_unique1 = pd.DataFrame(table1_unique)
table1_unique1
            0
0      Andrew
1  6175083617
2       Frank
3  7825942358
4       Jerry
5  3549856785
6         Liu 
7  9659875695

如何在数据框中获取唯一值,以便我可以根据我的示例输出连接它们。希望这很清楚。谢谢!

1 个答案:

答案 0 :(得分:1)

a = table1df[['Name','Phno']].drop_duplicates()
b = table2df[['Name','Phno']].drop_duplicates()
c = table3df[['Name','Phno']].drop_duplicates()
result = pd.concat([a,b,c])