比较一个数据框中的手机号码和另一数据框中的手机号码

时间:2018-10-18 08:59:53

标签: python pandas numpy dataframe

我的目的是导入两个excel文件,一个具有我的电话号码历史记录,另一个具有一些工作号码。

我想将工作号码与电话号码历史记录中的电话号码进行比较,并在日期和关联的电话通话时间中存储一个新的矩阵。

目前我正在按照以下说明手动进行操作,有人可以帮我吗?

谢谢大家。

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt; plt.rcdefaults()
import os 
clear = lambda:os.system('cls')
clear()

xls = pd.ExcelFile("C:\ - location")
df1 = pd.read_excel(xls, 'RawData', dtype=    {'Date':np.datetime64, 'Type':str}, header=None)
df2 = pd.read_excel(xls, 'WorkNumbers',0)

dR = df1.as_matrix()
dWN = df2.as_matrix()

Ewen = df1[(df1['Number'] == #mobile number#)]
Alex = df1[(df1['Number'] == #mobile number#)]
Nirmal = df1[(df1['Number'] == #mobile number#)]
Chris = df1[(df1['Number'] == #mobile number#)]
ChrisM = df1[(df1['Number'] == #mobile number#)]
Austofix = df1[(df1['Number'] == #mobile number#)]
Simon = df1[(df1['Number'] == #mobile number#)]
Tony = df1[(df1['Number'] == #mobile number#)]

Trial = [Ewen, Alex, Nirmal, Chris, ChrisM, Austofix, Simon, Tony]
enter code heredf3 = pd.concat(Trial)

Photo of Matrix / DataFrames Goal

数据示例:

df1:


Date            Type            Number          Duration
03/10/18    National Mobile 8156665498  4.00
03/10/18    National Mobile 8156665499  27.00
03/10/18    National Mobile 8156665500  21.00
02/10/18    National Mobile 8156665501  47.00
02/10/18    National Mobile 45687823456 47.00
02/10/18    National Mobile 45687823457 35.00
02/10/18    National Mobile 45687823458 55.00
30/09/18    National Mobile 45687823459 1.00
30/09/18    National Mobile 45687823460 41.00
30/09/18    CallForward to VoiceMail    8156665507  1.00
30/09/18    National Mobile 8156665508  3.00
29/09/18    National Mobile 8156665509  16.00
29/09/18    National Mobile 8156665510  2.00
29/09/18    National Mobile 8156665511  3.00
29/09/18    National Mobile 8156665512  2.00
28/09/18    13nnnn  8156665513  14.00
28/09/18    National Mobile 8156665514  25.00


df2:

WNumber name
45687823456 Ewen
45687823457 alex
45687823458 nirmal
45687823459 chris
45687823460 chris m

2 个答案:

答案 0 :(得分:0)

编辑:根据示例数据添加了左/右键。

这有效吗?使用pd.merge

df1 = pd.DataFrame([['04123', '60mins'], ['04723', '30mins'], ['04568', '10mins']], columns=['Number', 'Duration'])
df2 = pd.DataFrame([['Bob', '04123'], ['James', '04723'], ['Someone', '04567']], columns=['Name', 'WNumber'])

df_result = pd.merge(df2, df1, left_on=['WNumber'], right_on=['Number'])

它产生

    Name WNumber Number Duration
0    Bob   04123  04123   60mins
1  James   04723  04723   30mins

可以删除重复的列。

答案 1 :(得分:0)

我在Antoine下进行了此操作,它吐出了KeyError:'Number',这是我的excel列标题之一?有什么想法吗?

xls = pd.ExcelFile("C:\Users...… MobileData.xlsx")
df1 = pd.read_excel(xls, 'RawData',0)
df2 = pd.read_excel(xls, 'WorkNumbers',0)

df_result = pd.merge(df2, df1, on=['Number'])