我的目的是导入两个excel文件,一个具有我的电话号码历史记录,另一个具有一些工作号码。
我想将工作号码与电话号码历史记录中的电话号码进行比较,并在日期和关联的电话通话时间中存储一个新的矩阵。
目前我正在按照以下说明手动进行操作,有人可以帮我吗?
谢谢大家。
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt; plt.rcdefaults()
import os
clear = lambda:os.system('cls')
clear()
xls = pd.ExcelFile("C:\ - location")
df1 = pd.read_excel(xls, 'RawData', dtype= {'Date':np.datetime64, 'Type':str}, header=None)
df2 = pd.read_excel(xls, 'WorkNumbers',0)
dR = df1.as_matrix()
dWN = df2.as_matrix()
Ewen = df1[(df1['Number'] == #mobile number#)]
Alex = df1[(df1['Number'] == #mobile number#)]
Nirmal = df1[(df1['Number'] == #mobile number#)]
Chris = df1[(df1['Number'] == #mobile number#)]
ChrisM = df1[(df1['Number'] == #mobile number#)]
Austofix = df1[(df1['Number'] == #mobile number#)]
Simon = df1[(df1['Number'] == #mobile number#)]
Tony = df1[(df1['Number'] == #mobile number#)]
Trial = [Ewen, Alex, Nirmal, Chris, ChrisM, Austofix, Simon, Tony]
enter code heredf3 = pd.concat(Trial)
Photo of Matrix / DataFrames Goal
数据示例:
df1:
Date Type Number Duration
03/10/18 National Mobile 8156665498 4.00
03/10/18 National Mobile 8156665499 27.00
03/10/18 National Mobile 8156665500 21.00
02/10/18 National Mobile 8156665501 47.00
02/10/18 National Mobile 45687823456 47.00
02/10/18 National Mobile 45687823457 35.00
02/10/18 National Mobile 45687823458 55.00
30/09/18 National Mobile 45687823459 1.00
30/09/18 National Mobile 45687823460 41.00
30/09/18 CallForward to VoiceMail 8156665507 1.00
30/09/18 National Mobile 8156665508 3.00
29/09/18 National Mobile 8156665509 16.00
29/09/18 National Mobile 8156665510 2.00
29/09/18 National Mobile 8156665511 3.00
29/09/18 National Mobile 8156665512 2.00
28/09/18 13nnnn 8156665513 14.00
28/09/18 National Mobile 8156665514 25.00
df2:
WNumber name
45687823456 Ewen
45687823457 alex
45687823458 nirmal
45687823459 chris
45687823460 chris m
答案 0 :(得分:0)
编辑:根据示例数据添加了左/右键。
这有效吗?使用pd.merge。
df1 = pd.DataFrame([['04123', '60mins'], ['04723', '30mins'], ['04568', '10mins']], columns=['Number', 'Duration'])
df2 = pd.DataFrame([['Bob', '04123'], ['James', '04723'], ['Someone', '04567']], columns=['Name', 'WNumber'])
df_result = pd.merge(df2, df1, left_on=['WNumber'], right_on=['Number'])
它产生
Name WNumber Number Duration
0 Bob 04123 04123 60mins
1 James 04723 04723 30mins
可以删除重复的列。
答案 1 :(得分:0)
我在Antoine下进行了此操作,它吐出了KeyError:'Number',这是我的excel列标题之一?有什么想法吗?
xls = pd.ExcelFile("C:\Users...… MobileData.xlsx")
df1 = pd.read_excel(xls, 'RawData',0)
df2 = pd.read_excel(xls, 'WorkNumbers',0)
df_result = pd.merge(df2, df1, on=['Number'])