我正在解析两个不同的JSON文件并将数据发送到两个excels。我正在根据列合并来自两个excel的数据。但是当我尝试按组执行时,它会删除两列。以下是示例输出:
Ep_sg_id Ep_ip Ep_netmask Uuid \
0 36bc01bf 10.202.221.133 255.255.255.255 NaN
1 36bc01bf 10.202.220.141 255.255.255.255 NaN
2 cf564ff3 17.39.68.0 255.255.255.128 NaN
3 001d2bd5 17.176.253.64 255.255.255.192 001d2bd5
4 NaN NaN NaN 0448d01f
5 NaN NaN NaN 0d928eff
6 NaN NaN NaN 06306991
7 NaN NaN NaN 11003dc5
8 NaN NaN NaN 0a7509ea
Name
0 NaN
1 NaN
2 NaN
3 VIP
4 ADMIN_HOSTS
5 DB-EXTERNAL
6 CORP
7 POD1-DB
8 UAT
Ep_sg_id Ep_ip Ep_netmask
0 36bc01bf 10.202.221.133 255.255.255.255
1 36bc01bf 10.202.220.141 255.255.255.255
2 cf564ff3 17.39.68.0 255.255.255.128
3 001d2bd5 17.176.253.64 255.255.255.192
Uuid Name
0 001d2bd5 VIP
1 0448d01f ADMIN_HOSTS
2 0d928eff DB-EXTERNAL
3 06306991 CORP
4 11003dc5 POD1-DB
5 0a7509ea UAT
Ep_ip Ep_netmask
Ep_sg_id
001d2bd5 17.176.253.64 255.255.255.192
36bc01bf 10.202.221.133,10.202.220.141 255.255.255.255,255.255.255.255
cf564ff3 17.39.68.0 255.255.255.128
第一个是两者的组合数据。 第二和第三是各个数据帧。 最后一个是在我执行groupby之后。 Uuid和名字都没了。我不知道如何覆盖滋扰列功能。
这是我的代码:
#!/usr/bin/python
# -*- coding: utf-8 -*-
import xlwt
import json
from xlutils.copy import copy
import xlrd
import pandas as pd
import numpy as np
with open('ep1.txt', 'r') as f:
js = json.loads(f.read())
with open('sc1.txt', 'r') as f1:
js2 = json.loads(f1.read())
book = xlwt.Workbook(encoding="utf-8")
book1 = xlwt.Workbook(encoding="utf-8")
sheet1 = book.add_sheet("Sheet 1", cell_overwrite_ok=True)
sheet2 = book1.add_sheet("Sheet 1", cell_overwrite_ok=True)
sheet1.write(0, 0, 'Ep_sg_id')
sheet1.write(0, 1, 'Ep_ip')
sheet1.write(0, 2, 'Ep_netmask')
sheet2.write(0, 0, 'Uuid')
sheet2.write(0, 1, 'Name')
p = 1
for i, j in js.items():
sg_id = js[i]['Ep_sg_id']
ip = js[i]['Ep_ip']
netmask = js[i]['Ep_netmask']
sheet1.write(p, 0, sg_id)
sheet1.write(p, 1, ip)
sheet1.write(p, 2, netmask)
p = p + 1
q = 1
for i, j in js2.items():
uuid = js2[i]['Sg']['Uuid']
name = js2[i]['Sg']['Name']
sheet2.write(q, 0, uuid)
sheet2.write(q, 1, name)
q = q+1
book.save('new.xls')
book1.save('new1.xls')
df = pd.read_excel('new.xls')
df1 = pd.read_excel('new1.xls')
mergedDf = df.merge(df1, how='outer', left_on='Ep_sg_id', right_on='Uuid')
print mergedDf
mergedDf['Uuid'] = mergedDf['Uuid'].replace("", np.nan)
mergedDf['Name'] = mergedDf['Name'].replace("", np.nan)
mergedDf = mergedDf.groupby('Ep_sg_id').agg(','.join)
print df
print
print df1
print
print mergedDf
mergedDf.to_excel('final_excel.xls', index=False)
答案 0 :(得分:0)
Automatic Exclusion of nuisance column是默认行为,因此您可以复制数据框,例如:
/var/log/apache2
然后按照目前的情况执行extra = mergedDf[['Ep_sg_id', 'Uuid', 'Name']].copy()
:
groupBy
然后最终合并数据帧
mergedDf = mergedDf.groupby('Ep_sg_id').agg(','.join)