使用python pandas进行历史数据的Excel报告

时间:2018-01-30 03:54:40

标签: python pandas dataframe

我想使用python pandas生成excel报告。 我有像以下客户的json数据和" id"是独一无二的。

customer_day1 = [{"id": "1","name": "John","ip": "10.1.1.1"},
                 {"2": "Peter","name": "ip": "10.1.1.2"}]
customer_day2 = [{"id": "1","name": "John","ip": "10.1.1.10"}, 
                 {"3": "Nancy","name": "ip": "10.1.1.3"}]

想要生成包含以下详细信息的Excel报告

  1. 突出显示新客户行
  2. 突出显示已删除的客户
  3. 突出显示客户在两个日期之间更改了值
  4. 需要识别2个日期数据的差异,并生成包含上述详细信息的报告。

1 个答案:

答案 0 :(得分:1)

我能够使用pandas dataframe找到差异。参考http://pbpython.com/excel-diff-pandas.html

import pandas as pd
import numpy as np

def report_diff(x):
    return x[0] if x[0] == x[1] else '{} ---> {}'.format(*x)

def has_change(row):
    if "--->" in row.to_string():
        return "Y"
    else:
        return "N"

customer_day1 = '[{"id": "1","name": "John","ip": "10.1.1.1"},{"id":"2", "name":"Peter", "ip": "10.1.1.2"}]'
customer_day2 = '[{"id": "1","name": "John","ip": "10.1.1.10"},{"id": "3", "name":"Nancy", "ip": "10.1.1.3"}]'

df1 = pd.read_json(customer_day1)
df2 = pd.read_json(customer_day2)
df1.set_index('id',inplace=True)
df2.set_index('id',inplace=True)

df_panel = pd.Panel(dict(df1=df1,df2=df2))
df_output = df_panel.apply(report_diff, axis=0)
df_output['has_change'] = df_output.apply(has_change, axis=1)


writer = pd.ExcelWriter("Report_1.xlsx",engine='xlsxwriter')
df_output.to_excel(writer,"report")    
writer.save()