你如何从Python中分离谷歌分析的源媒体路径?

时间:2017-06-16 14:26:18

标签: python google-analytics google-analytics-api

我有一年的Google Analytics多属性漏斗API数据。以下示例。源媒体有不同的长度,我正在寻找为每个频道创建一个新列的方法“>”分隔符。

20160101    google / organic
20160101    bing / organic
20160101    google / organic > google / organic
20160101    google / organic > google / organic
20160101    (direct) / (none) > (direct) / (none)
20160101    (direct) / (none) > online.fliphtml5.com / referral
20160101    google / organic > google / organic > (direct) / (none)
20160101    google / organic > (direct) / (none) > google / organic
20160101    google / organic > online.fliphtml5.com / referral > (direct) / (none)
20160101    (direct) / (none) > (direct) / (none) > (direct) / (none)
20160101    pinterest.com / referral > (direct) / (none) > (direct) / (none)
20160101    google / organic > (direct) / (none) > (direct) / (none) > google / organic
20160101    bing / organic > (direct) / (none) > (direct) / (none) > (direct) / (none)
20160101    google / organic > (direct) / (none) > (direct) / (none) > (direct) / (none)

以下是我想要数据格式的一个例子。如何在Python中完成?

Source_Med_Path_1 Source_Med_Path_2....Source_Med_Path_72
google / cpc          direct            google / organic

1 个答案:

答案 0 :(得分:0)

你可以使用Pandas和apply()函数来完成它。

http://pandas.pydata.org/pandas-docs/version/0.18.1/generated/pandas.Series.apply.html

我的代码从csv获取源媒体,但可以轻松地用于API结果。

import pandas as pd


def main():
    #read original data from csv
    data = pd.read_csv('source.csv')

    #split the data on identifier >
    splitdata = data['source'].apply(lambda x: pd.Series(x.split('>')))
    #join the split data onto transaction data
    data = pd.concat([data['transaction'], splitdata], axis=1, join_axes=[data['transaction'].index])   


    #loop through renaming columns
    cols = ['transaction']    
    for i in range(len(data.columns) -1):
        cols.append('Source_Med_Path_' + str(i+1))

    data.columns = cols        


    #output data    
    print(data)
    data.to_csv('output.csv')

if __name__ == '__main__':
    main()