如何从Tableau中的订阅数据到事务数据?

时间:2017-11-14 14:59:33

标签: python sql excel tableau saas

我有一大堆数据,订阅情况如下:

customer_name    start_date   end_date     subscription_amount
A                1-7-2017     31-10-2017   4 USD/month
B                1-8-2017     30-09-2017   2 USD/month
C                1-10-2017    30-11-2017   3 USD/month

我需要将其转换为事务数据,因此最终结果应如下所示:

customer_name    payment_date    amount
A                1-7-2017        4 USD
A                1-8-2017        4 USD
A                1-9-2017        4 USD
A                1-10-2017       4 USD
B                1-8-2017        2 USD
B                1-9-2017        2 USD
C                1-10-2017       3 USD
C                1-11-2017       3 USD

我需要执行此转换以在Tableau中运行分析,但Excel解决方案也是可以接受的。我不想手动完成,而是寻找使用SQL或Python的自动化解决方案(我对这两者都不熟悉)

2 个答案:

答案 0 :(得分:0)

使用python (您需要一些格式的数据与您的问题完全匹配,但这个想法就在这里) http://rextester.com/OENHUT92986

使用此综合列表listMonths = [dt.strftime("%Y-%m-01") for dt in rrule(MONTHLY, dtstart=dtstart, until=until)列出2个日期之间每个月的第一天。

import datetime
import time
from dateutil import parser
from dateutil.rrule import rrule, MONTHLY


data=[
      {"customer_name":"A" ,"start_date":"2017-07-01","end_date":"2017-10-31","subscription_amount":"4 USD/month"},
      {"customer_name":"B" ,"start_date":"2017-08-01","end_date":"2017-09-30","subscription_amount":"2 USD/month"},
      {"customer_name":"C" ,"start_date":"2017-10-01","end_date":"2017-11-30","subscription_amount":"3 USD/month"}
     ]

for datum in data :
    dtstart=parser.parse(datum["start_date"])
    until=parser.parse(datum["end_date"])

    listMonths = [dt.strftime("%Y-%m-01") for dt in rrule(MONTHLY, dtstart=dtstart, until=until)]

    for month in listMonths :    
         print datum["customer_name"],month,datum["subscription_amount"]

将产生:

A 2017-07-01 4 USD/month
A 2017-08-01 4 USD/month
A 2017-09-01 4 USD/month
A 2017-10-01 4 USD/month
B 2017-08-01 2 USD/month
B 2017-09-01 2 USD/month
C 2017-10-01 3 USD/month
C 2017-11-01 3 USD/month

答案 1 :(得分:0)

此处为MS-SQL中的查询

if object_id('tempdb ..#subscriptions')IS NOT NULL DROP TABLE #subscriptions if object_id('tempdb ..#Calendar')IS NOT NULL DROP TABLE #Calendar

将@min_date声明为日期 将@max_date声明为日期

创建表#subscriptions (    customer_name char,    开始日期,    end_date日期,    subscription_amount money,    currencyPeriod varchar(10) )

创建表#Calendar (     Date_f日期 )

插入#subscriptions值('A',CAST('2017-7-01 00:00:00.000'作为日期),CAST('2017-07-10 00:00:00.000'作为日期),4 “美元/月”) 插入#subscriptions值('B',CAST('2017-8-01 00:00:00.000'作为日期),CAST('2017-08-04 00:00:00.000'作为日期),2,'USD /月') 插入#subscriptions值('C',CAST('2017-10-01 00:00:00'作为日期),CAST('2017-10-02 00:00:00.000'作为日期),3,'USD /月')

---这里我们需要获得所有寄存器的MIN和MAX日期 设置@min_date =(选择MIN(startdate)FROM #subscriptions) 设置@max_date =(选择MAX(end_date)FROM #subscriptions)

- 这里我们创建一个临时日历方面的所有寄存器的最小和最大日期

WHile @min_date< = @max_date 开始   INSERT INTO #Calendar(Date_f)   SELECT @min_date

SET @min_date = DATEADD(day,1,@ min_date) END

- 最终结果 选择B.customer_name,A.Date_f作为'payment_date', CONCAT(CAST(subscription_amount as int),'','USD')为'subscription_amount'
来自#Calendar A. INNER JOIN #subscriptions B. ON A.Date_f BETWEEN startdate AND end_date