Python Pandas:SQL Tally / Number Table的等效功能

时间:2018-02-13 23:25:32

标签: python sql-server pandas numpy

我遇到了一个有趣的技术,我试图找到一个大熊猫模拟器,如果可能的话测试:

SQL Server: Create New Records from Enumerated Date

select s.id
    , DATEADD(day, t.N - 1, s.transaction_dt)
    , s.measures
from @Something s
join cteTally t on t.N <= s.units
order by s.id
    , s.transaction_dt
    , t.N

作为基于集合的解决方案,这看起来非常有趣。我的问题是,是否有可能在熊猫中复制这样的东西? :

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.merge_asof.html

1 个答案:

答案 0 :(得分:1)

当然,这是可能的。我参考了你引用的SQL示例:

import pandas as pd

df = pd.DataFrame([[1, '2018-01-01', 4, 30.5],
                   [1, '2018-01-03', 4, 26.3],
                   [2, '2018-01-01', 3, 12.7],
                   [2, '2018-01-03', 3, 8.8]],
                  columns=['id', 'transaction_dt', 'units', 'measures'])

df_out = pd.DataFrame([df.iloc[idx] for idx in df.index \
                       for _ in range(df.iloc[idx]['units'])])

#    id transaction_dt  units  measures
# 0   1     2018-01-01      4      30.5
# 0   1     2018-01-01      4      30.5
# 0   1     2018-01-01      4      30.5
# 0   1     2018-01-01      4      30.5
# 1   1     2018-01-03      4      26.3
# 1   1     2018-01-03      4      26.3
# 1   1     2018-01-03      4      26.3
# 1   1     2018-01-03      4      26.3
# 2   2     2018-01-01      3      12.7
# 2   2     2018-01-01      3      12.7
# 2   2     2018-01-01      3      12.7
# 3   2     2018-01-03      3       8.8
# 3   2     2018-01-03      3       8.8
# 3   2     2018-01-03      3       8.8