Question

我使用Python并与sql server连接。我从sql server中的数据中选择了。

CustomerNumber	TransactionDate
1	                2/3/2019
1	                12/4/2019
1	                12/17/2019
2                	1/4/2019
2	                4/4/2019
3	                7/5/2019
4	                7/7/2019
4	                9/5/2019
4	                9/15/2019
4                	10/15/2019

我想基于CustomerNumber转换为数组

[1 2/3/2019 12/4/2019 12/17/2019 ]
[2 1/4/2019 4/4/2019]
[3 7/5/2019]
[4 7/5/2019 7/7/2019 9/5/2019 9/15/2019 10/15/2019]

我是python初学者。因此，我期待您的反馈。谢谢您的帮助。

Answer 1

由于我认为这是pandas DataFrame，所以这是一种pandas的方式

s=df.groupby('CustomerNumber').TransactionDate.apply(list).reset_index()
s
Out[49]: 
   CustomerNumber                              TransactionDate
0               1            [2/3/2019, 12/4/2019, 12/17/2019]
1               2                         [1/4/2019, 4/4/2019]
2               3                                   [7/5/2019]
3               4  [7/7/2019, 9/5/2019, 9/15/2019, 10/15/2019]
l=(s.CustomerNumber.apply(lambda x : [x])+s.TransactionDate).tolist()
l
Out[50]: 
[[1, '2/3/2019', '12/4/2019', '12/17/2019'],
 [2, '1/4/2019', '4/4/2019'],
 [3, '7/5/2019'],
 [4, '7/7/2019', '9/5/2019', '9/15/2019', '10/15/2019']]

Answer 2

我建议您使用pandas.read_sql（https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_sql.html）将数据读入pandas数据框。确保具有适当的参数，并注意为其指定与SQL数据库的连接的'con'参数。

将其放在具有两列（customer_number和交易日期）的熊猫数据框中后，它将变为简单的groupby操作：

df.groupby(['CustomerNumber'])['TransactionDate'].apply(list)

这应按CustomerNumber分组，并将每个日期附加到每个唯一客户编号的列表中。

关于Python（Numpy）中的数组

2 个答案: