我想制作一个数据透视表并将其填充到Date中。我编写了代码。
user_dct ={100:"Tom",101:"Jon",102:"Daisy"}
for key,value in user_dct.items():
file= './'+value+'.csv'
df = pd.read_csv(file)
每个df是
Date ID Name Score Rank
0 2011-01-12 100 Tom 40 C
1 2011-01-14 100 Tom 60 B
2 2011-01-19 100 Tom 80 A
・
・
・
Date ID Name Score Rank
0 2011-01-12 101 Jon 30 C
1 2011-01-14 101 Jon 50 C
2 2011-01-19 101 Jon 60 B
・
・
・
user_dct ={100:"Tom",101:"Jon",102:"Daisy"}
dfs = []
for key,value in user_dct.items():
file= './'+value+'.csv'
dfs.append(pd.read_csv(file, parse_dates=['Date']))
df = pd.concat(dfs, ignore_index=True)
df =df.sort_values(['Date','ID']).set_index(['Date','ID'])
date_df = pd.DataFrame({'Date':pd.date_range('2011-01-01','2011-12-31',freq='1D').strftime('%Y-%m-%d')})
df = pd.merge(df, date_df, on='Date', how='outer').fillna(0)
我的理想输出是
Name Score Rank
Date ID
2011-01-01 100 Tom 0 0
101 Jon 0 0
102 Daisy 0 0
・
・
・
2011-01-12 100 Tom 40 C
101 Jon 30 C
102 Daisy 90 S
2011-01-14 100 Tom 60 B
101 Jon 50 C
102 Daisy 90 S
2011-01-19 100 Tom 80 A
101 Jon 60 B
102 Daisy 80 A
・
・
・
我的代码有什么问题?应如何解决?为什么会发生int类型错误?我更改了sort_values&set_index,但错误并未消失。
答案 0 :(得分:2)
我认为需要:
SELECT E.ID, E.EnrolmentDate, A.NoOfMonths
FROM dbo.Enrolments E
OUTER APPLY
(
SELECT DATEDIFF(mm, E.EnrolmentDate, MIN(A.AppointmentDate)) AS NoOfMonths
FROM dbo.Appointments A
WHERE A.EnrolmentId = E.ID
) A