我正在合并两个数据帧,我可以这样做。我遇到的麻烦是仅在特定记录上显示合并的数据。这两个数据帧都有ID和日期。但是只有一个日期应该有相应的答复,但是我仍然想显示两个记录。您能提供的任何帮助将不胜感激。
例如:
ID | Date | Name | Question_1 | Response_1
12 12/4/2018 John question text response text
12 1/1/2019 John question text response text
16 2/23/2019 Carol question text response text
23 3/01/2019 Gary question text response text
这是我需要的:
ID | Date | Name | Question_1 | Response_1
12 12/4/2018 John question text response text
12 1/1/2019 John
16 2/23/2019 Carol question text response text
23 3/01/2019 Gary question text response text
代码:
def data_validate(files, study):
df1 = pd.read_csv(files[0])
df2 = pd.read_csv(files[1])
df_merge = pd.merge(df1, df2, on='ID', how='left')
df_merge.to_csv('results.csv', index=False)
print(df_merge)
答案 0 :(得分:3)
首先使用var str = str.slice(0, -"variable");
to_datetime
然后我们使用df.Date=pd.to_datetime(df.Date)
到duplicated
mask
如果未使用s=df.ID.duplicated()
df[['Question_1','Response_1']]=df[['Question_1','Response_1']].mask(s,'')
df
Out[287]:
ID Date Name Question_1 Response_1
0 12 2018-12-04 John questiontext responsetext
1 12 2019-01-01 John
2 16 2019-02-23 Carol questiontext responsetext
3 23 2019-03-01 Gary questiontext responsetext
,则假定您的数据框已排序
赞:
sort_values