我正在研究机器学习的学生项目,使用python和pandas来分析webdata。因此,我需要在一行中转换多行数据(来自一个会话)。会话具有可变长度,每行包含5个值:referer,ip,time,requestAdress,session,我想将其存储到列中。
df_row = pd.DataFrame()
length_session = len(df_work[df_work['session'] == session])
for row in df_work[df_work.session == session].itertuples(): #tuple = referer, ip, time, requestAdress, session
for i in range(1,len(row)):
name = ['referer', 'ip', 'time', 'requestAdress', 'session']
df_row[str(name[i-1]) + str(length_session)] = row[i]
print row[i]
length_row-=1
print(df_row)
输出是:
https://www.google.de/
x5d80e060.dyn.telefonica.de
2016-07-06 03:41:02
/kuenstlerbedarf/oelfarben/
-8730846718325754703
Empty DataFrame
Columns: [referer28, ip28, time28, requestAdress28, session28, referer27, ip27, time27, requestAdress27, session27, referer26, ip26, time26, requestAdress26, session26, referer25, ip25, time25, requestAdress25, session25, referer24, ip24, time24, requestAdress24, session24, referer23, ip23, time23, requestAdress23, session23, referer22, ip22, time22, requestAdress22, session22, referer21, ip21, time21, requestAdress21, session21, referer20, ip20, time20, requestAdress20, session20, referer19, ip19, time19, requestAdress19, session19, referer18, ip18, time18, requestAdress18, session18, referer17, ip17, time17, requestAdress17, session17, referer16, ip16, time16, requestAdress16, session16, referer15, ip15, time15, requestAdress15, session15, referer14, ip14, time14, requestAdress14, session14, referer13, ip13, time13, requestAdress13, session13, referer12, ip12, time12, requestAdress12, session12, referer11, ip11, time11, requestAdress11, session11, referer10, ip10, time10, requestAdress10, session10, referer9, ip9, time9, requestAdress9, session9, ...]
Index: []
因此,列的动态命名有效,但DataFrame仍为空。我根据此问题找到的只有this和this Question。
我想知道为什么df_row[str(name[i-1]) + str(length_row)] = row[i]
的作业不起作用,以及如何实现我的目标,用给定的值填充动态命名的列。
提前一个大THANX!