将重复的行值重塑为列-python

时间:2018-12-23 21:44:56

标签: python python-3.x pandas text jupyter-notebook

我正在尝试使用pandas.melt重塑数据

这是我的txt文件

2017/11/14(Tue)
23:20   Aditya Laksana S.   hahaha
23:20   Aditya Laksana S.   [Sticker]
23:20   Veronika Xaveria    [Sticker]
2017/12/14(Thu)
24:12   Veronika Xaveria    xxxxxxxx
24:14   Aditya Laksana S.   weeee
24:15   Aditya Laksana S.   [Sticker]

我希望数据看起来像

2017/11/14(Tue) 23:20   Aditya Laksana S.   hahaha
2017/11/14(Tue) 23:20   Aditya Laksana S.   [Sticker]
2017/11/14(Tue) 23:20   Veronika Xaveria    [Sticker]
2017/12/14(Thu) 24:12   Veronika Xaveria    xxxxxxxx
2017/12/14(Thu) 24:14   Aditya Laksana S.   weeee
2017/12/14(Thu) 24:15   Aditya Laksana S.   [Sticker]

1 个答案:

答案 0 :(得分:1)

如果我了解您要查找的内容以及当前数据框的实际外观,我想您可以按日期拆分数据框并使用update,那么我认为这并不是迭代时最有效的解决方案通过dfs的镜头。

假设这个df,我也假设它不是多索引,因为您没有指定它是:

             0             1
0   2017/11/14(Tue)       NaN
1   23:20                 Aditya Laksana S. hahaha
2   23:20                 Aditya Laksana S. [Sticker]
3   23:20                 Veronika Xaveria [Sticker]
4   2017/12/14(Thu)       NaN
5   24:12:00              Veronika Xaveria xxxxxxxx
6   24:14:00              Aditya Laksana S. weeee
7   24:15:00              Aditya Laksana S. [Sticker]

然后:

# find the index of the dates assuming that they follow the below format
idx = list(df[df[0].str.contains('Mon|Tue|Wed|Thu|Fri|Sat|Sun')].index)

# find all the values in idx
values = list(df.iloc[idx, 0].values)

# split your dataframe on idx
# this assumes that the first row contains a date
dfs = np.split(df,idx[1:])

# update your df using list comprehension
df[0].update(pd.concat([values[i] +' '+ dfs[i][0] for i in range(len(dfs))]))

# drop nulls
df.dropna()

              0                     1
1   2017/11/14(Tue) 23:20       Aditya Laksana S. hahaha
2   2017/11/14(Tue) 23:20       Aditya Laksana S. [Sticker]
3   2017/11/14(Tue) 23:20       Veronika Xaveria [Sticker]
5   2017/12/14(Thu) 24:12:00    Veronika Xaveria xxxxxxxx
6   2017/12/14(Thu) 24:14:00    Aditya Laksana S. weeee
7   2017/12/14(Thu) 24:15:00    Aditya Laksana S. [Sticker]