使用Python比较2个列表并获取相应的项目

时间:2018-02-08 04:39:13

标签: python excel pandas dataframe arraylist

我有两列,一列包含文本字符串,另一列包含这些文本字符串的显示时间。在下面的示例中,您将看到文本将随时间显示,但在添加新文本时将逐个消失。 Here is an example

Time (s)    Text string
5   This example
7   This example
10  example
11  example is cool
15  is cool
16  cool
17  
19  That example is
20  example is
21  is awesome
23  awesome
24  

我想提取每个文本的消失时间。例如,它应该是这样的: Here is the result I want

Disappeared time (s)    Text
10  This
15  example
16  is
17  cool
20  That
21  example
23  is
24  awesome

如何编写python代码来执行此操作。我是python的初学者,因此代码示例和解决问题的想法很有帮助。 非常感谢你提前!

1 个答案:

答案 0 :(得分:1)

使用:

df = df.set_index('Time (s)')['Text string'].str.get_dummies(' ')
print (df)
          That  This  awesome  cool  example  is
Time (s)                                        
5            0     1        0     0        1   0
7            0     1        0     0        1   0
10           0     0        0     0        1   0
11           0     0        0     1        1   1
15           0     0        0     1        0   1
16           0     0        0     1        0   0
17           0     0        0     0        0   0
19           1     0        0     0        1   1
20           0     0        0     0        1   1
21           0     0        1     0        0   1
23           0     0        1     0        0   0
24           0     0        0     0        0   0

df1 = (df.where(df.ne(df.shift().bfill()) & df.eq(0))
        .stack()
        .rename_axis(('Disappeared time (s)','Text'))
        .reset_index()
        .drop(0, axis=1))
print (df1)
   Disappeared time (s)     Text
0                    10     This
1                    15  example
2                    16       is
3                    17     cool
4                    20     That
5                    21  example
6                    23       is
7                    24  awesome