Question

我试图在time列值为plotList的任何地方将dup值附加到False。

DF =

 lat                time      trip_id     diff  shifted  Segment    dup 
-7.12040 2015-12-24 02:03:10  18060.0  0.00003  0.00000        1  False 
-7.12043 2015-12-24 02:03:12  18060.0  0.00000  0.00003        2  False 
-7.12043 2015-12-24 02:03:14  18060.0  0.00003  0.00003        2   True 
-7.12046 2015-12-24 02:03:16  18060.0  0.00003  0.00003        2   True 
-7.12049 2015-12-24 02:03:19  18060.0  0.00003  0.00000        3  False 
-7.12052 2015-12-24 02:03:22  18060.0  0.00000 -0.00473        4  False

守则=

plotList=[]
def pullLine(row):
    if row['dup'] == False:
        plotList.append(row['time'])
pullLine(df)

我原以为这可能会有效，但我收到错误ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

任何人都可以解释a）这里发生了什么，以及b）我能做些什么来避免？我不明白如果询问False某些内容是否含糊不清。

非常感谢。

Answer 1

我想你可以这样做：

plotList = df.loc[df['dup'] == False, 'time'].values

您将整个DF作为参数传递给您的函数，但将其视为一行...

取决于你想得到什么 - 数组或列表：

In [167]: df.loc[df['dup'] == False, 'time'].values
Out[167]:
array(['2015-12-24 02:03:10', '2015-12-24 02:03:12', '2015-12-24 02:03:19',
       '2015-12-24 02:03:22'], dtype=object)

In [168]: df.loc[df['dup'] == False, 'time'].tolist()
Out[168]:
['2015-12-24 02:03:10',
 '2015-12-24 02:03:12',
 '2015-12-24 02:03:19',
 '2015-12-24 02:03:22']

Answer 2

我只会使用否定dup过滤~列，因为您过滤了False。

>>> df[~df.dup].time
0    2015-12-24 02:03:10
1    2015-12-24 02:03:12
4    2015-12-24 02:03:19
5    2015-12-24 02:03:22
Name: time, dtype: object

如果你真的想要它以列表格式：

df[~df.dup].time.tolist()
['2015-12-24 02:03:10',
 '2015-12-24 02:03:12',
 '2015-12-24 02:03:19',
 '2015-12-24 02:03:22']

在Pandas if else语句之后附加到List

2 个答案: