Question

这是有效的（使用Pandas 12 dev）

table2=table[table['SUBDIVISION'] =='INVERNESS']

然后我意识到我需要使用“开头”来选择字段因为我错过了一堆。因此，根据我可以遵循的熊猫文档，我尝试了

criteria = table['SUBDIVISION'].map(lambda x: x.startswith('INVERNESS'))
table2 = table[criteria]

并且得到了AttributeError：'float'对象没有属性'startswith'

所以我尝试了一种具有相同结果的替代语法

table[[x.startswith('INVERNESS') for x in table['SUBDIVISION']]]

参考http://pandas.pydata.org/pandas-docs/stable/indexing.html#boolean-indexing 第4节：系列的列表推导和映射方法也可用于产生更复杂的标准：

我错过了什么？

Answer 1

您可以使用str.startswith DataFrame方法提供更一致的结果：

In [11]: s = pd.Series(['a', 'ab', 'c', 11, np.nan])

In [12]: s
Out[12]:
0      a
1     ab
2      c
3     11
4    NaN
dtype: object

In [13]: s.str.startswith('a', na=False)
Out[13]:
0     True
1     True
2    False
3    False
4    False
dtype: bool

并且布尔索引可以正常工作（我更喜欢使用loc，但它的工作原理完全相同）：

In [14]: s.loc[s.str.startswith('a', na=False)]
Out[14]:
0     a
1    ab
dtype: object

看起来系列/列中你的元素中最少有一个是float，它没有一个startswith方法，因此属于AttributeError，列表推导应该引发同样的错误...... < / p>

Answer 2

对特定的列值使用startfrom

     Type                      Date  diff
0     Car 2019-01-06 21:44:09+00:00     1
1   Train 2019-01-06 19:44:09+00:00     4
2   Train 2019-01-02 19:44:09+00:00     0
3     Car 2019-01-08 06:44:09+00:00     3
4     Car 2019-01-06 18:44:09+00:00     1
5   Train 2019-01-04 19:44:09+00:00     2
6     Car 2019-01-05 16:34:09+00:00     0
7   Train 2019-01-08 19:44:09+00:00     6
8     Car 2019-01-07 14:44:09+00:00     2
9     Car 2019-01-06 11:44:09+00:00     1
10  Train 2019-01-10 19:44:09+00:00     8

Answer 3

检索 startwith 所需字符串

的所有行

dataFrameOut = dataFrame[dataFrame['column name'].str.match('string')]

检索包含必需字符串

的所有行

dataFrameOut = dataFrame[dataFrame['column name'].str.contains('string')]

Answer 4

您可以使用apply轻松地将任何字符串匹配功能逐个应用于列。

table2=table[table['SUBDIVISION'].apply(lambda x: x.startswith('INVERNESS')]

这假设您的“ SUBDIVISION”列的类型正确（字符串）

pandas使用startswith从Dataframe中选择

4 个答案: