Question

这是我为问题创建的一些虚拟数据。我对此有两个问题：

为什么split在查询的第一部分而不是第二部分使用str来工作？
[0]是如何在第1部分中拾取第一行并在第2部分中拾取每一行的第一个元素？

chess_data = pd.DataFrame({"winner": ['A:1','A:2','A:3','A:4','B:1','B:2']})

chess_data.winner.str.split(":")[0]
['A', '1']

chess_data.winner.map(lambda n: n.split(":")[0])
0    A
1    A
2    A
3    A
4    B
5    B
Name: winner, dtype: object

Answer 1

chess_data是一个数据框
chess_data.winner是一个系列
chess_data.winner.str是特定于字符串并在某种程度上进行了优化的方法的访问器。
chess_data.winner.str.split是一种这样的方法
chess_data.winner.map是另一种方法，它接受字典或可调用对象，并用该系列的每个元素调用可调用对象，或者在每个元素上调用字典get方法系列的元素。

在使用chess_data.winner.str.split的情况下，Pandas会执行循环并执行一种str.split。 map是做同一件事的更粗略的方法。

随身携带数据。

chess_data.winner.str.split(':')

0    [A, 1]
1    [A, 2]
2    [A, 3]
3    [A, 4]
4    [B, 1]
5    [B, 2]
Name: winner, dtype: object

为了获取每个第一个元素，您将要再次使用字符串访问器

chess_data.winner.str.split(':').str[0]

0    A
1    A
2    A
3    A
4    B
5    B
Name: winner, dtype: object

这是执行map

中所做的操作的等效方法

chess_data.winner.map(lambda x: x.split(':')[0])

您还可以使用理解力

chess_data.assign(new_col=[x.split(':')[0] for x in chess_data.winner])

  winner new_col
0    A:1       A
1    A:2       A
2    A:3       A
3    A:4       A
4    B:1       B
5    B:2       B

Answer 2

chess_data.winner.str.split(":")[0] 
['A', '1']

与

相同

chess_data.winner.str.split(":").loc[0] 
['A', '1']

还有

chess_data.winner.map(lambda n: n.split(":")[0])
0    A
1    A
2    A
3    A
4    B
5    B
Name: winner, dtype: object

与

相同

chess_data.winner.str.split(":").str[0]
0    A
1    A
2    A
3    A
4    B
5    B
Name: winner, dtype: object

与

相同

pd.Series([x.split(':')[0] for x in chess_data.winner], name=chess_data.winner.name) 
0    A
1    A
2    A
3    A
4    B
5    B
Name: winner, dtype: object

Answer 3

在Indexing using str下的文档中对此进行了解释

.str [index]表示法按位置对字符串进行索引，而[index]将根据序列的索引进行切片。

使用示例

s = pd.Series(['A', 'B', 'C', 'Aaba', 'Baca', np.nan,'CABA', 'dog', 'cat'])

s.str[3]

返回每行索引3处的元素

0    NaN
1    NaN
2    NaN
3      a
4      a
5    NaN
6      A
7    NaN
8    NaN

而

s[3]

返回

'Aaba'

Answer 4

使用apply方法从拆分后的Series中提取第一个值

chess_data.winner.str.split(':')
Out: 
0    [A, 1]
1    [A, 2]
2    [A, 3]
3    [A, 4]
4    [B, 1]
5    [B, 2]
Name: winner, dtype: object

chess_data.winner.str.split(':').apply(lambda x: x[0])
Out:
0    A
1    A
2    A
3    A
4    B
5    B
Name: winner, dtype: object

使用时

chess_data.winner.str.split(":")[0]

您只需从得到的系列中获得拳头物品。但是.apply（）将某些功能应用于该系列中的所有值，并返回另一个系列。

在熊猫中使用str in split

4 个答案: