Question

我有一个多字型系列pd.Series，例如[100, 50, 0, foo, bar, baz]

当我运行pd.Series.str.isnumeric()

时

我得到[NaN, NaN, NaN, False, False, False]

为什么会这样？它不应该为该系列的前三个返回True吗？

Answer 1

使用字符串访问器将您的数字转换为NaN，这是在您甚至尝试使用isnumeric之前发生的事情：

s = pd.Series([100, 50, 0, 'foo', 'bar', 'baz'])
s.str[:]

0    NaN
1    NaN
2    NaN
3    foo
4    bar
5    baz
dtype: object

因此，当您使用NaN时，isnumeric仍然存在。首先使用 astype ：

s.astype(str).str.isnumeric()

0     True
1     True
2     True
3    False
4    False
5    False
dtype: bool

Answer 2

Pandas字符串方法紧跟Python方法：

str.isnumeric(100)    # TypeError
str.isnumeric('100')  # True
str.isnumeric('a10')  # False

任何会产生错误的类型都会给出NaN。根据Python docs，str.isnumeric仅适用于字符串：

str.isnumeric（）
如果字符串中的所有字符都是数字字符，并且至少包含一个字符，则返回true 否则。

根据熊猫docs，pd.Series.str.isnumeric等同于str.isnumeric：

Series.str.isnumeric（）
检查“系列/索引”中每个字符串中的所有字符是否均为数字。等效于str.isnumeric()。

您的系列具有“ object” dtype，这是一个包罗万象的类型，其中包含指向任意Python对象的指针。这些可能是字符串，整数等的混合。因此，应该在找不到字符串的地方使用NaN值。

要容纳数字类型，您需要显式转换为字符串，例如给定一系列s：

s.astype(str).str.isnumeric()