Question

如何计算列中有效整数的元素？

这是我能够提出的：

import re
pd.Series(["a","2","z","123","a","oops"]).apply(lambda x: x and re.match(r"^\d+$",x) and 1).sum()
==> 2.0

和

def isint (x):
    try:
        int(x)
        return 1
    except ValueError:
        return 0

pd.Series(["a","2","z","123","a","oops"]).apply(isint).sum()
==> 2

显然，第二种方法更好（int中的回报，很容易推广到其他类型 - date s，float s＆amp; c），但我想知道是否有更好的方式，不需要我编写自己的功能。

Answer 1

系列的.str属性提供了矢量化字符串方法：

>>> ser = pd.Series(["a","2","z","123","a","oops"])
>>> ser.str.isdigit().sum()
2

Answer 2

我使用pd.to_numeric()方法：

In [62]: pd.to_numeric(s, errors='coerce')
Out[62]:
0      NaN
1      2.0
2      NaN
3    123.0
4      NaN
5      NaN
dtype: float64

In [63]: pd.to_numeric(s, errors='coerce').count()
Out[63]: 2

Answer 3

你可以这样做：

isint = lambda x: all([ord(i) >= 48 and ord(i) < 58 for i in str(x)])
pd.Series(["a","2","z","123","a","oops"]).apply(isint).sum()

在一列字符串中计算有效整数

3 个答案: