如何计算列中有效整数的元素?
这是我能够提出的:
import re
pd.Series(["a","2","z","123","a","oops"]).apply(lambda x: x and re.match(r"^\d+$",x) and 1).sum()
==> 2.0
和
def isint (x):
try:
int(x)
return 1
except ValueError:
return 0
pd.Series(["a","2","z","123","a","oops"]).apply(isint).sum()
==> 2
显然,第二种方法更好(int
中的回报,很容易推广到其他类型 - date
s,float
s& c),但我想知道是否有更好的方式,不需要我编写自己的功能。
答案 0 :(得分:6)
系列的.str
属性提供了矢量化字符串方法:
>>> ser = pd.Series(["a","2","z","123","a","oops"])
>>> ser.str.isdigit().sum()
2
答案 1 :(得分:4)
我使用pd.to_numeric()方法:
In [62]: pd.to_numeric(s, errors='coerce')
Out[62]:
0 NaN
1 2.0
2 NaN
3 123.0
4 NaN
5 NaN
dtype: float64
In [63]: pd.to_numeric(s, errors='coerce').count()
Out[63]: 2
答案 2 :(得分:0)
你可以这样做:
isint = lambda x: all([ord(i) >= 48 and ord(i) < 58 for i in str(x)])
pd.Series(["a","2","z","123","a","oops"]).apply(isint).sum()