我有一个数据框(df):
df = pd.DataFrame({'A' : [54321, 'it is 54322', 'is it 54323 or 4?', np.NaN]})
我可以在其中找到数字
df['B'] = df.A.replace(regex={'[^\w]':'','^\D+':'','\D+':' '}).str.split('\s')
A B
0 54321 NaN
1 it is 54322 [54322]
2 is it 54323 or 4? [54323, 4]
3 NaN NaN
但是当我尝试找到每一行的最高编号时:
df['C'] = df['B'].apply(lambda x : max(x))
我得到:
TypeError: 'float' object is not iterable
答案 0 :(得分:1)
将lambda函数与{
"AWSTemplateFormatVersion": "2010-09-09",
"Parameters": {
"RestAPI": {
"Type": "String",
"Default": "HelloWorldApi"
}
},
"Resources": {
"RestAPI": {
"Type": "AWS::ApiGateway::RestApi",
"Properties": {
"Name": "hello-api",
"Description": "API used for practice",
"FailOnWarnings": true
}
},
"APIAuthorizer" :{
"Type" : "AWS::ApiGateway::Authorizer",
"Properties" : {
"RestApiId" : {
"Ref": "RestAPI"
}
}
},
"BannerDBModel": {
"Type" : "AWS::ApiGateway::Model",
"Properties" : {
"Name" : "postBannerModel",
"RestApiId" : {
"Ref": "RestAPI"
},
"Schema" : {
"$schema": "http://json-schema.org/draft-04/schema#",
"title": "ProductsInputModel",
"type": "object",
"properties": {
"url": {"type": "string"}
}
}
}
}
}
}
一起使用,还添加了对正确if-else
的整数转换:
max
或者为f = lambda x : max(int(y) for y in x) if isinstance(x, list) else np.nan
df['C'] = df['B'].apply(f)
print (df)
A B C
0 54321 NaN NaN
1 it is 54322 [54322] 54322.0
2 is it 54323 or 4? [54323, 4] 54323.0
3 NaN NaN NaN
使用Series.str.extractall
转换为MultiIndex
并在第一级使用int
:
max
答案 1 :(得分:1)
另一种解决方案:
import re
df['B'] = df['A'].apply(lambda x: pd.Series(re.findall(r'\d+', str(x))).astype(float).max())
print(df)
打印:
A B
0 54321 54321.0
1 it is 54322 54322.0
2 is it 54323 or 4? 54323.0
3 NaN NaN