在列表的Pandas数据框列中查找最大值

时间:2020-10-29 12:34:50

标签: python pandas max

我有一个数据框(df):

df = pd.DataFrame({'A' : [54321, 'it is 54322', 'is it 54323 or 4?', np.NaN]})

我可以在其中找到数字

df['B'] = df.A.replace(regex={'[^\w]':'','^\D+':'','\D+':' '}).str.split('\s')

                   A           B
0              54321         NaN
1        it is 54322     [54322]
2  is it 54323 or 4?  [54323, 4]
3                NaN         NaN

但是当我尝试找到每一行的最高编号时:

df['C'] = df['B'].apply(lambda x : max(x))

我得到:

TypeError: 'float' object is not iterable

2 个答案:

答案 0 :(得分:1)

将lambda函数与{ "AWSTemplateFormatVersion": "2010-09-09", "Parameters": { "RestAPI": { "Type": "String", "Default": "HelloWorldApi" } }, "Resources": { "RestAPI": { "Type": "AWS::ApiGateway::RestApi", "Properties": { "Name": "hello-api", "Description": "API used for practice", "FailOnWarnings": true } }, "APIAuthorizer" :{ "Type" : "AWS::ApiGateway::Authorizer", "Properties" : { "RestApiId" : { "Ref": "RestAPI" } } }, "BannerDBModel": { "Type" : "AWS::ApiGateway::Model", "Properties" : { "Name" : "postBannerModel", "RestApiId" : { "Ref": "RestAPI" }, "Schema" : { "$schema": "http://json-schema.org/draft-04/schema#", "title": "ProductsInputModel", "type": "object", "properties": { "url": {"type": "string"} } } } } } } 一起使用,还添加了对正确if-else的整数转换:

max

或者为f = lambda x : max(int(y) for y in x) if isinstance(x, list) else np.nan df['C'] = df['B'].apply(f) print (df) A B C 0 54321 NaN NaN 1 it is 54322 [54322] 54322.0 2 is it 54323 or 4? [54323, 4] 54323.0 3 NaN NaN NaN 使用Series.str.extractall转换为MultiIndex并在第一级使用int

max

答案 1 :(得分:1)

另一种解决方案:

import re
df['B'] = df['A'].apply(lambda x: pd.Series(re.findall(r'\d+', str(x))).astype(float).max())
print(df)

打印:

                   A        B
0              54321  54321.0
1        it is 54322  54322.0
2  is it 54323 or 4?  54323.0
3                NaN      NaN