python中最简单的功能映射器

时间:2013-07-21 09:05:58

标签: python pandas

我正在尝试使用python3创建一个最简单的功能映射器。两个目的:获得最佳性能并了解如何编程python:)

这是我的代码,它不起作用:

import pandas as pd
source = pd.DataFrame({'Country' : ['USA', 'USA', 'Russia','USA'], 
                  'City' : ['New-York1', 'New-York', 'Sankt-Petersburg', 'New-York']})

#trim column value selecting first two symbols
def s_trim(x):
    return x[:2]

#make new column from two selecting first two symbols from each
def s_trim_concat(x,y):
    return '%s-%s' % (x[:2],y[:2])

features = [
    ('trim',['Country'],s_trim),
    ('trim1',['Country','City'],s_trim_concat),
    ('trim2',['City','Country'],s_trim_concat)
    ]

for feature_name, columns, func in features:
    source[feature_name] = source[columns].apply(func, axis=1)

print(source)

更新:现在代码工作,但我不得不使功能复杂化,所以我仍然在寻找能够在没有类型转换的情况下使用简单函数的好解决方案:

import pandas as pd
source = pd.DataFrame({'Country' : ['USA', 'USA', 'Russia','USA'], 
                  'City' : ['New-York1', 'New-York', 'Sankt-Petersburg', 'New-York']})

#trim column value selecting first two symbols
def s_trim(x):
    return x.str[:2]

#make new column from two selecting first two symbols from each
def s_trim_concat(row):
    x = row[0]
    y = row[1]
    return '%s-%s' % (x[:2],y[:2])

features = [
    ('trim',['Country'],s_trim),
    ('trim1',['Country','City'],s_trim_concat),
    ('trim2',['City','Country'],s_trim_concat)
    ]

for feature_name, columns, func in features:
    if len(columns) == 1:
        source[feature_name] = source[columns].apply(func)
    else:
        source[feature_name] = source[columns].apply(func, axis=1)
print(source)

2 个答案:

答案 0 :(得分:0)

我认为问题在于您将列表传递给s_trim_concat而不是两个单独的参数。

您是否可以提供此示例的最终输出样本的示例。首先,我需要澄清从s_trim_concat返回的值应该与哪个键关联?

<强>更新

试试这个:

import pandas as pd
source = pd.DataFrame({'Country' : ['USA', 'USA', 'Russia','USA'], 
                  'City' : ['New-York1', 'New-York', 'Sankt-Petersburg', 'New-York']})

#trim column value selecting first two symbols
def s_trim(x):
    return x[:2]

#make new column from two selecting first two symbols from each
def s_trim_concat(x,y):
    return '%s-%s' % (x[:2],y[:2])

features = [
    ('trim',['Country'],s_trim),
    ('trim1',['Country','City'],s_trim_concat),
    ('trim2',['City','Country'],s_trim_concat)
    ]

for feature_name, columns, func in features:
    source[feature_name] = apply(func, columns)

print(source)

答案 1 :(得分:0)

可能我找到了解决方案:

import pandas as pd
source = pd.DataFrame({'Country' : ['USA', 'USA', 'Russia','USA'], 
                  'City' : ['New-York1', 'New-York', 'Sankt-Petersburg', 'New-York']})

#trim column value selecting first two symbols
def s_trim(x):
    return x.str[:2]

#make new column from two selecting first two symbols from each
def s_trim_concat(x,y):
    return '%s-%s' % (x[:2],y[:2])

features = [
    ('trim',['Country'],s_trim),
    ('trim1',['Country','City'],s_trim_concat),
    ('trim2',['City','Country'],s_trim_concat)
    ]

for feature_name, columns, func in features:
    source[feature_name] = source[columns].apply(
        func if len(columns) == 1 
        else lambda x: func(x[0],x[1]), axis=1)
print(source)