如果...在python / pandas中的elif语句

时间:2016-04-13 14:09:57

标签: python-2.7 if-statement pandas split

我正在编写一个对人名进行排序的脚本。我有这个工作使用csv模块,但因为这将绑定到一个更大的熊猫项目,我想我会转换它。

我需要将单个名称字段拆分为第一个,中间和最后一个字段。原始字段的名字首先是。例如:Richard Wayne Van Dyke。

我拆分了名字,但希望“Van Dyke”成为姓氏。

这是我的csv模块的代码:

with open('inputfil.csv') as inf:
    docs = csv.reader(inf)
    next(ccaddocs, None)
    for i in docs:
        #print i
        fullname = i[1]#it's the second column in the input file
        namelist =fullname.split(' ') 
        firstname = namelist[0]
        middlename = namelist[1]
        if len(namelist) == 2:
            lastname = namelist[1]
            middlename = ''
        elif len(namelist) == 3:
            lastname = namelist[2]
        elif len(namelist) == 4:
            lastname = namelist[2] + " " + namelist[3] #gets Van Dyke in lastname
        print "First: " + firstname + " middle: " + middlename + " last: " + lastname

这是我正在努力解决的基于熊猫的代码:

    df = pd.DataFrame({'Name':['Richard Wayne Van Dyke','Gary Del Barco','Dave Allen Smith']})
df = df.fillna('')
df =df.astype(unicode)
splits = df['Name'].str.split(' ', expand=True)

df['firstName'] = splits[0]
if  splits[2].notnull and splits[3].isnull:#this works for Bret Allen Cardwell

    df['lastName'] = splits[2]
    df['middleName'] = splits[1]
    print "Case 1: First: " + df['firstName'] + " middle: " +df['middleName'] + " last: " + df['lastName']
elif splits[2].all() == 'Del':#trying to get last name of "Del Barco"
    print 'del'
    df['middleName'] = ''
    df['lastName'] = splits[2] + " " + splits[3]
    print "Case 2: First: " + df['firstName'] + " middle: " +df['middleName'] + " last: " + df['lastName']

elif splits[3].notnull: #trying to get last name of "Van Dyke"
    df['middleName'] = splits[1]
    df['lastName'] = splits[2] + " " + splits[3]
    print "Case 3: First: " + df['firstName'] + " middle: " +df['middleName'] + " last: " + df['lastName']

我缺少一些基本的东西。

1 个答案:

答案 0 :(得分:0)

library(XML)

doc <- xmlParse("Input.xml")

stringdata <- t(xpathSApply(doc, "//String", xmlAttrs))
df <- data.frame(stringdata, stringsAsFactors = FALSE)

# CONVERT CHARACTER COLUMNS TO NUMERIC
df[, c(1,3:6)] <- sapply(df[, c(1,3:6)], function(x) as.numeric(x))
head(df)

#          WC     CONTENT HEIGHT WIDTH VPOS HPOS
# 1 0.8520000       SHELL     30    92  472  902
# 2 0.5462500    MAATVELD     32   150  475 1016
# 3 0.5287500    RIJKSWEG     34   150  511  901
# 4 0.2966667         A20     31    55  515 1073
# 5 0.4427273 NIEUWERKERK     36   207  550  900
# 6 0.2633333         A/D     31    54  557 1130