熊猫根据列的功能选择行

时间:2019-06-21 12:51:23

标签: python pandas

我正在尝试学习熊猫。我发现了一些有关如何构造熊猫数据框以及如何添加列的示例,它们很好地工作。我想学习根据列的值选择所有行。我发现了多个示例,如果列的值应小于或大于某个数字,该如何执行选择呢?我的问题是如何进行更一般的选择,首先要计算列的函数,然后选择所有函数值大于或小于某个数字的行

import names
import numpy as np
import pandas as pd
from datetime import date
import random

def randomBirthday(startyear, endyear):
    T1 = date.today().replace(day=1, month=1, year=startyear).toordinal()
    T2 = date.today().replace(day=1, month=1, year=endyear).toordinal()
    return date.fromordinal(random.randint(T1, T2))

def age(birthday):
    today = date.today()
    return today.year - birthday.year - ((today.month, today.day) < (birthday.month, birthday.day))

N_PEOPLE = 20
dict_people = { }
dict_people['gender'] = np.array(['male','female'])[np.random.randint(0, 2, N_PEOPLE)]
dict_people['names'] = [names.get_full_name(gender=g) for g in dict_people['gender']]

peopleFrame = pd.DataFrame(dict_people)

# Example 1: Add new columns to the data frame
peopleFrame['birthday'] = [randomBirthday(1920, 2020) for i in range(N_PEOPLE)]

# Example 2: Select all people with a certain age
peopleFrame.loc[age(peopleFrame['birthday']) >= 20]

除了最后一行,此代码有效。请提出编写此行的正确方法是什么。我考虑过使用功能年龄的值添加一个额外的列,然后根据其值进行选择。那行得通。但是我想知道是否必须这样做。如果我不想存储一个人的年龄怎么办,只用它来选择

1 个答案:

答案 0 :(得分:2)

使用Series.apply

peopleFrame.loc[peopleFrame['birthday'].apply(age) >= 20]