如何在条件下进行python循环

时间:2018-11-06 03:31:31

标签: python pandas loops

一段时间以来,我一直在尝试对此进行编码。 这是一个示例数据框:

capacity = 500
s = pd.Series(['School 1','School 2', 'School 3','School 4', 'School 5'])
p = pd.Series(['132', '458', '333', '300', '258'])
d = pd.Series(['1', '2', '3', '4', '5'])

df = pd.DataFrame(np.c_[s,p,d],columns = ['School Name','Population', 'Distance'])

我想做的是制作一个循环,其中循环将不断地从“容量”中减去“人口”,只要它不超过容量即可。它需要检查订单的“距离”。

示例: 由于“学校1”是最近的学校,因此从500中减去132,即368。但是由于“学校2”是第二个最近的学校,但是人口超过368(458> 368),因此它将在此处停止,因此不再继续检查第二所最近的学校是“学校3”。

在此之后,应将学校名称分配到另一列

最终结果将是:

s = pd.Series(['School 1','School 2', 'School 3','School 4', 'School 5'])
p = pd.Series(['132', '458', '333', '300', '258'])
d = pd.Series(['1', '2', '3', '4', '5'])
sn = pd.Series(['School 1', 0, 0 ,0 ,0])
df2 = pd.DataFrame(np.c_[s,p,d,sn],columns = ['School Name','Population', 'Distance','Included'])

从昨天开始尝试进行此操作,除手动之外,仍然不知道如何执行此操作。仍然是Python的初学者。

感谢您的帮助!

1 个答案:

答案 0 :(得分:2)

根据您的问题,我假设在容量超出限制之前,您只想要一个学校名称。可以这样实现:

import pandas as pd
import numpy as np

capacity = 500

s = pd.Series(['School 1','School 2', 'School 3','School 4', 'School 5'])
p = pd.Series(['132', '458', '333', '300', '258'])
d = pd.Series(['1', '2', '3', '4', '5'])
df = pd.DataFrame(np.c_[s,p,d],columns = ['School Name','Population', 'Distance'])

# converting population to integer values
p = p.astype('int')

# placeholder to store school name
school_name = None

for idx, val in enumerate(p):
  # keep assigning school name until capacity is exceeded
  capacity -= val
  if capacity < 0:
      break
  school_name = s[idx]

# add included column     
df['included'] = np.where(df['School Name'] == school_name, df['School Name'], 0)

然后,您可以打印df来查看它是否确实有效:

>>> df1
School Name Population Distance    included
0    School 1        132        1    School 1
1    School 2        458        2           0
2    School 3        333        3           0
3    School 4        300        4           0
4    School 5        258        5           0

但是,假设您要保留所有学校,直到超出容量为止,只需修改上述程序即可。.只需替换占位符和如下所示的循环:

school_names = []    # placeholder will be a list now
for idx, val in enumerate(p):
    capacity -= val
    if capacity < 0:
        break
    school_names.append(s[idx])    # keep adding schools that do not exceed capacity to the list

# Instead of equality, check if school name is in your list
df['included'] = np.where(df['School Name'].isin(school_names), df['School Name'], 0)

现在,如果您将capacity = 500和第二个总体更改为p = pd.Series(['132', '128', '333', '300', '258']),那么School 1School 2都将包括在内。