我在某些条件下制作名单。
这就是它的样子。
def time_price_pair(a, b):
if 32400<=a and a<32940:
a_list=[]
a_list.append(b)
elif 32940<=a and a<33480:
b_list=[]
b_list.append(b)
elif 33480<=a and a <34020:
c_list=[]
c_list.append(b)
......
......
......
elif 52920 <=a and a <53460:
some_list=[]
some_list.append(b)
每个条件将由540添加。如[32400,32940,33480,34020,34560,35100,35640,36180,36720,37260,37800,38340,38880,39420 .... 53460]
和列表名称并不重要。
答案 0 :(得分:1)
您可以使用带有一些递增变量i
的for循环,并不断更新要求。像这样:
def time_price_pair(a, b):
min = 32400
max = 32940
inc = 540
for i in range(some value):
if min + inc*i <= a < max + inc*i:
b = min + inc*i
a_list = [b]
答案 1 :(得分:1)
我会使用dict存储这些值列表,并使用一些数学知道将这些数字放在哪里
from collections import defaultdict
lists = defaultdict(list)
def time_price_pair(a, b):
if 32400 <= a < 53460:
i = (a-32400)/540
lists[i].append(b)
答案 2 :(得分:0)
字典可用于保存所有使用的时间范围箱,如下所示:
import collections
time_prices = [(32401, 20), (32402,30), (32939, 42), (32940, 10), (32941, 15), (40000, 123), (40100, 234)]
dPrices = collections.OrderedDict()
for atime, aprice in time_prices:
abin = 32400 + ((atime - 32400) // 540) * 540 # For bins as times
#abin = (atime - 32400) // 540 + 1 # For bins starting from 1
dPrices.setdefault(abin, []).append(aprice)
# Display results
for atime, prices in dPrices.items():
print atime, prices
这会给你以下输出:
32400 [20, 30, 42]
32940 [10, 15]
39960 [123, 234]
或单独作为:
print dPrices[32400]
[20, 30, 42]
使用Python 2.7进行测试
答案 3 :(得分:0)
看起来简单的高级熊猫函数pd.cut
非常适合您的目的。
import pandas as np
import numpy as np
# simulate your data
# ==================================
np.random.seed(0)
a = np.random.randint(32400, 53439, size=1000000)
b = np.random.randn(1000000)
# put them in dataframe
df = pd.DataFrame(dict(a=a, b=b))
print(df)
a b
0 35132 -0.4605
1 43199 -0.9469
2 42245 0.2580
3 52048 -0.7309
4 45523 -0.4334
5 41625 2.0155
6 53157 -1.4712
7 46516 -0.1715
8 47335 -0.6594
9 47830 -1.0391
... ... ...
999990 39754 0.8771
999991 34779 0.7030
999992 37836 0.5409
999993 44330 -0.6747
999994 41078 -1.1368
999995 38752 1.6121
999996 42155 -0.1139
999997 49018 -0.1737
999998 45848 -1.2640
999999 50669 -0.4367
# processing
# ===================================
rng = np.arange(32400, 53461, 540)
# your custom labels
labels = np.arange(1, len(rng))
# use pd.cut()
%time df['cat'] = pd.cut(df.a, bins=rng, right=False, labels=labels)
CPU times: user 52.5 ms, sys: 16 µs, total: 52.5 ms
Wall time: 51.6 ms
print(df)
a b cat
0 35132 -0.4605 6
1 43199 -0.9469 20
2 42245 0.2580 19
3 52048 -0.7309 37
4 45523 -0.4334 25
5 41625 2.0155 18
6 53157 -1.4712 39
7 46516 -0.1715 27
8 47335 -0.6594 28
9 47830 -1.0391 29
... ... ... ..
999990 39754 0.8771 14
999991 34779 0.7030 5
999992 37836 0.5409 11
999993 44330 -0.6747 23
999994 41078 -1.1368 17
999995 38752 1.6121 12
999996 42155 -0.1139 19
999997 49018 -0.1737 31
999998 45848 -1.2640 25
999999 50669 -0.4367 34
[1000000 rows x 3 columns]
# groupby
grouped = df.groupby('cat')['b']
# access to a particular group using your user_defined key
grouped.get_group(1).values
array([ 0.4525, -0.7226, -0.981 , ..., 0.0985, -1.4286, -0.2257])