Question

我正在尝试将datetime对象与熊猫系列中存储的日期进行比较。对于Series中与传递的datetime对象匹配的每个元素，该元素都会附加到数组中。需求为numpyfloat64。

date_chosen = dt.datetime(2019, 4, 2) 
raw_csv = pd.read_csv(data_series, sep=',', na_values=missing_values)

demand_s = pd.to_numeric(raw_csv['DEMAND'])          # extracts demand
date_series = pd.to_datetime(raw_csv['DATE'])        # extracts date

demand_needed = []                        # which demand values match the date_chosen
day = date_series.dt.day                  # only includes day 
for i in day:
    if day[i] == date_chosen.day:         # if element in day is same as chosen one
        demand_needed.append(demand_s[i]) # append matching element 

print(type(date_chosen.day))              # = int
print(type(day[2]))                       # = numpy.int64

运行正常，但问题是demand_needed []为空。 date_chosen.day是标准的int，并且day的元素是numpyint64。 如何比较int和numpyint64？

Answer 1

在for循环中，i是Series "day"中每一行的值，而不是索引。因此，您的循环结构应更像：

date_chosen = dt.datetime(2019, 4, 2) 
raw_csv = pd.read_csv(data_series, sep=',', na_values=missing_values)

demand_s = pd.to_numeric(raw_csv['DEMAND'])
date_series = pd.to_datetime(raw_csv['DATE'])

demand_needed = []
day = date_series.dt.day
for idx, d in day.iteritems():
    if d == date_chosen.day:
        demand_needed.append(demand_s.iloc[idx])

但是IIUC的更好解决方案是使用go there而不是进行迭代：

demand_needed = raw_csv.loc[raw_csv.DATE.dt.day.eq(date_chosen.day), 'DEMAND']

或者如果您需要将输出作为list而不是Series，请使用：

demand_needed = raw_csv.loc[raw_csv.DATE.dt.day.eq(date_chosen.day), 'DEMAND'].tolist()

将datetime对象与Panda系列元素进行比较

1 个答案: