将数据添加到dataframe列时,为什么数据为空?

时间:2018-12-14 08:40:35

标签: python pandas

我使用过滤器检查数据框中的条件,以便可以对其进行标记。

filtering = (dfsamen.shift(0).moving=='movingToclose') & (more condtions)
dffilter = pd.Dataframe(data=filtering, columns = ['filter'])
dffilter['DateTime'] = dfsamen['DateTime']

输出:

过滤

4     False
5     False
6      True
7      True

dffilter

4    False 2018-06-03 06:33:38.593
5    False 2018-06-03 06:33:39.197
6     True 2018-06-03 06:33:40.597
7     True 2018-06-03 06:33:41.800

但是后来我在不同的条件下使用了相同的代码,但这是行不通的

filtering2 = (dfsamen.shift(0).Input5==1) | (more conditions)
dffilter2 = pd.DataFrame(data=filtering2, columns=['filter2'])
dffilter2['DateTime'] = dfsamen['DateTime']

输出:

filtering2

4     False
5      True
6      True
7      True

dffilter2(在添加日期时间之前)

Empty DataFrame
Columns: [filter2]
Index: []

dffilter2(带有日期时间)

4      NaN 2018-06-03 06:33:38.593
5      NaN 2018-06-03 06:33:39.197
6      NaN 2018-06-03 06:33:40.597
7      NaN 2018-06-03 06:33:41.800

那么,即使我将数据添加到列中,但为什么filtering2中存在数据,为什么我的数据也不会消失在第二个过滤器中?

1 个答案:

答案 0 :(得分:1)

问题是您的DataFrame构造函数,因为它是默认创建的RangeIndex,因此两个DataFrame中可能存在不同的索引,数据不对齐,并且对于具有不同索引值的行,您会获得NaNs列。

解决方案正在将值转换为numpy数组:

filtering = (dfsamen.shift(0).moving=='movingToclose') & (more condtions)

dffilter = pd.DataFrame(data=filtering.values, columns = ['filter'])
dffilter['DateTime'] = dfsamen['DateTime'].values
print (dffilter)

示例

dfsamen = pd.DataFrame({
        'A':list('abc'),
        'DateTime':pd.date_range('2015-01-01', periods=3),
        'C':[7,8,9]
}, index=[4,5,6])

print (dfsamen)
   A   DateTime  C
4  a 2015-01-01  7
5  b 2015-01-02  8
6  c 2015-01-03  9

filtering = dfsamen.A == 'a'

dffilter = pd.DataFrame(data=filtering.values, columns = ['filter'])
dffilter['DateTime'] = dfsamen['DateTime'].values
print (dffilter)
   filter   DateTime
0    True 2015-01-01
1   False 2015-01-02
2   False 2015-01-03

或使用Series.to_frameSeries转换为具有一列的DataFrame:

filtering = dfsamen.A == 'a'

dffilter = filtering.to_frame('filter')
dffilter['DateTime'] = dfsamen['DateTime'].values
print (dffilter)
   filter   DateTime
4    True 2015-01-01
5   False 2015-01-02
6   False 2015-01-03