我有一个数据集,如下所示:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
上面将创建一个带有一些索引的数据框,其中a和b分别为0和1。
我想做的是使用pandas库函数循环每一列并查找值是否为1,然后创建另一个数据框,如下例所示
a = pd.DataFrame({'time': pd.date_range(start='2016-03-10', end='2019-03-10'),
'a': [0 for _ in range(1096)],
'b': [0 for _ in range(1096)]})
indices_a = [0,1,3,6,10,15, 20, 40, 50,70, 100,400,700]
indices_b = [0,1,3,6,10,15, 20, 40, 50,70, 100,400,700]
a.loc[indices_a,'a'] = 1
a.loc[indices_b,'b'] = 1
我的尝试:
time | category
2018-03-10 | a
2018-02-10 | a
2018-04-10 | a
2018-05-10 | a
2018-06-10 | b
2018-07-10 | b
2018-08-10 | b
2018-09-10 | b
2018-10-10 | b
答案 0 :(得分:2)
IIUC,您需要melt
和.query
b = a.melt(id_vars='time',var_name='category').query('value == 1')\
.drop('value',axis=1)
print(b)
time category
0 2016-03-10 a
1 2016-03-11 a
3 2016-03-13 a
6 2016-03-16 a
10 2016-03-20 a
15 2016-03-25 a
20 2016-03-30 a
40 2016-04-19 a
50 2016-04-29 a
70 2016-05-19 a
100 2016-06-18 a
400 2017-04-14 a
700 2018-02-08 a
1096 2016-03-10 b
1097 2016-03-11 b
1099 2016-03-13 b
1102 2016-03-16 b
1106 2016-03-20 b
1111 2016-03-25 b
1116 2016-03-30 b
1136 2016-04-19 b
1146 2016-04-29 b
1166 2016-05-19 b
1196 2016-06-18 b
1496 2017-04-14 b
1796 2018-02-08 b