我写了一个for循环。我希望它扫描“类型”列下的数据帧值,并且每当看到字符串“ LEC”时,它就会在给定行和“时间表”列中输出相应的时间
pandas.set_option('display.max_columns', None)
d = pd.read_html("https://www.bu.edu/phpbin/course-search/section/?t=casma124")
d = pd.concat(d)
number_of_rows = 1 #number of rows in dataframe
index_range = list(range(number_of_rows))
d = d.loc[:, ["Section", "Type","Schedule", "Location"]]
print(d)
for i in d.loc[:, 'Type']:
if d.loc[i,'Type']:
print(d.loc[i,'Schedule'])
Section Type Schedule Location
0 A1 LEC MWF 1:25 pm-2:15 pm STO B50
1 A1 NaN R 6:30 pm-8:30 pm ROOM
2 A2 LEC MWF 12:20 pm-1:10 pm STO B50
3 A2 NaN R 6:30 pm-8:30 pm ROOM
4 A3 LEC TR 12:30 pm-1:45 pm STO B50
5 A3 NaN R 6:30 pm-8:30 pm ROOM
6 B1 DIS T 2:00 pm-3:15 pm EPC 207
7 B2 DIS T 3:30 pm-4:45 pm EPC 207
8 B3 DIS T 5:00 pm-6:15 pm EPC 207
9 B4 DIS R 2:00 pm-3:15 pm EPC 207
10 B5 DIS M 2:30 pm-3:45 pm CAS 324
11 B6 DIS W 2:30 pm-3:45 pm CAS 324
12 B7 DIS R 3:30 pm-4:45 pm EPC 207
13 SA1 IND MTWR 1:00 pm-3:00 pm MCS B29
14 SA2 IND MTR 6:00 pm-8:30 pm COM 217
15 SB1 IND MTWR 11:00 am-1:00 pm PSY B51
16 SB2 IND MTR 6:00 pm-8:30 pm PSY B37
17 A1 LEC MWF 11:15 am-12:05 pm STO
18 A1 NaN R 6:30 pm-8:30 pm NaN
19 A2 LEC MWF 2:30 pm-3:20 pm STO
20 A2 NaN R 6:30 pm-8:30 pm NaN
21 A3 LEC TR 8:00 am-9:15 am STO
22 A3 NaN R 6:30 pm-8:30 pm NaN
23 B1 DIS M 4:30 pm-5:45 pm NaN
24 B2 DIS T 12:30 pm-1:45 pm NaN
25 B3 DIS T 3:30 pm-4:45 pm NaN
26 B4 DIS W 8:30 am-9:45 am CAS
27 B5 DIS W 4:30 pm-5:45 pm NaN
28 B6 DIS R 12:30 pm-1:45 pm NaN
答案 0 :(得分:0)
我问了一个类似的问题,关于根据特定列中的条件访问并在单独的工作表中存储行。您可能会发现Manish Chaudhary在这里对我的问题的回答很有帮助:
Using Openpyxl to create multiple custom spreadsheets
最终,我放弃了使用Openpyxl,而使用熊猫演示了该任务。
编辑:
我想出了一个基于类类型创建单独工作表的代码。这可能与您要求的不完全相同,但这可能是一个不错的起点:
d = pd.read_html("https://www.bu.edu/phpbin/course-search/section/?t=casma124")
d = pd.concat(d)
number_of_rows = 1 #number of rows in dataframe
index_range = list(range(number_of_rows))
d = d.loc[:, ["Section", "Type","Schedule", "Location"]]
print(d)
types = ('LEC','NaN','DIS','IND')
for i in types:
i = str(i)
leng = len(i)
try:
df = d[d.iloc[:, 1].str[:leng]== i]
except:
continue
df.to_excel(i+'.xlsx', index=False)
print('Sheet {} saved successfully!'.format(i))