我在本地Excel文件中存储了一些超链接。所有这些都在同一列中。例如
| A
| ----------------------------------|
| http://vocab.getty.edu/tgn/8699749|
| http://vocab.getty.edu/tgn/8704811|
| http://vocab.getty.edu/tgn/8702341|
| http://vocab.getty.edu/tgn/1063874|
| http://vocab.getty.edu/tgn/1063880|
| http://vocab.getty.edu/tgn/7032551|
|-----------------------------------|
每个链接都指向一个页面,我将从中提取与字段xl:prefLabel相关的信息并将结果存储在B列中
Openpyxl可能是解决方案?
预期结果应该类似于
| A | B |
| ----------------------------------| ------------------------
| http://vocab.getty.edu/tgn/8699749| tgn_term:1005671253-fr |
| http://vocab.getty.edu/tgn/8704811| tgn_term:1005683546-de |
| http://vocab.getty.edu/tgn/8702341| tgn_term:1005684314 |
| http://vocab.getty.edu/tgn/1063874| tgn_term:64447 |
| http://vocab.getty.edu/tgn/1063880| tgn_term:64453 |
| http://vocab.getty.edu/tgn/7032551| tgn_term:1001213640 |
|-----------------------------------|------------------------|
答案 0 :(得分:0)
一种快速的解决方案是使用Pandas切片:
import pandas as pd
import urllib.request
all_hyperlinks = pd.read_excel(path_to_excel_file, index_col=None, header=None)
first_hl = all_hyperlinks.loc[0, 0] # Get the first hype link
contents = request.urlopen(first_hl).read()