python包装列表并转换成漂亮汤的类型

时间:2018-09-19 14:42:23

标签: python list beautifulsoup

以下是我的代码:

from urllib.request import urlopen  # b_soup_1.py
from bs4 import BeautifulSoup

# Treasury Yield Curve web site, known to be HTML code
html = urlopen('https://www.treasury.gov/resource-center/'
               'data-chart-center/interest-rates/Pages/'
               'TextView.aspx?data=yieldYear&year=2018')

# create the BeautifulSoup object (BeautifulSoup Yield Curve)
bsyc = BeautifulSoup(html.read(), "lxml")

# save it to a file that we can edit
#fout = open('bsyc_temp.txt', 'wt', encoding='utf-8')

#fout.write(str(bsyc))

#fout.close()

# so get a list of all table tags
table_list = bsyc.findAll('table')


# to findAll as a dictionary attribute
tc_table_list = bsyc.findAll('table',
                      { "class" : "t-chart" } )

# only 1 t-chart table, so grab it
tc_table = tc_table_list[0]
# what are this table's components/children?
# tag tr means table row, containing table data
# what are the children of those rows?
# we have found the table data!
# just get the contents of each cell
print('\nthe contents of the children of the t-chart table:')
daily_yield_curves_temp = []
daily_yield_curves = []
for c in tc_table.children:
    for r in c.children:
        for i in r.contents:
            daily_yield_curves_temp.append(i)
for x in range(len(daily_yield_curves_temp) // 12):
    daily_yield_curves.append(daily_yield_curves_temp[12 * x : 12 * x + 12])

print(daily_yield_curves)

输出为:

  

[['Date','1 mo','3 mo','6 mo','1 yr','2 yr','3 yr','5 yr','7   yr”,“ 10年”,“ 20年”,“ 30年”],['01/02/18','1.29','1.44','1.61',   '1.83','1.92','2.01','2.25','2.38','2.46','2.64','2.81'],   ['01 / 03/18','1.29','1.41','1.59','1.81','1.94','2.02','2.25',   '2.37','2.44','2.62','2.78'],['01/04/18','1.28','1.41','1.60',   '1.82','1.96','2.05','2.27','2.38','2.46','2.62','2.79'] ........

但是,我要使输出看起来像这样:

daily_yield_curves = [
        [ … header list … ],
        [ … first data list … ],
        …
        [ … final data list … ]
    ]
  

[“日期”,“ 1个月”,“ 3个月”,“ 6个月”,“ 1年”,“ 2年”,“ 3年”,            '5年','7年','10年','20年','30年']

下面应该是每个数据行的列表。将每个利率值从字符串转换为浮点数:

  

['01/02/18',1.29,1.44,1.61,1.83,1.92,2.01,             2.25、2.38、2.46、2.64、2.81] ... ['09 /14/18'、2.02、2.16、2.33、2.56、2.78、2.85,             2.90、2.96、2.99、3.07、3.13]

请帮助我进行更改

0 个答案:

没有答案