Python 2.7,打印没有引号的列表输出,从html中提取输出

时间:2017-10-28 09:10:55

标签: python-2.7 list beautifulsoup

我希望输出像星期一下午5:00到凌晨12:00。删除输出检查中的所有单引号和空格:代码

for count in glob.glob(os.path.join("C:\\Users\\test", "*.html")):
    soup=BeautifulSoup(open(files), 'html.parser')
    hours=soup.find_all( 'table', {'class' : "table"
    [0].get_text().strip().split()
    check=[i.encode('utf-8').strip().replace("-","to" ) for i in hours]
    print check

当前输出:

['Mon', '5:00', 'pm', 'to', '12:00', 'am', 'Tue', '5:00', 'pm', 'to', '12:00', 'am']

2 个答案:

答案 0 :(得分:0)

如果输出一致,您可以加入输出列表以获得所需的结果:

Mon 5:00 pm to 12:00 am, Tue 5:00 pm to 12:00 am

我的输出:

$result = [
    0 => [
        'day_date' => '2017-10-27',
        'phase' => 'Definition',
        'foodstyle' => null,
        'workout_id' => 9
    ],
    1 => [
        'day_date' => '2017-10-28',
        'phase' => 'Definition',
        'foodstyle' => null,
        'workout_id' => 9
    ],
    2 => [
        'day_date' => '2017-10-29',
        'phase' => 'Definition',
        'foodstyle' => 'DeloadStart',
        'workout_id' => 8
    ],
    3 => [
        'day_date' => '2017-10-30',
        'phase' => 'Definition6',
        'foodstyle' => 'DeloadStart',
        'workout_id' => null
    ],
    4 => [
        'day_date' => '2017-10-31',
        'phase' => 'Definition1',
        'foodstyle' => 'DeloadStart1',
        'workout_id' => 10
    ],
    5 => [
        'day_date' => '2017-11-01',
        'phase' => 'Definition1',
        'foodstyle' => 'DeloadStart1',
        'workout_id' => null
    ],
    6 => [
        'day_date' => '2017-11-02',
        'phase' => null,
        'foodstyle' => null,
        'workout_id' => 11
    ],
    7 => [
        'day_date' => '2017-11-03',
        'phase' => null,
        'foodstyle' => 'DeloadStart2',
        'workout_id' => null
    ],
    8 => [
        'day_date' => '2017-11-04',
        'phase' => 'Definition2',
        'foodstyle' => 'DeloadStart2',
        'workout_id' => 12
    ],
    9 => [
        'day_date' => '2017-11-05',
        'phase' => 'Definition2',
        'foodstyle' => null,
        'workout_id' => null
    ],
    10 => [
        'day_date' => '2017-11-06',
        'phase' => 'Definition',
        'foodstyle' => null,
        'workout_id' => 13
    ],
    11 => [
        'day_date' => '2017-11-07',
        'phase' => 'Definition',
        'foodstyle' => 'DeloadStart9',
        'workout_id' => null
    ],
    12 => [
        'day_date' => '2017-11-08',
        'phase' => 'Definition3',
        'foodstyle' => 'DeloadStart9',
        'workout_id' => 14
    ],
    13 => [
        'day_date' => '2017-11-09',
        'phase' => null,
        'foodstyle' => 'DeloadStart9',
        'workout_id' => null
    ],
    14 => [
        'day_date' => '2017-11-10',
        'phase' => 'Definition',
        'foodstyle' => 'DeloadStart9',
        'workout_id' => 15
    ],
    15 => [
        'day_date' => '2017-11-11',
        'phase' => 'Definition',
        'foodstyle' => 'DeloadStart9',
        'workout_id' => null
    ],
];

答案 1 :(得分:0)

我就是这样做的......虽然可能会有更好的方法。

check=str(check)
check=check.strip('[]')
check=check.strip("''")
check=check.replace("','","")