BeautifulSoup我从HTML表中提取的数据不会以相同的表格格式打印出来。我可以保留表格格式吗?

时间:2016-10-27 14:05:27

标签: python-2.7 selenium-webdriver beautifulsoup

我有一个包含测试用例列表的HTML测试报告文件。每个测试用例都在HTML表格中连续排列。 我已经设法从表中获取每行的测试用例。 当我将其写入我的电子邮件代码时,它不会像HTML中那样将其写为表格格式。我想保留行和列的网格线,以便它可以很好地显示为表格。

我提取数据的方法是:

def extract_testcases_from_report_htmltestrunner():
    filename = (r"E:\test_runners project\selenium_regression_test\TestReport\ClearCore_Automated_GUI_Regression_Project_TestReport.html")
    html_report_part = open(filename,'r')
    soup = BeautifulSoup(html_report_part, "html.parser")
    for div in soup.select("#result_table tr div.testcase"):
          yield div.text.strip().encode('utf-8'), div.find_next("a").text.strip().encode('utf-8')

当我把它写入我的电子邮件代码时,我得到的输出是:

test_000001_login_valid_user
pass
test_000002_select_a_project
pass
test_000003_verify_Lademo_CRM_DataPreview_is_present
pass
test_000004_view_data_preview_Lademo_CRM_and_test_scrollpage
pass

如果可能的话,我想要的输出将采用以下格式,或者采用与HTML格式相同的表格格式:

test_000001_login_valid_user                                  pass
test_000002_select_a_project                                  pass
test_000003_verify_Lademo_CRM_DataPreview_is_present          pass
test_000004_view_data_preview_Lademo_CRM_and_test_scrollpage  pass

HTML代码段为:

    <div class='heading'>
    <h1>Test Report</h1>
    <p class='attribute'><strong>Start Time:</strong> 2016-10-27 10:06:59</p>
    <p class='attribute'><strong>Duration:</strong> 0:57:01.842000</p>
    <p class='attribute'><strong>Status:</strong> Pass 93</p>

    <p class='description'>Selenium - ClearCore Regression Project  Automated Test</p>
</div>
<p id='show_detail_line'>Show
    <a href='javascript:showCase(0)'>Summary</a>
    <a href='javascript:showCase(1)'>Failed</a>
    <a href='javascript:showCase(2)'>All</a>
</p>
<table id='result_table'>
    <colgroup>
        <col align='left' />
        <col align='right' />
        <col align='right' />
        <col align='right' />
        <col align='right' />
        <col align='right' />
    </colgroup>
    <tr id='header_row'>
        <td>Test Group/Test case</td>
        <td>Count</td>
        <td>Pass</td>
        <td>Fail</td>
        <td>Error</td>
        <td>View</td>
    </tr>

    <tr class='passClass'>
        <td>Regression_TestCase.RegressionProject_TestCase</td>
        <td>47</td>
        <td>47</td>
        <td>0</td>
        <td>0</td>
        <td><a href="javascript:showClassDetail('c1',47)">Detail</a></td>
    </tr>

    <tr id='pt1.1' class='hiddenRow'>
        <td class='none'><div class='testcase'>test_000001_login_valid_user</div></td>
        <td colspan='5' align='center'>

            <!--css div popup start-->
            <a class="popup_link" onfocus='this.blur();' href="javascript:showTestDetail('div_pt1.1')" >
        pass</a>

            <div id='div_pt1.1' class="popup_window">
                <div style='text-align: right; color:red;cursor:pointer'>
                    <a onfocus='this.blur();' onclick="document.getElementById('div_pt1.1').style.display = 'none' " >
           [x]</a>
                </div>
                <pre>

pt1.1: *** test_login_valid_user ***
test login with a valid user - Passed


                </pre>
            </div>
            <!--css div popup end-->

        </td>
    </tr>

    <tr id='pt1.2' class='hiddenRow'>
        <td class='none'><div class='testcase'>test_000002_select_a_project</div></td>
        <td colspan='5' align='center'>

            <!--css div popup start-->
            <a class="popup_link" onfocus='this.blur();' href="javascript:showTestDetail('div_pt1.2')" >
        pass</a>

            <div id='div_pt1.2' class="popup_window">
                <div style='text-align: right; color:red;cursor:pointer'>
                    <a onfocus='this.blur();' onclick="document.getElementById('div_pt1.2').style.display = 'none' " >
           [x]</a>
                </div>
                <pre>

pt1.2: *** test_login_valid_user ***
test login with a valid user - Passed

                </pre>
            </div>
            <!--css div popup end-->

        </td>
    </tr>

    <tr id='pt1.3' class='hiddenRow'>
        <td class='none'><div class='testcase'>test_000057_run_clean_and_match_process</div></td>
        <td colspan='5' align='center'>

            <!--css div popup start-->
            <a class="popup_link" onfocus='this.blur();' href="javascript:showTestDetail('div_pt1.3')" >
        pass</a>

            <div id='div_pt1.3' class="popup_window">
                <div style='text-align: right; color:red;cursor:pointer'>
                    <a onfocus='this.blur();' onclick="document.getElementById('div_pt1.3').style.display = 'none' " >
           [x]</a>
                </div>
                <pre>

pt1.3: *** test_login_valid_user ***
test login with a valid user - Passed

                </pre>
            </div>
            <!--css div popup end-->

        </td>
    </tr>

    <tr id='pt1.4' class='hiddenRow'>
        <td class='none'><div class='testcase'>test_000058_view_all_records_report_CRM_CRM2_ESCR</div></td>
        <td colspan='5' align='center'>

            <!--css div popup start-->
            <a class="popup_link" onfocus='this.blur();' href="javascript:showTestDetail('div_pt1.4')" >
        pass</a>

            <div id='div_pt1.4' class="popup_window">
                <div style='text-align: right; color:red;cursor:pointer'>
                    <a onfocus='this.blur();' onclick="document.getElementById('div_pt1.4').style.display = 'none' " >
           [x]</a>
                </div>
                <pre>

pt1.4: *** test_login_valid_user ***
test login with a valid user - Passed

*** Test view_all_records_report - CRM, CRM2, ESCR ***

                </pre>
            </div>
            <!--css div popup end-->

        </td>
    </tr>

    <tr id='pt1.5' class='hiddenRow'>
        <td class='none'><div class='testcase'>test_000059_view_matches_report_CRM_CRM2_ESCR</div></td>
        <td colspan='5' align='center'>

            <!--css div popup start-->
            <a class="popup_link" onfocus='this.blur();' href="javascript:showTestDetail('div_pt1.5')" >
        pass</a>

            <div id='div_pt1.5' class="popup_window">
                <div style='text-align: right; color:red;cursor:pointer'>
                    <a onfocus='this.blur();' onclick="document.getElementById('div_pt1.5').style.display = 'none' " >
           [x]</a>
                </div>
                <pre>

pt1.5: *** test_login_valid_user ***
test login with a valid user - Passed

*** Test view_all_records_report - CRM, CRM2, ESCR ***

                </pre>
            </div>
            <!--css div popup end-->

        </td>
    </tr>

有可能吗?

单词pass将进入一个新行。如果我可以把它分成一个列或一些好的空格。

单词pass位于HTML中的标签中。以下代码行找到了这个。当我提取它时,我可以放几个空格或在另一列中吗?:

yield div.text.strip().encode('utf-8'), div.find_next("a").text.strip().encode('utf-8')

我写的电子邮件信息代码段是:

msg = MIMEText("\n ClearCore Automated GUI Project Test Report \n " + "\n" +
               "".join([' - '.join(seq) for seq in extract_status_from_report_htmltestrunner()]) + "\n\n" +
               '\n'.join([elem
                          for seq in extract_testcases_from_report_htmltestrunner()
                          for elem in seq]) + "\n" +
               "\n Report location = : \\\storage-1\Testing\Selenium_Test_Report_Results\ClearCore\Selenium VM \n" + "\n")

我从报告中提取状态的代码是:

def extract_status_from_report_htmltestrunner():
    filename = (
    r"E:\test_runners 2 edit project\selenium_regression_test\TestReport\ClearCore_Automated_GUI_Regression_Project_TestReport.html")
    html_report_part = open(filename, 'r')
    soup = BeautifulSoup(html_report_part, "html.parser")
    div_heading = soup.find('div', {'class': 'heading'})
    p_status = div_heading.find('strong', text='Status:').parent
    p_status.find(text=True, recursive=False)
    print p_status.text
    return p_status.text

谢谢Riaz

1 个答案:

答案 0 :(得分:0)

from bs4 import BeautifulSoup

soup = BeautifulSoup(text, 'lxml')
trs = soup.find_all(class_='hiddenRow')
for tr in trs:
    row1 = tr.find('td').get_text()
    row2 = tr.find('a').get_text(strip=True)
    print('{:<55}{:>5}'.format(row1, row2))

出:

test_000001_login_valid_user                            pass
test_000002_select_a_project                            pass
test_000057_run_clean_and_match_process                 pass
test_000058_view_all_records_report_CRM_CRM2_ESCR       pass
test_000059_view_matches_report_CRM_CRM2_ESCR           pass