我希望有人可以告诉我如何创建仅包含第2列而不是前2行或左列的文本的pandas数据框。该解决方案需要能够处理多个类似的表。
我原以为pd.read_html(LOTable.prettify(),skiprows=2, flavor='bs4')
从html(跳过2行)创建数据帧列表就是方法,但最终数据结构对于这个新手来说太难以理解或操纵到更简单的结构。
其他人是否有办法处理结果或推荐其他方法来改进数据,所以我最终得到的1列只包含我需要的文字?
<table cellpadding="5" cellspacing="0" class="borders" width="100%">
<tr>
<th colspan="2">
Learning Outcomes
</th>
</tr>
<tr>
<td class="info" colspan="2">
On successful completion of this module the learner will be able to:
</td>
</tr>
<tr>
<td style="width:10%;">
LO1
</td>
<td>
Demonstrate an awareness of the important role of Financial Accounting information as an input into the decision making process.
</td>
</tr>
<tr>
<td style="width:10%;">
LO2
</td>
<td>
Display an understanding of the fundamental accounting concepts, principles and conventions that underpin the preparation of Financial statements.
</td>
</tr>
<tr>
<td style="width:10%;">
LO3
</td>
<td>
Understand the various formats in which information in relation to transactions or events is recorded and classified.
</td>
</tr>
<tr>
<td style="width:10%;">
LO4
</td>
<td>
Apply a knowledge of accounting concepts,conventions and techniques such as double entry to the posting of recorded information to the T accounts in the Nominal Ledger.
</td>
</tr>
<tr>
<td style="width:10%;">
LO5
</td>
<td>
Prepare and present the financial statements of a Sole Trader in prescribed format from a Trial Balance accompanies by notes with additional information.
</td>
</tr>
</table>
答案 0 :(得分:1)
第一个选项
使用iloc
这应该让iloc
摆脱第一列
pd.read_html(LOTable.prettify(),skiprows=2, flavor='bs4').iloc[:, 1:]
解释
...iloc[:, 1:]
# ^ ^
# | \
# says to says to take columns
# take all starting with one and on
# rows
您可以只使用
的单列pd.read_html(LOTable.prettify(),skiprows=2, flavor='bs4').iloc[:, 1]
我运行的代码
htm = """<table cellpadding="5" cellspacing="0" class="borders" width="100%">
<tr>
<th colspan="2">
Learning Outcomes
</th>
</tr>
<tr>
<td class="info" colspan="2">
On successful completion of this module the learner will be able to:
</td>
</tr>
<tr>
<td style="width:10%;">
LO1
</td>
<td>
Demonstrate an awareness of the important role of Financial Accounting information as an input into the decision making process.
</td>
</tr>
<tr>
<td style="width:10%;">
LO2
</td>
<td>
Display an understanding of the fundamental accounting concepts, principles and conventions that underpin the preparation of Financial statements.
</td>
</tr>
<tr>
<td style="width:10%;">
LO3
</td>
<td>
Understand the various formats in which information in relation to transactions or events is recorded and classified.
</td>
</tr>
<tr>
<td style="width:10%;">
LO4
</td>
<td>
Apply a knowledge of accounting concepts,conventions and techniques such as double entry to the posting of recorded information to the T accounts in the Nominal Ledger.
</td>
</tr>
<tr>
<td style="width:10%;">
LO5
</td>
<td>
Prepare and present the financial statements of a Sole Trader in prescribed format from a Trial Balance accompanies by notes with additional information.
</td>
</tr>
</table> """
pd.read_html(htm,skiprows=2, flavor='bs4')[0].iloc[:, 1:]