在解析简单的XML文本(以utf-8编码)时,xml.etree.ElementTree.fromstring抛出UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 1022-1023: invalid continuation byte
这是我的代码:
import xml.etree.ElementTree as ET
posts_file = open(posts_path, "r", encoding="utf-8")
count = 0
line = posts_file.read()
root = ET.fromstring(line)
这是xml文件:
<row Id="376095"
PostTypeId="2" ParentId="376081"
CreationDate="2008-12-17T21:28:45.560"
Score="103"
Body="<pre><code>$('#mytable tr').each(function() {
 var customerId = $(this).find("td:first").html(); 
});
</code></pre>

<p>What you are doing is iterating through all the trs in the table, finding the first td in the current tr in the loop, and extracting its inner html.</p>

<p>To select a particular cell, you can reference them with an index:</p>

<pre><code>$('#mytable tr').each(function() {
 var customerId = $(this).find("td").eq(2).html(); 
});
</code></pre>

<p>In the above code, I will be retrieving the value of the <strong>third row</strong> (the index is zero-based, so the first cell index would be 0)</p>

<hr>

<p>Here's how you can do it without jQuery:</p>

<pre><code>var table = document.getElementById('mytable'), 
 rows = table.getElementsByTagName('tr'),
 i, j, cells, customerId;

for (i = 0, j = rows.length; i &lt; j; ++i) {
 cells = rows[i].getElementsByTagName('td');
 if (!cells.length) {
 continue;
 }
 customerId = cells[0].innerHTML;
}
</code></pre>

<p></p>
"
OwnerUserId="44084"
OwnerDisplayName="Dreas"
LastEditorUserId="880797"
LastEditorDisplayName="Dreas"
LastEditDate="2011-11-04T16:25:28.717"
LastActivityDate="2011-11-04T16:25:28.717"
CommentCount="6" />
我正在使用Python 3.6.2