我正在尝试将标签的所有命中都写入Python中的csv文件中。 我的字符串是:
<pre class="CodeRay highlight"><code data-lang="java"><span class="annotation">@CDIUI</span>(<span class="string"><span class="delimiter">"</span><span class="content">cdievents</span><span class="delimiter">"</span></span>)
<span class="annotation">@Theme</span>(<span class="string"><span class="delimiter">"</span><span class="content">valo</span><span class="delimiter">"</span></span>)
<span class="directive">public</span> <span class="type">class</span> <span class="class">CDIEventUI</span> <span class="directive">extends</span> UI {
<span class="annotation">@Inject</span>
InputPanel inputPanel;
<span class="annotation">@Inject</span>
DisplayPanel displayPanel;
<span class="annotation">@Override</span>
<span class="directive">protected</span> <span class="type">void</span> init(VaadinRequest request) {
Layout content =
<span class="keyword">new</span> HorizontalLayout(inputPanel, displayPanel);
setContent(content);
}
}</code></pre>
我将用于将命中编写到csv文件的python代码是:
hits = soup.find_all("pre", "CodeRay highlight")# "programlisting")
f = open('extractedsuorceTEST2.csv','ab')
writer = csv.writer(f)
writer.writerow(('page', hits[0].text.encode('UTF-8').replace('Â',' ')))
通过此代码,命中[0]为:
'@CDIUI("cdievents")\n@Theme("valo")\npublic class CDIEventUI extends UI {\n @Inject\n InputPanel inputPanel;\n\n @Inject\n DisplayPanel displayPanel;\n\n @Override\n protected void init(VaadinRequest request) {\n Layout content =\n new HorizontalLayout(inputPanel, displayPanel);\n setContent(content);\n }\n}'
但是用csv文件写的结果是:
@CDIUI(""cdievents"")
@Theme(""valo"")
public class CDIEventUI extends UI {
@Inject
InputPanel inputPanel;
@Inject
DisplayPanel displayPanel;
@Override
protected void init(VaadinRequest request) {
Layout content =
应该是:
@CDIUI("cdievents")
@Theme("valo")
public class CDIEventUI extends UI {
@Inject
InputPanel inputPanel;
@Inject
DisplayPanel displayPanel;
@Override
protected void init(VaadinRequest request) {
Layout content =
new HorizontalLayout(inputPanel, displayPanel);
setContent(content);
}
}
有人可以提出解决方案吗? 感谢
答案 0 :(得分:0)
您必须小心不要放弃文件或CSVWriter对象。
尝试将代码更改为
hits = soup.find_all("pre", "CodeRay highlight")# "programlisting")
with open('extractedsuorceTEST2.csv','ab') as f:
writer = csv.writer(f)
writer.writerow(('page', hits[0].text.encode('UTF-8').replace('Â',' ')))
如果仍然失败,请检查field size limit并根据需要增加它。