Python将部分字符串写入csv行

时间:2016-02-19 22:27:05

标签: python csv beautifulsoup

我正在尝试将标签的所有命中都写入Python中的csv文件中。 我的字符串是:

<pre class="CodeRay highlight"><code data-lang="java"><span class="annotation">@CDIUI</span>(<span class="string"><span class="delimiter">"</span><span class="content">cdievents</span><span class="delimiter">"</span></span>)
<span class="annotation">@Theme</span>(<span class="string"><span class="delimiter">"</span><span class="content">valo</span><span class="delimiter">"</span></span>)
<span class="directive">public</span> <span class="type">class</span> <span class="class">CDIEventUI</span> <span class="directive">extends</span> UI {
    <span class="annotation">@Inject</span>
    InputPanel inputPanel;

    <span class="annotation">@Inject</span>
    DisplayPanel displayPanel;

    <span class="annotation">@Override</span>
    <span class="directive">protected</span> <span class="type">void</span> init(VaadinRequest request) {
        Layout content =
            <span class="keyword">new</span> HorizontalLayout(inputPanel, displayPanel);
        setContent(content);
    }
}</code></pre>

我将用于将命中编写到csv文件的python代码是:

hits =  soup.find_all("pre", "CodeRay highlight")# "programlisting")
f = open('extractedsuorceTEST2.csv','ab')
writer = csv.writer(f)
writer.writerow(('page', hits[0].text.encode('UTF-8').replace('Â',' ')))

通过此代码,命中[0]为:

'@CDIUI("cdievents")\n@Theme("valo")\npublic class CDIEventUI extends UI {\n    @Inject\n    InputPanel inputPanel;\n\n    @Inject\n    DisplayPanel displayPanel;\n\n    @Override\n    protected void init(VaadinRequest request) {\n        Layout content =\n            new HorizontalLayout(inputPanel, displayPanel);\n        setContent(content);\n    }\n}'

但是用csv文件写的结果是:

@CDIUI(""cdievents"")
@Theme(""valo"")
public class CDIEventUI extends UI {
    @Inject
    InputPanel inputPanel;

    @Inject
    DisplayPanel displayPanel;

    @Override
    protected void init(VaadinRequest request) {
        Layout content =

应该是:

@CDIUI("cdievents")
@Theme("valo")
public class CDIEventUI extends UI {
    @Inject
    InputPanel inputPanel;

    @Inject
    DisplayPanel displayPanel;

    @Override
    protected void init(VaadinRequest request) {
        Layout content =
            new HorizontalLayout(inputPanel, displayPanel);
        setContent(content);
    }
}

有人可以提出解决方案吗? 感谢

1 个答案:

答案 0 :(得分:0)

您必须小心不要放弃文件或CSVWriter对象。

尝试将代码更改为

hits =  soup.find_all("pre", "CodeRay highlight")# "programlisting")
with open('extractedsuorceTEST2.csv','ab') as f:
    writer = csv.writer(f)
    writer.writerow(('page', hits[0].text.encode('UTF-8').replace('Â',' ')))

如果仍然失败,请检查field size limit并根据需要增加它。