Question

下面是我读取包含大约25行的.csv文件的代码。每行的输出相同。我希望能够完成的是每个“行”的随机顺序。这是代码：

f_in = open("input.csv",'r')

f_out = open('output.txt', 'w')
for line in f_in.readlines():
    f_out.write('<p>' + random.choice(list(open('content.txt'))).replace("\n", "").replace(".", "").replace("?", "").strip().capitalize() + ' <a href="' +
                line.replace("\n", "").split(",")[0]+"" + '">' + line.replace("\n", "").split(",")[1]+"" + '</a> ' + random.choice(list(open('content.txt'))).replace("\n", "").strip().lower() + '</p>' + 
                #
                '<p>' + random.choice(list(open('content.txt'))).replace("\n", "").replace(".", "").replace("?", "").strip().capitalize() + ' <a href="' +
                line.replace("\n", "").split(",")[2]+"" + '">' + line.replace("\n", "").split(",")[3]+"" + '</a> ' + random.choice(list(open('content.txt'))).replace("\n", "").strip().lower() + '</p>' + 
                #
                '<p>' + random.choice(list(open('content.txt'))).replace("\n", "").replace(".", "").replace("?", "").strip().capitalize() + ' <a href="' +
                line.replace("\n", "").split(",")[4]+"" + '">' + line.replace("\n", "").split(",")[5]+"" + '</a> ' + random.choice(list(open('content.txt'))).replace("\n", "").strip().lower() + '</p>' +
                #
                '\n')    
f_in.close()
f_out.close()

这输出的是text a link texttext a link texttext a link text这很好，这就是我想要的但是我需要第2行以不同的顺序以及第3行等等。

例如，它从第1行读取的第一个输出将是AB CD CD列，我想要的是第2行输出为EF AB CD列。因此，对于.csv文件中的每一行，输出需要重新排序，而不仅仅是针对.csv文件中每25行的AB CD EF。

我在Python中并不是真正先进的，而且我的代码可以用不同的方式完成，这只是我知道如何实现这一目标的最佳方式。有人可以通过尝试获得能够实现此类输出的工作代码来帮助我吗？谢谢。

从CSV文件中输入数据：

Line 1 --> Column A http://domain.com Column B my anchor text 1 Column C http://domain.com Column D my anchor text 2 Column E http://domain.com Column F my anchor text 3
Line 2 --> Column A http://domain.com Column B my anchor text 1 Column C http://domain.com Column D my anchor text 2 Column E http://domain.com Column F my anchor text 3
Line 3 --> Column A http://domain.com Column B my anchor text 1 Column C http://domain.com Column D my anchor text 2 Column E http://domain.com Column F my anchor text 3

CSV数据

http://domain.com,anchor text 1,http://domain2.com,anchor text 2,http://domain3.com,anchor text 3
http://domain.com,anchor text 1,http://domain2.com,anchor text 2,http://domain3.com,anchor text 3
http://domain.com,anchor text 1,http://domain2.com,anchor text 2,http://domain3.com,anchor text 3

按行期望输出

Line 1 --> Column A and B Column E and F Column C and D
Line 2 --> Column E and F Column A and B Column C and D
Line 3 --> Column C and D column E and F Column A and B

Answer 1

我认为你要求的一种方法是将元组中的域/文本对分组为每一行，然后随机播放该列表。

这里有一些代码将从csv文件中读取，为每行的域/文本对进行混洗，并输出带有混洗行的text和csv文件：

import random
import csv

with open("input.csv") as infile:
    csvreader = csv.reader(infile)
    with open("output.csv", 'w') as outcsv:
        csvwriter = csv.writer(outcsv)
        with open("output.txt", 'w') as outtxt:
            for row in csvreader:
                random_pairs = [(row[2*i], row[2*i + 1]) for i in range(int(len(row)/2))]
                random.shuffle(random_pairs)
                outline = []
                for pair in random_pairs:
                    outtxt.write('<a href="' + pair[0] + '">' + pair[1] + '</a>')
                    outline.append(pair[0])
                    outline.append(pair[1])
                outtxt.write('\n')
                csvwriter.writerow(outline)

使用您提供的csv数据会产生以下输出：

<强> output.txt的：

<a href="http://domain3.com">anchor text 3</a><a href="http://domain2.com">anchor text 2</a><a href="http://domain.com">anchor text 1</a>
<a href="http://domain3.com">anchor text 3</a><a href="http://domain.com">anchor text 1</a><a href="http://domain2.com">anchor text 2</a>
<a href="http://domain2.com">anchor text 2</a><a href="http://domain3.com">anchor text 3</a><a href="http://domain.com">anchor text 1</a>

<强> output.csv：

http://domain3.com,anchor text 3,http://domain2.com,anchor text 2,http://domain.com,anchor text 1
http://domain3.com,anchor text 3,http://domain.com,anchor text 1,http://domain2.com,anchor text 2
http://domain2.com,anchor text 2,http://domain3.com,anchor text 3,http://domain.com,anchor text 1

Answer 2

我试图将代码分解为函数;它应该更容易理解和维护。

import csv
from itertools import izip
import random

LOREM_IPSUM = "content.txt"
LINK_TEXT   = "input.csv"
OUTPUT      = "output.phtml"

def csv_rows(fname, **kwargs):
    with open(fname, "rb") as inf:
        incsv = csv.reader(inf, **kwargs)
        for row in incsv:
            yield row

def by_twos(iterable):
    # given (a, b, c, d, ...) returns ((a,b), (c,d), ...)
    return izip(*([iter(iterable)]*2))

def a(href, *content):
    return "<a href=\"{0}\">{1}</a>".format(href, " ".join(content))

def p(*content):
    return "<p>{0}</p>".format(" ".join(content))

def br():
    return "<br/>"

def main():
    with open(LOREM_IPSUM) as inf:
        lines = (line.strip() for line in inf)
        content = [line.capitalize() for line in lines if line]
    randtxt = lambda: random.choice(content)

    with open(OUTPUT, "w") as outf:
        for row in csv_rows(LINK_TEXT):
            links = [a(href, text) for href,text in by_twos(row)]
            random.shuffle(links)    # randomize order
            paras = (p(randtxt(), link, randtxt()) for link in links)
            outf.write("".join(paras))
            outf.write(br())

if __name__=="__main__":
    main()

Python输出随机顺序来自ReadLines（）

2 个答案: