将两个脚本合并为一个新脚本(将列附加到csv)

时间:2013-10-29 22:03:05

标签: python python-2.7 csv

我有两个脚本在csv中创建新列,每个脚打开csv并附加一个新列。理想情况下,不是将csv保存到csv1然后打开csv1并将其作为csv2重新保存,我希望能够一步完成。

SCRIPT1

with open("inputcsv1.csv", "r") as input_file:
    header = input_file.readline()[:-1] #this is to remove trailing '\n'
    header += ",Table exists?"
    output_lines = [header]

    for line in input_file:
         output_lines.append(line[:-1])
         if 'table' in line.split(",")[3]:
             output_lines[-1]+=",table exists"
         else:
             output_lines[-1]+=",No table found"

with open("outputcsv1.csv", "w") as output_file:
    output_file.write("\n".join(output_lines))   

SCRIPT2

with open("outputcsv1.csv", "r") as input_file:
    header = input_file.readline()[:-1] #this is to remove trailing '\n'
    header += ",Are you sure Table exists?"
    output_lines = [header]

    for line in input_file:
         output_lines.append(line[:-1])
         if 'table' in line.split(",")[3]:
             output_lines[-1]+=",table definitely exists"
         else:
             output_lines[-1]+=",No table was not found"

with open("outputcsv2.csv", "w") as output_file:
   output_file.write("\n".join(output_lines))   

以上两个脚本是在一个非常简单的示例csv上使用的脚本。

示例inputcsv1.csv

title1,title2,title3,Table or no table?,title4
data,text,data,the cat sits on the table,text,data
data,text,data,tables are made of wood,text,data
data,text,data,the cat sits on the television,text,data
data,text,data,the dog chewed the table leg,text,data
data,text,data,random string of words,text,data
data,text,data,table seats 25 people,text,data
data,text,data,I have no idea why I made this example about tables,text,data
data,text,data,,text,data

所需的输出csv:

title1,title2,title3,Table or no table?,title4,Table exists?,Are you sure Table exist
data,text,data,the cat sits on the table,text,data,table exists,table definitely exists
data,text,data,tables are made of wood,text,data,table exists,table definitely exists
data,text,data,the cat sits on the television,text,data,No table found,No table was not found
data,text,data,the dog chewed the table leg,text,data,table exists,table definitely exists
data,text,data,random string of words,text,data,No table found,No table was not found
data,text,data,table seats 25 people,text,data,table exists,table definitely exists
data,text,data,I have no idea why I made this example about tables,text,data,table exists,table definitely exists
data,text,data,,text,data,No table found,No table was not found

在尝试合并这两个脚本时,我尝试了以下代码:

with open("inputcsv1.csv", "r") as input_file:
    header = input_file.readline()[:-1] #this is to remove trailing '\n'
    header2 = input_file.readline()[:-2] #this is to remove trailing '\n'
    header += ",Table exists?"
    header2 += ",Are you sure table exists?"
    output_lines = [header]
    output_lines2 = [header2]

    for line in input_file:
        output_lines.append(line[:-1])
        if 'table' in line.split(",")[3]:
            output_lines[-1]+=",table exists"
        else:
            output_lines[-1]+=",No table found"

    for line in input_file:
        output_lines.append(line[:-2])
        if 'table' in line.split(",")[3]:
            output_lines2[-2]+=",table definitely exists"
        else:
            output_lines2[-2]+=",No table was not found"

with open("TestMurgedOutput.csv", "w") as output_file:
    output_file.write("\n".join(output_lines).join(output_lines2))

它不会产生错误,但它只会在新的csv中输出以下内容。

data,text,data,the cat sits on the table,text,dat,Are you sure table exists?

我不确定为什么,虽然我对使用.join没有信心。 任何建设性的意见将不胜感激。

2 个答案:

答案 0 :(得分:3)

我认为这与您正在寻找的内容很接近 - 这就是我将两个脚本中的if语句放在单个for循环中的含义。它可以进行优化,但我试着保持简单,这样你就可以很容易地理解正在做什么。

with open("inputcsv1.csv", "rt") as input_file:
    header = input_file.readline()[:-1]  # remove trailing newline
    # add a title to the header for each of the two new columns
    header += ",Table exists?,Are you sure table exists?"
    output_lines = [header]

    for line in input_file:
        line = line[:-1]  # remove trailing newline
        cols = line.split(',')  # split line in columns based on delimiter
        # add first column
        if 'table' in cols[3]:
            line += ",table exists"
        else:
            line += ",No table found"
        # add second column
        if 'table' in cols[3]:
            line += ",table definitely exists"
        else:
            line += ",No table was not found"
        output_lines.append(line)

with open("TestMurgedOutput.csv", "wt") as output_file:
    output_file.write("\n".join(output_lines))

创建的TestMurgedOutput.csv文件的内容:

title1,title2,title3,Table or no table?,title4,Table exists?,Are you sure table exists?
data,text,data,the cat sits on the table,text,data,table exists,table definitely exists
data,text,data,tables are made of wood,text,data,table exists,table definitely exists
data,text,data,the cat sits on the television,text,data,No table found,No table was not found
data,text,data,the dog chewed the table leg,text,data,table exists,table definitely exists
data,text,data,random string of words,text,data,No table found,No table was not found
data,text,data,table seats 25 people,text,data,table exists,table definitely exists
data,text,data,I have no idea why I made this example about tables,text,data,table exists,table definitely exists
data,text,data,,text,data,No table found,No table was not found

答案 1 :(得分:0)

你的output_lines2列表只包含一个元素(因为文件中的所有行) 在第一个for循环中读取,因此join对它没有影响,write语句输出output_lines2列表的单个元素。试试这个:

with open("test.csv", "r") as input_file:
header = input_file.readline()[:-1] #this is to remove trailing '\n'
header += ",Table exists?"
header += ",Are you sure Table exists?"
output_lines = [header]
for line in input_file:
     output_lines.append(line[:-1])
     if 'table' in line.split(",")[3]:
            output_lines[-1]+=",table exists"
     else:
            output_lines[-1]+=",No table found"
     if 'table' in line.split(",")[3]:
            output_lines[-1]+=",table definitely exists"
     else:
            output_lines[-1]+=",No table was not found"
with open("output.csv", "w") as output_file:
output_file.write("\n".join(output_lines))