如何将新列添加到CSV文件?

时间:2012-06-17 10:10:50

标签: python csv python-3.x

我有几个CSV个文件,如下所示:

Input
Name        Code
blackberry  1
wineberry   2
rasberry    1
blueberry   1
mulberry    2

我想为所有CSV文件添加一个新列,以便它看起来像这样:

Output
Name        Code    Berry
blackberry  1   blackberry
wineberry   2   wineberry
rasberry    1   rasberry
blueberry   1   blueberry
mulberry    2   mulberry

我到目前为止的脚本是这样的:

import csv
with open(input.csv,'r') as csvinput:
    with open(output.csv, 'w') as csvoutput:
        writer = csv.writer(csvoutput)
        for row in csv.reader(csvinput):
            writer.writerow(row+['Berry'])

(Python 3.2)

但是在输出中,脚本会跳过每一行,而新列中只包含Berry:

Output
Name        Code    Berry
blackberry  1   Berry

wineberry   2   Berry

rasberry    1   Berry

blueberry   1   Berry

mulberry    2   Berry

10 个答案:

答案 0 :(得分:66)

这可以让你知道该怎么做:

>>> v = open('C:/test/test.csv')
>>> r = csv.reader(v)
>>> row0 = r.next()
>>> row0.append('berry')
>>> print row0
['Name', 'Code', 'berry']
>>> for item in r:
...     item.append(item[0])
...     print item
...     
['blackberry', '1', 'blackberry']
['wineberry', '2', 'wineberry']
['rasberry', '1', 'rasberry']
['blueberry', '1', 'blueberry']
['mulberry', '2', 'mulberry']
>>> 

编辑,请注意py3k中必须使用next(r)

感谢您接受答案。在这里你有一个奖金(你的工作脚本):

import csv

with open('C:/test/test.csv','r') as csvinput:
    with open('C:/test/output.csv', 'w') as csvoutput:
        writer = csv.writer(csvoutput, lineterminator='\n')
        reader = csv.reader(csvinput)

        all = []
        row = next(reader)
        row.append('Berry')
        all.append(row)

        for row in reader:
            row.append(row[0])
            all.append(row)

        writer.writerows(all)

请注意

  1. lineterminator中的csv.writer参数。默认情况下是 设置为'\r\n',这就是你有双倍间距的原因。
  2. 使用列表追加所有行并将其写入 用writerows一次拍摄。如果你的文件非常非常大 可能不是一个好主意(RAM),但对于普通文件我认为是 更快,因为I / O更少。
  3. 如本文评论中所示,请注意,而不是 嵌套两个with语句,您可以在同一行中执行:

    open('C:/test/test.csv','r')为csvinput,打开('C:/test/output.csv','w')为csvoutput:

答案 1 :(得分:42)

我很惊讶没有人建议熊猫。尽管使用像Pandas这样的一组依赖关系似乎比这么简单的任务所需要的更加严厉,但它产生了一个非常短的脚本,而Pandas是一个很棒的库,用于执行各种CSV(实际上是所有数据类型)数据操作。无法与4行代码争论:

import pandas as pd
csv_input = pd.read_csv('input.csv')
csv_input['Berries'] = csv_input['Name']
csv_input.to_csv('output.csv', index=False)

查看Pandas Website了解详情!

output.csv的内容:

Name,Code,Berries
blackberry,1,blackberry
wineberry,2,wineberry
rasberry,1,rasberry
blueberry,1,blueberry
mulberry,2,mulberry

答案 2 :(得分:7)

import csv
with open('input.csv','r') as csvinput:
    with open('output.csv', 'w') as csvoutput:
        writer = csv.writer(csvoutput)

        for row in csv.reader(csvinput):
            if row[0] == "Name":
                writer.writerow(row+["Berry"])
            else:
                writer.writerow(row+[row[0]])

也许这样的事情是你想要的?

此外,csv代表逗号分隔值。所以,你需要用逗号分隔你的价值观,我认为:

Name,Code
blackberry,1
wineberry,2
rasberry,1
blueberry,1
mulberry,2

答案 3 :(得分:3)

我用过熊猫,效果很好...... 当我使用它时,我不得不打开一个文件并为其添加一些随机列,然后只保存回同一个文件。

此代码添加了多个列条目,您可以根据需要进行编辑。

import pandas as pd

csv_input = pd.read_csv('testcase.csv')         #reading my csv file
csv_input['Phone1'] = csv_input['Name']         #this would also copy the cell value 
csv_input['Phone2'] = csv_input['Name']
csv_input['Phone3'] = csv_input['Name']
csv_input['Phone4'] = csv_input['Name']
csv_input['Phone5'] = csv_input['Name']
csv_input['Country'] = csv_input['Name']
csv_input['Website'] = csv_input['Name']
csv_input.to_csv('testcase.csv', index=False)   #this writes back to your file

如果您希望该单元格值无法复制,那么首先要手动在csv文件中创建一个空列,就像您将其命名为 Hours 那么,现在你可以在上面的代码中添加这一行,

csv_input['New Value'] = csv_input['Hours']

或者我们可以在不添加手册列的情况下完成

csv_input['New Value'] = ''    #simple and easy

我希望它有所帮助。

答案 4 :(得分:1)

我没有看到你在哪里添加新列,但试试这个:

    import csv
    i = 0
    Berry = open("newcolumn.csv","r").readlines()
    with open(input.csv,'r') as csvinput:
        with open(output.csv, 'w') as csvoutput:
            writer = csv.writer(csvoutput)
            for row in csv.reader(csvinput):
                writer.writerow(row+","+Berry[i])
                i++

答案 5 :(得分:1)

此代码足以满足您的要求,我已对示例代码进行了测试。

import csv

with open(in_path, 'r') as f_in, open(out_path, 'w') as f_out:
    csv_reader = csv.reader(f_in, delimiter=';')
    writer = csv.writer(f_out)

    for row in csv_reader:
    writer.writerow(row + [row[0]]

答案 6 :(得分:1)

是的,这是一个古老的问题,但可能会有所帮助

import csv
import uuid

# read and write csv files
with open('in_file','r') as r_csvfile:
    with open('out_file','w',newline='') as w_csvfile:

        dict_reader = csv.DictReader(r_csvfile,delimiter='|')
        #add new column with existing
        fieldnames = dict_reader.fieldnames + ['ADDITIONAL_COLUMN']
        writer_csv = csv.DictWriter(w_csvfile,fieldnames,delimiter='|')
        writer_csv.writeheader()


        for row in dict_reader:
            row['ADDITIONAL_COLUMN'] = str(uuid.uuid4().int >> 64) [0:6]
            writer_csv.writerow(row)

答案 7 :(得分:1)

使用不带标题名称的python在现有的csv文件中添加新列

  default_text = 'Some Text'
# Open the input_file in read mode and output_file in write mode
    with open('problem-one-answer.csv', 'r') as read_obj, \
    open('output_1.csv', 'w', newline='') as write_obj:
# Create a csv.reader object from the input file object
    csv_reader = reader(read_obj)
# Create a csv.writer object from the output file object
    csv_writer = csv.writer(write_obj)
# Read each row of the input csv file as list
    for row in csv_reader:
# Append the default text in the row / list
        row.append(default_text)
# Add the updated row / list to the output file
        csv_writer.writerow(row)

谢谢

答案 8 :(得分:1)

如果文件很大,则可以将pandas.read_csvchunksize参数一起使用,该参数允许按块读取数据集:

import pandas as pd

INPUT_CSV = "input.csv"
OUTPUT_CSV = "output.csv"
CHUNKSIZE = 1_000 # Maximum number of rows in memory

header = True
mode = "w"
for chunk_df in pd.read_csv(INPUT_CSV, chunksize=CHUNKSIZE):
    chunk_df["Berry"] = chunk_df["Name"]
    # You apply any other transformation to the chunk
    # ...
    chunk_df.to_csv(OUTPUT_CSV, header=header, mode=mode)
    header = False # Do not save the header for the other chunks
    mode = "a" # 'a' stands for append mode, all the other chunks will be appended

如果要就地更新文件,可以使用一个临时文件并在最后将其删除

import pandas as pd

INPUT_CSV = "input.csv"
TMP_CSV = "tmp.csv"
CHUNKSIZE = 1_000 # Maximum number of rows in memory

header = True
mode = "w"
for chunk_df in pd.read_csv(INPUT_CSV, chunksize=CHUNKSIZE):
    chunk_df["Berry"] = chunk_df["Name"]
    # You apply any other transformation to the chunk
    # ...
    chunk_df.to_csv(TMP_CSV, header=header, mode=mode)
    header = False # Do not save the header for the other chunks
    mode = "a" # 'a' stands for append mode, all the other chunks will be appended

os.replace(TMP_CSV, INPUT_CSV)

答案 9 :(得分:0)

要将新列添加到现有CSV文件(带有标题)中,如果要添加的列具有足够少的值,则此功能很方便(有点类似于@joaquin的解决方案)。该功能采用

  1. 现有CSV文件名
  2. 输出CSV文件名(将具有更新的内容)和
  3. 带有标题名称和列值的列表
def add_col_to_csv(csvfile,fileout,new_list):
    with open(csvfile, 'r') as read_f, \
        open(fileout, 'w', newline='') as write_f:
        csv_reader = csv.reader(read_f)
        csv_writer = csv.writer(write_f)
        i = 0
        for row in csv_reader:
            row.append(new_list[i])
            csv_writer.writerow(row)
            i += 1 

示例:

new_list1 = ['test_hdr',4,4,5,5,9,9,9]
add_col_to_csv('exists.csv','new-output.csv',new_list1)

现有CSV文件: enter image description here

输出(已更新)CSV文件: enter image description here