使用python合并两个电子表格 - 新工作表中的列源在源文件之间交替

时间:2015-01-15 17:26:11

标签: python python-2.7 csv

我想编写一个python代码来合并.csv格式的两个电子表格,以便新工作表中的第一列来自任一源表,而所有其他新列都是从源表单交替排序。

以下是一个示例(以电子表格格式显示):

来源1:

    (A) name 1  (A) name 2  (A) name 3  (A) name 4
class 1             
class 2             
class 3             
class 4     

来源2:

    (B) name 1  (B) name 2  (B) name 3  (B) name 4
class 1             
class 2             
class 3             
class 4     

期望的结果:

    (A) name 1  (B) name 1  (A) name 2  (B) name 2  (A) name 3  (B) name 3  (A) name 4  (B) name 4
class 1                             
class 2                             
class 3                             
class 4                             

编辑:

根据要求,以下是我的数据示例(以.csv格式显示)

表1:

,(F) Abies amabilis,(F) Abies balsamea,(F) Abies bifolia,(F) Abies concolor,(F) Abies fraseri,(F) Abies grandis,(F) Abies lasiocarpa,(F) Abies magnifica,(F) Abies procera,(F) Larix decidua,(F) Larix laricina,(F) Picea abies,(F) Picea engelmannii,(F) Picea glauca,(F) Picea mariana,(F) Picea pungens,(F) Picea sitchensis,(F) Pinus albicaulis,(F) Pinus aristata,(F) Pinus attenuata,(F) Pinus banksiana,(F) Pinus cembroides,(F) Pinus clausa,(F) Pinus contorta,(F) Pinus coulteri,(F) Pinus echinata,(F) Pinus edulis,(F) Pinus elliottii,(F) Pinus engelmannii,(F) Pinus flexilis,(F) Pinus halepensis,(F) Pinus jeffreyi,(F) Pinus lambertiana,(F) Pinus leiophylla,(F) Pinus longaeva,(F) Pinus monophylla,(F) Pinus monticola,(F) Pinus mugo,(F) Pinus muricata,(F) Pinus palustris,(F) Pinus ponderosa,(F) Pinus pumila,(F) Pinus pungens,(F) Pinus quadrifolia,(F) Pinus radiata,(F) Pinus resinosa,(F) Pinus rigida,(F) Pinus serotina,(F) Pinus strobiformis,(F) Pinus strobus,(F) Pinus sylvestris,(F) Pinus taeda,(F) Pinus thunbergii,(F) Pinus torreyana,(F) Pinus virginiana,(F) Pseudotsuga macrocarpa,(F) Pseudotsuga menziesii,(F) Tsuga canadensis,(F) Tsuga heterophylla,(F) Tsuga mertensiana
48,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
52,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
58,0,0,0,1,0,0,1,0,0,1,0,1,0,1,1,1,0,1,1,0,0,0,0,1,0,0,1,0,0,1,0,0,0,0,1,1,0,1,0,0,1,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,1,0,0,0

表2:

,(M) Abies amabilis,(M) Abies balsamea,(M) Abies bifolia,(M) Abies concolor,(M) Abies fraseri,(M) Abies grandis,(M) Abies lasiocarpa,(M) Abies magnifica,(M) Abies procera,(M) Larix decidua,(M) Larix laricina,(M) Picea engelmannii,(M) Picea glauca,(M) Picea mariana,(M) Picea pungens,(M) Picea sitchensis,(M) Pinus albicaulis,(M) Pinus aristata,(M) Pinus attenuata,(M) Pinus banksiana,(M) Pinus cembroides,(M) Pinus clausa,(M) Pinus contorta,(M) Pinus coulteri,(M) Pinus echinata,(M) Pinus edulis,(M) Pinus elliottii,(M) Pinus engelmannii,(M) Pinus flexilis,(M) Pinus halepensis,(M) Pinus jeffreyi,(M) Pinus lambertiana,(M) Pinus leiophylla,(M) Pinus longaeva,(M) Pinus monophylla,(M) Pinus monticola,(M) Pinus muricata,(M) Pinus palustris,(M) Pinus ponderosa,(M) Pinus pumila,(M) Pinus pungens,(M) Pinus quadrifolia,(M) Pinus radiata,(M) Pinus resinosa,(M) Pinus rigida,(M) Pinus serotina,(M) Pinus strobiformis,(M) Pinus strobus,(M) Pinus sylvestris,(M) Pinus thunbergii,(M) Pinus torreyana,(M) Pinus virginiana,(M) Tsuga canadensis,(M) Tsuga heterophylla,(M) Tsuga mertensiana
48,0,0,1,1,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,1,0,0,1,0,0,1,0,0,0,0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1
52,0,0,1,1,0,0,0,0,0,0,0,1,0,0,0,0,1,0,1,0,0,0,1,0,0,1,0,0,1,0,1,0,0,1,1,1,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1
58,0,0,1,0,0,1,1,0,0,0,0,1,0,0,1,0,1,0,0,0,0,0,1,0,0,1,0,0,1,0,0,0,0,1,1,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0

我是一个非常新手的编码员,所以我所尝试的几乎不值得一提。但是,我最初假设也许我可以使用zip链接表单,这适用于列表。我也想过,也许我可以做一些像

这样的事情
for line in "Source 1.csv" and row in "Source 2.csv:
        #then split the lines into lists and write to an outfile using list indices

事先,我非常感谢你的帮助!

1 个答案:

答案 0 :(得分:1)

我认为你使用zip()处于正确的轨道上,但它有点棘手,因为它返回了每个源文件中的一对值列表。以下通过展平嵌套序列来解决这个问题。所以我认为以下内容应该有效。您还可以使用zip()(或itertools.izip())并行迭代两个csv文件的行。

注意我在处理该格式的文件时通常会尝试使用csv模块,因为它通常可以节省大量时间和麻烦,而且使用起来相当容易。

import csv
import itertools

with open("Source 1.csv", "rb") as source1, \
     open("Source 2.csv", "rb") as source2, \
     open("merged_output.csv", "wb") as merged_output:

    source1_reader = csv.reader(source1, delimiter=',')
    source2_reader = csv.reader(source2, delimiter=',')
    merged_output_writer = csv.writer(merged_output, delimiter=',')

    for row1, row2 in itertools.izip(source1_reader, source2_reader):
        merged_output_writer.writerow(
            tuple(itertools.chain.from_iterable(itertools.izip(row1, row2))))