使用词典组合excel电子表格

时间:2017-04-14 12:08:35

标签: python excel dictionary

我做了一个简单的例子,我试图合并两个电子表格。目的是创建一个电子表格,其中包含城市名称,状态'和#'人口'作为三列。我认为这样做的方法是使用字典。

我自己也去了,这就是我到目前为止所做的。

code data

2 个答案:

答案 0 :(得分:3)

你知道熊猫包吗?

您可以使用DataFrame将excel文件中的数据读取到pandas.read_excel,然后合并Name of City列上的两个数据框。

这是一个简短的例子,展示了使用pandas合并两个数据帧是多么容易:

In [1]: import pandas as pd
In [3]: df1 = pd.DataFrame({'Name of City': ['Sydney', 'Melbourne'],
   ...:                     'State': ['NSW', 'VIC']})    
In [4]: df2 = pd.DataFrame({'Name of City': ['Sydney', 'Melbourne'],
   ...:                     'Population': [1000000, 200000]})
In [5]: result = pd.merge(df1, df2, on='Name of City')
In [6]: result
Out[6]:
  Name of City State  Population
0       Sydney   NSW     1000000
1    Melbourne   VIC      200000

答案 1 :(得分:0)

也许这个?

import os
import os.path
import xlrd
import xlsxwriter

file_name = input("Decide the destination file name in DOUBLE QUOTES: ")
merged_file_name = file_name + ".xlsx"
dest_book = xlsxwriter.Workbook(merged_file_name)
dest_sheet_1 = dest_book.add_worksheet()
dest_row = 1
temp = 0
path = input("Enter the path in DOUBLE QUOTES: ")
for root,dirs,files in os.walk(path):
    files = [ _ for _ in files if _.endswith('.xlsx') ]
    for xlsfile in files:
        print ("File in mentioned folder is: " + xlsfile)
        temp_book = xlrd.open_workbook(os.path.join(root,xlsfile))
        temp_sheet = temp_book.sheet_by_index(0)
        if temp == 0:
            for col_index in range(temp_sheet.ncols):
                str = temp_sheet.cell_value(0, col_index)
                dest_sheet_1.write(0, col_index, str)
            temp = temp + 1
        for row_index in range(1, temp_sheet.nrows):
            for col_index in range(temp_sheet.ncols):
                str = temp_sheet.cell_value(row_index, col_index)
                dest_sheet_1.write(dest_row, col_index, str)
            dest_row = dest_row + 1
dest_book.close()
book = xlrd.open_workbook(merged_file_name)
sheet = book.sheet_by_index(0)
print "number of rows in destination file are: ", sheet.nrows
print "number of columns in destination file are: ", sheet.ncols

看起来这应该也可以。

import pandas as pd

# filenames
excel_names = ["xlsx1.xlsx", "xlsx2.xlsx", "xlsx3.xlsx"]

# read them in
excels = [pd.ExcelFile(name) for name in excel_names]

# turn them into dataframes
frames = [x.parse(x.sheet_names[0], header=None,index_col=None) for x in excels]

# delete the first row for all frames except the first
# i.e. remove the header row -- assumes it's the first
frames[1:] = [df[1:] for df in frames[1:]]

# concatenate them..
combined = pd.concat(frames)

# write it out
combined.to_excel("c.xlsx", header=False, index=False)

How to concatenate three excels files xlsx using python?