在Excel中将表转换为列表

时间:2014-11-06 23:07:18

标签: python excel excel-vba vba

我有很多excel表,其中数据排列在表格中,如下所示

Original Table 我想将每个表转换为另一个类似列表的格式,如下所示。

Required format

在此表中:

Dir -> Name of  the tab
Year -> Same for entire table
DOM -> Day of the month
DOW -> Day of the week.
Hour -> Column label in original table
Traffic Count -> Value in the original table

有近1000张此类床单。每张表中的数据位于同一位置。做这个的最好方式是什么?我应该写一个VBA脚本还是Excel中的任何东西我可以用来让我的生活更轻松?

1 个答案:

答案 0 :(得分:2)

我使用python和xlrd模块

解决了这个问题
import xlrd
import numpy as np
from os import listdir, chdir
import calendar
import re

direction = {'N':0,'S':1,'E':2,'W':3}
rend = [0, 40, 37, 40, 39, 40, 39, 40, 40, 39, 40, 39, 40]

# Initialize the matrix to store final result
aData = np.matrix(np.zeros((0,7)))

# get all the xlsx files from the directory.
filenames = [ f for f in listdir('.') if f.endswith('.xlsx') ]

# for each .xlsx in the current directory
for file in filenames:
    # The file names are in the format gdot_39446_yyyy_mm.xlsx
    # yyyy is the year and mm is the month number with 0 -Jan and 11 - Dec
    # I extracted the month and year info from the file name

    ms = re.search('.+_.+_.+_([0-9]+)\.',file,re.M).group(1)
    month = int(ms) + 1
    year = int(file[11:15])

    # open the workbook
    workbook = xlrd.open_workbook(file)

    # the workbook has three sheets. I want information from 
    # sheet2 and sheet3 (indexed by 1 adn 2 resp.)
    for i in range(1,3):
        sheet = workbook.sheet_by_index(i)
        di =  sheet.name[-1]
        data = [[sheet.cell_value(r,c) for c in range(2,26)] for r in range(9,rend[month])]

        mData = np.matrix(data)
        mData[np.where(mData=='')] = 0 # some cells are blank. Insert 0 in those cells
        n,d = mData.shape
        rows = n * d

        rData = np.matrix(np.zeros((rows,7)))
        rData[:,0].fill(direction[di])
        rData[:,1].fill(year)
        rData[:,2].fill(month)
        for i in range(rows):
            rData[i,3] = (i/24) + 1
            rData[i,4] = calendar.weekday(year,month,(i/24) + 1)
            rData[i,5] = i%24

        for i in range(n):
            rData[i*24:((i+1)*24),6] = mData[i,:].T

        aData = np.vstack((aData,rData))

    np.savetxt("alldata.csv",aData, delimiter=',', fmt='%s')