我有很多excel表,其中数据排列在表格中,如下所示
我想将每个表转换为另一个类似列表的格式,如下所示。
在此表中:
Dir -> Name of the tab
Year -> Same for entire table
DOM -> Day of the month
DOW -> Day of the week.
Hour -> Column label in original table
Traffic Count -> Value in the original table
有近1000张此类床单。每张表中的数据位于同一位置。做这个的最好方式是什么?我应该写一个VBA脚本还是Excel中的任何东西我可以用来让我的生活更轻松?
答案 0 :(得分:2)
我使用python和xlrd模块
解决了这个问题import xlrd
import numpy as np
from os import listdir, chdir
import calendar
import re
direction = {'N':0,'S':1,'E':2,'W':3}
rend = [0, 40, 37, 40, 39, 40, 39, 40, 40, 39, 40, 39, 40]
# Initialize the matrix to store final result
aData = np.matrix(np.zeros((0,7)))
# get all the xlsx files from the directory.
filenames = [ f for f in listdir('.') if f.endswith('.xlsx') ]
# for each .xlsx in the current directory
for file in filenames:
# The file names are in the format gdot_39446_yyyy_mm.xlsx
# yyyy is the year and mm is the month number with 0 -Jan and 11 - Dec
# I extracted the month and year info from the file name
ms = re.search('.+_.+_.+_([0-9]+)\.',file,re.M).group(1)
month = int(ms) + 1
year = int(file[11:15])
# open the workbook
workbook = xlrd.open_workbook(file)
# the workbook has three sheets. I want information from
# sheet2 and sheet3 (indexed by 1 adn 2 resp.)
for i in range(1,3):
sheet = workbook.sheet_by_index(i)
di = sheet.name[-1]
data = [[sheet.cell_value(r,c) for c in range(2,26)] for r in range(9,rend[month])]
mData = np.matrix(data)
mData[np.where(mData=='')] = 0 # some cells are blank. Insert 0 in those cells
n,d = mData.shape
rows = n * d
rData = np.matrix(np.zeros((rows,7)))
rData[:,0].fill(direction[di])
rData[:,1].fill(year)
rData[:,2].fill(month)
for i in range(rows):
rData[i,3] = (i/24) + 1
rData[i,4] = calendar.weekday(year,month,(i/24) + 1)
rData[i,5] = i%24
for i in range(n):
rData[i*24:((i+1)*24),6] = mData[i,:].T
aData = np.vstack((aData,rData))
np.savetxt("alldata.csv",aData, delimiter=',', fmt='%s')