首先,我想强调一点,我是python的初学者,下面的代码我用来操作CSV中的一些数据。我知道这不是最漂亮的代码,可能我可以让它更优雅,但它有效,直到某一点,这就是我打开这个问题的原因
import csv
from numpy import interp
from operator import sub
import math
import pandas as pd
from Tkinter import *
import Tkinter as tk
import tkFileDialog as filedialog
root = Tk()
root.withdraw()
filename= filedialog.askopenfilename( initialdir="C:/", title="select file", filetypes=(("CSV files", "*.CSV"), ("all files", "*.*")))
id_uri = []
ore = []
minute = []
zile = []
activi = []
listx = []
listsa = []
list_ore = []
listspi = []
listspf = []
list_min = []
zile_luna = 0
test = []
nume = []
with open (filename) as p, open ('activi.csv') as a:
reader = csv.reader(p,delimiter=',')
for row in reader:
id_uri.append(row[0])
ore.append(row[1])
minute.append(row[2])
zile.append(row[3])
reader = csv.reader(a)
for row in reader:
activi.append(row[0])
nume.append(row[1])
id_uri = map(int, id_uri)
ore = map(float, ore)
minute = map(float, minute)
minute = interp(minute,[0,60],[0,100])
ore = ore + minute/100
zile = map(int, zile)
activi = map(int, activi)
zile_luna = len(set(zile))+1
mimin = 0
maxim = 0
def pontaj():
global listx
global listsa
global listspi
global listspf
global list_ore
global list_min
global maxim
global minim
for x in range(3):
for y in range(len(id_uri)):
if zile[y] == z:
if activi[x] == id_uri[y]:
listx.append(ore[y])
minim = min(listx)
maxim = max(listx)
listsa.append(maxim-minim)
listx = []
listspi = [int(i) for i in listsa]
listspf = [i%1 for i in listsa]
for i in range(len(listspf)):
listspf[i] = round(listspf[i], 2)
listspf[i] = listspf[i]*100
listspf[i] = interp(listspf[i],[0,100],[0,60])
listspf[i] = int(listspf[i])
list_ore.append(listspi)
list_min.append(listspf)
listsa = []
for z in range(1,zile_luna):
pontaj()
for sublst in list_ore:
for item in range(len(sublst)):
sublst[item] = str(sublst[item])
for sublst in list_min:
for item in range(len(sublst)):
sublst[item] = str(sublst[item])
for i in range(len(list_ore)):
for j in range(len(list_ore[i])):
list_ore[i][j] = ' '.join(i + ':' + j for i,j in zip(list_ore[i][j],list_min[i][j]))
df = pd.DataFrame(list_ore)
df = df.T
nume = pd.Series(nume)
df['e'] = nume.values
df.to_csv('pontaj.csv', index = False, header = False)
print df
和CSV文件我读了所有信息,如下所示(员工代码,小时,分钟,日):
23,5,00,1
23,6,00,1
24,7,00,1
25,8,00,1
24,9,00,1
25,11,00,1
24,7,00,2
25,8,00,2
24,9,00,2
25,11,00,2
23,5,00,4
23,6,00,4
24,7,00,4
25,8,00,4
24,9,00,4
25,11,00,4
我有另一个CSV文件,其员工代码如下所示:
23,aqwe
24,beww
25,cwww
基本上它是一个考勤记录器,它将一个CSV的信息与另一个CSV进行比较,查找某一天的最小和最大小时数,从最大值减去min并将此信息写入写入另一个csv的列表中。
事情是,如果所有员工都参加某一天,一切顺利,它计算出勤时间,将它们放入csv,一切顺利。但如果一名员工跳过一天会发生什么?正如我发现的那样,它破坏了计算,因为代码要求所有数据必须一致并且处于完美的顺序。
写入CSV文件的数据最终必须如下所示:
day1 day2 day3
hours hours hours employee_a
hours hours hours employee_b
hours hours hours employee_c
但是,如果一天跳过,那么小时就会被打乱。
我尝试了一些不同的方法,但都没有用,我意识到问题是由于我简单的思维方式,但正如我所说,我几天前才开始使用python。
对于如何改进代码以考虑某个员工的错过日期以及如此生成数据,您有什么建议:
day1 day2 day3
1:20 2:30 3:40 employee_a
1:20 2:30 3:40 employee_b
0:0 2:30 3:40 employee_c
任何建议都将不胜感激,谢谢!