我有一个python程序,它读取excel文档。我只需要允许某些列组合的第一次出现。 例如:
{%if sub_course_grade.grade is None%}
<script>
window.onload = function(){
reason_field = document.getElementById("id_reason");
reason_field.parentElement.style.display = "none";
}
</script>
{%else%}
<script>
window.onload = function(){
reason_field = document.getElementById("id_reason");
reason_field.setAttribute("required", "");
}
</script>
{%endif%}
我想删除/跳过复制找到的第三行并将其写入CSV文件。 这是我到目前为止所尝试的功能。但它没有用。
A | B
-------------
1. 200 | 201
2. 200 | 202
3. 200 | 201
4. 200 | 203
5. 201 | 201
6. 201 | 202
.............
答案 0 :(得分:2)
mylist = []
使用了两次,分配单个值会使其变得困难。应该是这样的:
mylist = []
for row in range(1, number_of_rows):
mylist.append((sheet.cell_value(row, 0), sheet.cell_value(row, 1)))
myset = set(mylist)
请注意,set
未订购。如果您想按顺序查看结果,请同时检查this。
答案 1 :(得分:2)
它对我有用:在python 2.7中
def validateExcel(filename):
xls=xlrd.open_workbook(filename)
setcount = 0
column = 0
count = 0
# sheetcount = 0
for sheet in xls.sheets():
header=""
# sheetcount = sheetcount + 1
number_of_rows = sheet.nrows
number_of_columns = sheet.ncols
sheetname = sheet.name
mylist = []
for row in range(1, number_of_rows):
mylist.append((sheet.cell_value(row, 0), sheet.cell_value(row, 1)))
myset = sorted(set(mylist), key=mylist.index)
return myset
答案 2 :(得分:2)
这是我的解决方案。删除重复项并创建一个没有重复项的新文件。
double approx(vector<Point> const& pts)
答案 3 :(得分:1)
这应该将行(在本例中称为子列表)附加到mylist
列表中(如果尚未放入)。这应该按照在xlsx文件中找到的顺序为您提供重复数据删除的列表。如果可以,可能值得一看pandas库。如果没有,这应该有所帮助:
def validateExcel(filename):
xls=xlrd.open_workbook(filename)
for sheet in xls.sheets():
header=""
number_of_rows = sheet.nrows
number_of_columns = sheet.ncols
sheetname = sheet.name
mylist = []
for row in range (1, number_of_rows):
sublist = [sheet.cell_value(row, col) for col in range(0, number_of_cols)]
if sublist not in mylist:
mylist.append(sublist)
print mylist
return mylist
编辑:
如果您有一个包含多个工作表的xlsx
文件,您可以使用dict存储重复数据删除的行数据,并将工作表名称作为键,然后将该dict传递给csv写入函数:< / p>
def validateExcel(filename):
outputDict = {}
xls=xlrd.open_workbook(filename)
sheetCount = 0
for sheet in xls.sheets():
number_of_rows = sheet.nrows
number_of_columns = sheet.ncols
sheetname = sheet.name
if not sheetname:
sheetname = str(sheetCount)
outputDict[str(sheetCount)] = []
for row in range (1, number_of_rows):
sublist = [sheet.cell_value(row, col) for col in in range(0,number_of_cols)]
if sublist not in outputDict[sheetname]:
outputDict[sheetname].append(sublist)
print outputDict[sheetname]
sheetCount += 1
return outputDict
# will go through the generated dictionary and write the data to csv files
def writeToFiles(generatedDictionary):
for key generatedDictionary:
with open(key + ".csv") as csvFile:
writer = csv.writer(csvFile)
writer.writerows(generatedDictionary[key])
如果你可以使用pandas,这样的东西可以起作用:
import pandas as pd
df = pd.read_excel(filename)
for name in df.sheetnames:
sheetDataFrame = df.parse(name)
filtered = sheetDataFrame.drop_duplicates()
filtered.to_csv(name + ".csv")