Question

我有一个包含5列和3行的csv文件。列以制表符分隔，行由新行分隔。有些元素是空的。我必须找到所有行都为空的列。该文件在这里：

我的代码如下。问题是它不适用于最后一列，即如果最后一列为空，或者在行中最后一个选项卡之后的最后一列中没有值，它仍然被计为非空字符串。我检查了＆＃34; eachElement＆＃34;的长度。奇怪的是，长度在第1行和第2行显示1，但显示第3行的空字符串。似乎它计算前两行最后一列中最后一个标签后的新行（因此长度为1），但逻辑上它不应该因为我使用＆＃34;表示内容中的行＆＃34;。因此，每一行只应包含没有＆＃34; \ n＆＃34;

的那一行

import sys
import array

rowIndex = -1
countEmptyCol = array.array('i',(0 for i in range(0,5)))    #this creates an unsigned int array of 58 elements and assigns 0 for each
listEmptyColumns = []   #contains index of columns that are empty for all records

#Get number of empty values for each columns in the array
with open("D:\TU Ilmenau\L1T2\Labs\DDM\Python\database.csv", "r", 1) as file:
    content = file.readlines()
    for line in content:
        rowIndex += 1
        colIndex = -1
        for eachElement in line.split("\t"):
            colIndex += 1
            if not eachElement:
                #increases the value of index by 1
                countEmptyCol.insert(colIndex, countEmptyCol.pop(colIndex) + 1)

numOfRows = rowIndex + 1

#Compare if number of empty values for each column is equal to the number of total rows
for idx, val in enumerate(countEmptyCol):
    if val == numOfRows:
        listEmptyColumns.append(idx)
print listEmptyColumns

Answer 1

line最后包含换行符\n。在for循环中删除它：

for line in content:
    line = line.rstrip('\n')
    rowIndex += 1
    colIndex = -1
    ...

我尝试了这个并且它有效。

使用python

1 个答案: