Question

我正在学习Python，具有C语言背景。对不起，如果我的问题是“天真”或“太简单”或“工作不够”。

在下面的代码中，我想练习未来的问题，通过'set'数据结构删除特定的行。但是，首先：它无法匹配删除设置内容。

另外，第二个问题：是o / p中的错误。这可以通过使缩进块工作来检查。

修剪后的数据文件为： marks_trim.csv

“Anaconda Systems Campus Placement”,,,,,,

“进行于：”,,,“2011年2月30日”,,,

“斯诺”， “数学”， “CS”， “GK”， “PROG”， “通讯”， “SEL”

1， “NA”， “NA”， “NA”，4,0,0

import csv, sys, re, random, os, time, io, StringIO

datfile = sys.argv[1] 

outfileName = sys.argv[2]

outfile = open(outfileName, "w")

count = 0

removal_list = set()

tmp = list()

i=0

re_pattern = "\d+" 

with open(datfile, 'r') as fp:

    reader1 = csv.reader(fp)
    for row in reader1:
        if re.match(re_pattern, row[0]):
             for cols in row:  
                    removal_list.add(tuple(cols)) #as tuple is hashable

print "::row>>>>>>",row

print "::removal_list>>>>>>>>",removal_list

convert = list(removal_list)

print "<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>"

print convert

f = open(datfile, 'r')

reader2 = csv.reader(f)

print ""

print "Removal List Starts"

print removal_list

print "Removal List Ends\n"

new_a_buf =  io.BytesIO() # StringIO.StringIO() : both 'io' & StringIO' work

writer = csv.writer(new_a_buf)

rr =""

j  = 0

for row in reader2:

    if row not in convert:   # removal_list: not used as list not hashable

          writer.writerow(row)  #outfile.write(new_a_buf)

         '''
 #below code using char array isn't used as it doesn't copy structure of csv file

    for cols in row:  #at indentation level of "if row not in  convert", stated above

          if cols not in convert:   # removal_list: not used as list not hashable

              for j in range(0,len(cols)):

                   rr+=cols[j]  #at indentation level of "if cols not in convert:"

         outfile.write(rr)  # at the indentation level of 'if'

         print "<<<<<<<<<<<<<<<<", rr


f = open(outfile, 'r')

reader2 = csv.reader(f)


       '''

new_a_buf.seek(0)

reader2 = csv.reader(new_a_buf)

for row in reader2:

      print row

问题/问题：

o / p中的常见错误（即使用char数组/ csv.writer对象）也会删除行，即在removal_list中出现。

但是，在使用char数组检索遗漏行的方法中，错误是：

追踪（最近一次呼叫最后一次）：

文件“test_list_in_set.py”，第51行，

f = open（outfile，'r'）

TypeError：强制转换为Unicode：需要字符串或缓冲区，找到文件

Answer 1

我没有阅读所有代码 - 但它似乎并不相关。该错误与打开文件有关：open采用文件名，但您传递的是outfile，它已经是一个文件。您应首先关闭该文件，然后将outfileName传递给打开。

Answer 2

得到它，可悲的是我自己。除了不存储indl的变化。 cols，将remove_list更改为数组，然后使用＆gt;追加到数组; removal_list.append（row）

撞击！

Python TypeError：强制转换为Unicode：需要字符串或缓冲区，找到文件

2 个答案: