例如我有一个元组,如
tup = [['P Y T F EY EN', 'p y t h o n'], ['R O K', 'r o x']]
然后我将元组分成诸如
之类的列表lst1 = [['P', 'Y', 'T', 'F', 'EY', 'EN'], ['R', 'O', 'K']]
lst2 = [['p', 'y', 't', 'h', 'o', 'n'], ['r', 'o', 'x']]
我有三个条件如下: 首先,元组中第一个元素的长度必须等于第二个元素的长度
for i in tup:
if not len(tup[0].split()) == len(tup[1].split()) :
count +=1
break
第二个条件是对于lst1中的每个元素,元素中的每个字符必须位于另一个文档中,例如csv文件
for i in lst1:
for j in i:
if j not in file:
count+=1
break
第三个条件是lst2中的每个元素,每个字符也必须在另一个文档中
for i in lst2:
for j in i:
if j not in other_file:
count+=1
break
正如你所看到的,我希望每当其中一个条件被打破时,计数就会增加。我也不希望计数重叠,如果条件在追加计数时被破坏,则跳到下一行。
答案 0 :(得分:0)
也许这会有所帮助:
我假设文件足够小,可以一次性读取:
f = open('doc1.csv', 'r') # read all of doc1.csv now
doc1 = f.read()
f.close()
f = open('doc2.csv', 'r') # read all of doc2.csv now
doc2 = f.read()
f.close()
count = 0 # count of all docs that are invalid
for item in tup:
l1 = item[0].split() # get list version of first and string
l2 = item[1].split()
if len(l1) != len(l2) or not all([char in doc1 for char in l1]) or not all([char in doc2 for char in l2]): # check if lengths are same, if any character in l1 is not in doc1, and any char in l2 is not in doc2
count += 1
print count
答案 1 :(得分:0)
首先,您的示例有两个问题:
1) tup is a list, not tuple;
2) tup[0] = ['P Y T F EY EN', 'p y t h o n']; tup[1] = ['R O K', 'r o x'];
Both of them are list, and cannot do split()
如果您想计算总计数,可以在以下一个语句中进行:
print sum([ not len(i[0].split()) == len(i[1].split()) for i in tup ] + \
[ j not in file for j in i for i in lst1 ] + \
[ j not in other_file for j in i for i in lst2 ])