我有一堆列,其中包含信用卡号以及从.csv文件读取的其他非信用卡号。我想首先使用正则表达式过滤信用卡号,然后将每个(卡号)值传递给执行Luhn检查以查看是否有效的信用卡的函数。如果该函数返回true,则将信用卡的索引值附加到列表中。我稍后将使用索引值使用.iloc来获取整行。
这是我到目前为止所做的
data = pd.read_csv("fetched_data.csv")
summ = data['summary']
values =np.array(summ)
creditcards = []
regex_match_index_list =[]
Validcardsfound = 0
no_duplicate_list =[]
for i in range(len(values)):
temp = re.findall(r'(\b(?:\d[ -]*?){13,16}\b)',str(values[i]))
if temp:
for each in temp:
if doLuhn(str(each)) is True:
#print ("In the loop")
creditcards.append(each)
Validcardsfound = Validcardsfound + 1
regex_match_index_list.append(i)
elif doLuhn(str(temp)) is False:
pass
#print (str(temp))
else:
pass
我的问题是如何删除重复的卡,然后附加索引值。
提前谢谢!
答案 0 :(得分:0)
使用集合可能是这样做的一种方式:
data = pd.read_csv("fetched_data.csv")
summ = data['summary']
values =np.array(summ)
creditcards = set()
regex_match_index_list =[]
Validcardsfound = 0
no_duplicate_list =[]
for i in range(len(values)):
temp = re.findall(r'(\b(?:\d[ -]*?){13,16}\b)',str(values[i]))
if temp:
for each in temp:
if doLuhn(str(each)):
# Add unique valid credit card numbers to set
if not each in creditcards:
creditcards.add(each) #add new card to set
Validcardsfound = Validcardsfound + 1 #increment number of unique cards found
regex_match_index_list.append(i) #append index of new card found
print(creditcards) # credit cards found
print(regex_match_index_list) # index of credit cards in values array