Question

我有一堆列，其中包含信用卡号以及从.csv文件读取的其他非信用卡号。我想首先使用正则表达式过滤信用卡号，然后将每个（卡号）值传递给执行Luhn检查以查看是否有效的信用卡的函数。如果该函数返回true，则将信用卡的索引值附加到列表中。我稍后将使用索引值使用.iloc来获取整行。

这是我到目前为止所做的

  data = pd.read_csv("fetched_data.csv")
  summ = data['summary']
  values =np.array(summ)
  creditcards = []
  regex_match_index_list =[]
  Validcardsfound = 0
  no_duplicate_list =[]
  for i in range(len(values)):
    temp = re.findall(r'(\b(?:\d[ -]*?){13,16}\b)',str(values[i]))
    if temp:
        for each in temp:
            if doLuhn(str(each)) is True:
                #print ("In the loop")
                creditcards.append(each)

                Validcardsfound = Validcardsfound + 1
                regex_match_index_list.append(i)

            elif doLuhn(str(temp)) is False:
                pass



      #print (str(temp))
    else:
        pass

我的问题是如何删除重复的卡，然后附加索引值。

提前谢谢！

Answer 1

使用集合可能是这样做的一种方式：

data = pd.read_csv("fetched_data.csv")
summ = data['summary']
values =np.array(summ)

creditcards = set()
regex_match_index_list =[]
Validcardsfound = 0
no_duplicate_list =[]

for i in range(len(values)):

    temp = re.findall(r'(\b(?:\d[ -]*?){13,16}\b)',str(values[i]))

    if temp:
        for each in temp:
            if doLuhn(str(each)):
                # Add unique valid credit card numbers to set
                if not each in creditcards:
                    creditcards.add(each)   #add new card to set
                    Validcardsfound = Validcardsfound + 1  #increment number of unique cards found
                    regex_match_index_list.append(i) #append index of new card found

print(creditcards) # credit cards found
print(regex_match_index_list) # index of credit cards in values array

在循环中检查重复项后如何在列表中追加元素？

1 个答案: