1。导入text.csv文件

Question

我已经编写了一些代码，以创建一个函数来计算一个子字符串出现在字符串中的次数，该函数的第二部分应该返回，即每个子字符串的索引。

该字符串存储在.csv文件中，而当我尝试返回子字符串的索引时，我遇到了问题。

1。导入text.csv文件

import csv
data = open('text.csv', 'r')
read_data = csv.reader(data)

2。完整的功能计数器。该函数应返回子字符串出现的次数及其索引

def counter(substring):
  ss_counter = 0
  for substring in read_data:
    ss_counter = ss_counter + 1
    print('Counter = ', ss_counter)
    print('Index = ', substring.index)

3。不要编辑下面的代码

counter("TCA")

我从.index得到的错误是

列表对象的内置方法索引位于0x7f4519700208

Answer 1

假设您修复了我在评论中提到的问题，那么它将无法按您的预期工作。

read_data变量是一个将遍历文件各行的对象（请参见documentation here）。因此，在函数内部，当您执行for substring in read_data时，子字符串变量（将覆盖该参数）包含单独的一行，其中每个元素都是逗号分隔的值。

检查the documentation中的list方法index。您必须将要查找的子字符串传递到index()函数中，并放在列表中。但是因为您覆盖了要查找的子字符串，所以现在不可能了。

请注意，子字符串可能会出现两次或更多次，而您当前的代码并未考虑到这一点。

一个解决方案可能是：

def counter(substring):
    ss_counter = 0
    for row in read_data:
        ss_counter = ss_counter + 1
        print('Counter = ', ss_counter)
        print('Index = ', row.index(substring))

我将让您弄清楚如何考虑给定行中的多次出现。

Answer 2

您的for循环将substring覆盖为变量，因此在for循环中，子字符串实际上是指csv文件中的一行，而不是您希望搜索的原始子字符串。您想遍历每一行的read_data，然后遍历每一行以寻找该行中每个可能的起点（从index = 0开始）的匹配项。我还建议您将read_data作为第二个参数而不是使用全局变量。请注意，此函数会将重叠的子字符串计为两个单独的子字符串（即，如果substring = 'aa'和read_data = ['aaaa']，则表示第一行中出现了三个子字符串）。

def counter(substring, readData):
    ss_counter = 0
    # Iterate through the read_data string from index 0 to the nth to the last index, with n = length of the substring
    for row in read_data:      
        for i in range(0, len(row) - len(substring) + 1):
            if row[i:i+len(substring)] == substring:
                ss_counter += 1
                print('Counter = ', ss_counter)
                print('Index = ', i)

counter(substring, read_data)

编辑：将read_data更改为行列表（字符串列表）。

在CSV文件中查找子字符串的索引

1。导入text.csv文件

2。完整的功能计数器。该函数应返回子字符串出现的次数及其索引

3。不要编辑下面的代码

2 个答案: