Question

如果字段与其他字段部分匹配，我想删除CSV文件中的行。

例如：

import pyodbc
driver= '{SQL Server Native Client 11.0}'

cnxn = pyodbc.connect(
    Trusted_Connection='Yes',
    Driver='{ODBC Driver 11 for SQL Server}',
    Server='MyServer,1433',
    Database='MyDB'
)

我想这三个只有一个条目。它应该只返回：

serial       book name                     author     

1.          Ramakrishna Kathamrita Vol1     Sri M     
2.          Ramakrishna Kathamrita Vol2     Sri M     
3.          Ramakrishna Kathamrita Vol3     Sri M

我们有什么办法可以在Python中做到这一点吗？

修改：（29-12-2017 17:05）

很抱歉不清楚。

我们可以设置以下标准。

如果图书名称包含serial book name author 1. Ramakrishna Kathamrita Vol1 Sri M个字词，则至少第一个n字词应匹配。
如果满足n-1，则会在询问用户时删除该行。

这个想法很明显：

1.

我们也可能会得到字数：

my_string1 = "Ramakrishna Kathamrita Vol1"
my_string2 = "Ramakrishna Kathamrita Vol2"    

splitted1 = my_string1.split()
splitted2 = my_string2.split()

if(splitted1[0] = splitted2[0] & splitted1[1] = splitted2[1])
     then ask the user whether to delete the row;wait for 'y/n'

现在我们如何实现它1）对于CSV 2）在询问时删除行？

Answer 1

如果某个字段与其他字段部分匹配。

您可以使用字符串距离算法。 StringDist模块可能很有用，但您需要定义similarity条件的标准。

如果字段与另一个字段部分匹配，则删除CSV中的行

1 个答案: