这是根据引用“客户投诉”的在线数据集改编而成的。在Excel和Notepad ++中修改了数据。这种操作在字符串“ VALUES(X)”之后的每个“索引数字” [1,2,3 ...]之后直接产生了一组“额外”引号。我只想删除此“额外引号”,然后维护顺序索引号,范围从一位到五位数字,这是为使用具有135万行代码的专有数据库做的准备。
对Regex的这种笨拙的改编将“查找”一个包含引号的字符串,但是保持索引号的“替换”代码使我难以理解。任何帮助将不胜感激。
REGEX
\s\(([0-9])",|\s\(([0-9][0-9])",|\s\(([0-9][0-9][0-9])",|\s\(([0-9][0-9][0-9][0-9])",|\s\(([0-9][0-9][0-9][0-9][0-9])",
数据字符串
INSERT INTO Complaints VALUES (1","2013-07-29","consumer loan","managing the loan or lease","Wells Fargo & Company","VA","24540","phone","2013-07-30","closed with explanation","468882");
INSERT INTO Complaints VALUES (2","2013-07-29","bank account or service","using a debit or ATM card","Wells Fargo & Company","CA","95992","web","2013-07-31","closed with explanation","468889");
INSERT INTO Complaints VALUES (3","2013-07-29","bank account or service","account opening, closing, or management","Santander Bank US","NY","10065","fax","2013-07-31","closed","468879");
答案 0 :(得分:1)
查找VALUES \((\d+)"
-内括号将捕获数字(\d
一次或多次(+
),直到遇到"
。
然后您可以替换为VALUES \($1
,其中$1
是相应的捕获值。
答案 1 :(得分:1)