我有一个SSIS包从.csv文件导入数据。此文件的每个条目都有doulbe引号("
)限定符,但也介于两者之间。我还添加了逗号(,
)作为列分隔符。我无法向您提供我正在使用的原始数据,但这里是一个示例我的数据如何在平面文件源中传递:
"ID-1","A "B"", C, D, E","Today"
"ID-2","A, B, C, D, E,F","Yesterday"
"ID-3","A and nothing else","Today"
正如您所看到的,第二列可以包含引号(和逗号),这些引号会破坏我的SSIS导入并指向此行的错误。 我对正则表达式并不熟悉,但我听说在这种情况下这可能会有所帮助。
在我看来,我需要用单引号("
)替换所有双引号('
),除了......
","
你们这些人可以帮助我吗?太棒了!
提前致谢!
答案 0 :(得分:1)
要根据您的规格用单引号替换双引号,请使用此简单的正则表达式。这个正则表达式将允许行的开头和/或结尾处的空格。
string pattern = @"(?<!^\s*|,)""(?!,""|\s*$)";
string resultString = Regex.Replace(subjectString, pattern, "'", RegexOptions.Multiline);
这是对模式的解释:
// (?<!^\s*|,)"(?!,"|\s*$)
//
// Options: ^ and $ match at line breaks
//
// Assert that it is impossible to match the regex below with the match ending at this position (negative lookbehind) «(?<!^\s*|,)»
// Match either the regular expression below (attempting the next alternative only if this one fails) «^\s*»
// Assert position at the beginning of a line (at beginning of the string or after a line break character) «^»
// Match a single character that is a “whitespace character” (spaces, tabs, and line breaks) «\s*»
// Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
// Or match regular expression number 2 below (the entire group fails if this one fails to match) «,»
// Match the character “,” literally «,»
// Match the character “"” literally «"»
// Assert that it is impossible to match the regex below starting at this position (negative lookahead) «(?!,"|\s*$)»
// Match either the regular expression below (attempting the next alternative only if this one fails) «,"»
// Match the characters “,"” literally «,"»
// Or match regular expression number 2 below (the entire group fails if this one fails to match) «\s*$»
// Match a single character that is a “whitespace character” (spaces, tabs, and line breaks) «\s*»
// Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
// Assert position at the end of a line (at the end of the string or before a line break character) «$»
答案 1 :(得分:0)
答案 2 :(得分:0)
在使用双引号和逗号加载CSV时,有一个限制是添加了额外的双引号,并且数据也附加了双引号,您可以在源文件的预览中查看。 因此,添加派生列任务并给出以下表达式: -
(REPLACE(替换(正确(SUBSTRING(TRIM(COL2),1,LEN(COL2) - 1),LEN(COL2) - 2),“”,“@”), “\”\“”,“\”“),”@“,”“)
粗体部分删除用双引号括起来的数据。
试试这个并告诉我这是否有用
答案 3 :(得分:0)
将文本限定符"
用于CSV目标
在将值插入CSV目标之前,添加派生的列表达式
REPLACE(REPLACE([Column1],",",""),"\"","")
这将在您的文本字段中保留"