CSV文件内容,使用RegEx替换

时间:2017-07-13 11:51:28

标签: csv coldfusion coldfusion-9 cfml lucee

我有一个CSV文件,其中包含如下所示的行:

"Jakins, Ann-Margaret",Ms.,Ann-Margaret, ,Jakins,Ms. Ann-Margaret Jakins,""Callawera"Property""Callawera"Property""allawera",Thallon,4497,Australia,Queensland

有没有办法在两个引号之间删除字段"Callawera Property Callawera Property allawer"的引号?是否有正则表达式来选择两个引用值之间的内容,因此结果如下:

"Jakins, Ann-Margaret",Ms.,Ann-Margaret, ,Jakins,Ms. Ann-Margaret Jakins,"Callawera Property Callawera Property allawera",Thallon,4497,Australia,Queensland

2 个答案:

答案 0 :(得分:1)

如果您已经整理了整个文件,则无法在两个引号之间修改引号。任何正则表达式(或者甚至是人类)都不可能始终如一地可靠地确定CSV中每个字段的预期边界。

虽然可以使用适用于有限数据集的RegEx来提取技巧,但通常并不适用于所有可能的数据集。

因此,您必须在文件汇编时执行此操作。在每个字段上,将所有"(双引号)替换为""(两个双引号)并确保整个字段都用双引号括起来。

答案 1 :(得分:0)

CSV是一种有些非标准的格式。 Excel可以读取带双引号和回车的文件,但通常无法正确导出。 ColdFusion是一样的...除非你使用opencsv,一个用于Java的CSV(逗号分隔值)解析器库。

如果您还没有使用过java,那么opencsv可能会有点复杂。我在这里写了一个演示脚本: https://gist.github.com/JamoCA/6062864

注意:如果您使用Windows,CSVed是一个功能强大的CSV文件编辑器,可以处理任何CSV文件,并使用任何分隔符分隔。

<!---
Convert CSV file to a ColdFusion query object using opencsv.
    Requirements:
    - ColdFusion 8+  ( http://en.wikipedia.org/wiki/Adobe_ColdFusion )
    - opencsv - free parser library for Java ( http://opencsv.sourceforge.net/ ) 
        http://opencsv.sourceforge.net/     
        opencsv supports all the basic csv-type things you're likely to want to do:
        - Arbitrary numbers of values per line
        - Ignoring commas in quoted elements
        - Handling quoted entries with embedded carriage returns (ie entries that span multiple lines)
        - Configurable separator and quote characters (or use sensible defaults)
        - Read all the entries at once, or use an Iterator style model
        - Creating csv files from String[] (ie. automatic escaping of embedded quote chars)

NOTE: To use opencsv in ColdFusion:
    - copy "opencsv-2.3.jar" to "ColdFusion Class Path" (in CFAdmin > Server Settings > Java and JVM)
    - Specifying custom Java library path in the Application.cfc without dynamic loading
      http://help.adobe.com/en_US/ColdFusion/10.0/Developing/WSe61e35da8d318518-106e125d1353e804331-7ffe.html
    - Use JavaLoader http://javaloader.riaforge.org/
--->

<!--- Configure CSV file & delimiter --->
<cfset CSVFile = "c:\sampleCSVFile.csv">
<cfset Delimiter = ",">

<!--- Read file using opencsv --->
<cfscript>
fileReader = createobject("java","java.io.FileReader");
fileReader.init(CSVFile);
csvReader = createObject("java","au.com.bytecode.opencsv.CSVReader");
csvReader.init(fileReader, Delimiter);
ArrData = csvReader.readAll();
csvReader.close();
fileReader.close();
</cfscript>

<!--- Determine if any records exist --->
<cfif not arraylen(ArrData)>
     <p>No data in file.</p>
     <cfexit>
<cfelseif arraylen(ArrData) lt 1>
     <p>No records.</p>
     <cfexit>
</cfif>

<!--- Convert 2 dimensional array of rows & columns to a ColdFusion query --->
<cfscript>
GetResults = QueryNew(ArrayToList(ArrData[1]));
Rows = arraylen(ArrData);
Fields = arraylen(ArrData[1]);
for(thisRow=2; thisRow lte Rows; thisRow = thisRow + 1){
    queryaddrow(GetResults);
    for(thisField=1; thisField lte Fields; thisField = thisField + 1){
        QuerySetCell(GetResults, ArrData[1][thisfield], ArrData[thisRow][thisfield]);
    }
}
</cfscript>

<cfsetting enablecfoutputonly="No">
<cfdump var="#GetResults#">