从csv中删除不需要的逗号

时间:2013-01-17 16:08:25

标签: csv

e.g。有“2,881,423”,如何从中删除“,”。我有数百万的数据需要完成。是否可以进行批量操作?所以我可以使用任何工具用于PC for Mac。

"Position","Value",
"1","1",
"2","1",
"3","1",
"4","2",
"5","2",

...

"2,881,423","19",
"2,881,424","22",
"2,881,425","23",
"2,881,426","23",
"2,881,427","25",
"2,881,428","25",
"2,881,429","25",

...

上面是csv中的一些内容。

3 个答案:

答案 0 :(得分:2)

以下代码将完成这项工作 - 它将遍历文件夹中包含给定掩码的所有文件:

Sub RemoveCommas()

Dim RegX_Comma As Object
'
Dim FileStream As Object
Dim FileContent As String
Dim SourceFolder As String
Dim FileName As String
'
Set RegX_Comma = CreateObject("VBScript.RegExp")
RegX_Comma.Pattern = "(?<=\d),(?=\d)" 'Comma between any digits
RegX_Comma.IgnoreCase = True
RegX_Comma.Global = True

Set FileStream = CreateObject("ADODB.Stream")
SourceFolder = "D:\DOCUMENTS\" 'Must be specified with trailing "\"

FileName = Dir(InputFolder & "*.txt") 'Specify ANY mask using wildcards, e.g. "*.csv*
Do While FileName <> ""

    FileStream.Open
    FileStream.Charset = "ASCII" 'Change encoding as required
    FileStream.LoadFromFile (SourceFolder & FileName)
    FileContent = RegX_Comma.Replace(FileStream.ReadText, "")
    FileStream.Position = 0
    FileStream.WriteText FileContent
    FileStream.SetEOS
    FileStream.SaveToFile SourceFolder & FileName, 2 'Will overwrite the existing file
    FileStream.Close

FileName = Dir
Loop

End Sub

根据内联注释对代码进行必要的修改。

祝你好运!)

答案 1 :(得分:2)

在Python中:

import csv
with open("myfile.csv", "rb") as infile, open("output.csv", "wb") as outfile:
    reader = csv.reader(infile)
    writer = csv.writer(outfile)
    for row in reader:
        writer.writerow(item.replace(",", "") for item in row)

答案 2 :(得分:0)

由于你的目标是使用R中的数据,你可以在将数据读入R后进行替换:

df <- Path/To/File.csv
df$varname <- as.numeric(gsub(",", "", df$varname))

其中df是您的数据框,varname是变量的名称。这不会检查逗号是否在两位数之间,因此您需要确保只将想要数字的变量传递给此数字,而不是传递逗号实际上是数据的任何字符串列。

这是一个类似的问题,询问如何从R中解决问题:

How to read data when some numbers contain commas as thousand separator?