e.g。有“2,881,423”,如何从中删除“,”。我有数百万的数据需要完成。是否可以进行批量操作?所以我可以使用任何工具用于PC for Mac。
"Position","Value",
"1","1",
"2","1",
"3","1",
"4","2",
"5","2",
...
"2,881,423","19",
"2,881,424","22",
"2,881,425","23",
"2,881,426","23",
"2,881,427","25",
"2,881,428","25",
"2,881,429","25",
...
上面是csv中的一些内容。
答案 0 :(得分:2)
以下代码将完成这项工作 - 它将遍历文件夹中包含给定掩码的所有文件:
Sub RemoveCommas()
Dim RegX_Comma As Object
'
Dim FileStream As Object
Dim FileContent As String
Dim SourceFolder As String
Dim FileName As String
'
Set RegX_Comma = CreateObject("VBScript.RegExp")
RegX_Comma.Pattern = "(?<=\d),(?=\d)" 'Comma between any digits
RegX_Comma.IgnoreCase = True
RegX_Comma.Global = True
Set FileStream = CreateObject("ADODB.Stream")
SourceFolder = "D:\DOCUMENTS\" 'Must be specified with trailing "\"
FileName = Dir(InputFolder & "*.txt") 'Specify ANY mask using wildcards, e.g. "*.csv*
Do While FileName <> ""
FileStream.Open
FileStream.Charset = "ASCII" 'Change encoding as required
FileStream.LoadFromFile (SourceFolder & FileName)
FileContent = RegX_Comma.Replace(FileStream.ReadText, "")
FileStream.Position = 0
FileStream.WriteText FileContent
FileStream.SetEOS
FileStream.SaveToFile SourceFolder & FileName, 2 'Will overwrite the existing file
FileStream.Close
FileName = Dir
Loop
End Sub
根据内联注释对代码进行必要的修改。
祝你好运!)答案 1 :(得分:2)
在Python中:
import csv
with open("myfile.csv", "rb") as infile, open("output.csv", "wb") as outfile:
reader = csv.reader(infile)
writer = csv.writer(outfile)
for row in reader:
writer.writerow(item.replace(",", "") for item in row)
答案 2 :(得分:0)
由于你的目标是使用R中的数据,你可以在将数据读入R后进行替换:
df <- Path/To/File.csv
df$varname <- as.numeric(gsub(",", "", df$varname))
其中df
是您的数据框,varname
是变量的名称。这不会检查逗号是否在两位数之间,因此您需要确保只将想要数字的变量传递给此数字,而不是传递逗号实际上是数据的任何字符串列。
这是一个类似的问题,询问如何从R中解决问题:
How to read data when some numbers contain commas as thousand separator?