我有一个excel表,其中有很多数据包含在sql数据库中的python字典形式的一列中。我无法访问原始数据库,因为CSV的每一行上的键/值的顺序不同,我无法使用本地infile命令将CSV导回到sql中。当我将Excel工作表导出为CSV时,我得到:
"{""first_name"":""John"",""last_name"":""Smith"",""age"":30}"
"{""first_name"":""Tim"",""last_name"":""Johnson"",""age"":34}"
在键/值周围删除“花括号之前和之后以及额外”的最佳方法是什么?
我还需要单独留下没有引号的整数。
我试着用json模块将它导入到python中,这样我就可以打印特定的键,但是我不能用doubled双引号导入它们。我最终需要保存在文件中的数据,如下所示:
{"first_name":"John","last_name":"Smith","age":30}
{"first_name":"Tim","last_name":"Johnson","age":34}
非常感谢任何帮助!
答案 0 :(得分:2)
易:
text = re.sub(r'"(?!")', '', text)
给定输入文件:TEST.TXT:
"{""first_name"":""John"",""last_name"":""Smith"",""age"":30}"
"{""first_name"":""Tim"",""last_name"":""Johnson"",""age"":34}"
剧本:
import re
f = open("TEST.TXT","r")
text_in = f.read()
text_out = re.sub(r'"(?!")', '', text_in)
print(text_out)
产生以下输出:
{"first_name":"John","last_name":"Smith","age":30}
{"first_name":"Tim","last_name":"Johnson","age":34}
答案 1 :(得分:2)
这应该这样做:
with open('old.csv') as old, open('new.csv', 'w') as new:
new.writelines(re.sub(r'"(?!")', '', line) for line in old)
答案 2 :(得分:1)
我认为你是在思考这个问题,为什么不替换数据呢?
l = list()
with open('foo.txt') as f:
for line in f:
l.append(line.replace('""','"').replace('"{','{').replace('}"','}'))
s = ''.join(l)
print s # or save it to file
它生成:
{"first_name":"John","last_name":"Smith","age":30}
{"first_name":"Tim","last_name":"Johnson","age":34}
使用list
存储中间行,然后调用.join
以提高效果,如Good way to append to a string
答案 3 :(得分:1)
如果输入文件如图所示,并且您提到的小尺寸,您可以将整个文件加载到内存中,进行替换,然后保存。恕我直言,你不需要RegEx这样做。执行此操作的最简单的代码是:
with open(filename) as f:
input= f.read()
input= str.replace('""','"')
input= str.replace('"{','{')
input= str.replace('}"','}')
with open(filename, "w") as f:
f.write(input)
我用样本输入测试了它,它产生:
{"first_name":"John","last_name":"Smith","age":30}
{"first_name":"Tim","last_name":"Johnson","age":34}
这正是你想要的。
如果需要,您还可以打包代码并编写
with open(inputFilename) as if:
with open(outputFilename, "w") as of:
of.write(if.read().replace('""','"').replace('"{','{').replace('}"','}'))
但我认为第一个更清晰,两者都完全一样。
答案 4 :(得分:1)
您可以实际使用csv模块和正则表达式来执行此操作:
st='''\
"{""first_name"":""John"",""last_name"":""Smith"",""age"":30}"
"{""first_name"":""Tim"",""last_name"":""Johnson"",""age"":34}"\
'''
import csv, re
data=[]
reader=csv.reader(st, dialect='excel')
for line in reader:
data.extend(line)
s=re.sub(r'(\w+)',r'"\1"',''.join(data))
s=re.sub(r'({[^}]+})',r'\1\n',s).strip()
print s
打印
{"first_name":"John","last_name":"Smith","age":"30"}
{"first_name":"Tim","last_name":"Johnson","age":"34"}