我试图确切地了解pandas.read_csv模块中quoting
和doublequote
的含义。假设我有以下数据:
['name' ,'age' ,'position']
['tom', 14, 'vp']
['jared', 100, 'head, sales']
熊猫有三个quoting
选项:
QUOTE_MINIMAL(0)[默认]
QUOTE_ALL(1)
QUOTE_NONNUMERIC(2)
QUOTE_NONE(3)
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html
如果quotechar
是"
,以上四个参数将如何解释以上数据?
答案 0 :(得分:3)
您可以尝试执行以下操作,以与csv
作家进行测试:
import csv
DATA = [
['name' ,'age' ,'position'],
['tom', 14, 'vp'],
['jared', 100, 'head, sales'],
]
with open('test_min.csv', 'w') as csvfile:
writer = csv.writer(csvfile, delimiter=',',quotechar='"', quoting=csv.QUOTE_MINIMAL)
for row in DATA:
writer.writerow(row)
with open('test_all.csv', 'w') as csvfile:
writer = csv.writer(csvfile, delimiter=',',quotechar='"', quoting=csv.QUOTE_ALL)
for row in DATA:
writer.writerow(row)
with open('test_nonnumeric.csv', 'w') as csvfile:
writer = csv.writer(csvfile, delimiter=',',quotechar='"', quoting=csv.QUOTE_NONNUMERIC)
for row in DATA:
writer.writerow(row)
with open('test_quotenone.csv', 'w') as csvfile:
writer = csv.writer(csvfile, delimiter=',',quotechar='"', quoting=csv.QUOTE_NONE)
for row in DATA:
writer.writerow(row)
这是您将看到的:
QUOTE_NONE
name,age,position
tom,14,vp
注释:无效的输出,将需要一个转义字符集。
QUOTE_NONNUMERIC
"name","age","position"
"tom",14,"vp"
"jared",100,"head, sales"
注意事项:请注意,14
和100
并未转义。
QUOTE_MINIMUM
name,age,position
tom,14,vp
jared,100,"head, sales"
注释:仅引用head, sales
,因为它是有问题的字段。
QUOTE_ALL
"name","age","position"
"tom","14","vp"
"jared","100","head, sales"
注释:无论类型如何,所有内容都被引用。