我有一个csv文件,其中的列全部排成一行,用引号引起来并用逗号分隔。列在一行中。
csv中的行用逗号分隔,如果有2个逗号,则表示缺少值。我想通过这些参数来分隔这些列。如果该行带有引号,则引号中的逗号不应是分隔符,因为这是一个地址。
这是数据示例(csv,我将其转换为字典以显示示例)
{'Store code,"Biz","Add","Labels","TotalSe","DirectSe","DSe","TotalVe","SeVe","MaVe","Totalac","Webact","Dions","Ps"': {0: ',,,,"Numsearching","Numsearchingbusiness","Numcatprod","Numview","Numviewed","Numviewed2","Numaction","Numwebsite","Numreques","Numcall"',
1: 'Nora,"Ora","Sgo, Mp, 2000",,111,44,33,121,1232,53411,4,5,,3',
2: 'mc11,"21 old","tjis that place, somewher, Netherlands, 2434",,3245,325,52454,3432,243,4353,343,23,23,18'}}
到目前为止,我已经尝试过了,但是有点卡住了:
disc = pd.read_csv('/content/gdrive/My Drive/blank/blank.csv',delimiter='",')
csv示例: csv sample
答案 0 :(得分:1)
我使用普通函数在两端的每一行中删除"
,然后将两个""
转换为单个"
通过这种方式,我可以获取可以用read_csv()
加载的CSV
f1 = open('Sample - Sheet1.csv')
f2 = open('temp.csv', 'w')
for row in f1:
row = row.strip() # remove "\n"
row = row[1:-1] # remove " on both ends
row = row.replace('""', '"') # conver "" into "
f2.write(row + '\n')
f2.close()
f1.close()
df = pd.read_csv('temp.csv')
print(len(df.columns))
print(df)
另一种方法:将其读取为CSV并另存为普通字符串
import csv
f1 = open('Sample - Sheet1.csv')
f2 = open('temp.csv', 'w')
reader = csv.reader(f1)
for row in reader:
f2.write(row[0] + '\n')
f2.close()
f1.close()
df = pd.read_csv('temp.csv')
print(len(df.columns))
print(df)