我有一个如下所示的csv,需要根据第3列中的值将csv解析为多行,以加载到db ...
由于限制,我只能使用import csv
模块来执行此功能,这就是我遇到的问题,我面临的问题是,如果我写了一个插入查询...它没有取出所有的rows ..它只获取每个for循环中的最后一条记录并插入表
1,2,3,4,5
10,20,30,50
100,200,300,400
可能的代码:
if column 3 = 'y' else 'n' in column 4 in table
输出:
1,2,3,y
1,2,4,n
1,2,5,n
10,20,30,y
10,20,50,n
100,200,300,y
100,200,400,n
这是我的代码
import csv
import os
#Test-new to clean.csv
fRead=open("clean.csv")
csv_r=csv.reader(fRead)
#to skip first two lines
leave=0
for record in csv_r:
if leave<2:
leave+=1
continue
#storing the values of column 3,4,5 as an array
JMU=[]
for t in [2, 3, 4]:
if not(record[t] in ["", "NA"]):
JMU.append(record[t].strip())
#print len(JMU)
#print "2"
if len(JMU)==0:
#print "0"
pass
else:
#check if the name contains WRK
isWRK1 = "Table"
for data in JMU:
print data
if data[:3].lower()=="wrk" or data[-3:].lower()=="wrk":
isWRK1="Work"
print isWRK
else:
isWRK = "table"
#check if column 2 value is "Yes" or "No"
fourthColumn="N"
if not(record[2] in ["", "NA"]):
#print record[2]
if record[3].strip().lower()=="no":
# print record[3]
fourthColumn = "I"
else:
fourthColumn = "N"
for i in JMU:
iWRK = "Table"
if record[2]==i:
newRecord = [record[0], record[1], i, fourthColumn, isWRK,]
#print newRecord
elif record[3] == i:
newRecord = [record[0], record[1], i, "N", isWRK]
#print newRecord
else:
newRecord = [record[0], record[1], i, "N", isWRK]
print ("insert into table (column_a,column_b,column_c,column_d,column_e) values (%s,%s,%s,%s,%s)"% (record[0],record[1],record[2],record[3],record[4]))
fRead.close()
fWrite.close()
答案 0 :(得分:1)
我假设您希望将前2列保持为常量,并为同一输入行上的每个下一个数字创建一个新行。
最初我想出了这个1-liner awk
命令:
$ cat data
1,2,3,4,5
10,20,30,50
100,200,300,400
$ awk -F, -v OFS=, '{for(i=3;i<=NF;i++) print $1, $2, $i, (i==3?"y":"n")}' data
1,2,3,y
1,2,4,n
1,2,5,n
10,20,30,y
10,20,50,n
100,200,300,y
100,200,400,n
然后我使用csv
模块将其复制到python中:
import csv
with open('data', 'r') as f:
reader=csv.reader(f)
for row in reader:
l=list(map(int, row))
for i in range(2, len(l)):
print(l[0], l[1], l[i], 'y' if i==2 else 'n', sep=',')
这是一个与awk
输出相同的示例运行:
1,2,3,y
1,2,4,n
1,2,5,n
10,20,30,y
10,20,50,n
100,200,300,y
100,200,400,n