用0填充csv值

时间:2013-08-28 03:53:30

标签: csv

我有这个csv文件,我想在20和21领域排序。例如,这些字段中的数据是P1,PK5。我的挑战是,当我对这些领域进行排序时,它们并不像我希望的那样有序。似乎我必须将这些字段填充到该字段数据中的最长值。

OrderNum,MerrillRecipientID,CustomerClass,MerrillItemNum,PODTemplateID,GridCode,AetnaDocID,MemberID,FirstName,MI,LastName,Address1,Address2,Address3,City,State,Zip,Country,OEL,PalletNum,PckgNum,IMBCode,ProcDate
"M394993","M39499300010000001","0GH","3GH000503","PDP","BO","1011250","MEBB04CB","Name","","Name","address","","","City","SC","29170-2043","","*******AUTO**SCH 5-DIGIT 29033","P1","PK5","2031100094470495539729170204309","3GH000503","August 26, 2013"
"M394993","M39499300010000002","0GH","3GH000503","PDP","BO","1011572","MEBB07GB","Name","G","Name","address","","","City","SC","29020-2912","","*********AUTO**SCH 3-DIGIT 290","P1","PK1","3031100094470495580529020291210","3GH000503","August 26, 2013"
"M394993","M39499300010000003","0GH","3GH000503","PDP","BO","1011693","MEBB08MP","Name","B","Name","address","","","City","SC","29061-9447","","*********AUTO**SCH 3-DIGIT 290","P1","PK2","3031100094470495583729061944757","3GH000503","August 26, 2013"
"M394993","M39499300010000004","0GH","3GH000503","PDP","BO","1011751","MEBB097M","Name","A","Name","address","","","City","SC","29645-0433","","*************AUTO**3-DIGIT 296","P1","PK31","3031100094470495629629645043333","3GH000503","August 26, 2013"
"M394993","M39499300010000005","0GH","3GH000503","PDP","BO","1012075","MEBB0K4L","Name","E","Name","address","","","City","SC","29682-9634","","*************AUTO**3-DIGIT 296","P1","PK33","3031100094470495637929682963428","3GH000503","August 26, 2013"
"M394993","M39499300010000006","0GH","3GH000503","PDP","BO","1012437","MEBB0TWQ","Name","R","Name","address","","","City","SC","29505-3030","","*******AUTO**SCH 5-DIGIT 29501","P1","PK24","2031100094470495556429505303050","3GH000503","August 26, 2013"
"M394993","M39499300010000007","0GH","3GH000503","PDP","BO","1012750","MEBB0YJY","Name","L","Name","address","","","City","SC","29642-3006","","***********AUTO**5-DIGIT 29642","P1","PK38","2031100094470495567529642300601","3GH000503","August 26, 2013"

所以从上面的数据我需要让文件看起来像这样:

OrderNum,MerrillRecipientID,CustomerClass,MerrillItemNum,PODTemplateID,GridCode,AetnaDocID,MemberID,FirstName,MI,LastName,Address1,Address2,Address3,City,State,Zip,Country,OEL,PalletNum,PckgNum,IMBCode,ProcDate
"M394993","M39499300010000001","0GH","3GH000503","PDP","BO","1011250","MEBB04CB","Name","","Name","address","","","City","SC","29170-2043","","*******AUTO**SCH 5-DIGIT 29033","P1","PK05","2031100094470495539729170204309","3GH000503","August 26, 2013"
"M394993","M39499300010000002","0GH","3GH000503","PDP","BO","1011572","MEBB07GB","Name","G","Name","address","","","City","SC","29020-2912","","*********AUTO**SCH 3-DIGIT 290","P1","PK01","3031100094470495580529020291210","3GH000503","August 26, 2013"
"M394993","M39499300010000003","0GH","3GH000503","PDP","BO","1011693","MEBB08MP","Name","B","Name","address","","","City","SC","29061-9447","","*********AUTO**SCH 3-DIGIT 290","P1","PK02","3031100094470495583729061944757","3GH000503","August 26, 2013"
"M394993","M39499300010000004","0GH","3GH000503","PDP","BO","1011751","MEBB097M","Name","A","Name","address","","","City","SC","29645-0433","","*************AUTO**3-DIGIT 296","P1","PK31","3031100094470495629629645043333","3GH000503","August 26, 2013"
"M394993","M39499300010000005","0GH","3GH000503","PDP","BO","1012075","MEBB0K4L","Name","E","Name","address","","","City","SC","29682-9634","","*************AUTO**3-DIGIT 296","P1","PK33","3031100094470495637929682963428","3GH000503","August 26, 2013"
"M394993","M39499300010000006","0GH","3GH000503","PDP","BO","1012437","MEBB0TWQ","Name","R","Name","address","","","City","SC","29505-3030","","*******AUTO**SCH 5-DIGIT 29501","P1","PK24","2031100094470495556429505303050","3GH000503","August 26, 2013"
"M394993","M39499300010000007","0GH","3GH000503","PDP","BO","1012750","MEBB0YJY","Name","L","Name","address","","","City","SC","29642-3006","","***********AUTO**5-DIGIT 29642","P1","PK38","2031100094470495567529642300601","3GH000503","August 26, 2013"

P1字段可能是P100,所以我需要填充P1到P001。但实际上它只需要是最大长度。我可以在两个字段上对文件进行排序,但不知道如何填充它们。

提前感谢您的帮助。

1 个答案:

答案 0 :(得分:1)

好的,既然没有别的东西可以用,这里有一个快速的python(2.x或3.x)脚本,可以满足你的需要:

import sys
import csv

reader = csv.reader(sys.stdin)
writer = csv.writer(sys.stdout, quoting=csv.QUOTE_ALL)

rows = [row for row in reader]
max_len = max([len(row[20]) for row in rows[1:]])

writer.writerow(rows[0])
for row in rows[1:]:
    while len(row[20]) < max_len:
        row[20] = 'PK0' + row[20][2:]
    writer.writerow(row)

如果您将其保存为pad.py,那么您可以像这样使用它:

$ cat /path/to/my_csv_file.csv | python /path/to/pad.py > /path/to/my_new_csv_file.csv

并将以您需要的格式创建my_new_csv_file.csv。由于脚本作用于stdin并输出到stdout,因此您可以通过多种不同方式使用它来满足您的目的。

希望这有帮助。