我有一个CSV文件。然后我有一些必须应用的规则,然后根据规则创建一个新的CSV。
所以它可以采取两种方式:
这是我到目前为止所拥有的
def applyRules(directory):
FILES = []
for f in listdir(OUTPUT_DIR):
writer = csv.writer(open("%s%s" % (DZINE_DIR, f), "wb"))
for rule in Substring.objects.filter(source_file=f):
from_column = rule.from_column
to_column = rule.to_column
reader = csv.DictReader(open("%s%s" % (OUTPUT_DIR, f)))
headers = reader.fieldnames
for row in reader:
if rule.get_rule_type_display() == "substring":
string = rule.string.split(",")
# alter value
row[to_column] = string[0] + row[from_column] + string[1]
if rule.from_column == rule.to_column:
print rule.from_column
else:
print rule.from_column
作为FROM_COLUMN和TO_COLUMN的规则,如果两者相同,则列保持不变,但必须使用规则更新数据,在这种情况下,只需在当前值之前和之后添加字符串。
当TO_COLUMN不同时,它只是一个新列,其中包含新列下的数据更改。
所以目前我只是改变了dict的值,但我不知道如何将它恢复到新的CSV等。
答案 0 :(得分:1)
如果您将输出文件作为DictWriter()
对象打开,那么您可以非常轻松地写出已更改的词典。您需要提前确定额外的字段名称:
with open(os.path.join(OUTPUT_DIR, f), 'rb') as rfile:
reader = csv.DictReader(rfile)
headers = reader.fieldnames
rules = Substring.objects.filter(source_file=f).all()
# pre-process the rules to determine the headers
for rule in rules:
from_column = rule.from_column
to_column = rule.to_column
if from_column not in headers:
# problem; perhaps raise an error?
if to_column not in headers:
headers.append(to_column
with open(os.path.join(DZINE_DIR, f), "wb") as wfile:
writer = csv.DictWriter(wfile, fieldnames=headers)
for row in reader:
for rule in rules:
from_column = rule.from_column
to_column = rule.to_column
if rule.get_rule_type_display() == "substring":
string = rule.string.split(",")
row[to_column] = string[0] + row[from_column] + string[1]
writer.writerow(reader)