我有一个csv文件的目录(有几个子文件夹)。我想在将csv文件上传到数据库(SQL服务器)之前删除所有csv文件的前两行。我从位于一个文件夹(没有子文件夹)的一小部分csv文件开始使用以下python脚本,尽管脚本成功运行但没有从文件中删除任何行。我错过了什么:
import glob
import csv
myfiles = glob.glob("C:\Data\*.csv")
for file in myfiles:
lines = open(file).readlines()
open(file, 'w').writelines(lines[1:])
以下是我的示例数据:
"Title: Distribution of Nonelderly Population by Household Employment Status | The Henry J. Kaiser Family Foundation"
"Timeframe: 2015"
"Location","At Least 1 Full Time Worker","Part Time Workers","Non Workers","Total"
"United States","0.82","0.08","0.10","1.00"
"Alabama","0.79","0.06","0.15","1.00"
"Alaska","0.85","0.06","0.09","1.00"
"Arizona","0.80","0.08","0.12","1.00"
"Arkansas","0.78","0.07","0.15","1.00"
"California","0.81","0.08","0.10","1.00"
我想用编辑过的输出csv文件维护相同的目录结构。 任何帮助将受到高度赞赏。
答案 0 :(得分:0)
试试这个:
import os
# Change this to your CSV file base directory
base_directory = 'C:\\Data'
for dir_path, dir_name_list, file_name_list in os.walk(base_directory):
for file_name in file_name_list:
# If this is not a CSV file
if not file_name.endswith('.csv'):
# Skip it
continue
file_path = os.path.join(dir_path, file_name)
with open(file_path, 'r') as ifile:
line_list = ifile.readlines()
with open(file_path, 'w') as ofile:
ofile.writelines(line_list[2:])
注意:请勿将文件用作变量名称,否则您将破坏内置类。