有一个整理的json文件,它已转换为csv文件
appended_data = []
for file in glob.glob('data-part.json'):
dfjson = pd.read_json(file,encoding='utf-8',lines=True,dtype=str,error_bad_lines=False)
appended_data.append(dfjson)
appended_data = pd.concat(appended_data)
appended_data.to_csv("data.csv",index = False)
但是在打开转换后的csv文件时,它看起来像这样(如下所示的片段)
color gear_type oil_type material date_purchase
[] ['Helical'] ['Synthetic'] ['Composite'] 20201505
[] ['Axle'] ['High Mileage'] ['Asphalt'] 20201505
nan ['Front-Axle'] ['Synthetic'] ['Vulcanised'] 20201505
nan ['Bevel'] ['Conventional'] ['Carbon black'] 20201505
但是需要使csv文件看起来像这样(因为需要对其进行一些搜索)
color gear_type oil_type material date_purchase
nan Helical Synthetic Composite 20201505
nan Axle High Mileage Asphalt 20201505
nan Front-Axle Synthetic Vulcanised 20201505
nan Bevel Conventional Carbon black 20201505
如何捕获这些垃圾('[',']'等)并规范化数据