连接两个csv文件

时间:2019-04-30 05:06:12

标签: python-3.x pandas data-analysis

我有两个如下的csv文件。它就像一个键值对数据。如果我将其读入pandas数据框中,它会像第一列一样显示为“多功能方向盘”。但这实际上是原始数据,没有特定的列。

Multi-function Steering Wheel   Yes
Power Adjustable Exterior Rear View Mirror  Yes
Touch Screen    Yes
Automatic Climate Control   Yes
Anti Lock Braking System    Yes
Alloy Wheels    Yes
Fog Lights - Front  Yes
Fog Lights - Rear   Yes
.................

Engine Type T-Jet Petrol Engine
Engine Description  1.4-litre 123.2bhp 16V T-Jet Petrol Engine
Engine Displacement(cc) 1368
No. of cylinder 4
Maximum Power   123.2bhp@5000rpm
Maximum Torque  208Nm@2000-3500rpm
Valves Per Cylinder 4
Valve Configuration DOHC
Fuel Supply System  MPFI
Bore x Stroke   No
Compression Ratio   No
...........
...........

如果我加入这两个表,则需要获取下表。

Multi-function Steering Wheel   Yes
Power Adjustable Exterior Rear View Mirror  Yes
Touch Screen    Yes
Automatic Climate Control   Yes
Anti Lock Braking System    Yes
Alloy Wheels    Yes
Fog Lights - Front  Yes
Fog Lights - Rear   Yes
Engine Type T-Jet Petrol Engine
Engine Description  1.4-litre 123.2bhp 16V T-Jet Petrol Engine
Engine Displacement(cc) 1368
No. of cylinder 4
Maximum Power   123.2bhp@5000rpm
Maximum Torque  208Nm@2000-3500rpm
Valves Per Cylinder 4
Valve Configuration DOHC
Fuel Supply System  MPFI
Bore x Stroke   No
Compression Ratio   No

我以相同的方式拥有10个以上的csv文件。我需要将上述10个文件添加到上述格式的单个表格中。我尝试了级联,但没有达到我的期望。 谁能解释在熊猫中如何做。任何帮助将不胜感激。谢谢。

1 个答案:

答案 0 :(得分:0)

CSV(逗号分隔值)格式文本的列必须用逗号(,)分隔,而不是空格。除非有任何有效的定界符(逗号),否则Pandas会将整行识别为一行。

您可以使用re(正则表达式)模块将那些空格替换为逗号,以生成正确的CSV文本。

import re
text = """Multi-function Steering Wheel   Yes
Power Adjustable Exterior Rear View Mirror  Yes
Engine Description  1.4-litre 123.2bhp 16V T-Jet Petrol Engine
Engine Displacement(cc) 1368
No. of cylinder 4
Maximum Power   123.2bhp@5000rpm
... ...
"""

# Replaces last white space(s) in each line into comma
p = re.compile(r' +(?=[^ ]+$)', re.MULTILINE)
replaced = re.sub(p, ',', text)
print(replaced)

这将为您提供类似的输出

Multi-function Steering Wheel,Yes
Power Adjustable Exterior Rear View Mirror,Yes
Engine Description  1.4-litre 123.2bhp 16V T-Jet Petrol,Engine
Engine Displacement(cc),1368
No. of cylinder,4
Maximum Power,123.2bhp@5000rpm
...,...

请注意,如果您打算将Engine Description 1.4-litre 123.2bhp 16V T-Jet Petrol Engine之类的行划分为Engine Description,1.4-litre 123.2bhp 16V T-Jet Petrol Engines,则这些行应手动编辑,因为上面的代码将机械地替换 last 空白在每一行中。

您可以仅使用VSCode之类的文本编辑器来代替python(请参阅https://code.visualstudio.com/docs/editor/codebasics#_search-and-replace