Pandas read_csv如何处理括号中的sep字符

时间:2017-02-22 15:19:17

标签: python pandas dataframe delimiter

所以我有这样的原始文件,大约有20k列,类似于:

number|colour|(a|1)|animal
1|green|x|dog
2|blue|y|cat
3|red|z|owl 

当我使用read_csv(' raw.csv',sep =' |')时,这会创建一个带有额外列的数据帧,因为(a | 1)列会被拆分。

我尝试使用quotechar参数,但这只能使用一个值。任何帮助将不胜感激

1 个答案:

答案 0 :(得分:3)

根据您提供的示例数据,额外的分隔符仅显示在标题行中。因此,您可以使用"require": { "php": ">=5.3.3", "composer/installers": "~1.0", "fuel/core": "1.8.*", "fuel/auth": "1.8.*", "fuel/email": "1.8.*", "fuel/oil": "1.8.*", "fuel/orm": "1.8.*", "fuel/parser": "1.8.*", "fuelphp/upload": "2.0.6", "monolog/monolog": "1.18.*", "phpseclib/phpseclib": "2.0.0", "michelf/php-markdown": "1.4.0", "twig/twig" : "1.31.0", "mthaml/mthaml": "*" }, 关键字提供自己的列名,然后告诉Pandas跳过标题行,如下所示:

names

这会给你:

import pandas as pd

df = pd.read_csv('raw.csv', sep='|', skiprows=1, names=["number", "colour", "(a|1)", "animal"])
print df