parse text into different columns in pandas

时间:2017-03-02 23:41:11

标签: python pandas url

I have a dataframe containing the query part of multiple urls.

For eg.

in=2015-09-19&stars_4=yes&min=4&a=3&city=New+York,+NY,+United+States&out=2015-09-20&search=1\n

in=2015-09-14&stars_3=yes&min=4&a=3&city=London,+United+Kingdom&out=2015-09-15&search=1\n

in=2015-09-26&Filter=175&min=5&a=2&city=New+York,+NY,+United+States&out=2015-09-27&search=2\n

My desired dataframe should be:

    in         Filter   stars  min  a  max  city  country  out          search
--------------------------------------------------------------------------------
    2015-09-19  NAN    stars_4  4   3  NAN   NY     US     2015-09-20      1
    2015-09-14  NAN    stars_3  4   3  NAN  LONDON  UK     2015-09-15      1
    2015-09-26  175     NAN     5   2  NAN   NY     US     2015-09-27      2

Is there any easy way out for this using regex?

Any help will be much appreciated! Thanks in advance!

1 个答案:

答案 0 :(得分:1)

快速而肮脏的解决方法是使用列表推导:

json_data = [{c[0]:c[1] for c in [b.split('=') for b in line.split('&')]} \
            for line in open('data_file.txt')]

df = pd.DataFrame.from_records(json_data)

这不会解决您的位置分类问题,但会为您提供更好的数据框架。