Python:使用循环向数据帧添加新列

时间:2016-07-12 11:07:17

标签: python loops pandas

我有来自dataframe的列

url    
http://www.utt66.ru/cat/13972/
https://yandex.ru/yandsearch?clid=2186618&text=%D0%BA%D0%B0%D0%BB%D1%8C%D1%8F%D0%BD%D1%8B%20khalil%20mamoon%20%D0%BA%D1%83%D0%BF%D0%B8%D1%82%D1%8C%20
http://yandex.ru/clck/jsredir?from=yandex.ru%3Byandsearch%3Bweb%3B%3B&text=&etext=1086.0aUeAWfTXcRUqS1ZI5nAyK9j2Mh6yse_NXu9LQ-mWhfjitIqsM3jlM3ks26caUTmE7X0u6HNg06uUGrA-9hHsQ.b9a03b0331f4f4f46de76151fcec7f63060561ea&uuid=&state=zRrSeA3aY6h77jydLxkYiPMTcIdGYjUu&data=UlNrNmk5WktYejR0eWJFYk1LdmtxdXhXMmR3bGZ5cW51SlVKblgwaTEzLXhaSkRmbDA2bTlsYUlpS3BROEhmaWNzNlNBR2ZSdzQ2RWVuRDk3MlFiYlRZRmxvZWZRY3NzLXVuNDl5Wm1GTlJQY2l3dHVlSzNDYms5M2hWNlZTUlRqcXBpc1JaczB3WkdleUc3YkNCYUxKcDl2akdTTTBwTVZXZ0doZ0cxRm94UkFyWHpUUjlnRkJ0bVhKY2JQbzdFWlBUejFwTlhTZEZhNW13ZE5UaC11RG1pdHhnN0RLSHEtakgtVFlvTWgta2JyaER0UGZ0UGxPU2VHaTVmNTVRN05lMFRrbmRZcC1Fd3lqeGdnb19XdGJyV1B4Y1BDZ3dVTmFhSF9jeldtV084ZjFhZFNoRkJfV3ZMOUo1WThicldXTWxELW1LeWlTbkdXSERSa2FGZ2Exa2NLSV9JeFAycQ&b64e=2&sign=6b914765fe78a3e68831c2b6828cf669&keyno=0&cst=AiuY0DBWFJ4BWM_uhLTTxGXxuvzKw7y-ntWZO3bTFIfl2v7d7v1QVNH8SEZECUcbUXMsM6Evbji9wDNUntadIeCfVACZaypi8Lv4SNxdaMz8Q5Ps4fuuccAF_tr0UwFikC6ag-nx6lB2WcA1pp9L6qSe3TipCWsrgE_JueAu4dtg3RwQcdxvk1V6q6p9QR8G8excLNwINU9EI2wQGEAfURn9Z_2zuPKTYZVucqZkm0IAcMpofvG3mJuY6EmScb54roqiYzcQA3O4eohhR0h2YGB66J-005U2&ref=orjY4mGPRjk5boDnW0uvlpAgqs5Jg3quKLfGKhgcZzlBh-w_NInSOY4el2-CiPHYpxLL_oT04HCjaJObTz07DhIg7d1EhIq2tEBI8RCSnOAHKMgOOSZIejHTFTVm7ArLNYUFK3CqIaug6TVqqCR2ki_XYek72UAw_aawNfMWBZpRptChlirBwodvzexIkSVO&l10n=ru&cts=1466022252275&mc=4.058813890331201
https://www.google.ru/search?ie=UTF-8&hl=ru&q=%D0%BA%D0%B0%D0%BB%D1%8C%D1%8F%D0%BD%D1%8B%20khalil%20mamoon%20%D0%BA%D1%83%D0%BF%D0%B8%D1%82%D1%8C&gws_rd=ssl
https://vk.com/im?peers=52525981_172627017_275902975_339455414_c107&sel=203575078
https://www.google.ru/search?ie=UTF-8&hl=ru&q=%D0%BA%D0%B0%D0%BB%D1%8C%D1%8F%D0%BD%D1%8B%20khalil%20mamoon%20%D0%BA%D1%83%D0%BF%D0%B8%D1%82%D1%8C&gws_rd=ssl
http://www.utt66.ru/cat/13972/
http://www.utt66.ru/contacts/
https://vk.com/im?peers=52525981_172627017_203575078_275902975_339455414&sel=c107
http://www.utt66.ru/feedback/

我需要收到来自url的请求,其中包含yandex.ru/yandsearchgoogle.ru/search并且我使用代码

    if '//www.google.ru/search?' not in urls[i] and '//www.google.ru/search?' in urls[i - 1]:
        get1 = urlparse(urls[i - 1])
        dict1 = parse_qs(get1[4])
        search_val = dict1['q'][0]
        searching_val.append(search_val)
    elif '//yandex.ru/yandsearch?' not in urls[i] and '//yandex.ru/yandsearch?' in urls[i - 1]:
        get2 = urlparse(urls[i - 1])
        dict2 = parse_qs(get2[4])
        search_val = dict2['text'][0]
        searching_val.append(search_val)

我想,如果找到这个字符串,请在下一个字符串中写入请求。 但在df['search_term'] = searching_val之后我得到了不正确的答案。 我做错了什么?

1 个答案:

答案 0 :(得分:0)

在您的循环中,当没有匹配任何地址时,您没有分配任何值。添加else语句并添加空字符串或Null - searching_val 中的元素数量必须等于数据框中的行数(或{{1} }列,如果你愿意的话。)

另外,将url转换为searching_val,然后分配到一列:

pandas.Series