Python - 如何在.csv元素中提取多个url

时间:2017-05-23 06:41:01

标签: python csv

我在.csv中的数据就像

{' ID':' NN00','网址':' http:// ....',&#39 ; Pic':" [' http:// ....',' http:// ....',...]&#34 ;}

我想在“Pic”中提取网址。 ,我该怎么办?

我试试这个:

    for i, row in enumerate(reader):
      for j,ele in enumerate(row['Pic']):
         print(ele)

我个人得到了这个角色

我该怎么办?

以下是我的代码:

with open('WB_INTENTION_with_pic.csv',encoding='utf-8', errors='ignore') as csvfile:
fieldnames = ['ID', 'URL', 'Pic']
reader = csv.DictReader(csvfile)
for i, row in enumerate(reader):
    pic = json.loads(row['Pic'])
    for p in pic:
        print(p)

有些行[' Pic']为空" []" ,有些是" [' http:// ....',' http:// ...。',...]"

My sample data

2 个答案:

答案 0 :(得分:1)

那是因为Pic元素不是列表

'Pic': "['http://...', 'http://...',... ]"

这是一个字符串。您需要先将其转换为JSON。

for i, row in enumerate(reader):
    pic = json.loads(row['Pic'])
    for p in pic:
       ....

答案 1 :(得分:0)

首先检查您的csv格式是否正确。否则csvreader将无法正常工作。我已根据您的示例构建了一个示例csv -

#sample csv
ID,URL,Pic
NN00,http://...,['http://abc.xyz', 'http://pqr.lmn', 'http://456.123']
NN01,http://...,['http://wdc.xyz', 'http://23fpwedr.lmn', 'http://423156.123']
NN02,http://...,['http://zazbxcec.xy32z', 'http://pq24f23r.lmn', http://45dw6.123']

现在只需逐行阅读csv -

with open('urls.csv', 'r+') as csvfile:
    reader = csv.DictReader(csvfile)
    for row in reader:
        for url in row['Pic'][1:-2].split(','):
            print(url.replace("'",""))

=>output
http://abc.xyz
http://pqr.lmn
http://456.123
http://wdc.xyz
http://23fpwedr.lmn
http://423156.123
http://zazbxcec.xy32z
http://pq24f23r.lmn
http://45dw6.123