Question

我有一个这种格式的列表。

[u' ', u'Address :',u'Sadar Bazaar',u'new Delhi,India',u' ',u'Name :',u'Saun-Jean',u' ',u'Occupation :',u'Developer',u'Hacker',u' ']

我想将记录插入数据库。

这是我的想法，怎么做。

1）取两个u' '

之间的所有项目

2）第二项u'Address'定义数据库的字段，并且下一步u' '定义数据。像

 'Address :','Sadar Bazaar','new Delhi,India'

3）对所有项目重复此过程。

可能还有其他好主意。

但我不知道如何在Python中做到这一点。有人可以帮助我吗？

编辑：以下是我构建List的方法：

for tr in browser.find_elements_by_xpath("//tbody//tbody//tr"):
 tds=tr.find_elements_by_tag_name('td')
 if tds:
  data.append([td.text for td in tds])

Answer 1

作为字典，这样做要好得多：

d={}
d['Address'] = ['sadar bazaar', ...]
d['Name'] = [ 'saun-jean', ... ]
...

或者可能是字典（或类实例）的列表：

[ {'Address' : 'sadar bazaar', 'Name': 'saun-jean'}, { ... } ]

要将您的列表转换为我上面的词典列表，您可以执行以下操作：

from collections import defaultdict
d = defaultdict(list)
a = iter(yourlist)
key = None
for elem in a:
   if elem == u' ':
      key = next(a)
   else:
      d[key].append(elem)

Answer 2

lis=[u' ', u'Address :',u'Sadar Bazaar',u'new Delhi,India',u' ',u'Name :',u'Saun-Jean',u' ',u'Occupation :',u'Developer',u'Hacker',u' ']
strs=' '.join(str(x).strip() for x in lis if str(x).strip())
lis1=strs.split(':')
dic={}
for i,x in enumerate(lis1[:-1]):
    if x.strip():
        temp_lis=x.strip().split()
        if i+1 <len(lis1)-1:
            dic[temp_lis[-1]]=' '.join(lis1[i+1].split()[:-1])
        else:
            dic[temp_lis[-1]]=' '.join(lis1[i+1].split())

print dic

<强>输出：

{'Occupation': 'Developer Hacker', 'Name': 'Saun-Jean', 'Address': 'Sadar Bazaar new Delhi,India'}

Answer 3

你必须在第一时间构建字典，这可以通过做这样的事情来完成（粗略的代码）：

data = {'Adress' : '',
        'Name' : '',
        'Occupation' : ''}
for item in tds:
    data['Adress'] = item[0]
    data['Name'] = item[1]
    data['Occupation'] = item[2]

当然，这是因为tds中的trs总是在同一个地方。然后，您可以使用“数据”字典按名称提取数据

Answer 4

按原样使用您的数据：

l = [u' ', u'Address :',u'Sadar Bazaar',u'new Delhi,India',u' ',u'Name :',u'Saun-Jean',u' ',u'Occupation :',u'Developer',u'Hacker',u' ']

entries = {}
key = ''
for i in range(len(l)):
    if l[i] == u' ' and i + 1 < len(l):
        key = l[i + 1].replace(':', '').strip()
        entries[key] = []
    elif entries.has_key(l[i].replace(':', '').strip()):
        continue
    else:
        # remove trailing space
        if l[i] != u' ':
            entries[key].append(l[i])

print entries

输出为字典：

{u'Occupation': [u'Developer', u'Hacker'], u'Name': [u'Saun-Jean'], u'Address': [u'Sadar Bazaar', u'new Delhi,India']}

解析某些格式的列表

4 个答案: