Question

我有此信息，但无法获取列serviceTypes和crowding的值：

id  name    modeName    disruptions lineStatuses    serviceTypes    crowding
0   piccadilly  Piccadilly  tube    []  []  [{'$type': 'Tfl.Api.Presentation.Entities.Line...   {'$type': 'Tfl.Api.Presentation.Entities.Crowd...
1   victoria    Victoria    tube    []  []  [{'$type': 'Tfl.Api.Presentation.Entities.Line...   {'$type': 'Tfl.Api.Presentation.Entities.Crowd...
2   bakerloo    Bakerloo    tube    []  []  [{'$type': 'Tfl.Api.Presentation.Entities.Line...   {'$type': 'Tfl.Api.Presentation.Entities.Crowd...
3   central Central tube    []  []  [{'$type': 'Tfl.Api.Presentation.Entities.Line...   {'$type': 'Tfl.Api.Presentation.Entities.Crowd.

我尝试了以下代码：

def split(x, index):
    try:
        return x[index]
    except:
        return None
dflines['serviceTypes'] = dflines.serviceTypes.apply(lambda x:split(x,0))
dflines['crowding'] = dflines.crowding.apply(lambda x:split(x,1))

def values(x):
    try:
        return ';'.join('{}'.format(val) for  val in x.values())
    except:
        return None
m = dflines['serviceTypes'].apply(lambda x:values(x))
dflines1 = m.str.split(';', expand=True)
dflines1.columns = dflines['serviceTypes'][0].keys()
dflines2 = dflines1[['name']]
dflines2

但是我得到了这个错误：

AttributeError                            Traceback (most recent call last)
<ipython-input-108-8f4bb6ac731a> in <module>
     14 m = dflines['serviceTypes'].apply(lambda x:values(x))
     15 dflines1 = m.str.split(';', expand=True)
---> 16 dflines1.columns = dflines['serviceTypes'][0].keys()
     17 dflines2 = dflines1[['name']]
     18 dflines2

AttributeError: 'str' object has no attribute 'keys'

有人可以帮助我吗？

Answer 1

您可以像这样将pandas列拉入列表：

service_types = dflines['serviceTypes']

第一个值现在是列表service_types中的第一个值。

first_value = service_types[0]

熊猫的工作方式不同于字典。我认为您可能正在尝试将数据框视为字典。如果我误解或简化了，我深表歉意。

编辑：

好吧，看来service_types（以上）是字典的列表。要编写该列，使其只包含您需要索引到列表然后再索引到字典中的类型。

service_types = dflines['serviceTypes']
types_alone = []
for i in service_types:
    types_alone.append(i['$type'][0])
dflines['new_column'] = types_alone

如何获取数据帧的信息，哪些列是字典或列表？

1 个答案: