Question

我的数据集看起来像这样：

    film_title      writers        actors

0                                  Leonardo Dicaprio, Jason Statham, Dwayne Johnson...
1                                  Jack Nicholson, Robert De Niro, Denzel Washington...
2                                  Jack Nicholson, Jason Statham, Dwayne Johnson...

'...'表示该单元格中有更多参与者；我正在尝试将所有演员放在列表中（并且不包括重复的演员）。到目前为止，我有以下代码：

actorsList = df_final.actors.str.split(', ') #which splits the cells into multiple lists

#print(actorsList) will print this:
['Leonardo Dicaprio', 'Jason Statham', 'Dwayne Johnson'...]
['Jack Nicholson', 'Robert De Niro', 'Denzel Washington'...]
['Jack Nicholson', 'Jason Statham', 'Dwayne Johnson'...]

如此

print(actorsList[0]) #will print the first list: ['Leonardo Dicaprio', 'Jason Statham', 'Dwayne Johnson'...]

然后我尝试再次遍历此列表，并存储每个演员的名字（不要重复，因为它们可以出现在多部电影中）

#ITERATE THROUGH ONE LIST
for i in range(len(actorsList[0])):
    txt = actorsList[0][i].split(', ')
    print(txt)

这会打印出这样的内容：

['Leonardo Dicaprio']
['Jason Statham']
['Dwayne Johnson']
and so on

我正在尝试为每个列表执行此操作，但是最终出现此错误：

       23 for i in range(len(actorsList)-1):
  ---> 24     for j in range(len(actorsList[i])):
       25         txt = actorsList[i][j].split(', ')
       26         print(txt)

       TypeError: object of type 'float' has no len()

我还应该提到它运行（打印结果）的事实，但是它停止了，然后出现此错误。

熊猫和python TypeError：类型为'float'的对象没有len（）

0 个答案: