Question

我正在尝试使用熊猫从已下载的.csv文件创建数据框。每次我尝试建立预测变量数据框时，它都会清空我要查找的列之一。我从此处下载了.csv文件：https://perso.telecom-paristech.fr/eagan/class/igr204/datasets 这是第四个名为“ film.csv”的文件

在使用其他数据集之前，我已经按照以下方式进行了此操作，并且它可以完美地工作。这次我的数据被删除了，我不知道为什么。

import pandas as pd

file=pd.read_csv('film.csv',sep=';',encoding="ISO 8859-1")
#print(file)
df=pd.DataFrame(file)

df=df.dropna(axis=0,how='any')

predictors=pd.DataFrame(df.Director,df.Length)
#prints directors as NaN
print(predictors)

#prints both columns fully
print(df.Director)
print(df.Length)

在上面的预测变量数据帧上打印可以正确打印出“长度”列，但是在Director列中所有文件均为NaN。我想要的只是Director和Length两列的数据框。任何帮助将不胜感激！

编辑：

这些是csv文件的前10行。

     Year;Length;Title;Subject;Actor;Actress;Director;Popularity;Awards
INT;INT;STRING;CAT;CAT;CAT;CAT;INT;BOOL;STRING
1990;111;Tie Me Up! Tie Me Down!;Comedy;Banderas, Antonio;Abril, 
Victoria;Almodóvar, Pedro;68;No
1991;113;High Heels;Comedy;Bosé, Miguel;Abril, Victoria;Almodóvar, 
Pedro;68;No
1983;104;Dead Zone, The;Horror;Walken, Christopher;Adams, 
Brooke;Cronenberg, David;79;No
1979;122;Cuba;Action;Connery, Sean;Adams, Brooke;Lester, Richard;6;No
1978;94;Days of Heaven;Drama;Gere, Richard;Adams, Brooke;Malick, 
Terrence;14;No
1983;140;Octopussy;Action;Moore, Roger;Adams, Maud;Glen, John;68;No
1984;101;Target Eagle;Action;Connors, Chuck;Adams, Maud;Loma, José 
Antonio de la;14;No
1989;99;American Angels: Baptism of Blood, The;Drama;Bergen, Robert 
D.;Adams, Trudy;Sebastian, Beverly;28;No

Answer 1

问题在此行samp_df.loc[-1] = samp_series want = samp_df.sort_index().reset_index(drop=True)

要从旧版本创建新的数据框，请使用类似以下内容的

：

predictors=pd.DataFrame(df.Director,df.Length)

为什么我的列不为空时会显示为NaN？

1 个答案: