我正在努力阅读SPSS文件(.sav)。我下面的代码可以读取.sav文件。但是,我遇到了一个非常奇怪的错误。当我尝试读取另一个.sav文件时,它会出现以下错误
Traceback (most recent call last):
File "C:\Users\fatihshen\Documents\Merjek
Project\Predictive_Analytics\sav_reader.py", line 28, in <module>
read_spss_file(file_path)
File "C:\Users\fatihshen\Documents\Merjek
Project\Predictive_Analytics\sav_reader.py", line 10, in read_spss_file
records = reader.all()
File "C:\Users\fatihshen\AppData\Local\Programs\Python\Python36-32\lib\site-
packages\savReaderWriter\savReaderNp.py", line 541, in all
return self.to_structured_array(filename)
File "C:\Users\fatihshen\AppData\Local\Programs\Python\Python36-32\lib\site-
packages\savReaderWriter\savReaderNp.py", line 122, in _convert_datetimes
array = func(self, *args)
File "C:\Users\fatihshen\AppData\Local\Programs\Python\Python36-32\lib\site-
packages\savReaderWriter\savReaderNp.py", line 148, in _convert_missings
array = func(self, *args)
File "C:\Users\fatihshen\AppData\Local\Programs\Python\Python36-32\lib\site-
packages\savReaderWriter\savReaderNp.py", line 531, in to_structured_array
array = np.fromiter(self, self.trunc_dtype, self.nrows)
File "C:\Users\fatihshen\AppData\Local\Programs\Python\Python36-32\lib\site-
packages\savReaderWriter\helpers.py", line 17, in fget_memoized
setattr(self, attr_name, fget(self))
File "C:\Users\fatihshen\AppData\Local\Programs\Python\Python36-32\lib\site-
packages\savReaderWriter\savReaderNp.py", line 376, in trunc_dtype
return np.dtype(obj)
ValueError: title already used as a name or title.
这是我的代码:
import savReaderWriter as spss
import pandas as pd
my_df = None
def read_spss_file(file_name):
global my_df
with spss.SavReaderNp(file_name) as reader:
records = reader.all()
my_df = pd.DataFrame(records)
file_path = "dataset/child_abilities.sav"
read_spss_file(file_path)
print(my_df)
.sav文件在SPSS上正常运行。但是,在使用这些Python代码时,某些.sav文件不起作用(此代码适用于大多数其他.sav文件)。
以下是您可以使用的文件: child abilities
知道这里发生了什么吗?我很感激你的帮助。
答案 0 :(得分:1)
通过使用“ pd.read_spss(filepath)”方法,有一种将SPSS文件读入pd.DataFrame()的简便方法。它可以处理您的文件。
import pandas as pd
file_path = "./child_abilities.sav"
df = pd.read_spss(file_path)
请注意,您必须安装Pyreadstat。
% pip install pyreadstst
or
% conda install -c conda-forge pyreadstat