Question

所以我发现这个页面很好地解释了不同类型的变量： http://www.spss-tutorials.com/spss-variable-types-and-formats/

我想知道，当我导出数据时，数字和字符串类型是如何区分的？数字和字符串是否映射到任何代码？

我想用Python解析SPSS数据。

Answer 1

如果你注意到下面代码生成的打印输出，你会发现字符串变量被“导出”到字符串（不足为奇），数字变量被转换/导出到浮点数。

日期变量也会转换为浮点数，日期表示为自1582年10月14日以来经过的秒数 - 这是SPSS存储日期变量的方式，但在SPSS中则有{{3}可以将日期变量设置为显示为（当然内部存储的浮点值保持不变）。

输入文件变量格式：

enter image description here

输入文件数据视图

enter image description here

将SPSS数据读入Python并打印结果的代码：

get file="C:\Program Files\IBM\SPSS\Statistics\23\Samples\English\Employee data.sav".
begin program. 
import spss, spssdata 
allfiles = spssdata.Spssdata().fetchall() 
print "\n".join([str(i) for i in allfiles])
end program.

<强>输出：

namedTuple(1.0, u'm  ', 11654150400.0, 15.0, 3.0, 57000.0, 27000.0, 98.0, 144.0, 0.0)
namedTuple(2.0, u'm  ', 11852956800.0, 16.0, 1.0, 40200.0, 18750.0, 98.0, 36.0, 0.0)
namedTuple(3.0, u'f  ', 10943337600.0, 12.0, 1.0, 21450.0, 12000.0, 98.0, 381.0, 0.0)
namedTuple(4.0, u'f  ', 11502518400.0, 8.0, 1.0, 21900.0, 13200.0, 98.0, 190.0, 0.0)
namedTuple(5.0, u'm  ', 11749363200.0, 15.0, 1.0, 45000.0, 21000.0, 98.0, 138.0, 0.0)
namedTuple(6.0, u'm  ', 11860819200.0, 15.0, 1.0, 32100.0, 13500.0, 98.0, 67.0, 0.0)
namedTuple(7.0, u'm  ', 11787552000.0, 15.0, 1.0, 36000.0, 18750.0, 98.0, 114.0, 0.0)
namedTuple(8.0, u'f  ', 12103948800.0, 12.0, 1.0, 21900.0, 9750.0, 98.0, 0.0, 0.0)
namedTuple(9.0, u'f  ', 11463897600.0, 15.0, 1.0, 27900.0, 12750.0, 98.0, 115.0, 0.0)
namedTuple(10.0, u'f  ', 11465712000.0, 12.0, 1.0, 24000.0, 13500.0, 98.0, 244.0, 0.0)
namedTuple(11.0, u'f  ', 11591424000.0, 16.0, 1.0, 30300.0, 16500.0, 98.0, 143.0, 0.0)
namedTuple(12.0, u'm  ', 12094012800.0, 8.0, 1.0, 28350.0, 12000.0, 98.0, 26.0, 1.0)
namedTuple(13.0, u'm  ', 11920867200.0, 15.0, 1.0, 27750.0, 14250.0, 98.0, 34.0, 1.0)
namedTuple(14.0, u'f  ', 11561529600.0, 15.0, 1.0, 35100.0, 16800.0, 98.0, 137.0, 1.0)
namedTuple(15.0, u'm  ', 11987654400.0, 12.0, 1.0, 27300.0, 13500.0, 97.0, 66.0, 0.0)
...
...

如何在SPSS中确定数字和字符串变量？

1 个答案: