Question

我在使用python中的panda一次将多个.dta文件转换为.csv时遇到问题，你能不能帮我解决这个问题，因为我有四个文件夹中的不同文件都包含.dta文件？

Answer 1

pandas.io模块具有read_stata功能：http://pandas.pydata.org/pandas-docs/dev/generated/pandas.io.stata.read_stata.html。

这会将单个stata文件读入数据帧。在那里，您可以使用数据框的.to_csv方法以所需的格式保存新文件。

当谈到获取目录中的所有数据时，我认为你最快的前进路径看起来像这样（未经测试）：

import glob
import os
import pandas

my_directories = ['/path/to/first', '/path/to/second', ..., '/path/to/nth']
for my_dir in my_directories:
    stata_files = glob.glob(os.path.join(my_dir, '*.dta'))  # collects all the stata files
    for file in stata_files:
         # get the file path/name without the ".dta" extension
         file_name, file_extension = os.path.splitext(file)

         # read your data
         df = pandas.read_stata(file, ...)

         # save the data and never think about stata again :)
         df.to_csv(file_name + '.csv')

从.dta（stata）将文件转换为.csv

1 个答案: