从另一个数据帧动态生成数据帧

时间:2019-06-17 14:50:10

标签: python pandas

我需要帮助来生成要由pandas.read_csv()打开的文件的动态列表

_ExtensionToLookFor = '.csv'
# Setting up boolean to check for CSV
_isCsv = None
# Returning all files in folder ** If the folder needs to be changed - FilePath #
_FileReturn = [_Files for _Files in listdir(_FilePath) if isfile(join(_FilePath,_Files))]
_FileReturn = pd.DataFrame(_FileReturn)
_FileReturn.columns = ['Files']
# Returning only CSV files #
_FileReturn = _FileReturn[_FileReturn['Files'].str.contains('.csv')]

def SpcFormat(_FileReturn):
    def __init__(self,_FileReturn):
        self._FileReturn = _FileReturn
    def __DataframeCreation__(self):
        _FileReturn = self._FileReturn
        for i in _FileReturn:
            StartInt = 1

我很难做到这一点,在底部附近,我试图遍历列表并命名与计数位置等效的数据框。

所以在伪代码中它应该像这样

For Files in _FileReturn:
Create New DataFrame(StartInt) = Pandas.Read_csv(DataFrame(IntPosition)+_ExtensionToLookFor)
StartInt ++ // Add One

谢谢!

*编辑:为清楚起见,我想做的是检查文件夹-返回文件夹中的所有文件-按特定文件类型过滤,然后根据检索到的Csv文件数量动态创建具有名称格式的数据框*

_FilePath = r'\\Ezquest\Quality Control\Transend Programs\ConversionTest'
# Returning all files in folder ** If the folder needs to be changed - FilePath #
_FileReturn  = glob(_FilePath + '\\' + '*.csv')
#_FileReturn = [_Files for _Files in listdir(_FilePath) if isfile(join(_FilePath,_Files))]
_FileReturn = pd.DataFrame(_FileReturn)
_FileReturn.columns = ['Files']

# Returning only CSV files #
_Files  = {
        'csv_' + str(_FileReturnName): pd.read_csv(_FileReturn['Files'],sep=',',encoding='latin')
        for _FileReturnName in range(len(_FileReturn['Files']))
          }

上面的代码包含@J的部分答案。母鹿-尽管我要返回

_Files  = {
        'csv_' + str(_FileReturnName): pd.read_csv(_FileReturn['Files'],sep=',',encoding='latin')
        for _FileReturnName in range(len(_FileReturn['Files']))
          }
Traceback (most recent call last):

  File "<ipython-input-3-a7027d4eb492>", line 3, in <module>
    for _FileReturnName in range(len(_FileReturn['Files']))

  File "<ipython-input-3-a7027d4eb492>", line 3, in <dictcomp>
    for _FileReturnName in range(len(_FileReturn['Files']))

  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 702, in parser_f
    return _read(filepath_or_buffer, kwds)

  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 413, in _read
    filepath_or_buffer, encoding, compression)

  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\common.py", line 232, in get_filepath_or_buffer
    raise ValueError(msg.format(_type=type(filepath_or_buffer)))

ValueError: Invalid file path or buffer object type: <class 'pandas.core.series.Series'>

任何进一步的帮助将不胜感激!

1 个答案:

答案 0 :(得分:1)

您可以考虑以下内容

import pandas as pd
from glob import glob
path = ('path/to/csv/folder/')
# get all csv files in a folder
files = glob(path + '*.csv')
# create a dictionary and read csv files
df_dict = {'csv_' + str(k): pd.read_csv(files[k], sep=',', encoding='latin') 
           for k in range(len(files))}
# then check each dataframe by indexing df_dict['csv_1']