Question

我正在尝试学习python / pandas。我正在研究“适用于Python的Analytics for基础”，但遇到了困难

使用

input_file = sys.argv[1]

给出结果

文件“ C：\ Users \ longr \ Desktop \ pfile \ 1excel_introspect_workbook.py”，第11行 input_file = sys.argv [1] IndexError：列表索引超出范围

在以前的练习中，将此呼叫替换为

input_file = 'supplier_data.csv'

有效... [对于一个csv文件]我使用了来自github的源代码-同样的错误。我所有的文件[.py / .xlsx / .csv]都放在C：\ Users \ longr \ Desktop \ pfile \ ....中，但我很茫然

有人可以帮忙吗？


import sys
from xlrd import open_workbook

input_file = sys.argv[1]

workbook = open_workbook(input_file)
print('Number of worksheets:', workbook.nsheets)
for worksheet in workbook.sheets():
    print("Worksheet name:", worksheet.name, "\tRows:", worksheet.nrows, "t\Columns:", worksheet.ncols)

Answer 1

sys.argv[1]是您输入的用于运行脚本的第三个输入，仅次于python文件名。假设您的py脚本名为example.py，那么您将像

python example.py

但是如果您想以argv[1]的形式获取csv文件，则需要像这样运行脚本

python example.py supplier_data.csv

现在是您的 argv[0] == example.py和 argv[1] == supplier_data.csv作为字符串类型。

Answer 2

经过进一步的追捕，我找到了这个网站 https://www.youtube.com/watch?v=kWaerL6-OiU 解决了我在多个Excel工作表中阅读的问题

#import numpy as np
import pandas as pd
import glob

#### Combine, concatenate, join multiple excel files in a given folder into one dataframe, Each excel files having multiple sheets 
#### All sheets in a single Excel file are first combined into a dataframe, then all the Excel Books in the folder
#### Are combined to make a single data frame. The combined data frame is the exported into a single Excel sheet.


#path = r'C:\Users\Tchamna\Downloads\UTRC_DATA\495GowanusSpeedData20152016'
path = r'C:\Users\Tchamna\Downloads\UTRC_DATA\test'

filenames = glob.glob(path + "/*.xlsx")
print(filenames)

### Dataframe Initialization
concat_all_sheets_all_files = pd.DataFrame()


for file in filenames:

        ### Get all the sheets in a single Excel File using  pd.read_excel command, with sheet_name=None
        ### Note that the result is given as an Ordered Dictionary File
        ### Hell can be found here: https://pandas.pydata.org/pandas-docs...

        df = pd.read_excel(file, sheet_name=None, skiprows=None,nrows=None,usecols=None,header = 0,index_col=None)
        #df = pd.read_excel(file, sheet_name=None, skiprows=0,nrows=34,usecols=105,header = 9,index_col=None)

        #print(df)

        ### Use pd.concat command to Concatenate pandas objects as a Single Table.
        concat_all_sheets_single_file = pd.concat(df,sort=False)



         ### Use append command to append/stack the previous concatenated data on top of each other 
        ### as the iteration goes on for every files in the folder

        concat_all_sheets_all_files=concat_all_sheets_all_files.append(concat_all_sheets_single_file)
        #print(concat_all_sheets)

导入csv文件到熊猫时出现问题，以避免“ IndexError：列表索引超出范围”

2 个答案: