我正在使用熊猫阅读一张纸。阅读完工作表后,我在这两个值之间得到一个空行。
所以,我需要找到该行的索引值并删除其下面的所有行,然后创建一个新的数据框。
from xlrd import open_workbook
import pandas as pd
from pandas import ExcelWriter
pathbook = open_workbook("S:\\1. DIRECTORY MASTER\\FINANCIAL RESEARCH\\Data
Initiative - PROJECTS\\Market Rollout\\"
"Modified Files\\2016\\2016A-3032 - CA.xlsx")
pathbook_sheet = pathbook.sheet_by_name("1-Rollout")
file = "S:\\1. DIRECTORY MASTER\\FINANCIAL RESEARCH\\Data Initiative -
PROJECTS\\Market Rollout\\" \
"Modified Files\\2016\\2016A-3032 - CA.xlsx"
for rowidx in range(pathbook_sheet.nrows):
row = pathbook_sheet.row(rowidx)
for colidx, cell in enumerate(row):
if cell.value == "Canadian Market":
print("Sheet Name:", pathbook_sheet.name)
print("Row Number:", rowidx)
CADvalue = int(rowidx)
CADvalue += 1
print(CADvalue)
reading_book = pd.read_excel(file, sheet_name="1-Rollout",
skiprows=CADvalue, index_col=0).iloc[:12]
write = ExcelWriter("Final" + ".xlsx")
reading_book.to_excel(write, 'Sheet1', index=False)
write.save()
我得到的excel文件中的示例输出
Sales 2016 2017 2018 2019 2020 2021
Units Sold 0 0 0 4 14 37
Unit Sale Price 1285 1285 1285 1285 1285 1285
Unit Profit 4000 4000 4000 4000 4000 4000
Rest of the World Market
所以最后3行之间有一个空行
答案 0 :(得分:0)
解决方案取决于空手段。如果它只是一个空字符串,如''
中所示,则查找索引的代码为:
empty = ''
idx_first_empty_row = reading_book.index[reading_book.iloc[:, 0] == empty][0]
如果第一列为空,则此方法有效。例如"空"表示NaN
,然后将该行替换为:
idx_first_empty_row = reading_book.index[np.isnan(reading_book.iloc[:, 0])]
如果行的dtype
是任何数字numpy类型,例如np.float64
,则此方法有效。
如果dtype
不是任何numpy数字类型,请尝试以下操作:
idx_first_empty_row = np.where(reading_book.iloc[:, 0].isnull().values == True)
您还可以根据行中的数据类型尝试以下操作:
idx_first_empty_row = reading_book.index[reading_book.iloc[:, 0].isnull().values]
答案 1 :(得分:0)
#First, find NaN entries in first column
blank_row_bool = reading_book.iloc[:,1].isna()
#Next, get index of first NaN entry
blank_row_index = [i for i, x in enumerate(blank_row_bool) if x][0]
#Finally, restrict dataframe to rows before the first NaN entry
reading_book = reading_book.iloc[:(blank_row_index-1)]
或者,在一行中:
reading_book = reading_book.iloc[:([i for i, x in
enumerate(reading_book.iloc[:,1].isna()) if x][0]-1)]