Question

我有一段代码，它查看“STIF”第38行...... 193中的每个字符，它等于'¡'并检查它的位置（作为整数Row_Col）是否在数据中['Row_Col'] 。如果是这样，它仍然是'¡'并且其来自数据的行被写入新的DataFrame“Build”。否则，它会变为'*'。

STIF = ...
data = pd.DataFrame(...)

for row in range(38,193,1):
    char_row = list(STIF[row])
    for col in range(0,157,1):
        Row_Col = ((row + row_offset) * 1000) + (col + col_offset)
        if char_row[col] == '¡'.decode('windows-1252'):
            if Row_Col in data['Row_Col'].values:
                char_row[col] = '¡'.decode('windows-1252')
                data['Build'].loc[data['Row_Col'] == Row_Col] = 1
            else:
                char_row[col] = '*'.decode('windows-1252')
    char_row[155] = '\r'
    char_row[156] = '\n'
    STIF[row] = u''.join(char_row)

STIF中有~24,000个字符，数据中有~14,000行。看起来这段代码的执行速度非常快，但运行起来需要很长时间。什么减慢了它？

提前感谢任何建议！

python文本分析脚本很慢

0 个答案: