Question

我对我的代码的编译速度有疑问。我想出了解决的方法，但是这种解决方法很慢。因此，我想问你是否有一个想法，使它变得更聪明/更快？

我有原始df（约10万行），我想将每行乘以4x（复制该行），然后jsut更改一列-stats1（对于axample加1）-Stats1在原点df中始终为5。 / p>

DF：

      Stats 1  Stats 2  Stats 3  Stats 4  Stats 5
Row 1       5        5        8        7        3
Row 2       5        8        3        7        9
Row 3       5        5        1        2        6

Output:
      Stats 1  Stats 2  Stats 3  Stats 4  Stats 5
Row 1       5        5        8        7        3
Row 1       6        5        8        7        3
Row 1       7        5        8        7        3
Row 1       8        5        8        7        3
Row 2       5        8        3        7        9
Row 2       6        8        3        7        9
Row 2       7        8        3        7        9
Row 2       8        8        3        7        9
Row 3       5        5        1        2        6
Row 3       6        5        1        2        6
Row 3       7        5        1        2        6
Row 3       8        5        1        2        6

此代码有效，但是速度很慢。

new_df = pd.DataFrame()
for i in range(len(df)):
    new = pd.DataFrame()
    new = new.append([df.loc[[i]]]*4,ignore_index=True)
    step = 0
    for j in range(0,4):
        new.loc[:,"Stats1"].iloc[j] = 5+step
        step += 1
    new_df = pd.concat([new_df,new])
new_df.reset_index(inplace = True, drop = True)

谢谢

Answer 1

检查一下：

df = pd.DataFrame(data={'Stats 1': [5, 5, 5],
                        'Stats 2': [5, 8, 5],
                        'Stats 3': [8, 3, 1],
                        'Stats 4': [7, 7, 2],
                        'Stats 5': [3, 9, 6]},
                  index=pd.Index(data=['Row 1', 'Row 2', 'Row 3']))

此代码的执行速度应该更快：

df_new = pd.concat([df] * 4).sort_index()

generator = (i for i in range(0, 4))
col = pd.Series(generator)

df_new.reset_index(inplace=True, drop=True)
df_new['Stats 1'] = df_new['Stats 1'] + pd.concat([col] * int(len(df_new) / 4)).reset_index(drop=True)

结果：

      Stats 1  Stats 2  Stats 3  Stats 4  Stats 5
    0       5        5        8        7        3
    1       6        5        8        7        3
    2       7        5        8        7        3
    3       8        5        8        7        3
    4       5        8        3        7        9
    5       6        8        3        7        9
    6       7        8        3        7        9
    7       8        8        3        7        9
    8       5        5        1        2        6
    9       6        5        1        2        6
    10      7        5        1        2        6
    11      8        5        1        2        6

希望这会有所帮助！

Answer 2

如果输出行顺序无关紧要，则可以尝试以下操作。必须删除列名称中的空格才能使其与appBar: AppBar( actions: <Widget>[ BlocBuilder<AuthBloc, AuthState>( builder: (context, state) { if (state is Authenticated) { return profileIcon(context); } else if (state is UnAuthenticated) { return logIn(context); } else { return Container(); } }, ), ], ),中的关键字参数一起使用。

assign

基本上，我们将同一数据帧连接4次，但是在每个增量中，我们将df.columns = [name.replace(' ', '') for name in df.columns] new_df = pd.concat([df.assign(Stats1=lambda x: x.Stats1 + i) for i in range(4)])列增加一个增量。您必须比较原始数据集的性能。

使用for循环和追加创建新df的更快方法

2 个答案: