遍历熊猫DataFrame

时间:2018-11-25 09:11:31

标签: python-3.x pandas loops numpy na

我有一个奇怪的问题,即每次迭代的结果都不会改变。代码如下:

import pandas as pd
import numpy as np

X = np.arange(10,100)
Y = X[::-1]
Z = np.array([X,Y]).T

df = pd.DataFrame(Z ,columns = ['col1','col2'])
dif = df['col1'] - df['col2']

for gap in range(100):
    Up = dif > gap
    Down = dif < -gap

    df.loc[Up,'predict'] = 'Up'
    df.loc[Down,'predict'] = 'Down'

    df_result = df.dropna()
    Total = df.shape[0]
    count = df_result.shape[0]
    ratio = count/Total
    print(f'Total: {Total}; count: {count}; ratio: {ratio}')

结果始终是

Total: 90; count: 90; ratio: 1.0

不应该的。预先谢谢你

1 个答案:

答案 0 :(得分:1)

在发布此问题5分钟后找到问题的根源。我只需要将dataFrame重置为原始版本即可解决该问题。

import pandas as pd
import numpy as np

X = np.arange(10,100)
Y = X[::-1]
Z = np.array([X,Y]).T

df = pd.DataFrame(Z ,columns = ['col1','col2'])
df2 = df.copy()#added this line to preserve the original df
dif = df['col1'] - df['col2']

for gap in range(100):
    df = df2.copy()#reset the altered df back to the original
    Up = dif > gap
    Down = dif < -gap

    df.loc[Up,'predict'] = 'Up'
    df.loc[Down,'predict'] = 'Down'

    df_result = df.dropna()
    Total = df.shape[0]
    count = df_result.shape[0]
    ratio = count/Total
    print(f'Total: {Total}; count: {count}; ratio: {ratio}')