熊猫数据框创建带有循环的列

时间:2019-11-20 03:06:30

标签: python pandas

我正在尝试添加新列,并使用for循环填充数据,从Price列获取数据,并将1000次迭代插入新的dataframe列,在1000 Price列迭代之后,再为1000个新列创建一个新列,依此类推。

import pandas as pd

import matplotlib.pyplot as plt

data_frame = pd.read_csv('candle_data.csv', names=['Time', 'Symbol','Side', 'Size', 'Price','1','2','3','4','5'])
price_df = pd.DataFrame()
count_tick = 0
count_candle = 0
for price in data_frame['Price']:
    if count_tick < 1000:
        price_df[count_candle] = price
        count_tick +=1
    elif count_tick == 1000:
        count_tick = 0
        count_candle +=1


price_df.head()

1 个答案:

答案 0 :(得分:0)

您不必遍历数据框,可以使用切片来实现此目的,请看下面的示例代码。我已经加载了一个具有100行的数据框,并尝试从“ col1”的前50行创建列-“ col3”,然后从“ col1”的后50行发布该列“ col4”。您可以修改以下代码以指向您的列和所需的值

import pandas as pd
import numpy as np

    if __name__ == '__main__':
        col1 = np.linspace(0,100,100)
        col2 = np.linspace(100, 200, 100)
        dict = {'col1':col1,'col2':col2}
        df = pd.DataFrame(dict)
        df['col3']= df['col1'][0:50]
        df['col4'] = df['col1'][50:100]
        print(df)

解决方案2基于评论中添加的信息

import pandas as pd
import numpy as np

if __name__ == '__main__':
    pd.set_option('display.width', 100000)
    pd.set_option('display.max_columns', 500)
    ### partition size for example I have taken a low volums 20
    part_size = 20
    ## number generation for data frame
    col1 = np.linspace(0,100,100)
    col2 = np.linspace(100, 200, 100)

    ## create initial data frame
    dict = {'col1':col1,'col2':col2}
    df = pd.DataFrame(dict)
    len = df.shape[0]
    ## tells you how many new columns you need
    rec = int(len/part_size)
    _ = {}
    ## initialize slicing variables
    low =0
    high=part_size
    print(len)
    for i in range(rec):
        if high >= len:
            _['col_name_here{0}'.format(i)] = df[low:]['col1']
            break
        else:
            _['col_name_here{0}'.format(i)] = df[low:high]['col1']
            low = high
            high+= part_size
    df = df.assign(**_)
    print(df)