Question

我有一个大熊猫系列，每行都有不同的调查文字。例如：

df = df.read_csv('survey_data.csv', header=None)

0 a comment
1 another comment
2 this what the person thought
3 what they felt
4 some more

因为我想将系列更改为具有三列的dataframe，并将其另存为csv文件。

因此新的df为：

a comment       another comment   this what the person thought
what they felt  some more

我实际上不在乎订单是否混乱。然后，我将其输出到csv文件中。

我尝试了许多不同的方法，而目前的方法是：

col_cnt = 1
df.dropna(inplace = True)  # removing null values to avoid errors
new_df = pd.DataFrame()
data = []

for index, row in df.iterrows():
    data.append(row)
    if col_cnt == 3: # we have done the three rows
        new_df.loc[len(new_df)]=list(data[1], data[2], data[3])
        col_cnt = 0
        data = [] # clear the list now that you have written it to the new df
    col_cnt = col_cnt + 1 #increment col counter for next row

    # need to write the remainder somehow

我收到错误消息：IndexError: list index out of range

更新

我找到并修改的这段代码有效！但是我只能以正确的顺序获得两列。不是我想要的三个。将范围内的2更改为3只会返回一列。

new_df = pd.DataFrame（）

index = 1
for i in range(0, len(df), 2):
    new_df['Column' + str(index)] = df[0].iloc[i:i+3].reset_index(drop=True)
    index += 1

Answer 1

如果您的数据框有6行，如下所示。

GraphSONSerializersV3d0

然后您可以执行此操作以获取所需的0 a comment 1 another comment 2 this what the person thought 3 what they felt 4 some more 5 some more

您可以使用此方法获得带有列名称的数据框

np.reshape(df.values,(-1,3)))

熊猫-将一列转换为三列

更新

1 个答案: