我有(3,50)形状的numpy数组:
data = np.array([[0, 3, 0, 2, 0, 0, 1, 2, 2, 0, 1, 0, 0, 0, 0, 0, 0, 2, 1, 2, 0, 0,
0, 0, 0, 0, 0, 0, 0, 2, 1, 0, 0, 0, 0, 0, 1, 0, 0, 7, 0, 0, 0, 0,
1, 1, 2, 0, 0, 2],
[0, 0, 0, 0, 0, 3, 0, 1, 6, 1, 1, 0, 0, 0, 0, 2, 0, 0, 1, 0, 1, 0,
3, 0, 0, 0, 0, 0, 0, 5, 2, 2, 2, 1, 0, 0, 1, 0, 1, 3, 2, 0, 0, 0,
0, 0, 2, 0, 0, 0],
[1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1,
0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 2, 0, 1, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0]])
以及以下列名:
new_cols = [f'description_word_{i+1}_count' for i in range(50)]
我正在尝试通过以下方式在现有数据框中添加新列:
df[new_cols] = data
但是得到错误:
KeyError:“ [Index([''description_word_1_count', 'description_word_2_count',\ n'description_word_3_count', 'description_word_4_count',\ n'description_word_5_count', 'description_word_6_count',\ n'description_word_7_count', 'description_word_8_count',\ n'description_word_9_count', 'description_word_10_count',\ n'description_word_11_count', 'description_word_12_count',\ n'description_word_13_count', 'description_word_14_count',\ n'description_word_15_count', 'description_word_16_count',\ n'description_word_17_count', 'description_word_18_count',\ n'description_word_19_count', 'description_word_20_count',\ n'description_word_21_count', 'description_word_22_count',\ n'description_word_23_count', 'description_word_24_count',\ n'description_word_25_count', 'description_word_26_count',\ n'description_word_27_count', 'description_word_28_count',\ n'description_word_29_count', 'description_word_30_count',\ n'description_word_31_count', 'description_word_32_count',\ n'description_word_33_count', 'description_word_34_count',\ n'description_word_35_count', 'description_word_36_count',\ n'description_word_37_count', 'description_word_38_count',\ n'description_word_39_count', 'description_word_40_count',\ n'description_word_41_count', 'description_word_42_count',\ n'description_word_43_count', 'description_word_44_count',\ n'description_word_45_count', 'description_word_46_count',\ n'description_word_47_count', 'description_word_48_count',\ n'description_word_49_count', 'description_word_50_count'],\ n dtype ='object')]位于 [专栏]”
我也不知道它在我的列名中哪里找到一个'\ n'符号。
同时用数据创建一个新的数据框是可以的:
new_df = pd.DataFrame(data=data, columns=new_cols)
有人知道导致错误的原因吗?
答案 0 :(得分:2)
假设您有一个这样的df:
df = pd.DataFrame({'person': [1,1,1], 'event': ['A','B','C']})
您可以这样添加新列:
import pandas as pd
import numpy as np
data = np.array([[0, 3, 0, 2, 0, 0, 1, 2, 2, 0, 1, 0, 0, 0, 0, 0, 0, 2, 1, 2, 0, 0,
0, 0, 0, 0, 0, 0, 0, 2, 1, 0, 0, 0, 0, 0, 1, 0, 0, 7, 0, 0, 0, 0,
1, 1, 2, 0, 0, 2],
[0, 0, 0, 0, 0, 3, 0, 1, 6, 1, 1, 0, 0, 0, 0, 2, 0, 0, 1, 0, 1, 0,
3, 0, 0, 0, 0, 0, 0, 5, 2, 2, 2, 1, 0, 0, 1, 0, 1, 3, 2, 0, 0, 0,
0, 0, 2, 0, 0, 0],
[1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1,
0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 2, 0, 1, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0]])
new_cols = [f'description_word_{i+1}_count' for i in range(50)]
df[new_cols] = pd.DataFrame(data, index=df.index)
我认为问题在于,当您实际上需要创建多个序列时,您正在使用语法来创建序列。换句话说,是一个数据框。