如何从数据框中删除方括号?

时间:2020-06-23 09:07:51

标签: python-3.x pandas dataframe

我正在尝试在预测值和实际值之间进行比较。

Edit: this is the initial dataframe这是我的尝试:

from sklearn import linear_model
    
reg = linear_model.LinearRegression()
reg.fit(df[['Op1', 'Op2', 'S2', 'S3', 'S4', 'S7', 'S8', 'S9', 'S11', 'S12','S13', 'S14', 'S15', 'S17', 'S20', 'S21']], df.unit)

predicted = []
actual = []
for i in range(1,len(df.unit.unique())):
    xp = df[(df.unit == i) & (df.cycles == len(df[df.unit == i].cycles))]
    xa = xp.cycles.values
    xp = xp.values[0,2:].reshape(1,-2)
    predicted.append(reg.predict(xp))
    actual.append(xa)

并显示数据框:

data = {'Actual cycles': actual, 'Predicted cycles': predicted }
df_2 = pd.DataFrame(data)


df_2.head()

我将得到一个输出:

Actual cycles   Predicted cycles
0   [192]   [56.7530579842869]
1   [287]   [50.76877712361329]
2   [179]   [42.72575900074571]
3   [189]   [42.876506912637524]
4   [269]   [47.40087182743173]

忽略相距很远的值,如何删除数据框中的方括号?有写我代码的更整洁的方法吗?谢谢!

2 个答案:

答案 0 :(得分:1)

print(df_2)

  Actualcycles       Predictedcycles
0        [192]    [56.7530579842869]
1        [287]   [50.76877712361329]
2        [179]   [42.72575900074571]
3        [189]  [42.876506912637524]
4        [269]   [47.40087182743173]

df=df_2.apply(lambda x:x.str.strip('[]'))
print(df)
 Actualcycles     Predictedcycles
0          192    56.7530579842869
1          287   50.76877712361329
2          179   42.72575900074571
3          189  42.876506912637524
4          269   47.40087182743173

答案 1 :(得分:0)

以下是带有括号的“周期”列的最小示例:

import pandas as pd

df = pd.DataFrame({
    'cycles' : [[192], [287], [179], [189], [269]]
})

此代码为您提供了没有括号的列:

df['cycles'] = df['cycles'].str[0]

输出看起来像这样:

print(df)

   cycles
0     192
1     287
2     179
3     189
4     269