Pandas使用loc在Multi Index DataFrame中进行赋值

时间:2017-12-10 08:53:15

标签: python pandas python-2.7 dataframe indexing

我初始化了一个这样的数据框:

df = pd.DataFrame(columns=["stockname","timestamp","price","volume"])
df.timestamp = pd.to_datetime(df.timestamp, format = "%Y-%m-%d %H:%M:%S:%f")
df.set_index(['stockname', 'timestamp'], inplace = True)

现在我从某个地方获取数据流但是为了程序,让我这样写:

filehandle = open("datasource")

for line in filehandle:
    line = line.rstrip()
    data = line.split(",")
    stockname = data[4]
    price = float(data[3])
    timestamp = pd.to_datetime(data[0], format = "%Y-%m-%d %H:%M:%S:%f")
    volume = int(data[6])

    df.loc[stockname, timestamp] = [price, volume]

filehandle.close()

print df

但这是错误的:

  

ValueError:无法使用长度不同于值

的多索引选择索引器进行设置

3 个答案:

答案 0 :(得分:5)

指定要分配数据的列名称,即

df = pd.DataFrame(columns=["a","b","c","d"])
df.set_index(['a', 'b'], inplace = True)

df.loc[('3','4'),['c','d']] = [4,5]

df.loc[('4','4'),['c','d']] = [3,1]

      c    d
a b          
3 4  4.0  5.0
4 4  3.0  1.0

如果您有逗号分隔文件,那么您可以使用read_csv,即:

import io
import pandas as pd
st = '''2017-12-08 15:29:58:740657,245.0,426001,248.65,APPL,190342,2075673,249.35,244.2
        2017-12-08 16:29:58:740657,245.0,426001,248.65,GOOGL,190342,2075673,249.35,244.2
        2017-12-08 18:29:58:740657,245.0,426001,248.65,GOOGL,190342,2075673,249.35,244.2
        '''
#instead of `io`, add the source name
df = pd.read_csv(io.StringIO(st),header=None)
# Now set the index and select what you want 
df.set_index([0,4])[[1,7]]

                                   1       7
 0                          4                   
2017-12-08 15:29:58.740657 APPL   245.0  249.35
2017-12-08 16:29:58.740657 GOOGL  245.0  249.35
2017-12-08 18:29:58.740657 GOOGL  245.0  249.35

答案 1 :(得分:1)

我认为你在寻找的是:

df.loc[a,b,:] = [c,d]

以下是您的数据框的示例:

for i in range(3):
    for j in range(3):
        df.loc[(str(i),str(j)),:] = [i,j]

输出:

     c  d
a b      
0 0  0  0
  1  0  1
  2  0  2
1 0  1  0
  1  1  1
  2  1  2
2 0  2  0
  1  2  1
  2  2  2

答案 2 :(得分:1)

您可能想使用let data1 = { purchaseOrder: [{ name: "Purchase Order", version: 1, description: "purchase order process", saved: true, visibility: true }, { name: "Purchase Order", version: 2, description: "purchase order process", saved: false, visibility: true } ], requestOrder: [{ name: "Request Order", version: 1, description: "request order process", saved: true, visibility: true }, { name: "Request Order", version: 2, description: "request order process", saved: false, visibility: true } ], cancelOrder: [{ name: "Cancel Order", version: 1, description: "cancel order process", saved: false, visibility: false }] }; let data2 = [{ id: "dwffrgefg68964", name: "Purchase Order", version: 1 }, { id: "emffrgefg68964", name: "Purchase Order", version: 2 }, { id: "iuffrgefg68964", name: "Request Order", version: 1 } ]; function getAllProcess() { for (let key in data1) { var temp1 = data1[key]; for (let i = temp1.length - 1; i >= 0; i--) { const reqModel = data2.find(process => process.name === temp1[i].name && process.version === temp1[i].version); if (reqModel) { temp1[i].id = reqModel.id; data1[key][i] = temp1[i]; } else { data1[key].splice(i, 1); } } if (!data1[key].length) { delete data1[key]; } } return data1; } console.log(getAllProcess());来逃避此错误