我有一个函数myfunc,它在两个pandas DataFrame列上进行计算。输出是一个Numpy数组。
def myfunc(df, args):
import numpy
return numpy.array([df.iloc[:,args[0]].sum,df.iloc[:,args[1]].sum])
在rolling_df_apply中调用此函数:
def rolling_df_apply(df, myfunc, window, *args):
import pandas
result = pandas.concat(pandas.DataFrame(myfunc(df.iloc[i:window+i],args), index=[df.index[i+window-1]]) for i in xrange(0,len(df)-window+1))
return result
通过
运行import numpy
import pandas
df=pandas.DataFrame(numpy.random.randint(5,size=(5,2)))
window=3
args = [0,1]
result = rolling_df_apply(df, myfunc, window, *args)
在pandas.concat()中给出ValueError:传递值的形状是(1,2),索引暗示(1,1)。 为了让这个运行必须改变什么? 哪个指数意味着形状1,1?但是要连接的所有数据帧的形状应该是1,2。
答案 0 :(得分:0)
在myfunc
.sum
中,.sum()
应为myfunc
myfunc
。
由于pandas.DataFrame(myfunc(df.iloc[i:window+i],args), index=[df.index[i+window-1]])
返回长度为2的数组,
pd.DataFrame([0,1], index=[0])
与
基本相同ValueError: Shape of passed values is (1, 2), indices imply (1, 1)
提出
[0,1]
错误是说值In [191]: pd.DataFrame({'a':0,'b':1}, index=[0])
Out[191]:
a b
0 0 1
意味着1行和2列,
而索引意味着1行和1列。
解决这个问题的方法是传递一个dict而不是一个列表:
import pandas as pd
import numpy as np
def myfunc(df, args):
return {'a':df.iloc[:,args[0]].sum(), 'b':df.iloc[:,args[1]].sum()}
def rolling_df_apply(df, myfunc, window, *args):
frames = [pd.DataFrame(myfunc(df.iloc[i:window+i],args),
index=[df.index[i+window-1]])
for i in xrange(0,len(df)-window+1)]
result = pd.concat(frames)
return result
np.random.seed(2015)
df = pd.DataFrame(np.random.randint(5,size=(5,2)))
window=3
args = [0,1]
result = rolling_df_apply(df, myfunc, window, *args)
print(result)
因此,要以最小的更改来修复代码,
a b
2 7 6
3 7 5
4 3 3
产量
myfunc
然而,通过调用rolling_df_apply
来替换pd.rolling_sum
和result = pd.rolling_sum(df, window=3).dropna(axis=0)
会更有效:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Runtime.Serialization;
using System.Runtime.Serialization.Json;
using System.IO;
namespace ConsoleApplication1
{
class Program
{
const string FILENAME = @"c:\temp\test.txt";
static void Main(string[] args)
{
field f = new field()
{
fields = new List<Person>() {
new Person() {
cid = "abc",
field_type = "def",
label = "ghi",
required = "true",
field_options = new List<sizee>() {
new sizee() {
size = "AAA",
options = new List<option>() {
new option() {
checke = "123",
label = "456",
},
new option() {
checke = "789",
label = "012",
}
}
},
new sizee() {
size = "BBB",
options = new List<option>() {
new option() {
checke = "321",
label = "654",
},
new option() {
checke = "987",
label = "210",
}
}
}
}
},
new Person() {
cid = "xyz",
field_type = "def",
label = "ghi",
required = "true",
field_options = new List<sizee>() {
new sizee() {
size = "AAA",
options = new List<option>() {
new option() {
checke = "123",
label = "456",
},
new option() {
checke = "789",
label = "012",
}
}
},
new sizee() {
size = "BBB",
options = new List<option>() {
new option() {
checke = "321",
label = "654",
},
new option() {
checke = "987",
label = "210",
}
}
}
}
}
}
};
DataContractJsonSerializer ser = new DataContractJsonSerializer(typeof(field));
FileStream stream = File.OpenWrite(FILENAME);
ser.WriteObject(stream, f);
stream.Flush();
stream.Close();
stream.Dispose();
}
}
[DataContract]
public class field
{
[DataMember]
public List<Person> fields { get; set; }
}
[DataContract]
public class Person
{
[DataMember]
public string label { get; set; }
[DataMember]
public string field_type { get; set; }
[DataMember]
public string required { get; set; }
[DataMember]
public string cid { get; set; }
[DataMember]
public List<sizee> field_options { get; set; }
}
[DataContract]
public class sizee
{
[DataMember]
public string size { get; set; }
[DataMember]
public List<option> options { get; set; }
}
[DataContract]
public class option
{
[DataMember]
public string checke { get; set; }
[DataMember]
public string label { get; set; }
}
}
产生相同的结果。