我正在分析一个Facebook对话,我想知道每个人每天的每小时发送多少条消息。使用熊猫,我做了public class Test {
public static void main(String[] args) {
List<String> averages = getAllLinesFromFileFromPath("/averages.txt");
double lowest = Double.valueOf(averages.get(0));
for (String line: averages) {
Double weekValue = Double.valueOf(line);
if (lowest>weelValue) {lowest = weekValue;}
}
}
public static List<String> getAllLinesFromFileFromPath(String filename) {
try {
BufferedReader br = new BufferedReader(new FileReader(Paths.get(filename).toFile()));
List<String> result = new ArrayList<>();
String line;
while ((line = br.readLine())!=null) {
result.add(line);
}
br.close();
return result;
}
catch (Exception e) {
e.printStackTrace();
return null;
}
}
}
。返回的系列对象具有以下所需形式:
data['n_msg_by_hour'] = df.groupby(['author', df['date'].dt.hour])['_id'].count()
但是,当我做Djézeune 0 4866
1 4549
2 4463
3 3841
4 2560
5 1029
6 396
7 239
8 76
9 56
10 40
11 88
12 340
13 685
14 1253
15 1712
16 2224
17 2650
18 2439
19 2951
20 3347
21 3575
22 4696
23 4741
Vinssan 0 108
1 129
2 84
3 72
4 8
5 17
6 4
7 1
8 1
9 1
11 4
12 26
13 37
14 81
15 114
16 92
17 123
18 83
19 95
20 58
21 112
22 87
23 109
Name: _id, dtype: int64
时,我有一个以元组为键的字典,像这样:
data['n_msg_by_hour'].to_dict()
但是我希望有一个缩进的字典,然后将其放入json
{
('Djézeune', 0):4866,
('Djézeune', 1):4549,
('Djézeune', 10):40,
('Djézeune', 11):88,
('Djézeune', 12):340,
('Djézeune', 13):685,
('Djézeune', 14):1253,
...
('Vinssan', 0):108,
('Vinssan', 1):129,
('Vinssan', 10):0,
('Vinssan', 11):4,
('Vinssan', 12):26,
('Vinssan', 13):37,
('Vinssan', 14):81,
}
是否可以使用{
'Djézeune':{0:4866, 1:4549, 10:40, 11:88, 12:340, 13:685, 14:1253 ...},
'Vinssan':{0:108, 1:129, 10:0, 11:4, 12:26, 13:37, 14:81 ...}
}
的{{1}}选项或level
的{{1}}之类的函数来轻松实现此目的,而无需遍历字典键?
DataFrame中的每一行如下:
groupby
答案 0 :(得分:1)
通过对索引的第一级进行分组并遍历结果Series
es,可能最容易实现:
In [320]: s = pd.Series(np.random.random(48), index=pd.MultiIndex.from_product([["DJ", "Vin"], range(24)]))
In [321]: d = {k: v.droplevel(0).to_dict() for k, v in s.groupby(level=0)}
In [322]: d
Out[322]:
{'DJ': {0: 0.8731657595223525,
1: 0.6806768452816228,
2: 0.6376297431476246,
...
23: 0.9995968607512785},
'Vin': {0: 0.19255930821536904,
1: 0.944802244484905,
2: 0.1171672201795304,
...
23: 0.7387196132363647}}