我正在尝试将复杂的 json(嵌套格式)转换为 csv
。
{
"caudal": [
{"ts": 1612746051248, "value": "0.0"},
{"ts": 1612745450856, "value": "0.0"},
{"ts": 1612744250898, "value": "0.0"},
{"ts": 1612743650861, "value": "0.0"},
{"ts": 1612743050821, "value": "0.0"}
],
"FreeHeap": [
{"ts": 1612746051248, "value": "247564"},
{"ts": 1612745450856, "value": "247564"},
{"ts": 1612744250898, "value": "247564"},
{"ts": 1612743650861, "value": "247564"},
{"ts": 1612743050821, "value": "247564"}
],
"MinimoFreeHeap": [
{"ts": 1612746051248, "value": "237440"},
{"ts": 1612745450856, "value": "237440"},
{"ts": 1612744250898, "value": "237440"},
{"ts": 1612743650861, "value": "237440"},
{"ts": 1612743050821, "value": "237440"}
]
}
我的程序必须处理的 jsons 包含更多记录,但为了简化分析,我将其缩小。我尝试使用 Pandas 库,如下所示:
import pandas as pd
with open('read.json') as f_input:
df = pd.read_json(f_input)
df.to_csv('out.csv', encoding='utf-8', index=False)
我得到以下结果:
caudal,FreeHeap,MinimoFreeHeap
"{'ts': 1612746051248, 'value': '0.0'}","{'ts': 1612746051248, 'value': '247564'}","{'ts': 1612746051248, 'value': '237440'}"
"{'ts': 1612745450856, 'value': '0.0'}","{'ts': 1612745450856, 'value': '247564'}","{'ts': 1612745450856, 'value': '237440'}"
"{'ts': 1612744250898, 'value': '0.0'}","{'ts': 1612744250898, 'value': '247564'}","{'ts': 1612744250898, 'value': '237440'}"
"{'ts': 1612743650861, 'value': '0.0'}","{'ts': 1612743650861, 'value': '247564'}","{'ts': 1612743650861, 'value': '237440'}"
"{'ts': 1612743050821, 'value': '0.0'}","{'ts': 1612743050821, 'value': '247564'}","{'ts': 1612743050821, 'value': '237440'}"
如你所见,每个单元格的信息是例如:
"{'ts': 1612743050821, 'value': '247564'}"
我理解的是另一个Json..有没有什么简单的方法可以添加一个名为timestamp(ts
)的列并且只将值放在这个json现在所在的单元格中?
我相信这将是正确的方法,我的目标是将 json 中包含的信息转换为 csv 格式,使其更容易被第三方(数据库或人工智能算法)使用。但是如果你能想到另一种更方便的方式或格式,我愿意改变我最初的想法。我不得不承认我是这个世界的新手。
我想通过 json 并手动进行转换,但很难关联具有相同时间戳的测量值。
答案 0 :(得分:1)
尼古拉斯
您没有说明您想要数据的方式,因此下面发布的代码将其转换为表格格式,其中每一列用于机器(不确定是否正确)、ts 和值。
import pandas as pd
import json
with open('read.json') as f_input:
data = json.load(f_input)
df = pd.DataFrame.from_dict(data, orient='columns')
df_new = pd.DataFrame(columns=['machine', 'ts', 'value'])
data=[]
for col in df.columns:
for index,row in df[col].iteritems():
ts, value = row.values()
data.append({'machine':col, 'ts':ts, 'value':value})
df_new = df_new.append(data)
df_new.to_csv('out.csv', encoding='utf-8', index=False)
如果您希望列作为时间戳并且机器将最后两行更改为此
df_new = df_new.append(data).pivot(index='ts', columns='machine', values='value')
df_new.to_csv('out.csv', encoding='utf-8')
答案 1 :(得分:1)
sl := TStringList.Create;
try
sl.LineBreak := '\n';
sl.Text := aString;
FFirstRow := sl[0];
FSecondRow := sl[1];
finally
sl.Free;
end;
是从列中标准化单个级别 pd.DataFrame(df[col].values.tolist())
的最快方法,但是此 answer 显示如何处理有问题的列(例如,在尝试 dict
时导致错误)。.values.tolist()
import pandas as pd
# read the json file
with open('read.json') as f_input:
df = pd.read_json(f_input)
# create a new dataframe for the normalized columns from df
normed_df = pd.DataFrame()
# iterate through each column, normalize it, and append it to normed_df
for col in df.columns:
normed = pd.DataFrame(df[col].values.tolist()) # normalize the column from df
normed['type'] = col # add the original column name as a new column so the associated values can be identified
normed_df = normed_df.append(normed) # append to normed_df
# convert ts to a datetime dtype
normed_df.ts = pd.to_datetime(normed_df.ts, unit='ms')
# reset the index
normed_df = normed_df.reset_index(drop=True)
# save this long form to a csv
normed_df.to_csv('long.csv', index=False)
# display(normed_df)
ts value type
0 2021-02-08 01:00:51.248 0.0 caudal
1 2021-02-08 00:50:50.856 0.0 caudal
2 2021-02-08 00:30:50.898 0.0 caudal
3 2021-02-08 00:20:50.861 0.0 caudal
4 2021-02-08 00:10:50.821 0.0 caudal
5 2021-02-08 01:00:51.248 247564 FreeHeap
6 2021-02-08 00:50:50.856 247564 FreeHeap
7 2021-02-08 00:30:50.898 247564 FreeHeap
8 2021-02-08 00:20:50.861 247564 FreeHeap
9 2021-02-08 00:10:50.821 247564 FreeHeap
10 2021-02-08 01:00:51.248 237440 MinimoFreeHeap
11 2021-02-08 00:50:50.856 237440 MinimoFreeHeap
12 2021-02-08 00:30:50.898 237440 MinimoFreeHeap
13 2021-02-08 00:20:50.861 237440 MinimoFreeHeap
14 2021-02-08 00:10:50.821 237440 MinimoFreeHeap
将数据与 .pivot
对齐作为索引。ts
答案 2 :(得分:0)
我终于找到了解决方案... 有一个非常有趣的库,名为“cherrypicker”。通过熊猫的示例和数据框,我想出了如何使其工作。代码如下:
import "react-native";
import React from "react";
import { shallow } from 'enzyme';
import { LoginContainer } from "...";
import { findByTestAttr } from '...';
const navigation = {
navigate: jest.fn()
}
describe('correct login action', () => {
const wrapper = shallow(<LoginContainer navigation={navigation} />);
let input = findByTestAttr(wrapper, "login-input");
let button = findByTestAttr(wrapper, "login-button");
test('should not navigate to login mail screen if email adress is not entered', () => {
input.simulate("changeText", "any@email.com");
button.simulate("press");
expect(navigation.navigate).toHaveBeenCalledTimes(1);
//input.simulate("changeText", "");
//button.simulate("press");
//expect(navigation.navigate).toHaveBeenCalledTimes(0);
});
});
我希望将来对某人有用,我不确定这是否是最简单的方法,但对我有用!问候