我需要使用带有MongoDB中嵌套文档的Pandas创建一个表。
这是我的json:
<?xml version="1.0" encoding="utf-8"?>
<LinearLayout xmlns:android="http://schemas.android.com/apk/res/android"
android:layout_width="match_parent"
android:layout_height="wrap_content"
android:layout_margin="35dp"
android:layout_gravity="center"
android:orientation="vertical">
<TextView
android:layout_width="match_parent"
android:layout_height="wrap_content"
android:layout_marginTop="15dp"
android:gravity="center"
android:text="a very long long long long long long long long long long long long long long long long long long long long long long long long long long long text"
android:textSize="24sp" />
<LinearLayout
android:layout_width="match_parent"
android:layout_height="wrap_content"
android:layout_gravity="center"
android:gravity="center"
android:orientation="horizontal"
android:paddingTop="10dp">
<Button
android:layout_width="120dp"
android:layout_height="wrap_content"
android:layout_margin="10dp"
android:padding="3dp"
android:text="cancel"
android:textSize="24sp"/>
<Button
android:layout_width="120dp"
android:layout_height="wrap_content"
android:layout_margin="10dp"
android:padding="3dp"
android:text="yes"
android:textSize="24sp" />
</LinearLayout>
</LinearLayout>
我需要一个简单的表:
{
"CNPJ" : "65206503000163",
"CNAE" : [
{
"codigoCNAE" : 7911200,
"dataInicioCNAE" : 20000101,
},
{
"codigoCNAE" : 9999999,
"dataInicioCNAE" : 2018101,
}
]
}
谢谢
答案 0 :(得分:1)
假设您只有一个这样的文档,则可以使用以下代码。
dict1 = { "CNPJ" : "65206503000163", "CNAE" : [{ "codigoCNAE" : 7911200, "dataInicioCNAE" : 20000101, }, { "codigoCNAE" : 9999999, "dataInicioCNAE" : 2018101, } ] }
df = pd.DataFrame(dict1['CNAE'])
df['CNPJ'] = dict1['CNPJ']
输出:
print(df)
codigoCNAE dataInicioCNAE CNPJ
0 7911200 20000101 65206503000163
1 9999999 2018101 65206503000163
对于多个文档,您可以遍历每个文档并使用pd.concat
组合每个df
答案 1 :(得分:1)
from pandas.io.json import json_normalize
dict1 = { "CNPJ" : "65206503000163",
"CNAE" : [{ "codigoCNAE" : 7911200,
"dataInicioCNAE" : 20000101, },
{ "codigoCNAE" : 9999999,
"dataInicioCNAE" : 2018101, } ] }
df = json_normalize(dict1, ['CNAE'],'CNPJ')
print (df)
codigoCNAE dataInicioCNAE CNPJ
0 7911200 20000101 65206503000163
1 9999999 2018101 65206503000163
答案 2 :(得分:0)
您需要:
import pandas as pd
x = {
"CNPJ" : "65206503000163",
"CNAE" : [
{
"codigoCNAE" : 7911200,
"dataInicioCNAE" : 20000101,
},
{
"codigoCNAE" : 9999999,
"dataInicioCNAE" : 2018101,
}
]
}
df = pd.DataFrame.from_dict(x, orient='columns')
df = pd.concat([df['CNAE'].apply(pd.Series), df['CNPJ']], axis=1)
print(df)
输出:
codigoCNAE dataInicioCNAE CNPJ
0 7911200 20000101 65206503000163
1 9999999 2018101 65206503000163
答案 3 :(得分:0)
仅从您拥有的dataframe
中制作一个dict
,将dataframe
分成两部分。将CNAE
的一部分设为Series
,将concat
的另一部分设为轴1。
x = {
"CNPJ" : "65206503000163",
"CNAE" : [
{
"codigoCNAE" : 7911200,
"dataInicioCNAE" : 20000101,
},
{
"codigoCNAE" : 9999999,
"dataInicioCNAE" : 2018101,
}
]
}
x_df = pd.DataFrame(x)
a_df = x_df['CNAE'].apply(pd.Series)
b_df = x_df['CNPJ']
df = pd.concat([b_df, a_df], axis=1)
df
#Output
CNPJ codigoCNAE dataInicio CNAE
0 65206503000163 7911200 20000101
1 65206503000163 9999999 2018101
答案 4 :(得分:0)
使用concat
:
>>> df=pd.DataFrame(d)
>>> pd.concat([df[['CNPJ']],pd.DataFrame(d['CNAE'])],axis=1)
CNPJ codigoCNAE dataInicioCNAE
0 65206503000163 7911200 20000101
1 65206503000163 9999999 2018101
>>>