Python Pandas DataFrame作为输入

Question

在Json中转换DataFrame，添加列名称，如所需的输出技能和建议后，保存在MongoDB集合中

Python Pandas DataFrame作为输入

    0     1     2       3       4       5       6       7
java    hadoop  java    hdfs    c       c++     php     python   html

c       c       c++     hdfs    python  hadoop  java    php      html

c++     c++     c       python  hdfs    hadoop  java    php      html

hadoop  hadoop  java    hdfs    c       c++     php     python   html

hdfs    hdfs    hadoop  java    c       c++     python  php      html

python  python  c++     html    c       php     hdfs    hadoop   java

Desired Output以

的形式保存到MongoDB集合中

{＆＃34; _id＆＃34; ：ObjectId（＆＃34; 5922a781205a763b55e2e90e＆＃34;），＆＃34;技能＆＃34; ：＆＃34; java＆＃34;，＆＃34;建议＆＃34; ：[＆＃34; hadoop＆＃34;，＆＃34; java＆＃34;，＆＃34; hdfs＆＃34;，＆＃34; c＆＃34;，＆＃34; c ++＆＃34;，＆＃34; php＆＃34;，＆＃34; python＆＃34;，＆＃34; html＆＃34; ]}

{＆＃34; _id＆＃34; ：ObjectId（＆＃34; 5922a781205a763b55e2e91e＆＃34;），＆＃34;技能＆＃34; ：＆＃34; c＆＃34;，＆＃34;建议＆＃34; ：[＆＃34; c＆＃34;，＆＃34; c ++＆＃34;，＆＃34; hdfs＆＃34;，＆＃34; python＆＃34;，＆＃34; hadoop＆＃34;，＆＃34; java＆＃34;，＆＃34; php＆＃34;，＆＃34; html＆＃34; ]}

{＆＃34; _id＆＃34; ：ObjectId（＆＃34; 5922a781205a763b55e2e92e＆＃34;），＆＃34;技能＆＃34; ：＆＃34; c ++＆＃34;，＆＃34;建议＆＃34; ：[＆＃34; c ++＆＃34;，＆＃34; c＆＃34;，＆＃34; python＆＃34;，＆＃34; hdfs＆＃34;，＆＃34; hadoop＆＃34;，＆＃34; java＆＃34;，＆＃34; php＆＃34;，＆＃34; html＆＃34; ]}

{＆＃34; _id＆＃34; ：ObjectId（＆＃34; 5922a781205a763b55e2e93e＆＃34;），＆＃34;技能＆＃34; ：＆＃34; hadoop＆＃34;，＆＃34;建议＆＃34; ：[＆＃34; hadoop＆＃34;，＆＃34; java＆＃34;，＆＃34; hdfs＆＃34;，＆＃34; c＆＃34;，＆＃34; c ++＆＃34;，＆＃34; php＆＃34;，＆＃34; python＆＃34;，＆＃34; html＆＃34; ]}

Answer 1

首先，您需要将数据转换为相应的格式。

strlist = [['java','hadoop','java','hdfs','c','c++','php','python','html'],
      ['c','c','c++','hdfs','python','hadoop','java','php','html'],
      ['c++','c++','c','python','hdfs','hadoop','java','php','html'],
      ['hadoop','hadoop','java','hdfs','c','c++','php','python','html'],
      ['hdfs','hdfs','hadoop','java','c','c++','python','php','html'],
      ['python','python','c++','html','c','php','hdfs','hadoop','java']]

df = pd.DataFrame(strlist)

#I guess you need the following code
df['skill']=df[df.columns[:1]].values
df['suggestions'] = df[df.columns[1:]].values.tolist()
df = df[['skill','suggestions']]

print(df)
    skill                                        suggestions
0    java  [hadoop, java, hdfs, c, c++, php, python, html...
1       c  [c, c++, hdfs, python, hadoop, java, php, html...
2     c++  [c++, c, python, hdfs, hadoop, java, php, html...
3  hadoop  [hadoop, java, hdfs, c, c++, php, python, html...
4    hdfs  [hdfs, hadoop, java, c, c++, python, php, html...
5  python  [python, c++, html, c, php, hdfs, hadoop, java...

然后将数据帧插入mongdb数据库。

records = json.loads(df.T.to_json()).values()
collection.insert_many(records)

将Python Pandas Data Frame转换为JSon格式，并通过使用Python

Python Pandas DataFrame作为输入

Desired Output以

1 个答案: