pandas.core.common.PandasError:未正确调用DataFrame构造函数

时间:2017-03-07 04:33:01

标签: python csv pandas

我正在尝试使用mosquitto接收数据并使用python pandas将其保存为csv文件。在我停止脚本之前,数据是连续的。

mqtt_pub.py

import paho.mqtt.client as mqtt
import random
import schedule
import time

mqttc = mqtt.Client("python_pub")
mqttc.connect("localhost", 1883)

def job():
    mqttc.publish("hello/world", random.randint(1, 10))

schedule.every(1).seconds.do(job)

while True:
    schedule.run_pending()
    time.sleep(1)

mqttc.loop(2)

mqtt_sub.py

import paho.mqtt.client as mqtt
import pandas as pd

def on_connect(client, userdata, rc):
    print("Connected with result code "+str(rc))
    client.subscribe("hello/world")

def on_message(client, userdata, msg):
    datas = map(int, msg.payload)
    for num in datas:
        df = pd.DataFrame(data=datas, columns=['the_number'])
        df.to_csv("testing.csv")

client = mqtt.Client()
client.on_connect = on_connect
client.on_message = on_message

client.connect("localhost", 1883, 60)

client.loop_forever()
从上面mqtt_sub.py脚本

,我得到testing.csv,看起来像这样

    | the _number
0   | 2

2是我在停止mqtt_sub.py脚本

之前收到的最后一位数字
Connected with result code 0
[3]
[9]
[5]
[3]
[7]
[2]
...
...
KeyboardInterrupt

我希望得到testing.csv这样的

    | the_number
0   | 3
1   | 9
2   | 5
...
...
5   | 2

要实现这一点,我尝试将以下df = pd.DataFrame(data=datas, columns=['the_number'])更改为df = pd.DataFrame(data=num, columns=['the_number']),并发生以下错误

pandas.core.common.PandasError: DataFrame constructor not properly called!

有谁知道如何解决错误?我也觉得我没有在这里正确使用for循环。

感谢您的建议和帮助。

[UPDATE]

我在on_message方法

中添加/更改以下行
def on_message(client, userdata, msg):
    datas = map(int, msg.payload)
    df = pd.DataFrame(data=datas, columns=['the_number'])

    f = open("test.csv", 'a')
    df.to_csv(f)
    f.close()

在Nulljack的帮助下,我能够在我的CSV文件中得到这样的结果

   | the_number
0  | 3
   | the_number
0  | 9 
   | the_number
0  | 5
   | the_number
0  | 3
   | the_number
0  | 7

我的目标是在CSV文件中实现类似的功能

   | the_number
0  | 3
1  | 9
2  | 5 
3  | 3
4  | 7

2 个答案:

答案 0 :(得分:0)

如果我的理解是错误的,我在道歉之前从未使用过mosquitto。

在我看来,每当你的mqtt_pub.py发布一条消息(即每一秒)时你的mqtt_sub.py中的on_message方法就会运行,这会导致你的 testing.csv 文件每次发布消息时都会被覆盖

要解决这个问题,我会在你的on_connect方法中初始化一个数据帧,然后在on_message中通过df.append将新值添加到数据帧

对于你终止后写入csv,我不确定。

希望这有帮助

答案 1 :(得分:0)

其他帖子很拥挤,所以我在这里移动了我的回复

尝试使用以下代码

import paho.mqtt.client as mqtt
import pandas as pd

# Move df here 
df = pd.DataFrame(columns=['the_number'])

def on_connect(client, userdata, rc):
    print("Connected with result code "+str(rc))
    client.subscribe("hello/world")

def on_message(client, userdata, msg):
    datas = map(int, msg.payload)

    # this adds the data to the dataframe at the correct index
    df.iloc[df.size] = datas

    # I reverted this line back to what you originally had 
    # This will overwrite the testing.csv file every time your subscriber
    # receives a message, but since the dataframe is formatted like you want 
    # it shouldn't matter 
    df.to_csv("testing.csv")


client = mqtt.Client()
client.on_connect = on_connect
client.on_message = on_message

client.connect("localhost", 1883, 60)

client.loop_forever()