Question

感谢您的关注。

我正在尝试修改Python脚本以从网站下载大量数据。我已经决定，鉴于将要使用的大数据，我想将脚本转换为Pandas。到目前为止我有这个代码。

snames = ['Index','Node_ID','Node','Id','Name','Tag','Datatype','Engine']
sensorinfo = pd.read_csv(sensorpath, header = None, names = snames, index_col=['Node', 'Index'])
for j in sensorinfo['Node']:    
     for z in sensorinfo['Index']:

    # create a string for the url of the data
    data_url = "http://www.mywebsite.com/emoncms/feed/data.json?id=" + sensorinfo['Id'] + "&apikey1f8&start=&end=&dp=600"
    print data_url
    # read in the data from emoncms
    sock = urllib.urlopen(data_url)
    data_str = sock.read()
    sock.close

    # data is output as a string so we convert it to a list of lists
    data_list = eval(data_str)
    myfile = open(feed_list['Name'[k]] + ".csv",'wb')

    wr=csv.writer(myfile,quoting=csv.QUOTE_ALL)

代码的第一部分给了我一个非常好的表，这意味着我打开我的csv数据文件并导入信息，我的问题是：

所以我试图用伪代码来做这个：

For node is nodes (4 nodes so far)
 For index in indexes
       data_url = websiteinfo + Id + sampleinformation
      smalldata.read.csv(data_url)
      merge(bigdata, smalldata.no_time_column)

这是我在这里的第一篇文章，我试图保持简短，但仍提供相关数据。如果我需要澄清任何内容，请告诉我。

Answer 1

在您的伪代码中，您可以这样做：

dfs = []
For node is nodes (4 nodes so far)
   For index in indexes
      data_url = websiteinfo + Id + sampleinformation
      df = smalldata.read.csv(data_url)
      dfs.append(df)

df = pd.concat(dfs)

使用带有FOR循环的Pandas数据帧

1 个答案: