graphlab创建sframe结合两列

时间:2016-08-23 02:35:50

标签: python pandas graphlab

我有两列字符串。让我们说col1和col2 现在我们如何将col1和col2的内容与graphlab SFrame结合到col3中?

col1 col2
23    33
42    11
........

进入

col3
23,33
42,11
....

unstack只能给sarray或dict,我只有一个单词

尝试了

user_info['X5']=user_info['X3'].apply(lambda x:x+','+user_info['X4'].apply(lambda y:y))

似乎不对

任何想法?

1 个答案:

答案 0 :(得分:3)

使用pandas:

In [17]: sf
Out[17]: 
Columns:
    col1    int
    col2    int

Rows: 2

Data:
+------+------+
| col1 | col2 |
+------+------+
|  23  |  33  |
|  42  |  11  |
+------+------+
[2 rows x 2 columns]

In [18]: sf['col3'] = sf['col1'].apply(str) + ',' + sf['col2'].apply(str)

In [19]: sf
Out[19]: 
Columns:
    col1    int
    col2    int
    col3    str

Rows: 2

Data:
+------+------+-------+
| col1 | col2 |  col3 |
+------+------+-------+
|  23  |  33  | 23,33 |
|  42  |  11  | 42,11 |
+------+------+-------+
[2 rows x 3 columns]

使用graphlab:

# coding: utf-8
true = True
false = False
null = None

import sys

inDoc = sys.argv[1]

#print inDoc
parts = inDoc.split(".")
if(parts[len(parts)-1].lower()=="json"):
    with open(inDoc,"r") as INDOC:
        doc = INDOC.read()
else:
    doc = inDoc

del inDoc

pyDoc = eval(doc)

del doc

pyDoc["#Creator"] = "<Your_Name>"
import time
import datetime
pyDoc["#CreatedAt"] = str(datetime.datetime.fromtimestamp(time.time()).strftime('%Y-%m-%d %H:%M:%S'))

doc = str(pyDoc)

del pyDoc

esDomain = "search-movies-4f3nw7eiia2xiynjr55a2nao2y.us-west-1.es.amazonaws.com/movies/movie/tt0116996"

import os
command = "curl -XPUT "+esDomain+" -d '"+str(doc).replace(": True",": true").replace(": False",": false").replace(": None",": null").replace("'",'"')+"'"

del doc
del esDomain

#print (command)
os.system(command)

del command