如何加入以下数据
# Convert Spark DataFrame to Pandas
pandas_df = df.toPandas()
print pandas_df
age name
0 NaN Michael
1 30 Andy
2 19 Justin
我目前的尝试,
persons = ""
for index, row in pandas_df.iterrows():
persons += str(row['name']) + ", " + str(row['age']) + "/ "
print row['name'], row['age']
print persons
结果,
Michael, nan/ Andy, 30.0/ Justin, 19.0/
但我追求(最后没有斜线),
Michael, nan/ Andy, 30.0/ Justin, 19.0
答案 0 :(得分:3)
如果您希望保持循环方法,那么您可以通过对其/
执行简单删除,从右侧剥离。rstrip()
。示例 -
persons = ""
for index, row in pandas_df.iterrows():
persons += str(row['name']) + ", " + str(row['age']) + "/ "
print row['name'], row['age']
person = person.rstrip("/ ")
print persons
示例/演示 -
>>> person = "Michael, nan/ Andy, 30.0/ Justin, 19.0/ "
>>> person = person.rstrip('/ ')
>>> person
'Michael, nan/ Andy, 30.0/ Justin, 19.0'
但是如果你真的不想在循环中使用print row['name'], row['age']
,那么你可以将它转换为生成器函数并让str.join()
处理你想要的东西。示例 -
person = "/".join(",".join([str(row['name']), str(row['age'])]) for _, row in pandas_df.iterrows())
答案 1 :(得分:2)
我认为这样做
persons = []
str_pearsons=""
for index, row in pandas_df.iterrows():
persons.append( str(row['name']) + ", " + str(row['age']))
str_pearsons="/ ".join(persons)
答案 2 :(得分:1)
您可以在一个将被矢量化的衬里中轻松实现这一目标:
In [10]:
'/ '.join(df['name'] + ', ' + df['age'].astype(str))
Out[10]:
'Michael, nan/ Andy, 30.0/ Justin, 19.0'