如何使用pandas执行两个没有共同列的数据帧的完全外连接 交叉连接?
在MySQL中,您可以这样做:
<mirrorOf>central</mirrorOf>
但在熊猫中,做:
SELECT *
FROM table_1
[CROSS] JOIN table_2;
给出错误:
df_1.merge(df_2, how='outer')
到目前为止,我所使用的最佳解决方案是使用MergeError: No common columns to perform merge on
:
import sqlalchemy as sa engine = sa.create_engine('sqlite:///tmp.db') df_1.to_sql('df_1', engine) df_2.to_sql('df_2', engine) df = pd.read_sql_query('SELECT * FROM df_1 JOIN df_2', engine)
答案 0 :(得分:9)
IIUC {} {}需要merge
lambda expression
的临时列tmp
:
DataFrames
答案 1 :(得分:1)
即使在MySQL中,您也必须指定要加入的字段。
http://dev.mysql.com/doc/refman/5.7/en/join.html
示例:
SELECT * FROM t1 LEFT JOIN t2 ON (t1.a = t2.a);
与熊猫相同的概念:
Parameters:
right : DataFrame
how : {‘left’, ‘right’, ‘outer’, ‘inner’}, default ‘inner’
left: use only keys from left frame (SQL: left outer join)
right: use only keys from right frame (SQL: right outer join)
outer: use union of keys from both frames (SQL: full outer join)
inner: use intersection of keys from both frames (SQL: inner join)
on : label or list
Field names to join on. Must be found in both DataFrames. If on is None and not merging on indexes, then it merges on the intersection of the columns by default.
left_on : label or list, or array-like
Field names to join on in left DataFrame. Can be a vector or list of vectors of the length of the DataFrame to use a particular vector as the join key instead of columns
right_on : label or list, or array-like
Field names to join on in right DataFrame or vector/list of vectors per left_on docs
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.merge.html