熊猫合并,无法在我的DF上正常工作

时间:2019-12-15 12:06:14

标签: pandas merge

我有2 df

  1. 数据-len(数据)= 50000
  2. data_ta-len(data_ta)= 47504

RangeIndex: 47504 entries, 0 to 47503

Data columns (total 9 columns):
ID_TA                47504 non-null object
special_city_rank    35554 non-null object
city_rank            42614 non-null object
TA_num_rew           47366 non-null object
TA_price_range       43202 non-null object
TA_cusine_style      43202 non-null object
services             29250 non-null object
category_marks       47504 non-null object
en_rew_marks         47504 non-null object
dtypes: object(9)
memory usage: 3.3+ MB



<class 'pandas.core.frame.DataFrame'>


RangeIndex: 50000 entries, 0 to 49999
Data columns (total 11 columns):
Restaurant_id        50000 non-null object
City                 50000 non-null object
Cuisine Style        38410 non-null object
Ranking              50000 non-null float64
Price Range          32639 non-null object
Number of Reviews    46800 non-null float64
Reviews              49998 non-null object
URL_TA               50000 non-null object
ID_TA                50000 non-null object
sample               50000 non-null int64
Rating               50000 non-null float64
dtypes: float64(3), int64(1), object(7)
memory usage: 4.2+ MB


They have column ID_TA (same)

我使用命令

data.merge(data_TA, how="left", on = "ID_TA")

和结果DF len = 73780

<class 'pandas.core.frame.DataFrame'>
Int64Index: 73780 entries, 0 to 73779
Data columns (total 19 columns):
Restaurant_id        73780 non-null object
City                 73780 non-null object
Cuisine Style        56721 non-null object
Ranking              73780 non-null float64
Price Range          48250 non-null object
Number of Reviews    69082 non-null float64
Reviews              73778 non-null object
URL_TA               73780 non-null object
ID_TA                73780 non-null object
sample               73780 non-null int64
Rating               73780 non-null float64
special_city_rank    35604 non-null object
city_rank            42668 non-null object
TA_num_rew           47422 non-null object
TA_price_range       43256 non-null object
TA_cusine_style      43256 non-null object
services             29296 non-null object
category_marks       47560 non-null object
en_rew_marks         47560 non-null object
dtypes: float64(3), int64(1), object(15)
memory usage: 11.3+ MB

我不明白怎么办? ))

必须为50000

0 个答案:

没有答案