Question

您好我想合并我在Excel中加载的两个数据帧。我将应该合并的列转换为＆＃34; str＆＃34;。令人遗憾的是，代码合并了第一行，但随后返回了NaN值.... 我使用的代码是：

ListA=pd.read_excel(inpath,sheetname="Tabelle2")
ListA["Stücklistenkomponente"]=ListA["Material"].astype(np.str)
ListB=pd.read_excel(inpath,sheetname="Tabelle1")
ListB["Stücklistenkomponente"]=ListB["Material"].astype(np.str)
print(ListA.dtypes)
print(ListB.dtypes)

物料对象

物料对象

两个数据帧的形状是：

利斯塔

Material
R 22B 2.0 7.72 11.0 Lo
X 127 1.5x4.64x4[G16.05.01] CL
L 431 2x6,96x5.5 Y
9999
L 431 2x5,96x5.5 p
F 631 2x6,96x5.5 a
N 431 2x6,96x5.5 v
J 431 2x6,96x5.5 
O 431 2x6,96x5.5 
VM 431 2x6,96x5.5 L

数组listB

   Material                          InnerDiameter  OuterDiameter   Length  
    R 22B 2.0 7.72 11.0 Lo           2              6               8
    X 127 1.5x4.64x4[G16.05.01] CL   2              7               12
    L 431 2x6,96x5.5 Y               5              8               13
    9999                             0              0               0
    L 431 2x5,96x5.5 p               6              9               15
    F 631 2x6,96x5.5 a               8              5               26
    N 431 2x6,96x5.5 v               9              1               3    
    J 431 2x6,96x5.5                 12             6               89 
    O 431 2x6,96x5.5                 5              4               12  
    VM 431 2x6,96x5.5 L              4             12               7

返回：

           Material       InnerDiameter    OuterDiameter  Lenth  
           R 22B 2.0 7.72 11.0 Lo    2                 6      8   
                   NaN              NaN               NaN    NaN   
                   NaN              NaN               NaN    NaN   
                   NaN              NaN               NaN    NaN   
                   NaN              NaN               NaN    NaN   
                   NaN              NaN               NaN    NaN   
                   NaN              NaN               NaN    NaN   
                   NaN              NaN               NaN    NaN   
                   NaN              NaN               NaN    NaN   
                   NaN              NaN               NaN    NaN   
                   NaN              NaN               NaN    NaN

那么我做错了什么？我认为解决方案是将两列都转换为dtype字符串，但这不起作用....

感谢任何帮助！

Answer 1

我认为必须有一些不同的数据，可能是搜索witespace，因为.astype(str)正确地将数据转换为string。

如果数据为string s，dict s，set s，list s，则dtype为object。

但type为string，dict ...

您可以通过以下方式查看：

print(ListA["Stücklistenkomponente"].apply(type))

对于检查数据，有时可以帮助生成lists：

print(ListA["Stücklistenkomponente"].tolist())
print(ListB["Stücklistenkomponente"].tolist())

编辑：

我测试你的数据，结果非常有趣：

df1 = pd.read_excel('Mappe3.xlsx',sheetname="Tabelle2")
df2 = pd.read_excel('Mappe3.xlsx',sheetname="Tabelle1")

#default inner join - get duplicated rows, because duplicate values
#on should be omit if only one same column for join
df = pd.merge(df1, df2)
print (df.head(10))
                   Stücklistenkomponente Ritzel_Materialnummer  \
0  RITZEL 22F 2.0 7.72 11.0 Z17 SCHWEISS           401.4425.13   
1  RITZEL 22F 2.0 7.72 11.0 Z17 SCHWEISS           401.4425.13   
2  RITZEL 22F 2.0 7.72 11.0 Z17 SCHWEISS           401.4425.13   
3  RITZEL 22F 2.0 7.72 11.0 Z17 SCHWEISS           401.4425.13   
4  RITZEL 22F 2.0 7.72 11.0 Z17 SCHWEISS           401.4425.13   
5  RITZEL 22F 2.0 7.72 11.0 Z17 SCHWEISS           401.4425.13   
6  RITZEL 22F 2.0 7.72 11.0 Z17 SCHWEISS           401.4425.13   
7  RITZEL 22F 2.0 7.72 11.0 Z17 SCHWEISS           401.4425.13   
8  RITZEL 22F 2.0 7.72 11.0 Z17 SCHWEISS           401.4425.13   
9  RITZEL 22F 2.0 7.72 11.0 Z17 SCHWEISS           401.4425.13   
...
...

#remove duplicates in both df
df1 = df1.drop_duplicates('Stücklistenkomponente')
df2 = df2.drop_duplicates('Stücklistenkomponente')

#default inner join - only 5 same categories
df = pd.merge(df1, df2)
print (df)
                   Stücklistenkomponente Ritzel_Materialnummer  \
0  RITZEL 22F 2.0 7.72 11.0 Z17 SCHWEISS           401.4425.13   
1  RITZEL 22F 3.0 7.72 11.0 Z17 SCHWEISS           401.4425.15   
2       RITZEL 22F 3.0 7.9 6.0 Z17 PRESS           401.4425.11   
3       RITZEL 22F 3.0 6.0 15.0 PRESS Z8           401.4487.01   
4       RITZEL 22F 4.0 7.9 6.0 Z17 PRESS           401.4425.14   

  Innendurchmesser  Außendurchmesser  Länge         Material1 Material2  \
0                2              7.72   11.0           X46Cr13         -   
1                3              7.72   11.0           X46Cr13         -   
2                4              7.90    6.0  42CrMo4 vergütet         -   
3                3              6.00   15.0  42CrMo4 vergütet         -   
4                2              7.90    6.0  42CrMo4 vergütet         -   

  Material3  
0         -  
1         -  
2         -  
3         -  
4         -

合并返回NaN，第一行除外

1 个答案: