我有桌子:
Name1 Name2 Name3
0 ABC FGD NNY
1 111S PC 1T Trees are always yellow NaN NaN
2 P FGD NNY
3 JJJ FGD NNY
4 111S PC 1T Trees are always yellow NaN NaN
5 ABC FGD NNY
6 UIK GJ DE
我想得到这个:
Name1 Name2 Name3 Name4
0 ABC FGD NNY NaN
1 111S PC 1T Trees are always yellow
2 P FGD NNY NaN
3 JJJ FGD NNY NaN
4 111S PC 1T Trees are always yellow
5 ABC FGD NNY NaN
6 UIK GJ DE NaN
我只需要拆分一些行,而其他行则不应更改。 我能够确定需要拆分数据的行:
if df[colname1].isnull:
df_index=df[df[colname1].isnull()].index
print(df_index)
现在需要在字符串中分隔值。我得到这样的东西:
if df[colname1].isnull:
df_index=df[df[colname1].isnull()].index
print(df_index)
for i in df_index:
print(i)
df1=df[colname][i].split(' ')
df1是具有我所需信息的字符串,但是我不知道如何将此信息放入需要索引的DataFrame df中。 您能帮我吗?
答案 0 :(得分:1)
将str.split
与n
一起使用
s=df.fillna('').apply(' '.join,1)
s.str.split(' ',n=3)
Out[189]:
0 [ABC, FGD, NNY]
1 [111S, PC, 1T, Trees are always yellow ]
2 [P, FGD, NNY]
3 [JJJ, FGD, NNY]
4 [111S, PC, 1T, Trees are always yellow ]
5 [ABC, FGD, NNY]
6 [UIK, GJ, DE]
dtype: object
pd.DataFrame(s.str.split(' ',n=3).tolist())
Out[190]:
0 1 2 3
0 ABC FGD NNY None
1 111S PC 1T Trees are always yellow
2 P FGD NNY None
3 JJJ FGD NNY None
4 111S PC 1T Trees are always yellow
5 ABC FGD NNY None
6 UIK GJ DE None
答案 1 :(得分:0)
IIUC,您有一个双空格来分隔列,在句子中有一个空格。您可以使用它来执行拆分。
idx = df.loc[df.Name2.isnull()].index
df['Name4'] = np.nan
df.loc[idx] = df.loc[idx].Name1.str.split(' ',expand = True).values
Name1 Name2 Name3 Name4
0 ABC FGD NNY NaN
1 111S PC 1T Trees are always yellow
2 P FGD NNY NaN
3 JJJ FGD NNY NaN
4 111S PC 1T Trees are always yellow
5 ABC FGD NNY NaN
6 UIK GJ DE NaN