拆分列而不更改行位置

时间:2019-11-06 18:08:31

标签: python pandas split

我正在尝试拆分一列,但我注意到拆分会更改其他值。例如,第10行的某些值与第8行交换。为什么?

ID为10的实际数据

| vat_number | email                                            | foi_mail       | website 
|     10     | abc@test.com;example@test.com;example@test.com   | xyz@test.com   | example.com

执行以下代码行:

base_data[['email','email_1','email_2']] = pd.DataFrame(base_data.email.str.split(';').tolist(),
                                                        columns = ['email','email_1','email_2'])

base_data变为:

| vat_number | email                  | foi_mail               | website     | email_1 | email_2
|     10     | some other row value   | some other row value   | example.com | ------  | -----

之前:

Before executing code that split column

之后:

After executing code that split column

数据包含数千行,但我只显示了一行。

3 个答案:

答案 0 :(得分:0)

尝试在表格中做表格:

def test():
base_data = []
base_data.append(['12','32'])
base_data.append(['352','335'])
base_data.append(['232','32'])

print(base_data)
a = base_data[0]
print(a)
print(a[0])
print(a[1])

input("Enter to contuniue. . .")

并使用循环添加

答案 1 :(得分:0)

如果我理解这种情况。我相信您需要这样的东西:

 base_data = base_data.merge(base_data['email'].str.split(';', expand = True).rename(columns = {0:'email',1:'email_1',2:'email_2']}), left_index = True, right_index = True)

这是逻辑解释:

a1 = list('abcdef')
b1 = list('fedcba')
c1 = [f'{x[0]};{x[1]}' for x in zip(a1, b1)]
df1 = pd.DataFrame({'c1':c1})
df1

Out[1]:

    c1
0   a;f
1   b;e
2   c;d
3   d;c
4   e;b
5   f;a

df1 = df1.merge(df1['c1'].str.split(';', expand = True).rename(columns = {0:'c2',1:'c3'}), left_index = True, right_index = True)
df1

Out[2]:

    c1  c2  c3
0   a;f a   f
1   b;e b   e
2   c;d c   d
3   d;c d   c
4   e;b e   b
5   f;a f   a

答案 2 :(得分:0)

使用.str.splitexpand参数:

import pandas as pd

# your dataframe
 vat_number                                           email      foi_mail      website
        NaN  abc@test.com;example@test.com;example@test.com  xyz@test.com  example.com

# split and expand
df[['email_1', 'email_2', 'email_3']] = df['email'].str.split(';', expand=True)

# drop `email` col
df.drop(columns='email', inplace=True)

# result
 vat_number      foi_mail      website       email_1           email_2           email_3
        NaN  xyz@test.com  example.com  abc@test.com  example@test.com  example@test.com