Question

我的pandas数据框由文本列表组成，我想用来分割文本，

df['c1']=['this is text one','this is text two','this is text three']

我尝试过

new = df["c1"].str.split(",", n = 1, expand = True)

可是我对新变量不满意

预期产量

c1='this is text one'
c1='this is text two'
c1='this is text three'

其他输出也可以，只要它可以拆分列表中的文本。谢谢您的帮助完整代码

import pandas as pd
data={"C1":[["this is text one","this is text two","this is text three"]]}
df=pd.DataFrame(data)
df.head()

Answer 1

使用np.concatenate()并调用数据框构造函数（因为您已经有一个字符串列表）：

df_new=pd.DataFrame(np.concatenate(df1.C1),columns=['C1'])
#or pd.DataFrame(df1.C1.values.tolist()).T

                   C1
0    this is text one
1    this is text two
2  this is text three

Answer 2

您不需要熊猫来拆分可用于循环的数组

这就是您需要的


for i in df['C1']:
    for each in i:
        print(each) #outputs each element in the array

Answer 3

您的问题有点令人困惑-您说您已经有一个文本列表，那么为什么要拆分它？如果您的意思是说您有一个数据框，其中的字符串必须用逗号分隔，则可以执行以下操作。

import pandas as pd

df = pd.DataFrame()

df['c1']=['this is the first text, which has some commas, in it',
          'this is text two, which also has commas']

df['lists'] = df['c1'].apply(lambda txt: txt.split(','))

df.head()

运行df['lists'][0]然后给出['this is the first text', ' which has some commas', ' in it']

在熊猫数据框中分割文本列表

3 个答案: