在熊猫数据框中分割文本列表

时间:2019-05-12 09:49:47

标签: python string pandas list python-3.6

我的pandas数据框由文本列表组成,我想用来分割文本,

df['c1']=['this is text one','this is text two','this is text three']

我尝试过

new = df["c1"].str.split(",", n = 1, expand = True)

可是我对新变量不满意

预期产量

c1='this is text one'
c1='this is text two'
c1='this is text three'

其他输出也可以,只要它可以拆分列表中的文本。谢谢您的帮助 完整代码

import pandas as pd
data={"C1":[["this is text one","this is text two","this is text three"]]}
df=pd.DataFrame(data)
df.head()

3 个答案:

答案 0 :(得分:2)

使用np.concatenate()并调用数据框构造函数(因为您已经有一个字符串列表):

df_new=pd.DataFrame(np.concatenate(df1.C1),columns=['C1'])
#or pd.DataFrame(df1.C1.values.tolist()).T

                   C1
0    this is text one
1    this is text two
2  this is text three

答案 1 :(得分:1)

您不需要熊猫来拆分可用于循环的数组

这就是您需要的


for i in df['C1']:
    for each in i:
        print(each) #outputs each element in the array

答案 2 :(得分:0)

您的问题有点令人困惑-您说您已经有一个文本列表,那么为什么要拆分它?如果您的意思是说您有一个数据框,其中的字符串必须用逗号分隔,则可以执行以下操作。

import pandas as pd

df = pd.DataFrame()

df['c1']=['this is the first text, which has some commas, in it',
          'this is text two, which also has commas']

df['lists'] = df['c1'].apply(lambda txt: txt.split(','))

df.head()

运行df['lists'][0]然后给出['this is the first text', ' which has some commas', ' in it']