我的pandas数据框由文本列表组成,我想用来分割文本,
df['c1']=['this is text one','this is text two','this is text three']
我尝试过
new = df["c1"].str.split(",", n = 1, expand = True)
可是我对新变量不满意
预期产量
c1='this is text one'
c1='this is text two'
c1='this is text three'
其他输出也可以,只要它可以拆分列表中的文本。谢谢您的帮助 完整代码
import pandas as pd
data={"C1":[["this is text one","this is text two","this is text three"]]}
df=pd.DataFrame(data)
df.head()
答案 0 :(得分:2)
使用np.concatenate()
并调用数据框构造函数(因为您已经有一个字符串列表):
df_new=pd.DataFrame(np.concatenate(df1.C1),columns=['C1'])
#or pd.DataFrame(df1.C1.values.tolist()).T
C1
0 this is text one
1 this is text two
2 this is text three
答案 1 :(得分:1)
您不需要熊猫来拆分可用于循环的数组
这就是您需要的
for i in df['C1']:
for each in i:
print(each) #outputs each element in the array
答案 2 :(得分:0)
您的问题有点令人困惑-您说您已经有一个文本列表,那么为什么要拆分它?如果您的意思是说您有一个数据框,其中的字符串必须用逗号分隔,则可以执行以下操作。
import pandas as pd
df = pd.DataFrame()
df['c1']=['this is the first text, which has some commas, in it',
'this is text two, which also has commas']
df['lists'] = df['c1'].apply(lambda txt: txt.split(','))
df.head()
运行df['lists'][0]
然后给出['this is the first text', ' which has some commas', ' in it']