我有一个带有单列“ data”的数据框,其中包含以空格分隔的单词。我想将数据分成多行,并按空格分开。 我已经尝试了以下代码,但是不起作用:
from itertools import chain
def chainer(s):
return list(chain.from_iterable(s.str.split('\s+')))
lengths = df['data'].str.split('\s+').map(len)
df_m = pd.DataFrame({"data" : np.repeat(df["data"], lengths)})
数据框示例
words = ["a b c d e","b m g f e","c" ,"w"]
dff = pd.DataFrame({"data" :words })
data
0 a b c d e
1 b m g f e
2 c
3 w
答案 0 :(得分:2)
您在寻找这样的东西吗?
private bool Consume(byte[] fileByteArray, IDataProcess dataConsumer)
{
try
{
using (var conn = OpenConnection())
{
// Convert byte Array to a stream
Stream stream = new MemoryStream(fileByteArray);
// Create a reader from the stream
using (var reader = new CsvReader(stream, false, System.Text.Encoding.UTF8))
{
RecordEnumerator enumerator = reader.GetEnumerator();
enumerator.MoveNext();
do
{
// Proccess enumerator.Current with dataConsumer
} while (enumerator.MoveNext());
}
}
}
catch (Exception ex)
{
return false;
}
return true;
}
输入:
df = pd.DataFrame()
df['text'] = ['word1 word2 word3', 'hey there hello word', 'stackoverflow is amazing']
要做:
text
0 word1 word2 word3
1 hey there hello word
2 stackoverflow is amazing
输出:
x = df.data.str.split(expand=True).stack().values
new_df = pd.DataFrame()
new_df['words'] = x.tolist()
答案 1 :(得分:1)
下面是我的尝试。
words = ['oneword','word1 word2 word3', 'hey there hello word', 'stackoverflow is amazing']
# make list of list and flatten.
flat_list = [item for sublist in words for item in sublist.split(' ')]
# put flat_list into DataFrame.
df = pd.DataFrame({"data" :flat_list })
print(df)
data
0 oneword
1 word1
2 word2
3 word3
4 hey
5 there
6 hello
7 word
8 stackoverflow
9 is
10 amazing