我正在尝试使用新的词干字符串值重新分配该系列中列表的值。但是,我不知道该怎么做。目前,我正在制作一个新列表,并附加新词干,但这只是将所有列表中的所有单词放到列表ns
中。我只想在其中使用新词干更新当前列表的地方。
words = data.String.apply(lambda x: word_tokenize(x))
ns =[]
#print(words)
for i in words:
for j in i:
ns.append(ps.stem(j))
例如,words
=
0 [I, loved, dogs, because, they, are, cute, and...
1 [my, dog, is, looking, at, me, weird, maybe, c...
2 [I, think, I, look, like, a, cupacake, one, wi...
3 [do, you, want, to, be, a, snowman, no, thanks...
4 [hey, do, you, know, what, time, it, is, cooking,...
5 [dogs, are, so, awesome, dogs, are, so, awesome]
在通过for循环插入单词之后,words
应该像这样:
0 [I, love, dog, becaus, they, are, cute, and...
1 [my, dog, is, look, at, me, weird, maybe, c...
2 [I, think, I, look, like, a, cupacake, one, wi...
3 [do, you, want, to, be, a, snowman, no, thank...
4 [hey, do, you, know, what, time, it, is, cook,...
5 [dog, are, so, awesome, dog, are, so, awesome]
在:
print(type(words))
print(type(words[1]))
print(type(words[1][1]))
出局:
<class 'pandas.core.series.Series'>
<class 'list'>
<class 'str'>
有什么想法吗?
谢谢!
答案 0 :(得分:1)
将列表理解与ps.stem
函数一起使用:
print (data)
String
0 I loved dogs because they are cute and
1 my dog is looking at me weird maybe
2 I think I look like a cupacake one
3 do you want to be a snowman no thanks
4 hey do you know what time it is cooking
5 dogs are so awesome dogs are so awesome
from nltk.stem.snowball import SnowballStemmer
from nltk import word_tokenize
ps = SnowballStemmer("english")
words = data.String.apply(lambda x: [ps.stem(y) for y in word_tokenize(x)])
print (words)
0 [i, love, dog, becaus, they, are, cute, and]
1 [my, dog, is, look, at, me, weird, mayb]
2 [i, think, i, look, like, a, cupacak, one]
3 [do, you, want, to, be, a, snowman, no, thank]
4 [hey, do, you, know, what, time, it, is, cook]
5 [dog, are, so, awesom, dog, are, so, awesom]
Name: String, dtype: object
如果需要重新分配到同一列:
data.String = data.String.apply(lambda x: [ps.stem(y) for y in word_tokenize(x)])
print (data)
String
0 [i, love, dog, becaus, they, are, cute, and]
1 [my, dog, is, look, at, me, weird, mayb]
2 [i, think, i, look, like, a, cupacak, one]
3 [do, you, want, to, be, a, snowman, no, thank]
4 [hey, do, you, know, what, time, it, is, cook]
5 [dog, are, so, awesom, dog, are, so, awesom]
或转到新列:
data['Stem'] = data.String.apply(lambda x: [ps.stem(y) for y in word_tokenize(x)])
答案 1 :(得分:0)
如果要保留两个for循环,则需要使用索引进行浏览。在python中,当您使用for i in lst
时。 i
等于列表中的每个元素。要更改列表中的值,您将需要索引。
更改循环以使用索引代替:
for i in range(len(words)):
for j in range(len(words[i])):
words[i][j] = "something new"
这将允许您更改数组中[i][j]
点的值。