The nltk module is running with other libraries in the corpus folder.
我已经尝试过先将'import nltk'放入,但是还是一样,而且我也尝试了'from nltk.tokenize import'PunktSentenceTokenizer'。我不知道为什么Python shell无法找到nltk的定义。我该如何解决?我仍在学习如何编写和编码python。
答案 0 :(得分:0)
您需要
# order dataframe on id and year
data <- data[order(data$id, data$year), ]
# get the max length of non-NA stretche in values column
max.rle <- tapply(!is.na(data$value), data$id, function(x) {
max.vec <- rle(x)$length[rle(x)$values==T]
ifelse(length(max.vec) > 0, max(max.vec), 0)
})
# remove those ids that has stretch length less than 5
data <- data[data$id %in% names(max.rle[max.rle >= 5]), ]
# print data
data
id year value
1 A 2008 116.57251
2 A 2009 92.22958
3 A 2010 68.67486
4 A 2011 86.67054
5 A 2012 85.74104
6 A 2013 83.21088
7 A 2014 97.20029
8 A 2015 127.53420
9 A 2016 86.97861
10 A 2017 119.10791
11 B 2008 105.26708
12 B 2009 72.47399
13 B 2010 85.00305
14 B 2011 93.80867
15 B 2012 113.37334
16 B 2013 116.63578
17 B 2014 119.41421
18 B 2015 108.64411
19 B 2016 73.80403
20 B 2017 143.75300
通过
安装软件包后import nltk
答案 1 :(得分:0)
您在文件中拼错了程序包的名称,而是使用//sample data
insert into employee(id_user int,salary_date date,salary_value) values
(3, 2017-06-28, 15)
( 5, 2017-06-26 5)
( 2, 2017-06-20 5)
( 1, 2017-06-20 15)
( 4, 2017-06-17 25)
//expected output:
id date sal newdate days aboveavgsal date current/firstsal
1 2017-06-20 15 2017-05-20 31 N 2017-04-20 0.6
1 2017-05-20 10 2017-04-20 30 N 2017-04-20 0.4
1 2017-04-20 20 2017-03-20 31 Y 2017-03-20 0.8
1 2017-03-20 20 2017-02-20 28 Y NULL 0.8
1 2017-02-20 15 NULL NULL N NULL 0.6
SELECT id_user
, salary_date
, salary_value
, lead(salary_date) OVER(PARTITION BY id_user ORDER BY salary_date desc)
, salary_date::timestamp - lead(salary_date::timestamp) over (PARTITION BY id_user ORDER BY salary_date DESC)
, CASE WHEN salary_value >= 20 THEN 'Y' ELSE 'N' END
, CASE WHEN salary_value >= 20 THEN lead(salary_date) OVER(PARTITION BY id_user ORDER BY salary_date DESC) END
, (SELECT CAST(t1.salary_value AS float) / CAST(t.salary_value AS float)
FROM (
SELECT t.id_user
, t.salary_value
, ROW_NUMBER() OVER(PARTITION BY t.id_user ORDER BY t.salary_date) AS rowrank
FROM employee t
) AS t
INNER JOIN employee AS t1 ON t1.id_user = t.id_user
WHERE t.rowrank = 1
GROUP BY t1.id_user, t1.salary_value, t1.salary_date, t.salary_value, t.rowrank
ORDER BY t1.id_user, t1.salary_date DESC
) AS fraction
FROM employee
Error starts in this query which i tried enclosing in braces.
(SELECT CAST(t1.salary_value AS float) / CAST(t.salary_value AS float)
FROM (
SELECT t.id_user
, t.salary_value
, row_number() OVER(PARTITION BY t.id_user ORDER BY t.salary_date ) AS rowrank
FROM employee t) AS t
INNER JOIN employee t1 ON t1.id_user = t.id_user
WHERE t.rowrank = 1
GROUP BY t1.id_user, t1.salary_value, t1.salary_date, t.salary_value, t.rowrank
ORDER BY t1.id_user, t1.salary_date DESC
)
而不是nf ntlk
更改
nltk
到
tagged = ntlk.pos_tag(words)