我有下面的代码,我正在尝试使用pandasql运行sqldf的SQL查询。我正在进行一些划分和聚合。当我在带有sqldf的r中运行它时,查询运行正常。我是pandasql的新手,我收到下面的错误,有谁能看到我的问题,并建议如何修复它?我还提供了一些样本数据。
代码:
import pandasql
from pandasql import sqldf
pysqldf = lambda q: sqldf(q, globals())
ExampleDf=pysqldf("select sum(lastSaleAmount-priorSaleAmount)/sum(squareFootage) as AvgPric
,zipcode
from data
where priorSaleDate between '2010-01-01' and '2011-01-01'
group by zipcode
order by
sum(lastSaleAmount-priorSaleAmount)/sum(squareFootage) desc")
错误:
File "<ipython-input-100-679165684772>", line 1
ExampleDf=pysqldf("select sum(lastSaleAmount-priorSaleAmount)/sum(squareFootage) as AvgPric
^
SyntaxError: EOL while scanning string literal
示例数据:
print(data.iloc[:50])
id address city state zipcode latitude \
0 39525749 8171 E 84th Ave Denver CO 80022 39.849160
1 184578398 10556 Wheeling St Denver CO 80022 39.888020
2 184430015 3190 Wadsworth Blvd Denver CO 80033 39.761710
3 155129946 3040 Wadsworth Blvd Denver CO 80033 39.760780
4 245107 5615 S Eaton St Denver CO 80123 39.616181
5 3523925 6535 W Sumac Ave Denver CO 80123 39.615136
6 30560679 6673 W Berry Ave Denver CO 80123 39.616350
7 39623928 5640 S Otis St Denver CO 80123 39.615213
8 148975825 5342 S Gray St Denver CO 80123 39.620158
9 184623176 4967 S Wadsworth Blvd Denver CO 80123 39.626770
10 39811456 6700 W Dorado Dr # 11 Denver CO 80123 39.614540
11 39591617 4956 S Perry St Denver CO 80123 39.628740
12 39577604 4776 S Gar Way Denver CO 80123 39.630547
13 153665665 8890 W Tanforan Dr Denver CO 80123 39.630738
14 39868673 5538 W Prentice Cir Denver CO 80123 39.620625
15 184328555 4254 W Monmouth Ave Denver CO 80123 39.629000
16 30554949 6600 W Berry Ave Denver CO 80123 39.616165
17 24157982 6560 W Sumac Ave Denver CO 80123 39.614712
18 51335315 5655 S Fenton St Denver CO 80123 39.615488
19 152799217 5626 S Fenton St Denver CO 80123 39.616153
20 51330641 5599 S Fenton St Denver CO 80123 39.616514
21 15598828 6595 W Sumac Ave Denver CO 80123 39.615144
22 49360310 6420 W Sumac Ave Denver CO 80123 39.614531
23 39777745 4962 S Field Ct Denver CO 80123 39.625819
24 18021201 9664 W Grand Ave Denver CO 80123 39.625826
25 39776096 4881 S Jellison St Denver CO 80123 39.628401
26 29850085 5012 S Field Ct Denver CO 80123 39.625537
27 51597934 4982 S Field Ct Denver CO 80123 39.625757
28 39563379 4643 S Hoyt St Denver CO 80123 39.632457
29 18922140 5965 W Sumac Ave Denver CO 80123 39.615199
30 39914328 9740 W Chenango Ave Denver CO 80123 39.627226
31 51323181 5520 W Prentice Cir Denver CO 80123 39.620548
32 3493378 4665 S Garland Way Denver CO 80123 39.632063
33 4115341 5466 W Prentice Cir Denver CO 80123 39.619027
34 39639069 5735 W Berry Ave Denver CO 80123 39.617727
35 184333944 9015 W Tanforan Dr Denver CO 80123 39.631178
36 18197471 4977 S Garland St Denver CO 80123 39.626080
37 49430482 9540 W Bellwood Pl Denver CO 80123 39.624558
38 39868648 5535 S Fenton St Denver CO 80123 39.617145
39 143684222 3761 W Wagon Trail Dr Denver CO 80123 39.631251
40 152898579 4850 S Yukon St Denver CO 80123 39.629025
41 43174426 4951 S Ammons St Denver CO 80123 39.626582
42 39615194 7400 W Grant Ranch Blvd # 31 Denver CO 80123 39.618440
43 184340029 7400 W Grant Ranch Blvd # 7 Denver CO 80123 39.618440
44 3523919 5425 S Gray St Denver CO 80123 39.618265
45 151444231 6610 W Berry Ave Denver CO 80123 39.616148
46 19150871 4756 S Perry St Denver CO 80123 39.630389
47 39545155 4328 W Bellewood Dr Denver CO 80123 39.627883
48 3523923 6585 W Sumac Ave Denver CO 80123 39.615145
49 51337334 5737 W Alamo Dr Denver CO 80123 39.615881
longitude bedrooms bathrooms rooms squareFootage lotSize yearBuilt \
0 -104.893468 3 2.0 6 1378 9968 2003.0
1 -104.830930 2 2.0 6 1653 6970 2004.0
2 -105.081070 3 1.0 0 1882 23875 1917.0
3 -105.081060 4 3.0 0 2400 11500 1956.0
4 -105.058812 3 4.0 8 2305 5600 1998.0
5 -105.069018 3 5.0 7 2051 6045 1996.0
6 -105.070760 4 4.0 8 2051 6315 1997.0
7 -105.070617 3 3.0 7 2051 8133 1997.0
8 -105.063094 3 3.0 7 1796 5038 1999.0
9 -105.081990 3 3.0 0 2054 4050 2007.0
10 -105.071350 3 4.0 7 2568 6397 2000.0
11 -105.040126 3 2.0 6 1290 9000 1962.0
12 -105.100242 3 4.0 6 1804 6952 1983.0
13 -105.097718 3 3.0 6 1804 7439 1983.0
14 -105.059503 4 5.0 8 3855 9656 1998.0
15 -105.042330 2 2.0 4 1297 16600 1962.0
16 -105.069424 4 4.0 9 2321 5961 1996.0
17 -105.069264 4 4.0 8 2321 6337 1997.0
18 -105.060173 3 3.0 7 2321 6151 1998.0
19 -105.059696 3 3.0 7 2071 6831 1999.0
20 -105.060193 3 3.0 7 2071 6050 1998.0
21 -105.069803 3 3.0 7 2074 6022 1996.0
22 -105.067815 4 4.0 9 2588 6432 1996.0
23 -105.099825 3 2.0 7 1567 6914 1980.0
24 -105.106423 3 2.0 5 1317 9580 1983.0
25 -105.108440 3 3.0 5 1317 6718 1982.0
26 -105.099012 2 2.0 6 808 8568 1980.0
27 -105.099484 2 1.0 6 808 6858 1980.0
28 -105.104752 3 2.0 6 1321 6000 1978.0
29 -105.062378 3 4.0 8 2350 6839 1997.0
30 -105.107806 2 2.0 5 1586 6510 1982.0
31 -105.058600 2 4.0 6 2613 8250 1998.0
32 -105.101493 3 2.0 8 1590 7044 1977.0
33 -105.057427 3 5.0 7 2614 9350 1999.0
34 -105.059123 3 4.0 7 2107 6491 1998.0
35 -105.099179 2 1.0 5 1340 6741 1982.0
36 -105.103470 3 2.0 6 1085 6120 1985.0
37 -105.104316 3 1.0 6 1085 13500 1981.0
38 -105.060195 4 3.0 8 2365 6050 1998.0
39 -105.036567 3 2.0 5 1344 9240 1959.0
40 -105.081998 2 3.0 5 1601 6660 1986.0
41 -105.087250 3 2.0 8 1858 6890 1986.0
42 -105.079900 2 2.0 5 1603 5742 1997.0
43 -105.079900 2 2.0 5 1603 6168 1997.0
44 -105.061397 3 3.0 7 1860 6838 1998.0
45 -105.069618 3 4.0 8 2376 5760 1996.0
46 -105.038707 3 2.0 5 1355 9600 1960.0
47 -105.042611 2 2.0 6 1867 11000 1973.0
48 -105.069604 3 3.0 7 2382 5830 1996.0
49 -105.059085 3 3.0 6 1872 5500 1999.0
lastSaleDate lastSaleAmount priorSaleDate priorSaleAmount \
0 2009-12-17 75000 2004-05-13 165700.0
1 2004-09-23 216935 NaN NaN
2 2008-04-03 330000 NaN NaN
3 2008-12-02 185000 2008-06-27 0.0
4 2012-07-18 308000 2011-12-29 0.0
5 2006-09-12 363500 2005-05-16 339000.0
6 2014-12-15 420000 2006-07-07 345000.0
7 2004-03-15 328700 1998-04-09 225200.0
8 2011-08-16 274900 2011-01-10 0.0
9 2015-12-01 407000 2012-10-30 312000.0
10 2014-11-12 638000 2005-03-22 530000.0
11 2004-02-02 235000 2000-10-12 171000.0
12 2004-07-19 247000 1999-06-07 187900.0
13 2013-08-14 249700 2000-09-07 217900.0
14 2004-08-17 580000 1999-01-11 574000.0
15 2011-11-07 150000 NaN NaN
16 2006-01-18 402800 2004-08-16 335000.0
17 2013-12-31 422000 2012-11-05 399000.0
18 1999-12-02 277900 NaN NaN
19 2000-02-04 271800 NaN NaN
20 1999-10-20 274400 NaN NaN
21 2007-11-30 314500 NaN NaN
22 2001-12-31 342500 NaN NaN
23 2016-12-02 328000 2016-08-02 231200.0
24 2017-06-21 376000 2008-02-29 244000.0
25 2004-08-31 225000 NaN NaN
26 2016-09-06 310000 2015-09-15 258900.0
27 1999-12-06 128000 NaN NaN
28 2004-04-28 197000 NaN NaN
29 2011-08-11 365000 2004-08-04 365000.0
30 2015-07-08 302000 2004-07-15 210000.0
31 2000-02-10 425000 1999-04-08 396500.0
32 2016-02-26 275000 2004-12-03 204000.0
33 2005-08-29 580000 1999-09-10 398200.0
34 2004-06-30 355000 2001-02-22 320000.0
35 2015-05-26 90000 1983-06-01 80000.0
36 2017-06-08 312500 2017-05-12 258000.0
37 2001-04-27 184000 1999-11-10 164900.0
38 2004-02-08 335000 2001-05-08 339950.0
39 2016-10-17 290000 NaN 70200.0
40 2010-09-02 260000 1998-04-14 189900.0
41 2012-07-30 231600 2012-03-30 0.0
42 2013-10-24 400000 2004-08-04 388400.0
43 2004-11-19 350000 1998-10-05 292400.0
44 2005-06-23 295000 2004-07-26 300000.0
45 2009-06-24 404500 2000-05-04 304900.0
46 1999-12-14 153500 1999-12-14 153500.0
47 2004-05-25 208000 NaN NaN
48 2016-10-20 502000 2005-05-31 357000.0
49 2013-04-05 369000 2000-08-07 253000.0
estimated_value
0 239753
1 343963
2 488840
3 494073
4 513676
5 496062
6 514953
7 494321
8 496079
9 424514
10 721350
11 331915
12 389415
13 386694
14 784587
15 354031
16 515537
17 544960
18 504791
19 495121
20 495894
21 496281
22 528343
23 349041
24 367754
25 356934
26 346001
27 342927
28 337969
29 500105
30 353827
31 693035
32 350857
33 716655
34 493156
35 349355
36 348079
37 343957
38 504705
39 311996
40 391469
41 418814
42 502894
43 478049
44 475615
45 521467
46 366187
47 386913
48 527104
49 497239
答案 0 :(得分:2)
只需更改引号即可读取多行字符串:
ExampleDf=pysqldf("""select sum(lastSaleAmount-priorSaleAmount)/sum(squareFootage) as AvgPric
,zipcode
from data
where priorSaleDate between '2010-01-01' and '2011-01-01'
group by zipcode
order by
sum(lastSaleAmount-priorSaleAmount)/sum(squareFootage) desc""")