pyspark:在reduceByKey中排序错误:在<lambda> TypeError中:&#39; int&#39;对象不可调用

时间:2016-09-27 16:09:54

标签: python-2.7 apache-spark pyspark spark-dataframe

我有以下代码:对于每个Step 8 : RUN composer global --verbose require sebastian/phpcpd && composer global --verbose require phpmd/phpmd && composer global --verbose require pdepend/pdepend && composer global --verbose require squizlabs/php_codesniffer && composer global --verbose require phpunit/phpunit ---> Running in 0f203e1760a4 [ErrorException] chdir(): No such file or directory (errno 2) Exception trace: () at phar:///usr/local/bin/composer/src/Composer/Command/GlobalCommand.php:74 Composer\Util\ErrorHandler::handle() at n/a:n/a chdir() at phar:///usr/local/bin/composer/src/Composer/Command/GlobalCommand.php:74 Composer\Command\GlobalCommand->run() at phar:///usr/local/bin/composer/vendor/symfony/console/Application.php:847 Symfony\Component\Console\Application->doRunCommand() at phar:///usr/local/bin/composer/vendor/symfony/console/Application.php:192 Symfony\Component\Console\Application->doRun() at phar:///usr/local/bin/composer/src/Composer/Console/Application.php:231 Composer\Console\Application->doRun() at phar:///usr/local/bin/composer/vendor/symfony/console/Application.php:123 Symfony\Component\Console\Application->run() at phar:///usr/local/bin/composer/src/Composer/Console/Application.php:104 Composer\Console\Application->run() at phar:///usr/local/bin/composer/bin/composer:43 require() at /usr/local/bin/composer:24 global <command-name> [<args>]... ,我尝试根据my_id字段对amount字段进行排序:

timestamp

但是,我收到以下错误:

output_rdd = my_df.rdd.map(lambda r: (r['my_id'], [r['timestamp'],[r['amount']]]))\
                        .reduceByKey(lambda a, b: sorted(a+b, key=(a+b)[0]))\
                        .map(lambda r: r[1])

知道我错过了什么吗?非常感谢你!

2 个答案:

答案 0 :(得分:1)

key应该是一个功能。尝试

...     .reduceByKey(lambda a, b: sorted(a+b, key=lambda x: x[0] )) \

答案 1 :(得分:1)

请注意Python documentation -

中的以下内容

key参数的值应该是一个函数,它接受一个参数并返回一个用于排序的键。这种技术很快,因为每个输入记录只调用一次键函数。

将传递给key的参数转换为python函数或lambda函数,然后重试。