数据流作业似乎卡住了

时间:2018-12-28 07:51:25

标签: python google-cloud-dataflow apache-beam

问题现在已经改变了一些。主要问题是我的代码需要oracle库,但是我看不到使用安装文件来运行自定义命令来在worker中设置oracle客户端

from __future__ import absolute_import
from __future__ import print_function

import subprocess
from distutils.command.build import build as _build

import setuptools

# This class handles the pip install mechanism.
class build(_build):  # pylint: disable=invalid-name
    sub_commands = _build.sub_commands + [('CustomCommands', None)]


CUSTOM_COMMANDS = [
    ['sudo','apt-get', 'update'],
    ['sudo','apt-get','--assume-yes','install','unzip'],
['wget','https://storage.googleapis.com/facbeambucketv1/files/instantclient-basic-linux.x64-18.3.0.0.0dbru.zip'],
    ['sudo','unzip','-o', 'instantclient-basic-linux.x64-18.3.0.0.0dbru.zip', '-d' ,'orclbm'],
    ['sudo','apt-get','--assume-yes','install','libaio1'],
    ['sudo','apt-get','--assume-yes','install','tree']  
]


class CustomCommands(setuptools.Command):
    """A setuptools Command class able to run arbitrary commands."""

    def initialize_options(self):
        pass

    def finalize_options(self):
        pass

    def RunCustomCommand(self, command_list):
        print('Running command: %s' % command_list)
        p = subprocess.Popen(
            command_list,
            stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
        # Can use communicate(input='y\n'.encode()) if the command run requires
        # some confirmation.
        stdout_data, _ = p.communicate()
        print('Command output: %s' % stdout_data)
        if p.returncode != 0:
            raise RuntimeError('Command %s failed: exit code: %s' % (command_list, p.returncode))

    def run(self):
        for command in CUSTOM_COMMANDS:
            self.RunCustomCommand(command)


# Configure the required packages and scripts to install.
# Note that the Python Dataflow containers come with numpy already installed
# so this dependency will not trigger anything to be installed unless a version
# restriction is specified.
REQUIRED_PACKAGES = ['numpy','apache_beam','apache_beam[gcp]','cx_Oracle','datetime','google-cloud-bigquery']


setuptools.setup(
    name='orclbm',
    version='0.0.1',
    description='Oraclebm workflow package.',
    install_requires=REQUIRED_PACKAGES,
    packages=setuptools.find_packages(),
    include_package_data=True,
    cmdclass={
        # Command class instantiated and run during pip install scenarios.
        'build': build,
        'CustomCommands': CustomCommands,
        }
    )

但是自定义命令未运行。通过安装所需的软件包。我不确定是什么问题。

0 个答案:

没有答案