我想使用python(我发现的所有内容都在java中)收听HDFS中的特定目录。 当文件上传或移动到受监视的目录时,我希望我的python脚本对其执行不同的操作,然后删除该目录中的所有文件。问题是,我找不到使用python监视HDFS中的目录的方法。
我尝试使用看门狗,如您在所附代码中看到的那样,但它无法解析路径。路径存在并且目录也存在,但我显示错误:路径不是目录。 在此处输入代码
from hdfs import InsecureClient
client_hdfs = InsecureClient('', user='')
TestWatcher = client_hdfs.resolve("")
print(TestWatcher)
import time
from watchdog.observers import Observer
#from watchdog.observers.polling import PollingObserver
from watchdog.events import PatternMatchingEventHandler
if __name__ == "__main__":
#patterns = "*"
patterns = "*.pickle"
ignore_patterns = ""
ignore_directories = False
case_sensitive = True
#case_sensitive = False
my_event_handler = PatternMatchingEventHandler(patterns,
ignore_patterns, ignore_directories, case_sensitive)
def on_created(event):
print(f"hey, {event.src_path} has been created!")
def on_deleted(event):
print(f"what the f**k! Someone deleted {event.src_path}!")
def on_modified(event):
print(f"hey buddy, {event.src_path} has been modified")
def on_moved(event):
print(f"ok ok ok, someone moved {event.src_path} to
{event.dest_path}")
my_event_handler.on_created = on_created
my_event_handler.on_deleted = on_deleted
my_event_handler.on_modified = on_modified
my_event_handler.on_moved = on_moved
path = TestWatcher
go_recursively = False
my_observer = Observer()
#my_observer = PollingObserver()
my_observer.schedule(my_event_handler, path, recursive=go_recursively)
my_observer.start()
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
my_observer.stop()
my_observer.join()
错误OS错误:路径不是目录