Is there a tool in the Hadoop ecosystem which can actually know if new data has been added to the HDFS File System ?
Actually I want to execute remotely a sqoop import job from an external database (no merge, only new table). Then when this data is written in HDFS, it would execute a spark script that would process with the newly data added and do some stuffs.
Is there any feature in Hadoop that does this kind of job ?
I could totally execute the spark script after the sqoop import job is done, but I would like to know if such feature exists and haven't find any yet.
Thanks in advance.