在hdfs中我理解所有文件都被复制了,但是我们在我们的工作中做了某些日志记录,我们不希望复制的文件,因为它可能会不必要地维护复制的副本,是否可以这样做?即为了避免仅复制日志文件。?
答案 0 :(得分:2)
您可以使用-setrep标志和hadoop fs shell命令设置复制。
Usage: hadoop fs -setrep [-R] [-w] <numReplicas> <path>
Changes the replication factor of a file. If path is a directory then the command recursively changes the replication factor of all files under the directory tree rooted at path.
Options:
The -w flag requests that the command wait for the replication to complete. This can potentially take a very long time.
The -R flag is accepted for backwards compatibility. It has no effect.
Example:
hadoop fs -setrep -w 3 /user/hadoop/dir1
为避免复制,您可以将numReplicas设置为1.