Hadoop - split manually files in HDFS
I have submitted a file with size 1 GB and I want to split this file in files with size 100MB. How can I do that from the command line. I'm searching for a command like:
hadoop fs -split --bytes=100m /user/foo/one_gb_file.csv /user/foo/100_mb_file_1-11.csv
Is there a way to do that in HDFS?
In HDFS, we cannot expect all feature that are available in unix. Current version of hadoop fs utility doesn't provide this functionality. May be we can expect in future. you can raise a bug(improvement in apache Jira) for including this feature in hdfs.
For now you got to write your own implementation in Java.