Hadoop - split manually files in HDFS

I have submitted a file with size 1 GB and I want to split this file in files with size 100MB. How can I do that from the command line. I'm searching for a command like:

hadoop fs -split --bytes=100m /user/foo/one_gb_file.csv /user/foo/100_mb_file_1-11.csv

Is there a way to do that in HDFS?

Answers


In HDFS, we cannot expect all feature that are available in unix. Current version of hadoop fs utility doesn't provide this functionality. May be we can expect in future. you can raise a bug(improvement in apache Jira) for including this feature in hdfs.

For now you got to write your own implementation in Java.


Need Your Help

V2 documentation errors?

box-api

I am working on an application that will be using the V2 Box API and have noticed that for some calls the response I get from the API differs from the documentation.

Create multidimensional array from array

php arrays multidimensional-array

I am trying to create a multidimensional array from a linear array.