HDFS API - count the number of directories, files and bytes

How do you get the DIR_COUNT, FILE_COUNT, CONTENT_SIZE FILE_NAME in HDFS programatically in Scala/Java? (Not through Shell)

val fileStatus = fileSystem.getFileStatus(new Path(path))
val fileByteSize = fileStatus.getLen

FileSystem API doesn't seem to have those information. I can only get the file size of 1 file (code above). But I don't get the file count and byte size per directory.

I'm looking for a similar behavior to:

hdfs dfs -count [-q] <paths>

which count the number of directories, files and bytes under the path provided

Answers


You can use FileSystem.listStatus method to get information about files and directories in a given HDFS directory.

You can use the returned array of FileStatus objects to calculate total size, count of files, etc.


Need Your Help

How to map a custom type in Ming ODM?

python mongodb pymongo ming

I'm in the process of setting up a mapping in Ming ODM. One issue that has come up is how to map a custom type to a field, and how to pass that custom type into .query

Binding NSMutableArray to NSPopUpButton and inserting new values

macos cocoa cocoa-bindings nsarraycontroller nspopupbutton

(This example project is here https://github.com/danieljfarrell/BindingToPopUpButtons)