Pointing a file to the hadoop cluster

I have a file stored in a server. I want the file to be pointed on the Hadoop cluster upon running spark. What I have is that I can point the spark context to the hadoop cluster but the data cannot be accessed in Spark now that it is pointing to the cluster. I have the data stored locally so in order for me to access the data, I have to point it locally. However, this causes a lot of memory error. What I hope to do is to point Spark on the cluster but at the same time accessed my data stored locally. Please provide me some ways how I can do this.

Answers


Spark (on Hadoop) cannot read a file stored locally. Remember spark is a distributed system running on multiple machines, thus it cannot read data on one of the nodes (other than localhost) directly.

You should put the file on HDFS and have spark read it from there.

To access it locally you should use hadoop fs -get <hdfs filepath> or hadoop fs -cat <hdfs filepath> command.


Need Your Help

Alphabetically Sort Dropdown values in php/html forms?

javascript php jquery html

Please Tell me how to Alphabetically Sort view of Dropdown Values in html/php forms.