Hadoop Hive: How to allow regular user continuously write data and create tables in warehouse directory?
I am running Hadoop 188.8.131.52.0.6.0-101 on a single node. I am trying to run Java MRD program that writes data to an existing Hive table from Eclipse under regular user. I get exception:
org.apache.hadoop.security.AccessControlException: Permission denied: user=dev, access=WRITE, inode="/apps/hive/warehouse/testids":hdfs:hdfs:drwxr-xr-x
This happens because regular user has no write permission to warehouse directory, only hdfs user does:
drwxr-xr-x - hdfs hdfs 0 2014-03-06 16:08 /apps/hive/warehouse/testids drwxr-xr-x - hdfs hdfs 0 2014-03-05 12:07 /apps/hive/warehouse/test
To circumvent this I change permissions on warehouse directory, so everybody now have write permissions:
[hdfs@localhost wks]$ hadoop fs -chmod -R a+w /apps/hive/warehouse [hdfs@localhost wks]$ hadoop fs -ls /apps/hive/warehouse drwxrwxrwx - hdfs hdfs 0 2014-03-06 16:08 /apps/hive/warehouse/testids drwxrwxrwx - hdfs hdfs 0 2014-03-05 12:07 /apps/hive/warehouse/test
This helps to some extent, and MRD program can now write as a regular user to warehouse directory, but only once. When trying to write data into the same table second time I get:
ERROR security.UserGroupInformation: PriviledgedActionException as:dev (auth:SIMPLE) cause:org.apache.hcatalog.common.HCatException : 2003 : Non-partitioned table already contains data : default.testids
Now, if I delete output table and create it anew in hive shell, I again get default permissions that do not allow regular user to write data into this table:
[hdfs@localhost wks]$ hadoop fs -ls /apps/hive/warehouse drwxr-xr-x - hdfs hdfs 0 2014-03-11 12:19 /apps/hive/warehouse/testids drwxrwxrwx - hdfs hdfs 0 2014-03-05 12:07 /apps/hive/warehouse/test
Please advise on Hive correct configuration steps that will allow a program run as a regular user do the following operations in Hive warehouse:
- Programmatically create / delete / rename Hive tables?
- Programmatically read / write data from Hive tables?
If you maintain the table from outside Hive, then declare the table as external:
An EXTERNAL table points to any HDFS location for its storage, rather than being stored in a folder specified by the configuration property hive.metastore.warehouse.dir.
A Hive administrator can create the table and it can point it toward your own user owned HDFS storage location and you grant Hive permission to read from there.
As a general comment, there are no ways for an unprivileged user to do an unauthorized privileged action. Any such way is technically an exploit and you should never rely on it: even if is possible today, it will likely be closed soon. Hive Authorization (and HCatalog authorization) is orthogonal to HDFS authorization.
Your application is also incorrect, irrelevant of authorization issues. You are trying to write 'twice' in the same table which means your application does not handle partitions correctly. Start from An Introduction to Hive’s Partitioning.
You can configure for hdfs-site.xml such as:
<property> <name>dfs.permissions</name> <value>false</value> </property>
This configure will disable permissions on HDFS. So, a regular user can do the operations on HDFS.
I hope this solve will help you.