Accumulo-Pig error - Connector info for AccumuloInputFormat can only be set once per job
Versions: Accumulo 1.5 Pig 0.10
Attempted: Read/write data in/into Accumulo from Pig, using accumulo-pig. Encountered an error - any insight into getting past this error is greatly appreciated. Switching to Accumulo 1.4 is not an option as we are using the Accumulo Thrift Proxy in our C# codebase.
Impact: This is currently a roadblock in our project.
Source reference: Source code - https://git-wip-us.apache.org/repos/asf/accumulo-pig.git
Error: In attemtping to read a dataset in Accumulo, from Pig, I am getting the following error-
org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Connector info for AccumuloInputFormat can only be set once per job
DATA = LOAD 'accumulo://departments?instance=indra&user=root&password=xxxxxxx&zookeepers=cdh-dn01:2181' using org.apache.accumulo.pig.AccumuloStorage() AS (row, cf, cq, cv, ts, val); dump DATA;
Try using the ACCUMULO-1783-1.5 branch from the same repository. The way that Pig sets up the InputFormat doesn't play nicely with how Accumulo sets up InputFormats (notably, Accumulo makes a funny assertion that you never call the same static method more than one for a Configuration).
I have been using pig 0.12 -- I doubt there's a difference in how 0.10 sets up the InputFormats as opposed to 0.12, but I'm not positive YMMV.
I just pushed a fix to the above branch that gets rid of the previously mentioned limitation on Hadoop version.