Wednesday, September 14, 2011

Cassandra / Hadoop : No local connection available

If you are trying to run the word_count example in Cassandra Hadoop, and you encounter the following exception:

java.lang.RuntimeException: java.lang.UnsupportedOperationException: no local connection available
at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.initialize(ColumnFamilyRecordReader.java:132)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:418)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:620)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
Caused by: java.lang.UnsupportedOperationException: no local connection available
at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.getLocation(ColumnFamilyRecordReader.java:176)
at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.initialize(ColumnFamilyRecordReader.java:113)
... 4 more


Then you have hit a problem with local IP resolution in Java.

Cassandra currently uses the following line to resolve IP addresses;
localAddresses = InetAddress.getAllByName(InetAddress.getLocalHost().getHostAddress());

There are better ways to do this using the NetworkInterface,

But until Cassandra uses that you'll need to make sure that the bit of code above resolves properly by manipulating your /etc/hosts to resolve localhost to match the configuration in Cassandra, which by default is looking for localhost bound to 127.0.0.1.

I submitted a patch for this. If you are having issues, please vote to get the patch accepted.
https://issues.apache.org/jira/browse/CASSANDRA-3211