Brian ONeill's Random Thoughts: October 2011

Monday, October 24, 2011

Virgil gets a command-line interface (virgil-cli)

Tonight I bundled the cassandra command-line interface (CLI) into virgil. Since the CLI uses the thrift-based CassandraDaemon, the main method now starts a thrift server along side the REST server.

Now, when you (or your application) issues commands through the REST interface, you can verify that they worked through the command-line interface. For more information, check out the wiki.

Specifically, if you use the curl commands in the Getting Started section. You should see the following in the command-line interface.


bone@zen:~/dev/code.google.com/virgil/trunk-> bin/virgil-cli -h localhost
Connected to: "Test Cluster" on localhost/9160
Welcome to the Cassandra CLI.

Type 'help;' or '?' for help.
Type 'quit;' or 'exit;' to quit.

[default@unknown] use playground;
Authenticated to keyspace: playground
[default@playground] list toys;
Using default limit of 100
-------------------
RowKey: swingset
=> (column=bar, value=33, timestamp=1319508065134)
=> (column=foo, value=1, timestamp=1319508065126)

1 Row Returned.
[default@playground] quit

Thursday, October 20, 2011

Virgil: a GUI and REST layer for Cassandra

Love Cassandra? Love REST?
Wish you could have both at the same time?
Now you can.

After much discussion, I'm happy to announce the birth of a new project, Virgil. The project will provide a GUI and a services layer on top of Cassandra, exposing data and services via REST.

Virgil already has a REST layer for CRUD operations against keyspaces, column families, and data. We hope to quickly add Pig/Hadoop support via REST as well as a thin, javascript-based GUI that uses the REST services.

How can you help nurture the baby?
Head over to Apache Extras,
http://code.google.com/a/apache-extras.org/p/virgil/

Star the project, and then get involved.
Grab the source code and give it a try.

Wednesday, October 5, 2011

Pig / Cassandra: binary operator expected

If you are trying to run Pig on Cassandra and you encounter: "binary operator expected"

You are most likely running pig_cassandra against the latest release of Pig, which has two jar files in it one with hadoop and one without hadoop. Your PIG_HOME is set to the root directory of your pig installation, which contains those two jar files. The existence of TWO jar files breaks the pig_cassandra shell script.

I've submitted a patch for this:
https://issues.apache.org/jira/browse/CASSANDRA-3320
(Please vote to get it included)

Until that is committed, you can simply remove the jar file you don't want to use:


rm -fr $PIG_HOME/pig-0.9.1-withouthadoop.jar

That should fix you.
Happy pigging.

Monday, October 3, 2011

Cassandra / Hadoop : Getting the row key (when iterating over all rows)

I thought I would save some people some time...

The word count example is fantastic, and is enough to get you going. But, you it may leave you wondering how to get at the row key since the "key" passed into the map is the name of the column and not the key of the row. Instead the key is in the context. Take a look at the snippet below.

public void map(ByteBuffer key, SortedMap<ByteBuffer, IColumn> columns, Context context) 
   throws IOException, InterruptedException {
  for (ByteBuffer columnKey : columns.keySet()){
     String name = ByteBufferUtil.string(columns.get(columnKey).name());
     String value = ByteBufferUtil.string(columns.get(columnKey).value());            
     logger.info("[" + ByteBufferUtil.string(columnKey) + "]->[" + name + "]:[" + value + "]");
      logger.info("Context [" + ByteBufferUtil.string(context.getCurrentKey()) + "]);             
  }