Cassandra

From GeilThings

Jump to: navigation, search
Cassandra
General
Version 1.2.3
Stable 1.2.7
Function Database
SQL NoSQL
Port 9160
Files
Config File /opt/cassandra/conf/cassandra.yaml, /opt/cassandra/conf/cassandra-env.sh
Data File /var/lib/cassandra/data/*
Log File /var/log/cassandra/system.log, /var/lib/cassandra/commitlog
Scripts
Version Script /opt/cassandra/bin/nodetool -h localhost version
Start service cassandra start
Stop service cassandra stop
Cli /opt/cassandra/bin/cassandra-cli

Contents

General

Cassandra Wiki: http://wiki.apache.org/cassandra/FrontPage
Cassandra Data Model: http://wiki.apache.org/cassandra/DataModel
From Arin Sarkissian, Digg's engineering team, about Cassandra's data model: http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model
Cassandra Ports: http://wiki.apache.org/cassandra/FAQ#ports
CQL: Cassandra Query Language. Backwards incompatibility from CQL3 to CQL2 (Cassandra 1.1.0 to 1.0.9).
Determine version: /opt/cassandra/bin/nodetool -h localhost version

Installation

For 0.8.1, 0.8.6, 0.8.7, 1.0.0 and beyond:

# Needs java1.6
# Cassandra on CentOS: 
# http://blog.milford.io/2010/06/installing-apache-cassandra-on-centos/
 
cd /opt
wget http://apache.easy-webs.de//cassandra/0.8.1/apache-cassandra-0.8.1-bin.tar.gz
 
# Edit conf/cassandra.yaml if necessary, particularly 
# data_file_directories:
#     - /var/lib/cassandra/data
# commitlog_directory: /var/lib/cassandra/commitlog
# saved_caches_directory: /var/lib/cassandra/saved_caches
 
# Create the directories above
mkdir -p /var/lib/cassandra/data
mkdir -p /var/lib/cassandra/commitlog
mkdir -p /var/lib/cassandra/saved_caches
 
# Edit conf/log4j-server.properties if necessary, particularly
# log4j.appender.R.File=/var/log/cassandra/system.log
 
# Create the directory above
mkdir -p /var/log/cassandra
 
# Create softlink to facilitate administration.
cd /opt
ln -s apache-cassandra-0.8.1 /opt/cassandra

Running

Using a slightly modified version from the start stop script from Nathan Milford, http://blog.milford.io/2010/06/installing-apache-cassandra-on-centos/

#!/bin/bash
# init script for Cassandra.
# chkconfig: 2345 90 10
# description: Cassandra
# script slightly modified from 
# http://blog.milford.io/2010/06/installing-apache-cassandra-on-centos/
 
. /etc/rc.d/init.d/functions
 
CASS_HOME=/opt/cassandra
CASS_BIN=$CASS_HOME/bin/cassandra
CASS_LOG=/var/log/cassandra/system.log
CASS_USER="root"
CASS_PID=/var/run/cassandra.pid
 
if [ ! -f $CASS_BIN ]; then
  echo "File not found: $CASS_BIN"
  exit 1
fi
 
RETVAL=0
 
start() {
  if [ -f $CASS_PID ] && checkpid `cat $CASS_PID`; then
    echo "Cassandra is already running."
    exit 0
  fi
  echo -n $"Starting $prog: "
  daemon --user $CASS_USER $CASS_BIN -p $CASS_PID >> $CASS_LOG 2>&1
  usleep 500000
  RETVAL=$?
  if [ "$RETVAL" = "0" ]; then
    echo_success
  else
    echo_failure
  fi
  echo
  return $RETVAL
}
 
stop() {
  # check if the process is already stopped by seeing if the pid file exists.
  if [ ! -f $CASS_PID ]; then
    echo "Cassandra is already stopped."
    exit 0
  fi
  echo -n $"Stopping $prog: "
  if kill `cat $CASS_PID`; then
    RETVAL=0
    echo_success
  else
    RETVAL=1
    echo_failure
  fi
  echo
  [ $RETVAL = 0 ]
}
 
status_fn() {
  if [ -f $CASS_PID ] && checkpid `cat $CASS_PID`; then
    echo "Cassandra is running."
    exit 0
  else
    echo "Cassandra is stopped."
    exit 1
  fi
}
 
case "$1" in
  start)
    start
    ;;
  stop)
    stop
    ;;
  status)
    status_fn
    ;;
  restart)
    stop
	usleep 500000
    start
    ;;
  *)
    echo $"Usage: $prog {start|stop|restart|status}"
    RETVAL=3
esac
 
exit $RETVAL
  • End of script.
# After the script is created as /etc/init.d/cassandra, and
# made executable with
chmod +x /etc/init.d/cassandra
# start and stop Cassandra with
 
service cassandra start
service cassandra stop
 
# To bring cassandra alive after rebooting:
 
chmod +x /etc/init.d/cassandra
chkconfig --add cassandra
chkconfig cassandra on

Cassandra Cli

/opt/cassandra/bin/cassandra-cli
 
connect localhost/9160;
 
show keyspaces;
 
create keyspace test;
 
drop keyspace <keyspace_name>;
 
use test;
 
create column family HelloWorld with comparator = UTF8Type;
# Update: wrong:
update column family HelloWorld with key_validation_class=UTF8Type;
# The update column_metadata has to contain all parameters, it is not an incremental update.
# Update: right:
update column family HelloWorld with
...     column_metadata = [{column_name: language, validation_class: UTF8Type, index_type: KEYS},
...     {column_name: message_text, validation_class: UTF8Type},];
 
set HelloWorld['php']['language'] = 'php';
set HelloWorld['php']['message_text'] = 'Hello World from PHP and Cassandra';
set HelloWorld['perl']['language'] = 'perl';
set HelloWorld['perl']['message_text'] = 'Hello World from Perl, mod_perl and Cassandra';
set HelloWorld['node.js']['language'] = 'node.js';
set HelloWorld['node.js']['message_text'] = 'Hello World from node.js and Cassandra';
 
quit;

Configuration

Cassandra configuration: Some tips for using Cassandra: http://blog.mikiobraun.de/2010/08/-cassandra-tips.html
Cassandra Performance Tuning: http://jonathanhui.com/cassandra-performance-tuning-and-monitoring
Modify cassandra-env.sh so Cassandra uses less memory
No: MAX_HEAP_SIZE = (system_memory_in_mb / 2)
Instead: MAX_HEAP_SIZE = "512M"
and HEAP_NEWSIZE="100M"
Using Cassandra 1.1.0 or 1.1.1 with Java 1.7.4:
(Starting with Cassandra 1.1.2, the start script recognizes if java 1.7 is installed so this error should not happen anymore).
Error:
Cassandra 1.1.0 could not be started and the error pops on the log file:
Error running Cassandra with Java 1.7.4:
The stack size specified is too small, Specify at least 160k
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
Solution:
Modify the line below (..."128k"...) to reach a stack size of at least 160k
in the file /opt/cassandra/conf/cassandra-env.sh
JVM_OPTS="$JVM_OPTS -Xss160k"

Backup

Cassandra Operations: Backing up data
# Backup configuration and data files.
cp /opt/cassandra/conf/cassandra.yaml /home/username/backup/cassandra/cassandra_081_YYMMDD/cassandra.yaml
cp /opt/cassandra/conf/cassandra-env.sh /home/username/backup/cassandra/cassandra_081_YYMMDD/cassandra-env.sh
cp -pr /var/lib/cassandra/data/* /home/username/backup/cassandra/cassandra_081_YYMMDD/data/

Upgrade Cassandra

Good: Dominic Williams Blog, 10 steps to upgrade a Cassandra node
Upgrading from 0.8.x to 1.0: DataStax
  • Started with 0.8.1.
######### START OF EDIT #########
 
APPNAME=cassandra
# The directory where I want to download the files to be installed.
DOWNLOADDIRECTORY=/opt
# If I want to backup, set it to TRUE.
ISBACKUP=TRUE
# BACKUPROOTDIR is only relevant if ISBACKUP=TRUE
BACKUPROOTDIR=/home/backup
DATE=`date +%Y-%m-%d`
# The script breaks if variable names are not enclosed with {}
BACKUPDIR=${BACKUPROOTDIR}/${APPNAME}/${APPNAME}_${OLDVERSION}_${DATE}
 
######### END OF EDIT #########
 
# As of Cassandra 0.8.x, 1.0.x, 1.1.x, 1.2.x, Cassandra's version can be read using
# /opt/cassandra/bin/nodetool -h localhost version
# e.g. Output is: ReleaseVersion: 0.8.6
# This works only if Cassandra is running.
# TODO. Schrecklich here.
service cassandra stop
sleep 5
service cassandra start
# Service needs to be up to continue. TODO: Check that service is running.
sleep 5
RAWOLDVERSION=`/opt/cassandra/bin/nodetool -h localhost version`
OLDVERSION=${RAWOLDVERSION:16}
echo ${OLDVERSION}
 
RAWNEWVERSION=`curl -sd "action=ask&query=[[Cassandra]]|%3FStable&format=json" http://www.geilthings.com/api.php` 
# Get Version as Substring.  
# Using grep to sed to extract STRING between first occurrence of MATCH1 and next occurrence of MATCH2:  
# http://stackoverflow.com/questions/4392106/sed-extract-string-between-first-occurrence-of-match1-and-next-occurrence-of-m  
NEWVERSION=`echo "$RAWNEWVERSION" | grep -Po '^.*?\K(?<=Stable\":\[\").*?(?=\")'`  
echo $NEWVERSION
 
if [[ ${ISBACKUP} == "TRUE" ]]; then
  # Create backup directory if it does not exist.
  if [ ! -d "${BACKUPDIR}" ]; then
    mkdir -m 755 -p ${BACKUPDIR}
  fi
  # Backup configuration files.
  # Create backup directory if it does not exist.
  if [ ! -d "${BACKUPDIR}/conf" ]; then 
    mkdir -m 755 -p ${BACKUPDIR}/conf
  fi
  cp -pr /opt/cassandra/conf/* $BACKUPDIR/conf/
  # Stop the server before copying the data files.
  service cassandra stop
  # Backup the data files.
  # Create backup directory if it does not exist.
  if [ ! -d "${BACKUPDIR}/data" ]; then
    mkdir -m 755 -p ${BACKUPDIR}/data
  fi
  cp -pr /var/lib/cassandra/data/* $BACKUPDIR/data/
fi
 
cd ${DOWNLOADDIRECTORY}
# Get the file
wget http://mirror.netcologne.de/apache.org/cassandra/${NEWVERSION}/apache-cassandra-${NEWVERSION}-bin.tar.gz
 
tar -zxvf apache-cassandra-${NEWVERSION}-bin.tar.gz
cd apache-cassandra-${NEWVERSION}
 
# Read the /opt/apache-cassandra-$NEWVERSION/NEWS.txt file; 
# it contains upgrade relevant notes and details about new features.
# /opt/apache-cassandra-$NEWVERSION/README.txt contains 
# Getting Started information.
vi /opt/apache-cassandra-$NEWVERSION/NEWS.txt
# :q to leave vi.
 
# Copy any customization in the configuration files from 
# /opt/apache-cassandra-${OLDVERSION}/conf/cassandra.yaml 
# /opt/apache-cassandra-${OLDVERSION}/conf/log4j-server.properties 
# /opt/apache-cassandra-${OLDVERSION}/conf/cassandra-env.sh
# to
# /opt/apache-cassandra-${NEWVERSION}/conf/cassandra.yaml 
# /opt/apache-cassandra-${NEWVERSION}/conf/log4j-server.properties 
# /opt/apache-cassandra-${NEWVERSION}/conf/cassandra-env.sh
# Small RAM: In conf/cassandra-env.sh, set MAX_HEAP_SIZE = "512M" and HEAP_NEWSIZE="100M".
 
# Copy all Thrift libraries (or better, generate them again) before deleting the old Cassandra directory.
# e.g.
# cp -pr /opt/apache-cassandra-${OLDVERSION}/interface/gen-perl/ /opt/apache-cassandra-${NEWVERSION}/interface/
# cp -pr /opt/apache-cassandra-${OLDVERSION}/interface/gen-java/ /opt/apache-cassandra-${NEWVERSION}/interface/
# cp -pr /opt/apache-cassandra-${OLDVERSION}/interface/gen-nodejs/ /opt/apache-cassandra-${NEWVERSION}/interface/
 
# Better: Generate the thrift files again:
cd /opt/apache-cassandra-${NEWVERSION}/interface/
thrift --gen perl cassandra.thrift
thrift --gen php cassandra.thrift
thrift --gen js:node cassandra.thrift
thrift --gen py cassandra.thrift
thrift --gen rb cassandra.thrift
thrift --gen java cassandra.thrift
thrift --gen erl cassandra.thrift
 
# Modify the cassandra soft link.
rm -f /opt/cassandra
ln -s /opt/apache-cassandra-${NEWVERSION} /opt/cassandra
 
# Edit the stop and start script /etc/init.d/cassandra, in case it does not uses a soft link 
# or the soft link above does not exist.
 
# The old Cassandra 0.8.1 data files can be used with 0.8.6, no need to upgrade anything.
# 0.8.7 data files can be used if upgrading to 1.0.0.
 
# See issue below: 
# Upgrading to Cassandra 1.2.0: Change the partitioner in conf/cassandra.yaml
# partitioner: org.apache.cassandra.dht.Murmur3Partitioner
partitioner: org.apache.cassandra.dht.RandomPartitioner
 
# Start.
service cassandra start
 
# Check that the service is running.
ps aux | grep cassandra
# A mega long line appears.
 
# Rename the old version directory and delete it some time later.
mv /opt/apache-cassandra-${OLDVERSION} /opt/old_apache-cassandra-${OLDVERSION}
 
# Check version.
# ++++ Cassandra 1.1.3: ERROR. THE SCRIPT BELOW DOES NOT WORK:
# nodetool in Casandra 1.1.3 does not work for CentOS, see also https://issues.apache.org/jira/browse/CASSANDRA-4494
RAWVERSIONSCRIPT=`curl -sd "action=ask&query=[[Cassandra]]|%3FVersion_Script&format=json" http://www.geilthings.com/api.php` 
DIRTYVERSIONSCRIPT=`echo "$RAWVERSIONSCRIPT" | grep -Po '^.*?\K(?<=Version Script\":\[\").*?(?=\")'`
# Forward slahes will be preceded by Back slashes
CLEANVERSIONSCRIPT=$(echo "$DIRTYVERSIONSCRIPT" |  sed -e 's/\\//g')
RAWINSTALLEDNEWVERSION=`$CLEANVERSIONSCRIPT`
INSTALLEDNEWVERSION=${RAWINSTALLEDNEWVERSION:16}
echo $INSTALLEDNEWVERSION
 
# Test applications.
# ...

Programming


Issues

Upgrade to Cassandra 1.2.0, Cassandra does not start

Cassandra does not start and the error messages below appear in /var/log/cassandra/system.log:

ERROR Cannot open /var/lib/cassandra/data/system/schema_keyspaces/system-schema_keyspaces-hf-2; partitioner org.apache.cassandra.dht.RandomPartitioner does not match system partitioner org.apache.cassandra.dht.Murmur3Partitioner. Note that the default partitioner starting with Cassandra 1.2 is Murmur3Partitioner, so you will need to edit that to match your old partitioner if upgrading.

ERROR [SSTableBatchOpen:1] SSTableReader.java (line 176) Cannot open /var/lib/cassandra/data/system/schema_keyspaces/system-schema_keyspaces-hf-2; partitioner org.apache.cassandra.dht.RandomPartitioner does not match system partitioner org.apache.cassandra.dht.Murmur3Partitioner. Note that the default partitioner starting with Cassandra 1.2 is Murmur3Partitioner, so you will need to edit that to match your old partitioner if upgrading.

Solution: If upgrading to Cassandra 1.2.0: Change the partitioner in conf/cassandra.yaml:

from

partitioner: org.apache.cassandra.dht.Murmur3Partitioner

to

partitioner: org.apache.cassandra.dht.RandomPartitioner

The stack size specified is too small, Specify at least 160k

# This Java 1.7 error below was corrected in Cassandra 1.1.2 and higher:
# Java 1.7 needs 160k: JVM_OPTS="$JVM_OPTS -Xss160k"
# Otherwise Casandra cannot be started and the error pops on the log file:
# The stack size specified is too small, Specify at least 160k
# Error: Could not create the Java Virtual Machine.
# Error: A fatal exception has occurred. Program will exit.
++++++++++++++ Cassandra 1.1.3 nodetool does not work in CentOS 6.3 +++++++++++++++
+++ See http://www.mail-archive.com/commits@cassandra.apache.org/msg46663.html +++

Versions

Software name Version number Version date
Cassandra 1.1.7
1.1.8
1.2.0
0.8.6
0.8.7
1.0.0
1.0.3
1.0.5
1.0.6
1.2.1
1.2.2
1.2.3
0.8.1
1 December 2012
26 December 2012
8 January 2013
15 October 2011
22 October 2011
25 November 2011
2 December 2011
14 December 2011
17 January 2012
28 January 2013
25 February 2013
18 March 2013
12 July 2011

Comments

blog comments powered by Disqus