Installation using PIP
1 2 3 4 5 |
# install python-pip sudo apt-get install -y python-pip # install cql driver sudo pip install cql |
Manual Installation from the Sources
1 2 3 4 5 6 7 8 |
# download path mkdir -p ~/build/python-cql cd ~/build/python-cql # download git clone https://github.com/pcmanus/python-cql cd python-cql sudo python setup.py install |
Query Example
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 |
# create and move to diretory mkdir -p ~/build/python-cql cd ~/build/python-cql # create test file cat > test.py <<-"_EOF_" import cql host, port = "127.0.0.1", 9160 keyspace = "mytest" con = cql.connect(host, port, cql_version='3.0.0') print (con) cursor = con.cursor() query="DROP KEYSPACE %s;"%keyspace print(query) r= cursor.execute(query) print(r) query=""" CREATE KEYSPACE %s WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 1}; """%keyspace print(query) r= cursor.execute(query) print(r) query="USE %s;"%keyspace print(query) r= cursor.execute(query) print(r) query=""" CREATE TABLE users ( nickname text, password text, gender text, email text, active int, country text, date_of_birth timestamp, PRIMARY KEY (nickname) ); """ print(query) r= cursor.execute(query) print(r) query="CREATE INDEX country_index ON users (country);" print(query) r= cursor.execute(query) print(r) query="CREATE INDEX active_index ON users (active);" print(query) r= cursor.execute(query) print(r) query=""" INSERT INTO users(nickname, email, country, active) VALUES ('first', 'first@first123.com', 'UK', 1) ; """ print(query) r= cursor.execute(query) print(r) query=""" BEGIN BATCH INSERT INTO users(nickname, email, country, active) VALUES ('asmith', 'a@aaa.com', 'US', 1) ; INSERT INTO users(nickname, email, country, active) VALUES ('bsmith', 'b@bbb.com', 'US', 1) ; APPLY BATCH; """ print(query) r= cursor.execute(query) print(r) query="SELECT COUNT(*) FROM users;" print(query) r= cursor.execute(query) print(r) for row in cursor: # Iteration is equivalent to lots of fetchone() calls print row query="SELECT * FROM users;" print(query) r= cursor.execute(query) print(r) for row in cursor: # Iteration is equivalent to lots of fetchone() calls print row query="SELECT * FROM users where country > 'A' and active=1 ALLOW FILTERING;" print(query) r= cursor.execute(query) print(r) for row in cursor: # Iteration is equivalent to lots of fetchone() calls print row query=""" CREATE TABLE timeline ( nickname text, message text, created timeuuid, ctime timestamp, PRIMARY KEY (nickname, created) ) WITH CLUSTERING ORDER BY (created DESC); """ print(query) r= cursor.execute(query) print(r) import uuid key = str(uuid.uuid4()) import datetime from datetime import datetime dt=datetime.now().strftime("%Y-%m-%d %H:%M:%S") query=""" BEGIN BATCH INSERT INTO timeline(nickname, message, created, ctime) VALUES ('asmith','this message one', now(), '"""+str(dt)+"""' ) ; INSERT INTO timeline(nickname, message, created, ctime) VALUES ('asmith','this message two', """+str(uuid.uuid1())+""" , '"""+str(dt)+"""' ) ; APPLY BATCH; """ print(query) r= cursor.execute(query) print(r) query="SELECT * FROM timeline;" print(query) r= cursor.execute(query) print(r) for row in cursor: # Iteration is equivalent to lots of fetchone() calls print row print datetime.fromtimestamp(long(row[2])) # FINISH cursor.close() con.close() """ import uuid random_uuid = str(uuid.uuid4()) # anonymous global_uuid = str(uuid.uuid1()) # machine id coded inside cursor.execute("CQL QUERY", dict(kw='Foo', kw2='Bar, etc...)) - cursor.description # None initially, list of N tuples that represent the N columns in a row after an execute. Only contains type and name info, not values. - cursor.rowcount # -1 initially, N after an execute - cursor.arraysize # variable size of a fetchmany call - cursor.fetchone() # returns a single row - cursor.fetchmany() # returns self.arraysize # of rows - cursor.fetchall() # returns all rows, don't do this. Query substitution: - Use named parameters and a dictionary of names and values. e.g. execute("SELECT * FROM CF WHERE name=:name", {"name": "Foo"}) """ _EOF_ |
1 |
python test.py |
Tips
uuid1 vs uuid4
uuid1 vs uuid4
uuid1()
Generates a UUID from a host ID, sequence number, and the current time – guaranteed collision free
uuid4()
Generates a random UUID, can collide, chance extermely small
See Stackoverflow Reference: http://stackoverflow.com/questions/1785503/when-should-i-use-uuid-uuid1-vs-uuid-uuid4-in-python
Troubleshooting
1. InvalidRequestException – Invalid version value: 3.0
Fix: Change CQL Version from “3.0” to “3.0.0” in order to fix the following exception
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
a@a:~/build/python-cql$ python test.py Traceback (most recent call last): File "test.py", line 7, in <module> con = cql.connect(host, port, cql_version='3.0') File "/usr/local/lib/python2.7/dist-packages/cql/connection.py", line 143, in connect consistency_level=consistency_level, transport=transport) File "/usr/local/lib/python2.7/dist-packages/cql/connection.py", line 59, in __init__ self.establish_connection() File "/usr/local/lib/python2.7/dist-packages/cql/thrifteries.py", line 162, in establish_connection self.set_cql_version(self.cql_version) File "/usr/local/lib/python2.7/dist-packages/cql/thrifteries.py", line 165, in set_cql_version self.client.set_cql_version(cql_version) File "/usr/local/lib/python2.7/dist-packages/cql/cassandra/Cassandra.py", line 1983, in set_cql_version self.recv_set_cql_version() File "/usr/local/lib/python2.7/dist-packages/cql/cassandra/Cassandra.py", line 2004, in recv_set_cql_version raise result.ire cql.cassandra.ttypes.InvalidRequestException: InvalidRequestException(why='Invalid version value: 3.0 (see http://semver.org/ for details)') |
Source: could-not-connect-cql-30
2. Bad Request: line 4:29 mismatched input ‘:’
Fix: Change
1 |
WITH strategy_class = 'SimpleStrategy' AND strategy_options:replication_factor='1'; |
TO
1 |
WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 3} |
in order to fix the following exception
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
a@a:~/build/python-cql$ python test.py ThriftConnection(host='127.0.0.1', port=9160, keyspace=None, conn open) CREATE KEYSPACE mytest WITH strategy_class = 'SimpleStrategy' AND strategy_options:replication_factor='1'; Traceback (most recent call last): File "test.py", line 18, in <module> r= cursor.execute(query) File "/usr/local/lib/python2.7/dist-packages/cql/cursor.py", line 80, in execute response = self.get_response(prepared_q, cl) File "/usr/local/lib/python2.7/dist-packages/cql/thrifteries.py", line 77, in get_response return self.handle_cql_execution_errors(doquery, compressed_q, compress, cl) File "/usr/local/lib/python2.7/dist-packages/cql/thrifteries.py", line 98, in handle_cql_execution_errors raise cql.ProgrammingError("Bad Request: %s" % ire.why) cql.apivalues.ProgrammingError: Bad Request: line 4:29 mismatched input ':' expecting '=' |
3. error: cannot assign result of function now
Fix: Upgrade Cassandra from Version 1.2.3 to 1.2.4 to fix error
1 2 3 4 5 6 7 8 9 10 11 12 |
INSERT INTO timeline(nickname, message, created, ctime) VALUES ('asmith','this message two', now(), 7ed26c4e-a452-11e2-92b6-0022fa7a3a68 ) ; Traceback (most recent call last): File "test.py", line 161, in <module> r= cursor.execute(query) File "/usr/local/lib/python2.7/dist-packages/cql/cursor.py", line 80, in execute response = self.get_response(prepared_q, cl) File "/usr/local/lib/python2.7/dist-packages/cql/thrifteries.py", line 77, in get_response return self.handle_cql_execution_errors(doquery, compressed_q, compress, cl) File "/usr/local/lib/python2.7/dist-packages/cql/thrifteries.py", line 98, in handle_cql_execution_errors raise cql.ProgrammingError("Bad Request: %s" % ire.why) cql.apivalues.ProgrammingError: Bad Request: Type error: cannot assign result of function now (type timeuuid) to created (type 'org.apache.cassandra.db.marshal.ReversedType(org.apache.cassandra.db.marshal.TimeUUIDType)') |
Source: https://issues.apache.org/jira/browse/CASSANDRA-5386
Check Cassandra Version using any of the following commands:
1 2 |
nodetool -h localhost version #ReleaseVersion: 1.2.4 |
Manually Updgrade Cassandra to 1.2.3
Cassandra version 1.2.4 is currently not yet available in the ubuntu/datax repositories.
A partial upgrade (not safe for production)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
#create directories mkdir -p ~/build/cassandra cd ~/build/cassandra # download version="1.2.4" wget http://apache.openmirror.de/cassandra/$version/apache-cassandra-$version-bin.tar.gz tar -xvf apache-cassandra-$version-bin.tar.gz cd apache-cassandra-$version # move old jar file to old directory sudo mkdir -p /usr/share/cassandra/old/ sudo mv /usr/share/cassandra/lib /usr/share/cassandra/old/ sudo mv /usr/share/cassandra/*1.2.3.jar /usr/share/cassandra/old/ # copy new libraries sudo cp -r lib /usr/share/cassandra/ sudo mv /usr/share/cassandra/lib/apache-cassandra-* /usr/share/cassandra/ # change apache-cassandra link sudo rm /usr/share/cassandra/apache-cassandra.jar sudo ln -s /usr/share/cassandra/apache-cassandra-1.2.4.jar /usr/share/cassandra/apache-cassandra.jar # cassandra sudo /etc/init.d/cassandra restart |
Cannot execute this query as it might involve data filtering
Fix: Change query from
1 |
SELECT * FROM users where country > 'A' and active=1; |
TO
1 |
SELECT * FROM users where country > 'A' and active=1 ALLOW FILTERING; |
Source: http://www.datastax.com/docs/1.2/cql_cli/cql/SELECT
Error:
1 2 3 4 5 6 7 8 9 10 11 |
SELECT * FROM users where country > 'A' and active=1; Traceback (most recent call last): File "test.py", line 111, in <module> r= cursor.execute(query) File "/usr/local/lib/python2.7/dist-packages/cql/cursor.py", line 80, in execute response = self.get_response(prepared_q, cl) File "/usr/local/lib/python2.7/dist-packages/cql/thrifteries.py", line 77, in get_response return self.handle_cql_execution_errors(doquery, compressed_q, compress, cl) File "/usr/local/lib/python2.7/dist-packages/cql/thrifteries.py", line 98, in handle_cql_execution_errors raise cql.ProgrammingError("Bad Request: %s" % ire.why) cql.apivalues.ProgrammingError: Bad Request: Cannot execute this query as it might involve data filtering and thus may have unpredictable performance. If you want to execute this query despite the performance unpredictability, use ALLOW FILTERING |
References
1. Apache Cassandra Home: http://cassandra.apache.org/
2. python-cql Github Home: https://github.com/pcmanus/python-cql
3. python-cql PIP Home: https://pypi.python.org/pypi/cql
4. CQL Tutorial: http://techdocs.acunu.com/v1.1/html/tutorial.html#using-cql
5. Apache Cassandra 1.2 Documentation(PDF): http://www.datastax.com/doc-source/pdf/cassandra12.pdf
6. Cassandra Query Language Manual: http://cassandra.apache.org/doc/cql3/CQL.html
7. Cassandra, CQL3, and Time Series Data With Timeuuid: http://www.rustyrazorblade.com/2012/10/cassandra-cql3-and-time-series-data-with-timeuuid/
8. CQL3 for Cassandra experts:
9. A thrift to CQL3 upgrade guide: http://www.datastax.com/dev/blog/thrift-to-cql3
10. Understanding the Cassandra data model: http://www.datastax.com/docs/1.2/ddl/index
11. Cassandra Data Modeling Best Practices, Part 1: http://www.ebaytechblog.com/2012/07/16/cassandra-data-modeling-best-practices-part-1/
12. PRIMARY KEY’s in CQL: http://planetcassandra.org/blog/post/primary-keys-in-cql
13. CQL3 In Depth: http://www.slideshare.net/yukim/cql3-in-depth
14. CQL and Timeseries Schema: http://stackoverflow.com/questions/13500216/cql-and-timeseries-schema
15. Advanced Time Series with Cassandra: http://www.datastax.com/dev/blog/advanced-time-series-with-cassandra
16. Secondary Index ought to have low cardinality: http://en.wikipedia.org/wiki/Cardinality_(SQL_statements)