0
How do I connect Cassandra to Spark?
Cassandra > Spark > R
I’ve already been able to connect R to Spark, now I need to bring the data that is stored in Cassandra to Spark and finally analyze it in R. Can someone help me? Thanks in advance.
0
How do I connect Cassandra to Spark?
Cassandra > Spark > R
I’ve already been able to connect R to Spark, now I need to bring the data that is stored in Cassandra to Spark and finally analyze it in R. Can someone help me? Thanks in advance.
0
Spark doesn’t know how to talk about Cassandra, but its functionality can be extended through the use of connectors. The Datastax people produced a connector using Spark and Scala (scripting language that runs on JVM) and is available for download on Github:
https://github.com/datastax/spark-cassandra-connector
After building the repository on your computer, there will be two jar files in a directory called "target", one for Scala and one for Java. It’s good to have the jar accessible through a path that’s easy to remember.
Start Spark again (from Spark directory), but this time load the jar (remember to set the directory where the jar is):
bin/spark-shell --jars ~/spark-cassandra-connector-assembly-1.4.0-SNAPSHOT.jar
Now type the following at the scala prompt:
sc.stop
import com.datastax.spark.connector._, org.apache.spark.SparkContext, org.apache.spark.SparkContext._, org.apache.spark.SparkConf
val conf = new SparkConf(true).set("spark.cassandra.connection.host", "localhost")
val sc = new SparkContext(conf)
This takes the context of Spark and replaces it with one that is connected to your local database.
Type the following in the scala shell:
val test_spark_rdd = sc.cassandraTable("NOME_KEYSPACE", "SUA_TABELA")
test_spark_rdd.first
(instead of NOME_KEYSPACE
and SUA_TABELA
, place Keyspace and the table of the Keyspace).
I hope I helped in some way. Att
Browser other questions tagged rstudio cassandra spark
You are not signed in. Login or sign up in order to post.