Using Apache Spark from Clojure
Here's a small sample of how to process big data (with a small sample) from Clojure using Apache Spark and the Sparkling library:
(do
(require '[sparkling.conf :as conf])
(require '[sparkling.core :as spark])
(spark/with-context ; this creates a spark context from the given config
sc
(-> (conf/spark-conf)
(conf/app-name "sparkling-test")
(conf/master "local"))
(let [lines-rdd
;; here we provide data from a clojure collection.
;; You could also read from a text file, or avro file.
;; You could even approach a JDBC datasource
(spark/into-rdd sc ["This is a first line"
"Testing spark"
"and sparkling"
"Happy hacking!"])]
(spark/collect ; get every element from the filtered RDD
(spark/filter ; filter elements in the given RDD (lines-rdd)
#(.contains % "spark") ; a pure clojure function as filter predicate
lines-rdd)))))
Written by Chris
Related protips
Have a fresh tip? Share with Coderwall community!
Post
Post a tip
Best
#Clojure
Authors
Sponsored by #native_company# — Learn More
#native_title#
#native_desc#