rest,apache-spark,spring-batch,job-scheduling,spring-data-hadoop , Triggering spark jobs with REST


Triggering spark jobs with REST

Question:

Tag: rest,apache-spark,spring-batch,job-scheduling,spring-data-hadoop

I have been of late trying out apache spark. My question is more specific to trigger spark jobs. Here I had posted question on understanding spark jobs. After getting dirty on jobs I moved on to my requirement.

I have a REST end point where I expose API to trigger Jobs, I have used Spring4.0 for Rest Implementation. Now going ahead I thought of implementing Jobs as Service in Spring where I would submit Job programmatically, meaning when the endpoint is triggered, with given parameters I would trigger the job. I have now few design options.

Firstly, I would like to know what is the best solution in this case, execution wise and also scaling wise.

Note : I am using a standalone cluster from spark. kindly help.


Answer:

Just use the Spark JobServer https://github.com/spark-jobserver/spark-jobserver

There are a lot of things to consider with making a service, and the Spark JobServer has most of them covered already. If you find things that aren't good enough, it should be easy to make a request and add code to their system rather than reinventing it from scratch


Related:


How to expose existing REST API through Azure Service Bus (or through something else)


rest,azure,azureservicebus
I have an existing on-premise REST API from an external vendor. I'd like to expose this API unmodified to the outside world through an Azure website. So I have customers that run this API on-premise and I'm developing a PaaS/SaaS app that should access these on-premise API's. I also have...

Trying to write a unit test for file upload to a django Restless API


python,django,rest,file-upload,request
I'm writing a fairly small lightweight REST api so I chose restless as the quickest/easiest support for that. I didn't seem to need all the complexity and support of the django-REST module. My service will only received and send json but users need to upload files to one single endpoint....

remote data fetching inside model object in objective c using AFNetworking


ios,objective-c,rest,model-view-controller,afnetworking-2
In all of my iOS application I use this approach to respect MCV, I want to be sure that my implementation is correct and respects the best practices and the MVC design pattern : Singleton of AFNetworking acting as API for network calls: MyAPI.h : #import "AFHTTPSessionManager.h" #import "AFNetworking.h" @interface...

REST Jersey server JAX-RS 500 Internal Server Error


java,rest,jersey,jax-rs
I'm calling this method and getting a 500 back from it. In the debugger I'm able to step though it all the way to the return statement at the end. No problem, r is populated as expected after Response.build() is called, the status says 200 OK. But that's not what...

Install SparkR that comes with Spark 1.4


r,apache-spark,sparkr
The newest version of Spark (1.4) now comes with SparkR. Does anyone know how to go about installing the SparkR implementation on Windows? The sparkR.R script is currently located in C:/spark-1.4.0/R/pkgs/R/ This appears to be a step in the right direction, but the instructions don't work for Windows as there...

Sencha/Extjs rest call with all parameters


json,rest,extjs,sencha-touch
I'm using ExtJs 5.1.1 and I've written a simple view with a grid, and selecting one row the corresponding model property are editable in some text fields. When editing is completed the button 'save' call Model.save() method, which use the rest proxy configured to write the changes on the server....

How to un-nest a spark rdd that has the following type ((String, scala.collection.immutable.Map[String,scala.collection.immutable.Map[String,Int]]))


scala,cassandra,apache-spark
Its a nested map with contents like this when i print it onto screen (5, Map ( "ABCD" -> Map("3200" -> 3, "3350.800" -> 4, "200.300" -> 3) (1, Map ( "DEF" -> Map("1200" -> 32, "1320.800" -> 4, "2100" -> 3) I need to get something like this Case...

How to manipulate local files with webdav


javascript,jquery,rest,file-upload,webdav
Hi so I just found out that webdav protocol allows for manipulations of local files through a browser. I have it already set up in the back end. What I would like to know is how to make it work on front end. I am using javascript with jQuery. For...

Access key from mapValues or flatMapValues?


scala,apache-spark
In Spark 1.3, is there a way to access the key from mapValues? Specifically, if I have val y = x.groupBy(someKey) val z = y.mapValues(someFun) can someFun know which key of y it is currently operating on? Or do I have to do val y = x.map(r => (someKey(r), r)).groupBy(_._1)...

Intercepting login calls with Spring-Security-Rest plugin in Grails


rest,grails,spring-security
I am using the spring security rest plugin for Grails to provide a login mechanism for an AngularJS app. Login works fine, but I can't figure out how to intercept login calls, in order to store additional statistics on (invalid/valid) login attempts. As I am quite new to Spring Security...

Which spark MLIB algorithm to use?


machine-learning,apache-spark
I'm newbie to machine learning and would like to understand what algorithm (Classification algorithm or co-relation algorithm?) to use in order to understand what is the relationship between one or more attributes. for example consider I have following set of attributes, Bill No, Bill Amount, Tip amount, Waiter Name and...

How can I get json objects without the object number?


javascript,jquery,json,rest
I have a simple json object that spits out 4 items that have completely different properties inside each one. I have got the json being displayed with the 4 objects that are called meta.work_content like so: [Object, Object, Object, Object] I can open these in console and see the objects...

Passing a function foreach key of an Array


scala,apache-spark,scala-collections,spark-graphx
I have an array like that : val pairs: Array[(Int, ((VertexId, Seq[Int]), Int))] which generates this output : (11,((11,ArraySeq(2, 5, 4, 5)),1)) (11,((12,ArraySeq(7, 7, 8, 2)),1)) (11,((13,ArraySeq(5, 9, 8, 7)),1)) (1,((1,ArraySeq(1, 2, 3, 4)),1)) (1,((4,ArraySeq(1, 5, 1, 1)),1)) I want to build a Graph for each pairs._1. That means for...

What's the best way to map objects into ember model from REST Web API?


json,rest,ember.js,asp.net-web-api,ember-data
The topic of this post is: my solution is too slow for a large query return. I have a Web Api serving REST results like below from a call to localhost:9090/api/invetories?id=1: [ { "inventory_id": "1", "film_id": "1", "store_id": "1", "last_update": "2/15/2006 5:09:17 AM" }, { "inventory_id": "2", "film_id": "1", "store_id":...

Stuck with nested serializer using Django Rest Framework and default user


django,api,rest,django-rest-framework,serializer
The models and serializers are described in the pastebin: http://pastebin.com/ZxzxWY7V In my database I have a user which also has a member profile and a set of credentials attached to it. Now... when I run this as is and try to pull a user using the AuthUserModelSerializer I get the...

Pyspark: using filter for feature selection


python,apache-spark,pyspark
I have an array of dimensions 500 x 26. Using the filter operation in pyspark, I'd like to pick out the columns which are listed in another array at row i. Ex: if a[i]= [1 2 3] Then pick out columns 1, 2 and 3 and all rows. Can this...

Join files using Apache Spark / Spark SQL


java,apache-spark,apache-spark-sql
I am trying to use Apache Spark for comparing two different files based on some common field, and get the values from both files and write it as output file. I am using Spark SQL for joining both files (after storing the RDD as table). Is this the correct approach?...

Default/Constant values for POST/PUT arguments with Retrofit


java,rest,retrofit
Using the Retrofit REST Client library from Square, is there anyway of providing default/constant values for POST/PUT fields in a call. I know about including constant query parameters by simply including them in the path, but this work for Body parameters. I have an API that looks similar to: POST...

Unable to select values from the select list


javascript,jquery,rest
my select list is getting populated via a service call but I cannot select any of the values from the select list. AJS.$("#select2-actor").auiSelect2( { placeholderOption: 'first', formatResult: function(actor) { return '<b>' + actor.text ; }, data: function () { var data = []; AJS.$.ajax({ dataType: 'json', type: 'GET', url: AJS.params.baseURL+"/rest/leangearsrestresource/1.0/message/list/{actor}",...

In sbt, how can we specify the version of hadoop on which spark depends?


apache-spark,sbt
Well I have a sbt project which uses spark and spark sql, but my cluster uses hadoop 1.0.4 and spark 1.2 with spark-sql 1.2, currently my build.sbt looks like this: libraryDependencies ++= Seq( "com.datastax.cassandra" % "cassandra-driver-core" % "2.1.5", "com.datastax.cassandra" % "cassandra-driver-mapping" % "2.1.5", "com.datastax.spark" % "spark-cassandra-connector_2.10" % "1.2.1", "org.apache.spark" %...

How to avoid abusive use of REST endpoint [closed]


java,javascript,rest
how can I avoid abusive use of my REST API? For example, I have a website where certain actions earn a bunch of points which are stored within a user account. So technically, when ever this action is performed, I call my REST endpoint to add the points to the...

Include package in Spark local mode


python,apache-spark,py.test,pyspark
I'm writing some unit tests for my Spark code in python. My code depends on spark-csv. In production I use spark-submit --packages com.databricks:spark-csv_2.10:1.0.3 to submit my python script. I'm using pytest to run my tests with Spark in local mode: conf = SparkConf().setAppName('myapp').setMaster('local[1]') sc = SparkContext(conf=conf) My question is, since...

Ruby on Rails - Help Adding Badges to Application


ruby-on-rails,ruby,rest,activerecord,one-to-many
I'm creating a rails application that is a backend for a mobile application. The backend is implemented with a RESTful web API. Currently I am trying to add gamification to the platform through the use of badges that can be earned by the user. Right now the badges are tied...

In simple RESTful design, does PATCH imply mapping to CRUD's (ORM's) “update” and PUT to “destroy”+“create” (to replace a resource)?


database,rest,http,orm,crud
I'm trying to create a simple REST API and map it to CRUD. I have an ORM (DataMapper) which has methods like create, update and destroy. If I get it right, given a resource {a:'foo',b:'bar',c:'baz'}, performing a PUT {b:'qux'} is supposed to replace the resource and result in the same...

Retrieving TriangleCount


scala,apache-spark,spark-graphx
I'm trying to retrieve the amount of triangles from a graph using graphX. As I'm new to both Scala and graphX, I'm currently quite stuck. I'm creating a graph from an edgefile: 1 2 1 3 2 3 This should be 1 triangle. Next I'm using the build in function...

How to extract application ID from the PySpark context


apache-spark,yarn,pyspark
A previous question recommends sc.applicationId, but it is not present in PySpark, only in scala. So, how do I figure out the application id (for yarn) of my PySpark process?...

Removing duplicates from Spark RDDPair values


python,apache-spark,pyspark
I am new to Python and also Spark. I've an pair RDD containing (key, List) but some of the values are duplicate. RDD is of the form (zipCode,streets) I want a pair RDD which does not contain duplicates. I am trying to achieve it using python. Can anyone please help...

REST API with token based authentication


angularjs,codeigniter,api,rest,token
I want to develop a web site with AngularJS. On the backend side I will use Codeigniter REST framework. I have some security issues and I don't want to start developing without fixing them on my mind. I don't want to use something like api key because it will be...

Spring Data Rest executes query but returns 500 internal Server Error


java,spring,rest,spring-boot,spring-data-rest
I am using spring boot and spring data rest and I am facing a 500 internal server error, but no messages are displayed in console. I have the following: ProdutoVendaRepository.java public interface ProdutoVendaRepository extends PagingAndSortingRepository<ProdutoVenda, Integer> { @Query("SELECT new br.com.contoso.model.VendaPorFamilia(b.nome, SUM(i.valorMultiplicado)) FROM ProdutoVenda i JOIN i.produto o JOIN o.familia b...

Convert RDD[Map[String,Double]] to RDD[(String,Double)]


scala,apache-spark,rdd
I did some calculation and returned my values in a RDD containing scala map and now I want to remove this map and want to collect all keys values in a RDD. Any help will be appreciated....

Springboot REST application should accept and produce both XML and JSON


java,xml,rest,jackson,spring-boot
I am working on Springboot REST API. My application should consume and produce both XML and JSON. I came across the Jackson json Xml dependency. <groupId>com.fasterxml.jackson.dataformat</groupId> <artifactId>jackson-dataformat-xml</artifactId> <version>2.5.4</version> </dependency> I added this in my pom.xml. Now I am able to accept xml input but the values are null when mapped...

Can't save json data to variable (or cache) with angularjs $http.get


json,angularjs,web-services,rest
I have weird angularjs problem. I'm trying to fetch data from Rest Webservice. It works fine, but I can't save json data to object. My code looks like: services.service('customerService', [ '$http', '$cacheFactory', function($http, $cacheFactory) { var cache = $cacheFactory('dataCache'); var result = cache.get('user'); this.getById = function(id){ $http.get(urlList.getCustomer + id).success(function(data, status,...

Adding authorization to routes


ruby-on-rails,rest,routes,authorization
I cannot seem to find a good example for this. I have for example, a TicketController I define a ticket resource in my routes.rb. You only need to be logged in as a customer to GET a ticket, but you must be logged in as an administrator to PUT a...

Count of second dimension in two dimension data in Spark


apache-spark
I have the data in this format (apple, laptop) (apple, laptop) (apple, ipad) (dell, laptop) I want to output to be (apple, laptop, 2) (apple, ipad, 1) (dell, laptop, 1) I wanted to do this using groupby and then count but groupby is not allowing grouping based on two columns....

@RestController throws HTTP Status 406


java,spring,rest,maven
I am working on a basic Hello World program using Spring and Restful webservices. But when I try to call my service I am getting below error message: HTTP Status 406 - description - The resource identified by this request is only capable of generating responses with characteristics not acceptable...

AngularJS $resource Custom Action for Requesting a Password Reset


angularjs,rest,ngresource,angularjs-1.3
I'm just starting to use ngResource in a project to consume my RESTful endpoints. Is this how you would implement a user password reset using $resource? Looks weird passing the email address as a URL parameter. .factory('User', ['$resource', function ($resource) { var paramDefaults = {id: '@id'} var actions = {...

How to respond in Middleware Slim PHP Framework


php,rest,authentication,middleware,slim
I am creating middleware for auth into REST API. My API is created using Slim PHP Framework ,which in case provide great features to build APIs. One of this feature is Middleware. I need to check credentials in Middleware and respond with an error (HTTP code with JSON descriptions) to...

Consuming and exposing webservices in one project (.NET)


.net,web-services,rest,soap
What is best practice concerning consuming and exposing webservices in one project? (.net) I need to create a rest webservice to expose data. The rest webservice would need to consume this data from another (SOAP) webservice from a third party. (The data needs to be merged with data present in...

Unable to upload file to Sharepoint @ Office 365 via REST


javascript,ajax,rest,sharepoint,office365
I'm having trouble creating/uploading files via Microsoft's REST API (or at least that's what they call it) for Sharepoint running on Office 365. It looks like I'm able to authenticate all right, but I'm getting 403 Forbidden when I try to create a file. The same user can upload a...

spark-mllib: Error “reassignment to val” in source code


apache-spark,mllib
I'm using IDEA SBT project to test spark-mllib code. Here is build.sbt: name := "SparkTest" version := "1.0" scalaVersion := "2.11.6" libraryDependencies ++= Seq( "org.apache.spark" %% "spark-core" % "1.2.0", "org.apache.spark" %% "spark-mllib" % "1.2.0" ) After all the import and compile work has done, I found some errors in lib...

How do I flatMap a row of arrays into multiple rows?


apache-spark,apache-spark-sql
After parsing some jsons I have a one-column DataFrame of arrays scala> val jj =sqlContext.jsonFile("/home/aahu/jj2.json") res68: org.apache.spark.sql.DataFrame = [r: array<bigint>] scala> jj.first() res69: org.apache.spark.sql.Row = [List(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)] I'd like to explode each row out into several rows. How? edit: Original json file:...

Mailchimp Ecommerce360 Javascript Implementation


javascript,rest,e-commerce,mailchimp
Wondering if anyone can provide an example of how to pass a request to the /ecomm/order-add function of the Mailchimp API using javascript. This is critical for making use of Mailchimp's Ecommerce360 tracking. Here is documentation from Mailchimps API: https://apidocs.mailchimp.com/api/2.0/ecomm/order-add.php...

Is the DStream return by updateStateByKey function only contains one RDD?


apache-spark,spark-streaming,apache-spark-sql,pyspark
Is the DStream return by updateStateByKey function only contains one RDD? If not,Under what circumstances will the DStream contains more than one RDD?

REST api : correctly ask for an action


api,rest,endpoint
I'm currently working on a REST api. I've read a few times how to handle endpoints the right way, using the protocol (post, put, ...) to define which action should be made. Let's say I have a list of quotes. I have : a GET endpoint /quotes that let me...

Link to another resource in a REST API: by its ID, or by its URL?


json,api,rest,api-design,hateoas
I am creating some APIs using apiary, so the language used is JSON. Let's assume I need to represent this resource: { "id" : 9, "name" : "test", "customer_id" : 12, "user_id" : 1, "store_id" : 3, "notes" : "Lorem ipsum example long text" } Is it correct to refer...

Remove resource wrapper from CakePHP REST API JSON


rest,cakephp,cakephp-2.2
My question is similar to this one. I understand the answer given there. The OP of that question doesn't seem to have my issue. I am using CakePHP 2.2.3. I am fetching a resource like this: http://cakephpsite/lead_posts.json and it returns results like this: [ { "LeadPost": { "id": "1", "fieldA":...

Spark streaming transform function


java,apache-spark,spark-streaming
I am having compilation errors in the transform function for spark streaming. Specifically seem to be missing finalizing the DStream variable or something similar. I have copied from the amplab tutorials so slightly confused... Here is the code, the problem is in the transform function towards the end. Here is...

Error when running job that queries against Cassandra via Spark SQL through Spark Jobserver


cassandra,apache-spark,apache-spark-sql,spark-jobserver,spark-cassandra-connector
So I'm trying to run job that simply runs a query against cassandra using spark-sql, the job is submitted fine and the job starts fine. This code works when it is not being run through spark jobserver (when simply using spark submit). Could someone tell my what is wrong with...

RESTful routing best practice when referencing current_user from route?


ruby-on-rails,rest
I have typical RESTful routes for a user: /user/:id /user/:id/edit /user/:id/newsfeed However the /user/:id/edit route can only be accessed when the id equals the current_user's id. As I only want the current_user to have access to edit its profile. I don't want other users able to edit profiles that don't...

Using .update with nested Serializer to post Image


django,rest,django-models,django-rest-framework,imagefield
I have an ImageField. When I update it with the .update command, it does not properly save. It validates, returns a successful save, and says it is good. However, the image is never saved (I don't see it in my /media like I do my other pictures), and when it...