SPARK-1565, update examples to be used with spark-submit script. #552

ScrapCodes · 2014-04-25T14:06:29Z

Commit for initial feedback, basically I am curious if we should prompt user for providing args esp. when its mandatory. And can we skip if they are not ?

Also few other things that did not work like
bin/spark-submit examples/target/scala-2.10/spark-examples-1.0.0-SNAPSHOT-hadoop1.0.4.jar --class org.apache.spark.examples.SparkALS --arg 100 500 10 5 2

Not all the args get passed properly, may be I have messed up something will try to sort it out hopefully.

AmplabJenkins · 2014-04-25T14:07:55Z

Merged build triggered.

AmplabJenkins · 2014-04-25T14:08:03Z

Merged build started.

AmplabJenkins · 2014-04-25T14:44:16Z

Merged build finished.

AmplabJenkins · 2014-04-25T14:44:16Z

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14484/

pwendell · 2014-04-28T06:04:48Z

core/src/main/scala/org/apache/spark/SparkContext.scala

@@ -74,8 +74,8 @@ class SparkContext(config: SparkConf) extends Logging {
   * be generated using [[org.apache.spark.scheduler.InputFormatInfo.computePreferredLocations]]
   * from a list of input files or InputFormats for the application.
   */
-    @DeveloperApi
-    def this(config: SparkConf, preferredNodeLocationData: Map[String, Set[SplitInfo]]) = {
+  @DeveloperApi


This indentation change seems wrong

I am not sure but other methods are at this corrected indentation level. Is there some other reason for it being wrong?

Like this https://github.com/apache/spark/pull/552/files#diff-364713d7776956cb8b0a771e9b62f82dL90

Sorry, I meant the body of the function should be indented only 2 spaces from the signature, not 4.

pwendell · 2014-04-28T06:07:49Z

This seems like a good start. Hey @ScrapCodes we changed the format of spark-submit a bit to no longer use --arg, so you should try that.

I don't think it's necessary to prompt the users for arguments, I think just removing all the cases where there was a master argument is sufficient for now.

ScrapCodes · 2014-04-28T10:17:28Z

I don't think it's necessary to prompt the users for arguments, I think just removing all the cases where there was a master argument is sufficient for now.

Hm.. I will do so !, but in many cases it can be really hard to guess the parameters without reading the code of examples.

AmplabJenkins · 2014-04-28T13:17:58Z

Merged build triggered.

AmplabJenkins · 2014-04-28T13:18:06Z

Merged build started.

AmplabJenkins · 2014-04-28T13:57:19Z

Merged build finished. All automated tests passed.

AmplabJenkins · 2014-04-28T13:57:19Z

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14529/

pwendell · 2014-04-28T17:18:04Z

examples/src/main/scala/org/apache/spark/examples/bagel/WikipediaPageRankStandalone.scala

@@ -30,22 +30,15 @@ import org.apache.spark.rdd.RDD

 object WikipediaPageRankStandalone {
  def main(args: Array[String]) {
-    if (args.length < 5) {


Don't we still want to have a usage here?

I suppose we should and I think same for all those who accept compulsory arguments. Should I just go ahead and fix that ?

AmplabJenkins · 2014-04-29T10:27:57Z

Build triggered.

AmplabJenkins · 2014-04-29T10:28:03Z

Build started.

ScrapCodes · 2014-04-29T10:50:52Z

Do you want this change to go for streaming as well !, because for some things it may not make sense.

And then how do people stop it ?

AmplabJenkins · 2014-04-29T11:42:02Z

Build finished. All automated tests passed.

AmplabJenkins · 2014-04-29T11:42:03Z

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14565/

pwendell · 2014-05-01T22:18:25Z

@ScrapCodes yes let's update the streaming examples too. If people run it in driver mode, this will be exactly the same as the current examples.

AmplabJenkins · 2014-05-02T12:52:57Z

Build triggered.

AmplabJenkins · 2014-05-02T12:53:06Z

Build started.

AmplabJenkins · 2014-05-02T12:54:51Z

Build finished.

AmplabJenkins · 2014-05-02T12:54:51Z

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14622/

AmplabJenkins · 2014-05-02T12:57:58Z

Build triggered.

AmplabJenkins · 2014-05-02T12:58:06Z

Build started.

AmplabJenkins · 2014-05-02T14:12:09Z

Build finished. All automated tests passed.

AmplabJenkins · 2014-05-02T14:12:09Z

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14623/

andrewor14 · 2014-05-05T23:06:11Z

examples/src/main/java/org/apache/spark/examples/JavaTC.java

-    Integer slices = (args.length > 1) ? Integer.parseInt(args[1]): 2;
+    SparkConf sparkConf = new org.apache.spark.SparkConf().setAppName("JavaHdfsLR");
+    JavaSparkContext sc = new JavaSparkContext(sparkConf);
+    Integer slices = (args.length > 0) ? Integer.parseInt(args[0]): 2;


nit: space before colon

Do we leave space before colon ? I think the convention was nospace before colon and single space after it.

Also, don't put the full package name (org.apache.spark.SparkConf) here since you imported it above

Yeah, in scala we have no space before colon, but this is a common pattern in Java (e.g. bool ? 1 : 2 is short-hand for if (bool) { 1 } else { 2 }). We do this elsewhere in other examples actually.

AmplabJenkins · 2014-05-07T10:43:07Z

Build started.

AmplabJenkins · 2014-05-07T11:47:13Z

Build finished. All automated tests passed.

AmplabJenkins · 2014-05-07T11:47:14Z

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14772/

pwendell · 2014-05-07T16:38:42Z

@ScrapCodes you'll need to merge this with master - unfortunately there was another patch that renamed/moved some of the example files.

AmplabJenkins · 2014-05-08T06:32:58Z

Merged build triggered.

AmplabJenkins · 2014-05-08T06:33:03Z

Merged build started.

AmplabJenkins · 2014-05-08T06:42:57Z

Merged build triggered.

AmplabJenkins · 2014-05-08T06:43:03Z

Merged build started.

AmplabJenkins · 2014-05-08T07:31:18Z

Merged build finished. All automated tests passed.

AmplabJenkins · 2014-05-08T07:31:18Z

Merged build finished. All automated tests passed.

AmplabJenkins · 2014-05-08T07:31:18Z

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14808/

AmplabJenkins · 2014-05-08T07:31:18Z

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14809/

ScrapCodes · 2014-05-08T09:04:31Z

@pwendell Done !

andrewor14 · 2014-05-08T17:02:00Z

This LGTM. Thanks @ScrapCodes for all the effort!

pwendell · 2014-05-08T17:17:36Z

Thanks @ScrapCodes - sorry you had to up-merge this... good stuff :)

Commit for initial feedback, basically I am curious if we should prompt user for providing args esp. when its mandatory. And can we skip if they are not ? Also few other things that did not work like `bin/spark-submit examples/target/scala-2.10/spark-examples-1.0.0-SNAPSHOT-hadoop1.0.4.jar --class org.apache.spark.examples.SparkALS --arg 100 500 10 5 2` Not all the args get passed properly, may be I have messed up something will try to sort it out hopefully. Author: Prashant Sharma <[email protected]> Closes #552 from ScrapCodes/SPARK-1565/update-examples and squashes the following commits: 669dd23 [Prashant Sharma] Review comments 2727e70 [Prashant Sharma] SPARK-1565, update examples to be used with spark-submit script. (cherry picked from commit 44dd57f) Signed-off-by: Patrick Wendell <[email protected]>

@pwendell

. tex formulas in the documentation using mathjax. and spliting the MLlib documentation by techniques see jira https://spark-project.atlassian.net/browse/MLLIB-19 and https://github.com/shivaram/spark/compare/mathjax Author: Martin Jaggi <[email protected]> == Merge branch commits == commit 0364bfabbfc347f917216057a20c39b631842481 Author: Martin Jaggi <[email protected]> Date: Fri Feb 7 03:19:38 2014 +0100 minor polishing, as suggested by @pwendell commit dcd2142c164b2f602bf472bb152ad55bae82d31a Author: Martin Jaggi <[email protected]> Date: Thu Feb 6 18:04:26 2014 +0100 enabling inline latex formulas with $.$ same mathjax configuration as used in math.stackexchange.com sample usage in the linear algebra (SVD) documentation commit bbafafd2b497a5acaa03a140bb9de1fbb7d67ffa Author: Martin Jaggi <[email protected]> Date: Thu Feb 6 17:31:29 2014 +0100 split MLlib documentation by techniques and linked from the main mllib-guide.md site commit d1c5212b93c67436543c2d8ddbbf610fdf0a26eb Author: Martin Jaggi <[email protected]> Date: Thu Feb 6 16:59:43 2014 +0100 enable mathjax formula in the .md documentation files code by @shivaram commit d73948db0d9bc36296054e79fec5b1a657b4eab4 Author: Martin Jaggi <[email protected]> Date: Thu Feb 6 16:57:23 2014 +0100 minor update on how to compile the documentation

A recent PR (#552) fixed this for all Scala / Java examples. We need to do it for python too. Note that this blocks on #799, which makes `bin/pyspark` go through Spark submit. With only the changes in this PR, the only way to run these examples is through Spark submit. Once #799 goes in, you can use `bin/pyspark` to run them too. For example, ``` bin/pyspark examples/src/main/python/pi.py 100 --master local-cluster[4,1,512] ``` Author: Andrew Or <[email protected]> Closes #802 from andrewor14/python-examples and squashes the following commits: cf50b9f [Andrew Or] De-indent python comments (minor) 50f80b1 [Andrew Or] Remove pyFiles from SparkContext construction c362f69 [Andrew Or] Update docs to use spark-submit for python applications 7072c6a [Andrew Or] Merge branch 'master' of github.com:apache/spark into python-examples 427a5f0 [Andrew Or] Update docs d32072c [Andrew Or] Remove <master> from examples + update usages

A recent PR (#552) fixed this for all Scala / Java examples. We need to do it for python too. Note that this blocks on #799, which makes `bin/pyspark` go through Spark submit. With only the changes in this PR, the only way to run these examples is through Spark submit. Once #799 goes in, you can use `bin/pyspark` to run them too. For example, ``` bin/pyspark examples/src/main/python/pi.py 100 --master local-cluster[4,1,512] ``` Author: Andrew Or <[email protected]> Closes #802 from andrewor14/python-examples and squashes the following commits: cf50b9f [Andrew Or] De-indent python comments (minor) 50f80b1 [Andrew Or] Remove pyFiles from SparkContext construction c362f69 [Andrew Or] Update docs to use spark-submit for python applications 7072c6a [Andrew Or] Merge branch 'master' of github.com:apache/spark into python-examples 427a5f0 [Andrew Or] Update docs d32072c [Andrew Or] Remove <master> from examples + update usages (cherry picked from commit cf6cbe9) Signed-off-by: Patrick Wendell <[email protected]>

Commit for initial feedback, basically I am curious if we should prompt user for providing args esp. when its mandatory. And can we skip if they are not ? Also few other things that did not work like `bin/spark-submit examples/target/scala-2.10/spark-examples-1.0.0-SNAPSHOT-hadoop1.0.4.jar --class org.apache.spark.examples.SparkALS --arg 100 500 10 5 2` Not all the args get passed properly, may be I have messed up something will try to sort it out hopefully. Author: Prashant Sharma <[email protected]> Closes apache#552 from ScrapCodes/SPARK-1565/update-examples and squashes the following commits: 669dd23 [Prashant Sharma] Review comments 2727e70 [Prashant Sharma] SPARK-1565, update examples to be used with spark-submit script.

A recent PR (apache#552) fixed this for all Scala / Java examples. We need to do it for python too. Note that this blocks on apache#799, which makes `bin/pyspark` go through Spark submit. With only the changes in this PR, the only way to run these examples is through Spark submit. Once apache#799 goes in, you can use `bin/pyspark` to run them too. For example, ``` bin/pyspark examples/src/main/python/pi.py 100 --master local-cluster[4,1,512] ``` Author: Andrew Or <[email protected]> Closes apache#802 from andrewor14/python-examples and squashes the following commits: cf50b9f [Andrew Or] De-indent python comments (minor) 50f80b1 [Andrew Or] Remove pyFiles from SparkContext construction c362f69 [Andrew Or] Update docs to use spark-submit for python applications 7072c6a [Andrew Or] Merge branch 'master' of github.com:apache/spark into python-examples 427a5f0 [Andrew Or] Update docs d32072c [Andrew Or] Remove <master> from examples + update usages

@pwendell

. tex formulas in the documentation using mathjax. and spliting the MLlib documentation by techniques see jira https://spark-project.atlassian.net/browse/MLLIB-19 and https://github.com/shivaram/spark/compare/mathjax Author: Martin Jaggi <[email protected]> == Merge branch commits == commit 0364bfabbfc347f917216057a20c39b631842481 Author: Martin Jaggi <[email protected]> Date: Fri Feb 7 03:19:38 2014 +0100 minor polishing, as suggested by @pwendell commit dcd2142c164b2f602bf472bb152ad55bae82d31a Author: Martin Jaggi <[email protected]> Date: Thu Feb 6 18:04:26 2014 +0100 enabling inline latex formulas with $.$ same mathjax configuration as used in math.stackexchange.com sample usage in the linear algebra (SVD) documentation commit bbafafd2b497a5acaa03a140bb9de1fbb7d67ffa Author: Martin Jaggi <[email protected]> Date: Thu Feb 6 17:31:29 2014 +0100 split MLlib documentation by techniques and linked from the main mllib-guide.md site commit d1c5212b93c67436543c2d8ddbbf610fdf0a26eb Author: Martin Jaggi <[email protected]> Date: Thu Feb 6 16:59:43 2014 +0100 enable mathjax formula in the .md documentation files code by @shivaram commit d73948db0d9bc36296054e79fec5b1a657b4eab4 Author: Martin Jaggi <[email protected]> Date: Thu Feb 6 16:57:23 2014 +0100 minor update on how to compile the documentation Conflicts: docs/mllib-guide.md

Implement AQE again, with union fix.

There is some unknown error that causes spark job exit when installing openjdk8, extract it to be a role and run it at the begaining of the job to debug the root causes.

pwendell reviewed Apr 28, 2014
View reviewed changes

ScrapCodes changed the title ~~[WIP] SPARK-1565, update examples to be used with spark-submit script.~~ SPARK-1565, update examples to be used with spark-submit script. Apr 28, 2014

pwendell reviewed Apr 28, 2014
View reviewed changes

andrewor14 reviewed May 5, 2014
View reviewed changes

SPARK-1565, update examples to be used with spark-submit script.

2727e70

Review comments

669dd23

asfgit closed this in 44dd57f May 8, 2014

andrewor14 mentioned this pull request May 16, 2014

[SPARK-1824] Remove <master> from Python examples #802

Closed

ScrapCodes deleted the SPARK-1565/update-examples branch June 3, 2015 06:01

helenyugithub pushed a commit to helenyugithub/spark that referenced this pull request Aug 20, 2019

Merge pull request apache#552 from palantir/juang/aqe-again

e452fb0

Implement AQE again, with union fix.

SPARK-1565, update examples to be used with spark-submit script. #552

SPARK-1565, update examples to be used with spark-submit script. #552

Uh oh!

Conversation

ScrapCodes commented Apr 25, 2014

Uh oh!

AmplabJenkins commented Apr 25, 2014

Uh oh!

AmplabJenkins commented Apr 25, 2014

Uh oh!

AmplabJenkins commented Apr 25, 2014

Uh oh!

AmplabJenkins commented Apr 25, 2014

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pwendell commented Apr 28, 2014

Uh oh!

ScrapCodes commented Apr 28, 2014

Uh oh!

AmplabJenkins commented Apr 28, 2014

Uh oh!

AmplabJenkins commented Apr 28, 2014

Uh oh!

AmplabJenkins commented Apr 28, 2014

Uh oh!

AmplabJenkins commented Apr 28, 2014

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AmplabJenkins commented Apr 29, 2014

Uh oh!

AmplabJenkins commented Apr 29, 2014

Uh oh!

ScrapCodes commented Apr 29, 2014

Uh oh!

AmplabJenkins commented Apr 29, 2014

Uh oh!

AmplabJenkins commented Apr 29, 2014

Uh oh!

pwendell commented May 1, 2014

Uh oh!

AmplabJenkins commented May 2, 2014

Uh oh!

AmplabJenkins commented May 2, 2014

Uh oh!

AmplabJenkins commented May 2, 2014

Uh oh!

AmplabJenkins commented May 2, 2014

Uh oh!

AmplabJenkins commented May 2, 2014

Uh oh!

AmplabJenkins commented May 2, 2014

Uh oh!

AmplabJenkins commented May 2, 2014

Uh oh!

AmplabJenkins commented May 2, 2014

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AmplabJenkins commented May 7, 2014

Uh oh!

AmplabJenkins commented May 7, 2014

Uh oh!

AmplabJenkins commented May 7, 2014

Uh oh!

pwendell commented May 7, 2014

Uh oh!