Skip to content

SPARK-1565, update examples to be used with spark-submit script. #552

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

ScrapCodes
Copy link
Member

Commit for initial feedback, basically I am curious if we should prompt user for providing args esp. when its mandatory. And can we skip if they are not ?

Also few other things that did not work like
bin/spark-submit examples/target/scala-2.10/spark-examples-1.0.0-SNAPSHOT-hadoop1.0.4.jar --class org.apache.spark.examples.SparkALS --arg 100 500 10 5 2

Not all the args get passed properly, may be I have messed up something will try to sort it out hopefully.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished.

@AmplabJenkins
Copy link

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14484/

@@ -74,8 +74,8 @@ class SparkContext(config: SparkConf) extends Logging {
* be generated using [[org.apache.spark.scheduler.InputFormatInfo.computePreferredLocations]]
* from a list of input files or InputFormats for the application.
*/
@DeveloperApi
def this(config: SparkConf, preferredNodeLocationData: Map[String, Set[SplitInfo]]) = {
@DeveloperApi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This indentation change seems wrong

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure but other methods are at this corrected indentation level. Is there some other reason for it being wrong?

Like this https://github.com/apache/spark/pull/552/files#diff-364713d7776956cb8b0a771e9b62f82dL90

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I meant the body of the function should be indented only 2 spaces from the signature, not 4.

@pwendell
Copy link
Contributor

This seems like a good start. Hey @ScrapCodes we changed the format of spark-submit a bit to no longer use --arg, so you should try that.

I don't think it's necessary to prompt the users for arguments, I think just removing all the cases where there was a master argument is sufficient for now.

@ScrapCodes
Copy link
Member Author

I don't think it's necessary to prompt the users for arguments, I think just removing all the cases where there was a master argument is sufficient for now.

Hm.. I will do so !, but in many cases it can be really hard to guess the parameters without reading the code of examples.

@ScrapCodes ScrapCodes changed the title [WIP] SPARK-1565, update examples to be used with spark-submit script. SPARK-1565, update examples to be used with spark-submit script. Apr 28, 2014
@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14529/

@@ -30,22 +30,15 @@ import org.apache.spark.rdd.RDD

object WikipediaPageRankStandalone {
def main(args: Array[String]) {
if (args.length < 5) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we still want to have a usage here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we should and I think same for all those who accept compulsory arguments. Should I just go ahead and fix that ?

@AmplabJenkins
Copy link

Build triggered.

@AmplabJenkins
Copy link

Build started.

@ScrapCodes
Copy link
Member Author

Do you want this change to go for streaming as well !, because for some things it may not make sense.

And then how do people stop it ?

@AmplabJenkins
Copy link

Build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14565/

@pwendell
Copy link
Contributor

pwendell commented May 1, 2014

@ScrapCodes yes let's update the streaming examples too. If people run it in driver mode, this will be exactly the same as the current examples.

@AmplabJenkins
Copy link

Build triggered.

@AmplabJenkins
Copy link

Build started.

@AmplabJenkins
Copy link

Build finished.

@AmplabJenkins
Copy link

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14622/

@AmplabJenkins
Copy link

Build triggered.

@AmplabJenkins
Copy link

Build started.

@AmplabJenkins
Copy link

Build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14623/

Integer slices = (args.length > 1) ? Integer.parseInt(args[1]): 2;
SparkConf sparkConf = new org.apache.spark.SparkConf().setAppName("JavaHdfsLR");
JavaSparkContext sc = new JavaSparkContext(sparkConf);
Integer slices = (args.length > 0) ? Integer.parseInt(args[0]): 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: space before colon

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we leave space before colon ? I think the convention was nospace before colon and single space after it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, don't put the full package name (org.apache.spark.SparkConf) here since you imported it above

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, in scala we have no space before colon, but this is a common pattern in Java (e.g. bool ? 1 : 2 is short-hand for if (bool) { 1 } else { 2 }). We do this elsewhere in other examples actually.

@AmplabJenkins
Copy link

Build started.

@AmplabJenkins
Copy link

Build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14772/

@pwendell
Copy link
Contributor

pwendell commented May 7, 2014

@ScrapCodes you'll need to merge this with master - unfortunately there was another patch that renamed/moved some of the example files.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

1 similar comment
@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14808/

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14809/

@ScrapCodes
Copy link
Member Author

@pwendell Done !

@andrewor14
Copy link
Contributor

This LGTM. Thanks @ScrapCodes for all the effort!

@pwendell
Copy link
Contributor

pwendell commented May 8, 2014

Thanks @ScrapCodes - sorry you had to up-merge this... good stuff :)

@asfgit asfgit closed this in 44dd57f May 8, 2014
asfgit pushed a commit that referenced this pull request May 8, 2014
Commit for initial feedback, basically I am curious if we should prompt user for providing args esp. when its mandatory. And can we skip if they are not ?

Also few other things that did not work like
`bin/spark-submit examples/target/scala-2.10/spark-examples-1.0.0-SNAPSHOT-hadoop1.0.4.jar --class org.apache.spark.examples.SparkALS --arg 100 500 10 5 2`

Not all the args get passed properly, may be I have messed up something will try to sort it out hopefully.

Author: Prashant Sharma <[email protected]>

Closes #552 from ScrapCodes/SPARK-1565/update-examples and squashes the following commits:

669dd23 [Prashant Sharma] Review comments
2727e70 [Prashant Sharma] SPARK-1565, update examples to be used with spark-submit script.
(cherry picked from commit 44dd57f)

Signed-off-by: Patrick Wendell <[email protected]>
pwendell pushed a commit to pwendell/spark that referenced this pull request May 12, 2014
.

tex formulas in the documentation

using mathjax.
and spliting the MLlib documentation by techniques

see jira
https://spark-project.atlassian.net/browse/MLLIB-19
and
https://github.com/shivaram/spark/compare/mathjax

Author: Martin Jaggi <[email protected]>

== Merge branch commits ==

commit 0364bfabbfc347f917216057a20c39b631842481
Author: Martin Jaggi <[email protected]>
Date:   Fri Feb 7 03:19:38 2014 +0100

    minor polishing, as suggested by @pwendell

commit dcd2142c164b2f602bf472bb152ad55bae82d31a
Author: Martin Jaggi <[email protected]>
Date:   Thu Feb 6 18:04:26 2014 +0100

    enabling inline latex formulas with $.$

    same mathjax configuration as used in math.stackexchange.com

    sample usage in the linear algebra (SVD) documentation

commit bbafafd2b497a5acaa03a140bb9de1fbb7d67ffa
Author: Martin Jaggi <[email protected]>
Date:   Thu Feb 6 17:31:29 2014 +0100

    split MLlib documentation by techniques

    and linked from the main mllib-guide.md site

commit d1c5212b93c67436543c2d8ddbbf610fdf0a26eb
Author: Martin Jaggi <[email protected]>
Date:   Thu Feb 6 16:59:43 2014 +0100

    enable mathjax formula in the .md documentation files

    code by @shivaram

commit d73948db0d9bc36296054e79fec5b1a657b4eab4
Author: Martin Jaggi <[email protected]>
Date:   Thu Feb 6 16:57:23 2014 +0100

    minor update on how to compile the documentation
asfgit pushed a commit that referenced this pull request May 17, 2014
A recent PR (#552) fixed this for all Scala / Java examples. We need to do it for python too.

Note that this blocks on #799, which makes `bin/pyspark` go through Spark submit. With only the changes in this PR, the only way to run these examples is through Spark submit. Once #799 goes in, you can use `bin/pyspark` to run them too. For example,

```
bin/pyspark examples/src/main/python/pi.py 100 --master local-cluster[4,1,512]
```

Author: Andrew Or <[email protected]>

Closes #802 from andrewor14/python-examples and squashes the following commits:

cf50b9f [Andrew Or] De-indent python comments (minor)
50f80b1 [Andrew Or] Remove pyFiles from SparkContext construction
c362f69 [Andrew Or] Update docs to use spark-submit for python applications
7072c6a [Andrew Or] Merge branch 'master' of github.com:apache/spark into python-examples
427a5f0 [Andrew Or] Update docs
d32072c [Andrew Or] Remove <master> from examples + update usages
asfgit pushed a commit that referenced this pull request May 17, 2014
A recent PR (#552) fixed this for all Scala / Java examples. We need to do it for python too.

Note that this blocks on #799, which makes `bin/pyspark` go through Spark submit. With only the changes in this PR, the only way to run these examples is through Spark submit. Once #799 goes in, you can use `bin/pyspark` to run them too. For example,

```
bin/pyspark examples/src/main/python/pi.py 100 --master local-cluster[4,1,512]
```

Author: Andrew Or <[email protected]>

Closes #802 from andrewor14/python-examples and squashes the following commits:

cf50b9f [Andrew Or] De-indent python comments (minor)
50f80b1 [Andrew Or] Remove pyFiles from SparkContext construction
c362f69 [Andrew Or] Update docs to use spark-submit for python applications
7072c6a [Andrew Or] Merge branch 'master' of github.com:apache/spark into python-examples
427a5f0 [Andrew Or] Update docs
d32072c [Andrew Or] Remove <master> from examples + update usages
(cherry picked from commit cf6cbe9)

Signed-off-by: Patrick Wendell <[email protected]>
pdeyhim pushed a commit to pdeyhim/spark-1 that referenced this pull request Jun 25, 2014
Commit for initial feedback, basically I am curious if we should prompt user for providing args esp. when its mandatory. And can we skip if they are not ?

Also few other things that did not work like
`bin/spark-submit examples/target/scala-2.10/spark-examples-1.0.0-SNAPSHOT-hadoop1.0.4.jar --class org.apache.spark.examples.SparkALS --arg 100 500 10 5 2`

Not all the args get passed properly, may be I have messed up something will try to sort it out hopefully.

Author: Prashant Sharma <[email protected]>

Closes apache#552 from ScrapCodes/SPARK-1565/update-examples and squashes the following commits:

669dd23 [Prashant Sharma] Review comments
2727e70 [Prashant Sharma] SPARK-1565, update examples to be used with spark-submit script.
pdeyhim pushed a commit to pdeyhim/spark-1 that referenced this pull request Jun 25, 2014
A recent PR (apache#552) fixed this for all Scala / Java examples. We need to do it for python too.

Note that this blocks on apache#799, which makes `bin/pyspark` go through Spark submit. With only the changes in this PR, the only way to run these examples is through Spark submit. Once apache#799 goes in, you can use `bin/pyspark` to run them too. For example,

```
bin/pyspark examples/src/main/python/pi.py 100 --master local-cluster[4,1,512]
```

Author: Andrew Or <[email protected]>

Closes apache#802 from andrewor14/python-examples and squashes the following commits:

cf50b9f [Andrew Or] De-indent python comments (minor)
50f80b1 [Andrew Or] Remove pyFiles from SparkContext construction
c362f69 [Andrew Or] Update docs to use spark-submit for python applications
7072c6a [Andrew Or] Merge branch 'master' of github.com:apache/spark into python-examples
427a5f0 [Andrew Or] Update docs
d32072c [Andrew Or] Remove <master> from examples + update usages
gzm55 pushed a commit to MediaV/spark that referenced this pull request Jul 17, 2014
.

tex formulas in the documentation

using mathjax.
and spliting the MLlib documentation by techniques

see jira
https://spark-project.atlassian.net/browse/MLLIB-19
and
https://github.com/shivaram/spark/compare/mathjax

Author: Martin Jaggi <[email protected]>

== Merge branch commits ==

commit 0364bfabbfc347f917216057a20c39b631842481
Author: Martin Jaggi <[email protected]>
Date:   Fri Feb 7 03:19:38 2014 +0100

    minor polishing, as suggested by @pwendell

commit dcd2142c164b2f602bf472bb152ad55bae82d31a
Author: Martin Jaggi <[email protected]>
Date:   Thu Feb 6 18:04:26 2014 +0100

    enabling inline latex formulas with $.$

    same mathjax configuration as used in math.stackexchange.com

    sample usage in the linear algebra (SVD) documentation

commit bbafafd2b497a5acaa03a140bb9de1fbb7d67ffa
Author: Martin Jaggi <[email protected]>
Date:   Thu Feb 6 17:31:29 2014 +0100

    split MLlib documentation by techniques

    and linked from the main mllib-guide.md site

commit d1c5212b93c67436543c2d8ddbbf610fdf0a26eb
Author: Martin Jaggi <[email protected]>
Date:   Thu Feb 6 16:59:43 2014 +0100

    enable mathjax formula in the .md documentation files

    code by @shivaram

commit d73948db0d9bc36296054e79fec5b1a657b4eab4
Author: Martin Jaggi <[email protected]>
Date:   Thu Feb 6 16:57:23 2014 +0100

    minor update on how to compile the documentation

Conflicts:
	docs/mllib-guide.md
@ScrapCodes ScrapCodes deleted the SPARK-1565/update-examples branch June 3, 2015 06:01
helenyugithub pushed a commit to helenyugithub/spark that referenced this pull request Aug 20, 2019
Implement AQE again, with union fix.
bzhaoopenstack pushed a commit to bzhaoopenstack/spark that referenced this pull request Sep 11, 2019
There is some unknown error that causes
spark job exit when installing openjdk8,
extract it to be a role and run it at
the begaining of the job to debug the
root causes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants