-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
DOC: added string processing comparison with R #16502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
added string comparison functions section in documentation comparison_with_r.rst
@@ -530,6 +530,103 @@ For more details and examples see :ref:`categorical introduction <categorical>` | |||
:ref:`differences to R's factor <categorical.rfactor>`. | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a section tag here, like: _compare_with_r.string
(actually if you can add them some of the sub-sections would be great). you put right after the sub-section label.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you aren't familiar with sphinx, you need to start the line with a ..
http://www.sphinx-doc.org/en/stable/markup/inline.html#cross-referencing-arbitrary-locations
``nchar`` includes leading and trailing blanks. Use ``nchar`` and ``trimws`` | ||
to exclude leading and trailing blanks. | ||
|
||
.. code-block:: none |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a R highlter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
http://pygments.org/docs/lexers/#lexers-for-the-r-s-languages
r
or rconsole
should work. Probably rconsole
if you're showing output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I realize that these produce the same output, so we don't actually show the output (I think) elsewhere, so maybe that is ok (though obviously the code formatting would be nice)
``len`` includes leading and trailing blanks. Use ``len`` and ``strip`` | ||
to exclude leading and trailing blanks. | ||
|
||
.. code-block:: none |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-> ipython:: python (and for all running pandas code)
|
||
df <- data.frame(color = c('red', ' blue', 'green ', ' yellow ')) | ||
nchar(as.character(df$color)) | ||
nchar(trimws(as.character(df$color))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be nice to show output here (IOW run the R code and show the output too if you can)
Codecov Report
@@ Coverage Diff @@
## master #16502 +/- ##
=======================================
Coverage 90.43% 90.43%
=======================================
Files 161 161
Lines 51045 51045
=======================================
Hits 46161 46161
Misses 4884 4884
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #16502 +/- ##
=======================================
Coverage 90.43% 90.43%
=======================================
Files 161 161
Lines 51045 51045
=======================================
Hits 46161 46161
Misses 4884 4884
Continue to review full report at Codecov.
|
@gfyoung can you review |
``find`` function. ``find`` searches for the first position of the | ||
substring. If the substring is found, the function returns its | ||
position. Keep in mind that Python indexes are zero based whereas | ||
R indexes are 1 based. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simpler: "Keep in mind that Python 0-indexes, whereas R 1-indexes"
|
||
In Python, you can use ``[]`` notation to extract a substring | ||
from a string by position locations. Keep in mind that Python | ||
indexes are zero-based. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slightly simpler: "Keep in mind that Python 0-indexes"
In addition, Python's ``title`` function changes the string to | ||
proper case. | ||
|
||
.. code-block:: none |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment above
can you rebase / update |
can you update according to comments |
closing as stale. if you'd like to continue working, pls ping. |
#13229
Added string processing section to comparison_with_r DOC. Used info from http://blog.dominodatalab.com/using-r-and-python-for-common-sas-functions.
This completes the issue.