-
Notifications
You must be signed in to change notification settings - Fork 72
add codepoint-based string functions as Data.String.CodePoints #79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
48 commits
Select commit
Hold shift + click to select a range
428b995
WIP code point based string functions
michaelficarra dc0577c
more progress
michaelficarra 25572de
minor stuff
michaelficarra 8279da8
count
michaelficarra 292e0de
drop and take
michaelficarra fd91b0b
length
michaelficarra 8387641
singleton
michaelficarra fb47387
splitAt
michaelficarra 5a6cfd0
use String.fromCodePoint in singleton implementation when available
michaelficarra 3003c09
re-export Data.String
michaelficarra 75117d2
uncons
michaelficarra ecfbf0b
re-arrange imports
michaelficarra d5b6d92
re-arrange JS exports
michaelficarra 8860295
fix count; implement dropWhile and takeWhile
michaelficarra a6855b4
indexOf and lastIndexOf
michaelficarra 8c55257
add some initial tests and fix some bugs
michaelficarra a26afdf
trailing whitespace
michaelficarra c1ff8c5
finished the tests
michaelficarra 04154a5
fix linting errors
michaelficarra c798dfe
change re-export of Data.String
michaelficarra 71cdcf2
bugfixes
michaelficarra 2c2418a
move fromCodePoint from JS to purs
michaelficarra 46e9545
move codePointAt0 from JS to purs
michaelficarra c59f340
remove TODOs
michaelficarra 71c5156
use charCodeAt from Data.String.Unsafe
michaelficarra e8ca6f3
open imports for Prelude
michaelficarra 8d6d263
add some comments
michaelficarra 8e99c39
remove unused parameters
michaelficarra 557186c
remove some redundant JS implementations
michaelficarra 4ec116b
remove unnecessary qualification in import
michaelficarra 5490d46
prefer 10e3 over 1024e1
michaelficarra af2db11
prefer string iteration over Array.from in _codePointAt FFI function
michaelficarra 205838c
remove Newtype instance for CodePoint
michaelficarra 3b57fd4
remove duplication
michaelficarra 7eac69e
remove unused function
michaelficarra cde0d26
bug fix for unsafeCodePointAt0Fallback
michaelficarra 4292a8b
consistent code unit variable names
michaelficarra 0d81e0b
bug fix lastIndexOf'
michaelficarra 370af7c
add comments and complexity notes
michaelficarra cef521a
update Data.String import warning comment
michaelficarra b38eb80
refactor to avoid lists dep; better complexity adherence in fallbacks
michaelficarra 4f3d71d
remove fallback to Array.from in codePointAt JS implementation for now
michaelficarra e3cea19
prefer let over where
michaelficarra db3eba3
change JS implementation of count to use string iterator if possible
michaelficarra 3a24c8d
update comments
michaelficarra 82a502f
pull functions out of where clauses
michaelficarra 085022e
change complexity documentation for drop{,While} and take{,While}
michaelficarra 6edb70f
forgot about a prime
michaelficarra File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,108 @@ | ||
"use strict"; | ||
/* global Symbol */ | ||
|
||
var hasArrayFrom = typeof Array.from === "function"; | ||
var hasStringIterator = | ||
typeof Symbol !== "undefined" && | ||
Symbol != null && | ||
typeof Symbol.iterator !== "undefined" && | ||
typeof String.prototype[Symbol.iterator] === "function"; | ||
var hasFromCodePoint = typeof String.prototype.fromCodePoint === "function"; | ||
var hasCodePointAt = typeof String.prototype.codePointAt === "function"; | ||
|
||
exports._unsafeCodePointAt0 = function (fallback) { | ||
return hasCodePointAt | ||
? function (str) { return str.codePointAt(0); } | ||
: fallback; | ||
}; | ||
|
||
exports._codePointAt = function (fallback) { | ||
return function (Just) { | ||
return function (Nothing) { | ||
return function (unsafeCodePointAt0) { | ||
return function (index) { | ||
return function (str) { | ||
var length = str.length; | ||
if (index < 0 || index >= length) return Nothing; | ||
if (hasStringIterator) { | ||
var iter = str[Symbol.iterator](); | ||
for (var i = index;; --i) { | ||
var o = iter.next(); | ||
if (o.done) return Nothing; | ||
if (i === 0) return Just(unsafeCodePointAt0(o.value)); | ||
} | ||
} | ||
return fallback(index)(str); | ||
}; | ||
}; | ||
}; | ||
}; | ||
}; | ||
}; | ||
|
||
exports._count = function (fallback) { | ||
return function (unsafeCodePointAt0) { | ||
if (hasStringIterator) { | ||
return function (pred) { | ||
return function (str) { | ||
var iter = str[Symbol.iterator](); | ||
for (var cpCount = 0; ; ++cpCount) { | ||
var o = iter.next(); | ||
if (o.done) return cpCount; | ||
var cp = unsafeCodePointAt0(o.value); | ||
if (!pred(cp)) return cpCount; | ||
} | ||
}; | ||
}; | ||
} | ||
return fallback; | ||
}; | ||
}; | ||
|
||
exports._fromCodePointArray = function (singleton) { | ||
return hasFromCodePoint | ||
? function (cps) { | ||
// Function.prototype.apply will fail for very large second parameters, | ||
// so we don't use it for arrays with 10,000 or more entries. | ||
if (cps.length < 10e3) { | ||
return String.fromCodePoint.apply(String, cps); | ||
} | ||
return cps.map(singleton).join(""); | ||
} | ||
: function (cps) { | ||
return cps.map(singleton).join(""); | ||
}; | ||
}; | ||
|
||
exports._singleton = function (fallback) { | ||
return hasFromCodePoint ? String.fromCodePoint : fallback; | ||
}; | ||
|
||
exports._take = function (fallback) { | ||
return function (n) { | ||
if (hasStringIterator) { | ||
return function (str) { | ||
var accum = ""; | ||
var iter = str[Symbol.iterator](); | ||
for (var i = 0; i < n; ++i) { | ||
var o = iter.next(); | ||
if (o.done) return accum; | ||
accum += o.value; | ||
} | ||
return accum; | ||
}; | ||
} | ||
return fallback(n); | ||
}; | ||
}; | ||
|
||
exports._toCodePointArray = function (fallback) { | ||
return function (unsafeCodePointAt0) { | ||
if (hasArrayFrom) { | ||
return function (str) { | ||
return Array.from(str, unsafeCodePointAt0); | ||
}; | ||
} | ||
return fallback; | ||
}; | ||
}; |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why have three separate code paths here? Is Array.from likely to be faster than using an iterator, which in turn is likely to be faster than the purescript fallback? Are there many platforms which provide String iteration but not Array.from?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hdgarrood
Array.from
is likely faster except in cases where we are more likely to only look at earlier code points, since it requires a scan of the entire string. Of course, implementations may be doing really clever things that make this naive reasoning worthless without real world benchmarks.I've removed some paths that, after review, I felt were unlikely to be any better supported or faster than the alternative paths.