-
Notifications
You must be signed in to change notification settings - Fork 711
Refactor/cleanup of String
/ByteString
usage
#4666
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This is the proper type, as 'getFileContents' is supposed to return contents read in binary file mode, and prior to this patch, `[Char]` was abused to return binary data.
The API was assymetric, as there was `fromUTF8(L)BS` but not the dual operations. The plan is refactor all occurences of - `fromUTF8 :: String -> String` - `toUTF8` :: String -> String` until `fromUTF8`/`toUTF8` is unused, at which point we can officially deprecate or remove it.
This was introduced in 1821d80 but it would imply that stdout was set to binary mode, which it isn't.
String
/ByteString
usageString
/ByteString
usage
8912f68
to
3856f76
Compare
This new type will be used to disentangle conflated uses of `String` and clearly distinguish between binary and textual data.
This removes the remaining occurences of the weakly typed `{to,from}UTF8` conversion.
Since we don't use those functions anymore, we can finally `DEPRECATE` them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM modulo minor comments.
@@ -1777,12 +1776,16 @@ checkCabalFileBOM ops = do | |||
-- --cabal-file is specified. So if you can't find the file, | |||
-- just don't bother with this check. | |||
Left _ -> return $ Nothing | |||
Right pdfile -> (flip check pc . startsWithBOM . fromUTF8) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can startsWithBOM
be removed now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
Cabal/Distribution/Utils/Generic.hs
Outdated
@@ -246,8 +246,7 @@ startsWithBOM _ = False | |||
|
|||
-- | Check whether a file has Unicode byte order mark (BOM). | |||
fileHasBOM :: FilePath -> NoCallStackIO Bool | |||
fileHasBOM f = fmap (startsWithBOM . fromUTF8) | |||
. hGetContents =<< openBinaryFile f ReadMode | |||
fileHasBOM f = (startsWithBOM . fromUTF8LBS) <$> BS.readFile f |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't you use BS.isPrefixOf
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, but we can also just remove fileHasBOM
as nobody uses it anymore iirc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1.
Cabal/Distribution/Simple/Utils.hs
Outdated
-- closed. | ||
-- | ||
-- @since 2.2.0 | ||
ioDataHGetContents :: Handle -> IODataMode -> IO IOData |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe move IOData
& friends to a separate module and import that qualified instead of using a prefix? Like the existing BS.
/LBS.
precedent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@23Skidoo what do you suggest? Something like
module Distribution.Simple.IOData
( IOData(..)
, hGetContents
, hPutContents
-- ...
) where
-- ...
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, that's what I had in mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll go for D.Utils.IOData
, as that's where e.g. ShortText
lives too
Cabal/Distribution/Utils/Generic.hs
Outdated
| c <= '\xFB' = moreBytes 5 0x200000 cs (ord c .&. 0x3) | ||
| c <= '\xFD' = moreBytes 6 0x4000000 cs (ord c .&. 0x1) | ||
| otherwise = replacementChar : fromUTF8 cs | ||
fromUTF8 = decodeStringUtf8 . map c2w |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fwiw, we could also remove to/fromUTF8
w/o any deprecation. Nothing in the code-base needs it anymore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, since this will be in Cabal 2.2, setup scripts shouldn't be affected. May be worth adding a note to the changelog about the stuff that is gone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for changelog see b1bbf2e
@23Skidoo I've implemented your suggestion; can you look over it? I'll write the changelog as soon as you confirm this is what you intended :-) |
@hvr Looks good, though you forgot to |
This reduces the surface area of lib:Cabal by removing entry points that Setup.hs scripts are very unlikely to use.
@23Skidoo Doh! ... fixed... |
[skip ci]
This refactoring is intended to clean-up various instances where
[Char]
is used to conveysomething other than Unicode text. This picks up unfinished yak-shaving that was put aside when working
on #3913 ... ;-)
Note: this PR has been carefully split into self-contained commits to ease reviewing