-
Notifications
You must be signed in to change notification settings - Fork 411
writers/readers #62 based on io::Write/Read #119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK. Seems OK, but found some minor nits.
Greak work!
src/util/ser.rs
Outdated
pub enum Error { | ||
CorruptedData, | ||
BadPublicKey, | ||
IoError(io::Error), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not simply it's named Io
like most crates do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced to Io
now.
src/util/ser.rs
Outdated
} | ||
} | ||
|
||
impl From<io::Error> for Error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a #[doc(hidden)]
(hide unneeded implementation details)
pub struct Writer<W> { writer: W } | ||
pub struct Reader<R> { reader: R } | ||
|
||
pub trait Writeable<W: Write> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably bikeshedding, but why don't give it a name like Encodable
/Decodable
just like how MsgEncodable
/MsgDecodable
are named?
src/util/ser.rs
Outdated
fn write(&self, writer: &mut Writer<W>) -> Result<(), Error>; | ||
} | ||
|
||
pub trait Readable<R> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
src/util/ser.rs
Outdated
pub fn new(reader: R) -> Reader<R> { | ||
return Reader { reader } | ||
} | ||
pub fn into_inner(self) -> R { self.reader } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Identation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation all fixed now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good at a high level. There are a few indentation issues in the existing patch, but no big deal, and this seems more targeted at current ChannelMonitor serialization, which is fine, but ideally we'd be able to use it for that and network messages as well.
src/util/ser.rs
Outdated
fn write(&self, w: &mut Writer<W>) -> Result<(), Error> { | ||
match self { | ||
Some(data) => { | ||
true.write(w)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This appears to be what we do in ChannelMonitor, but note we don't do this in network messages. Not sure if its bad to have it, just noting it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed Option<?>
altogether as it seems difficult to me implementing generic serialization varying depending on the type of the payload of Option
. But I left something equivalent to Option<Script>
which found usage in two messages.
src/util/ser.rs
Outdated
|
||
pub const MAX_VEC_SIZE: usize = 32 * 1024 * 1024; | ||
|
||
pub enum Error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these different from msgs::DecodeError? Seems to me they should be the same with the addition of IoError, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed to use msgs::DecodeError
now
src/util/ser.rs
Outdated
} | ||
Ok(buf[0] == 1) | ||
} | ||
fn read_fixed_size(&mut self, sz: usize) -> Result<Vec<u8>, Error> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not have the same API as reader.read_exact? Seems annyoing to force a vec allocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Looking forward to seeing more things migrate to using this.
src/util/ser.rs
Outdated
if byte_size > MAX_VEC_SIZE { | ||
return Err(DecodeError::InvalidLength); | ||
} | ||
let mut ret = Vec::with_capacity(len as usize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need something to ensure that there is at least len capacity left in the buffer to be read, as allocating full-usize-memory will crash us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure what you meant. len*sizeof(T)
is bounded and each read()?
will raise 'ShortRead' if there is not enough data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I guess I'm blind. That said, MAX_VEC_SIZE is much too large for network messages (which can never be logner than 16K), even if its within reason for ChannelMonitors :/
For |
Oops, yea, looks like a bug on my_current_per_commitment_point. As for decode with data_loss_protect, I dunno, it kinda sucks to pass around LocalFeatures into message decoding, and for now relying on EOF is fine, but it may not be in the future. For simplicity maybe keep using EOF for now and if the spec ever changes to have different decoding based on LocalFeatures we can pass around a ref in the decoder stuff in the future? |
Added more messages and separated fuzz targets for Writeable. |
02f04ee
to
f660829
Compare
travis timed out |
Hmm, I kicked it, but it's probably just that you added more fuzz targets so the fuzzer is taking way more time. |
I assume we'll convert all the deserializers to just use the new framework before this is through so it wont be a problem, but until its done I assume travis will keep timing out. Not a huge deal. |
@yuntai Tell me when you want a more thourough review of this. I was kinda waiting to see all the messages get churned through but don't want to leave you waiting on review if you feel you're blocked on it. |
I think it's worth a review now. All messages are covered. I was able to run fuzz locally. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking pretty good. I had a few comments, but mostly nits and notes that we should probably skip including chunks of code that are unused. I'd kinda prefer to see all the message encoding/decoding stuff switched over to the new framework in peer_handler and the old crap dropped before merge, but if you'd prefer they be separate, I'm game to do so if you've got a commit that does the transition.
src/util/ser.rs
Outdated
pub fn into_inner(self) -> W { self.writer } | ||
#[cfg(feature = "fuzztarget")] | ||
pub fn get_ref(&self) -> &W { &self.writer } | ||
writer_fn!(write_u64, u64, be64_to_array); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe its just personal preference but I tend to find this kind of thing unreadable. All said, you've added 4 lines of macro_rules! definitions to save 6 lines of write_X { }s, and made anyone reading the code have to jump to read whats going on.
Its a bit different when the macro is defining more than one line, but for one-line functions it just feels like overkill.
src/util/ser.rs
Outdated
writer_fn!(write_u32, u32, be32_to_array); | ||
writer_fn!(write_u16, u16, be16_to_array); | ||
fn write_u8(&mut self, v: u8) -> Result<(), DecodeError> { | ||
Ok(self.writer.write_all(&[v])?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
readability nit: You can skip the Ok(...?) and just let it return the original Result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wirte_all() raise io::Error
src/util/ser.rs
Outdated
return Reader { reader } | ||
} | ||
#[cfg(feature = "fuzztarget")] | ||
pub fn into_inner(self) -> R { self.reader } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: indentation issue.
src/util/ser.rs
Outdated
} | ||
} | ||
|
||
macro_rules! reader_fn { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move this definition down to impl Reader where its used?
src/util/ser.rs
Outdated
impl_array!(64); // for Signature | ||
|
||
// Tuples | ||
macro_rules! tuple_encode { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tuple encode/decode looks unused (and I dont see if it would be used in ChannelMonitor serialization, either). Maybe just remove for now and we can re-add it if its needed? Generally this applies to several things here - we've got most of the stuff we're gonna want to serialize in the short term already written so having serializers for things we dont yet need is just more code review.
src/util/ser.rs
Outdated
|
||
impl<W, T> Writeable<W> for Option<T> | ||
where W: Write, | ||
T: ::std::ops::Index<::std::ops::RangeFull, Output=[u8]> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think your tabstop is maybe confused? Either way, when aligning things best to use spaces to align and tab to indent. If you set your editor to highlight tabs differently it should be pretty clearly visible.
} | ||
|
||
#[macro_export] | ||
macro_rules! test_msg_writeable_simple { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unused, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is used for Ping/Pong message. Wanted to remove ping/pong message test in the top fuzz directory.
let buf = w.into_inner().into_inner(); | ||
assert_eq!(&r.into_inner().into_inner()[..p], &buf[..p]); | ||
|
||
let encoded = <$MsgType as MsgDecodable>::decode(&buf[..]).unwrap().encode(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like overkill? If the message deserialized fine the first time and then encoded to the same value, shouldn't it definitely deserialize again this time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was a way to test against the existing code. We can drop it and probably Travis will be happy again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think travis will only be happy when you remove the existing message fuzz targets to reduce the total runtime.
let mut w = Writer::new(::std::io::Cursor::new(vec![])); | ||
msg.write(&mut w).unwrap(); | ||
|
||
let buf = w.into_inner().into_inner(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, how do most other serialization frameworks work? This looks kinda awkward but maybe this is the normal way to do it? Either way, these functions will need to be non-fuzztarget-only, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems they tend to use into_inner()
and get_ref()
when accessing the underlying Read
like instd::io::BufReader
. They may be not required in the normal situation, but I'm not totally sure. I agree it shouldn't be non-fuzztarget-only.
I can work on migrating whole stuff before merging. My mental model is;
Does this make sense to you? |
|
Hmm, then there is no saving in buffer copies. Actually, |
70bc53b
to
9a15634
Compare
Rebased & fixed. Fuzz test went thru but got an error in
|
Identified a problem (ShortRead). working on it and trying to figure why fuzz message test couldn't catch it. |
Yea, for perf you may want to have a capacity_hint() in Writer, or so. Though I'm not sure why I indicated we cant do no-copy in the encryption side, thats obviously bogus - right now we copy twice, once to write into a buffer, then encrypt the buffer, but we could easily just write into an encryptor and then write that out into a buffer to pass to the user's networking stack. Same on the decryption end, obviously. |
PR lightningdevkit#119 fails on no_existing_test_breakage() which uses hex from this buggy encoding. Perhaps use this to regenerate hex?
9a15634
to
b82f62b
Compare
fixed a bug in new serde and squashed the fouling commit. It all looks good now and Travis is so happy! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good. There's a few indentation and other nits, and a few simplifications in the message stuff that could be done, but I'm pretty much happy to rebase and merge and then clean those up later. Will do tomorrow morning if you haven't gotten to it.
msg.write(&mut w).unwrap(); | ||
|
||
let buf = w.into_inner().into_inner(); | ||
assert_eq!(&r.into_inner().into_inner()[..p], &buf[..p]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also assert that buf.len() == p?
let p = w.get_ref().position() as usize; | ||
|
||
let buf = w.into_inner().into_inner(); | ||
eprintln!("buf({})", buf.len()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rebase error?
src/ln/peer_handler.rs
Outdated
0u16.write(&mut w).unwrap(); | ||
$msg.write(&mut w).unwrap(); | ||
let mut msg = w.into_inner().into_inner(); | ||
byte_utils::be16_to_array_inplace(msg.len() as u16-2, &mut msg[..2]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could just msg[..2].copy_from_slice(byte_utils::be16_to_array(msg.len() - 2)) to avoid adding another function. It should optimize away pretty easily.
src/util/ser.rs
Outdated
|
||
pub trait Readable<R> | ||
where Self: Sized, | ||
R: Read |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: indentation here?
src/ln/msgs.rs
Outdated
@@ -43,6 +44,12 @@ pub enum DecodeError { | |||
/// A length descriptor in the packet didn't describe the later data correctly | |||
/// (currently only generated in node_announcement) | |||
BadLengthDescriptor, | |||
/// Error from std::io | |||
Io(::std::io::Error), | |||
/// Invalid value found when decoding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you be more specific here? AFAICT this is only used for bool decoding (making it unused currently, presumably until we move the ChannelMonitor deserializer to the new framework) when the value is not 1 or 0.
src/util/ser.rs
Outdated
( $size:expr ) => ( | ||
impl<W, T> Writeable<W> for [T; $size] | ||
where W: Write, | ||
T: Writeable<W>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: also strange tabstop here and below.
src/ln/msgs.rs
Outdated
Io(::std::io::Error), | ||
/// Invalid value found when decoding | ||
InvalidValue, | ||
/// Data too big |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, can we just merge this with BadLengthDescriptor?
src/util/ser.rs
Outdated
|
||
impl<W: Write> Writeable<W> for Option<Script> { | ||
fn write(&self, w: &mut Writer<W>) -> Result<(), DecodeError> { | ||
Ok(if let &Some(ref script) = self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why put the ()-returning if inside the Ok()?
src/ln/msgs.rs
Outdated
let mut buf = [0; 32+33]; | ||
match r.read_exact(&mut buf) { | ||
Ok(()) => Some(DataLossProtect { | ||
your_last_per_commitment_secret: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because you aren't allows to only fill in part of data_loss_protect you should be able to simplify this a bit by swapping buf for your_last_per_commitment_secret directly, then just calling Readable::read(r)? for my_current_per_commitment_point.
src/ln/msgs.rs
Outdated
}, | ||
hop_data: { | ||
let mut buf = [0u8;20*65]; | ||
r.read_exact(&mut buf)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just impl_array!(1300)?
with additional variants in DecodeError
un-fuzz-target only for get_ref() and inner_into() move up use indentation fixes remvoe unused implementation
added inplace byte_utils
For [T] iteration is used for each T, read position is not predictable when ShortRead is raised. [T] is not used in p2p messages and, if needed, Vec[T] can be used for other structures. Also a performance gain.
2cdd763
to
c7fedf9
Compare
and be more specific about DecodeError::InvalidValue
Will take as #170 (its no different, just squashed some commits). |
Implement issue #62. I started writing some generic routines for #55 and then found lots of them are already done in rust-bitcoin. I was thinking of importing them directly but I saw some conversation about restructuring serialization routines in the rust-bitcoin repo so I just adopted them here.
If this approach looks good I can go ahead putting a fuzz testing.