Skip to content

Start main binary at 0x10001000 to allow for standalone second stage loader? #84

Open
@swetland

Description

@swetland
Contributor

Since the (Q)SPI flash bootloader is possibly part specific, it would be nice if (for simple cases) firmware could "just work" with a pre-flashed second stage, rather than having to be compiled for a specific flash part.

The first step to enable this would be to align the main image to start at the next erase unit (0x1000 offset) so it can be reflashed without disrupting the second stage.

For more complex projects that need to access the flash other than just reading from it via the XIP window, a function table could be provided for optimized, part-specific low level flash access (replacements for the ROM flash functions).

Activity

lurch

lurch commented on Feb 4, 2021

@lurch
Contributor

it can be reflashed without disrupting the second stage.

I assume you're aware that the BOOTSEL mode (which provides the UF2 flashing over MSD) is built into the ROM of the RP2040, and as such can't be changed? 😕 (Unless I've misunderstood what you're asking for?)

Wren6991

Wren6991 commented on Feb 4, 2021

@Wren6991
Contributor

The first step to enable this would be to align the main image to start at the next erase unit (0x1000 offset) so it can be reflashed without disrupting the second stage

So in its simplest form, you would just want a chainloader at +0x100 that immediately vectors through a table at +0x1000, and have the current image start at +0x100 moved up to there?

That would be a fairly modest linker script addition, we are looking into templating our linker scripts at some point (as they are a bit copy/pastey) and that would make it pretty easy to add something like this.

Until we have something useful to go in that alignment hole, the default build will probably stay as it is -- people would miss the ~4k of flash they would immediately lose.

optimized, part-specific low level flash access (replacements for the ROM flash functions).

This is a little dicey because you can't do XIP execution whilst programming is in progress. This would lead to you copying 4k of code into RAM.

swetland

swetland commented on Feb 4, 2021

@swetland
ContributorAuthor

No changes to the ROM loader are needed, just to the second stage (boot2) which is built by the SDK and lives at offset 0 in the SPI flash.

The SPI flash's erase unit size is 0x1000, so with the main binary at offset 0x100, it's not possible to update the "app" without erasing and replacing boot2 at the same time (one could adjust tooling to save and restore boot2 but that gets kinda fiddly).

What I'm suggesting is that boot2 instead of transferring control to 0x10000100, transfer control to 0x10001000 once it has configured XIP mode. The extra space could also be used to provide a function table of optimized flash io functions to the app, similar to how the boot rom provides generic flash io functions.

The combination of the above then simplifies development for arbitrary rp2040 based boards by no longer requiring the app to have a board-specific flash "driver" compiled in. Of course it doesn't prevent that, either, if that's desirable in a particular instance.

One could also imagine providing some table of board hardware info, though before long that spirals out into some madness like devicetree or ACPI, so maybe simpler is better.

Wren6991

Wren6991 commented on Feb 4, 2021

@Wren6991
Contributor

Think we got our messages crossed!

swetland

swetland commented on Feb 4, 2021

@swetland
ContributorAuthor

Yeah! Saw your reply moments after I clicked "comment".

swetland

swetland commented on Feb 4, 2021

@swetland
ContributorAuthor

Good point about the XIP helpers needing to be SRAM loaded, so not quite as trivial, though still doable.

4K out of a typical several MB flash didn't strike me as a huge cost against the possibility of making arbitrary dev boards "just work" if they have a compatible boot2 installed from the factory. Obviously since the end users has full control of what they're flashing (which is fantastic) separately updateable boot2 + app could be discarded if space is at a premium, etc.

And I'm half-joking, half-not about some kind of HW descriptor table. There's already that firmware info table telling users what GPIO assignments are what. With sufficient cleverness one could allow for the two to be resolved against each other with a little helper routine to run at startup and then you start getting self-configuring systems.

lurch

lurch commented on Feb 4, 2021

@lurch
Contributor

No changes to the ROM loader are needed, just to the second stage (boot2) which is built by the SDK and lives at offset 0 in the SPI flash.
The SPI flash's erase unit size is 0x1000, so with the main binary at offset 0x100, it's not possible to update the "app" without erasing and replacing boot2 at the same time (one could adjust tooling to save and restore boot2 but that gets kinda fiddly).

I'm obviously not as familiar with the low-level details as you and Luke, but I guess my concern is that (if I'm understanding this correctly) there'd then be some apps that do have an embedded boot2, and some apps that don't have an embedded boot2 (because they're relying on there already being a suitable boot2 in flash), and how much confusion this could cause users? 🤷‍♂️

4K out of a typical several MB flash didn't strike me as a huge cost

Me neither, but we've already had users asking for 48 bytes back! #78

swetland

swetland commented on Feb 4, 2021

@swetland
ContributorAuthor

I'm obviously not as familiar with the low-level details as you and Luke, but I guess my concern is that (if I'm understanding this correctly) there'd then be some apps that do have an embedded boot2, and some apps that don't have an embedded boot2 (because they're relying on there already being a suitable boot2 in flash), and how much confusion this could cause users? man_shrugging

That is a point to consider. It may be that, having launched as it is, it's too late to explore such a proposal. On the other hand, if the no-onboard-flash variant of the part (datasheet indicates onboard flash at least a possibility based on p/n scheme) is most common, and a diverse ecosystem of devboards explodes (yay, success!), dealing with "what flash do I need to compile support for" becomes more and more of a headache for developers and/or SDK maintainers.

Having been through a few OS/platform launches, what I do know is the longer you wait, the more difficult it becomes to make a change like this, and sometimes taking a hit early on can save on pain down the road.

4K out of a typical several MB flash didn't strike me as a huge cost

Me neither, but we've already had users asking for 48 bytes back! #78

Well, I do have to applaud frugality. The way people burn through memory nowadays blows my mind.

lurch

lurch commented on Feb 4, 2021

@lurch
Contributor

dealing with "what flash do I need to compile support for" becomes more and more of a headache for developers and/or SDK maintainers.

I've never written any low-level flash code, but how "incompatible" are different flash chips? Or looking at it from the other angle, how likely is it that 3rd-party RP2040 devboards (intended for general public use) would choose a flash-chip which isn't already supported by the current SDK?

swetland

swetland commented on Feb 4, 2021

@swetland
ContributorAuthor

The SDK currently includes 4 different boot2 flash XIP implementations (following info from the header comments in the assembly source files):

  • generic -- should work with just about anything, but 3x worse than QSPI support
  • is25lp80 -- supports ISSI IS25LP080D
  • w25q080 -- supports Winbond W25Q080 and W25Q16JV, AT25SF081, S25FL132K0
  • w25x10cl -- supports Winbond W24X10CL

I don't know how exhaustively that covers popular, active parts.

Even if the SDK supports the a part, figuring out which part is on your board is another step, and not immediately obvious. Presumably one could install a helper using the generic driver or just copy-to-ram boot2 and attempt to read the part number from the SPI flash.

I haven't yet stumbled over a document that told me exactly what flash part was on my Pico board(s) -- I'm guessing one of those supported by boot2_w25q080.S based on that being the default boot2 version selected by CMakeLists.txt. The Pico Data Sheet and all the marketing literature I've seen simply mentions 2MB of QSPI flash and I assume that the exact part may change from batch to batch based on availability, pricing, etc.

Wren6991

Wren6991 commented on Feb 4, 2021

@Wren6991
Contributor

I haven't yet stumbled over a document that told me exactly what flash part was on my Pico board(s) -- I'm guessing one of those supported by boot2_w25q080.S based on that being the default boot2 version selected by CMakeLists.txt. The Pico Data Sheet and all the marketing literature I've seen simply mentions 2MB of QSPI flash and I assume that the exact part may change from batch to batch based on availability, pricing, etc.

Good point, It's a W25Q16JV (if you scroll down in the Pico datasheet you will see the schematic I clipped here), I'll make sure the part number is mentioned higher up in the datasheet too.

image

Having been through a few OS/platform launches, what I do know is the longer you wait, the more difficult it becomes to make a change like this, and sometimes taking a hit early on can save on pain down the road.

Yes, appreciate this, we jumped on #10 for similar reasons.

I don't know how exhaustively that covers popular, active parts.

You can include boot2 files in your project, I guess an example of this would be helpful, and yes there needs to be better tooling for discovering what is on your board.

Will wait for @kilograham to get back before making any changes here, I think one of the major challenges is how this fits into programming tools and how we get boot-from-0x100 binaries to play nicely with boot-from-0x1000 binaries (because people will be upset about that 4k) and he is the right person to weigh in on that aspect of it. I think he's just popped off for a few days' break as we've all been quite hard pressed around launch.

Wren6991

Wren6991 commented on Feb 4, 2021

@Wren6991
Contributor

I don't know how exhaustively that covers popular, active parts.

It gives examples of the most common QSPI and DSPI continuous read formats (EBh/BBh), the remaining wrinkles are mostly around things like status register layout.

I would be interested in developing a generic e.g. SFDP extended boot2 that occupies the first 4k of flash, but my brief experience with SFDP (by buying a bunch of random devices off DigiKey to test their SFDP support) is that support is incredibly patchy, with a lot of broken implementations. Then again, 4k gives you a lot of space to work around the quirks.

lurch

lurch commented on Feb 4, 2021

@lurch
Contributor

buying a bunch of random devices .... support is incredibly patchy, with a lot of broken implementations.

Sounds very similar to the situation with SD cards 😀

swetland

swetland commented on Feb 4, 2021

@swetland
ContributorAuthor

buying a bunch of random devices .... support is incredibly patchy, with a lot of broken implementations.

Sounds very similar to the situation with SD cards

Same as it ever was... last time around for me was a big pile of NVME M.2 SSDs, crosschecking spec-vs-reality while bringing up a host driver.

Regarding "wasting" 4K... looking at some of the existing boot2 implementations which only have 2-3 unused words of their 256 byte allotment, I'd be nervous about having limited space to deal with some more complex boot situation down the road. Sure boot2 could read a larger boot3 that really knows how to turn on XIP, but there's no space for such a critter between the end of boot2 and the start of the app image.

kilograham

kilograham commented on Feb 6, 2021

@kilograham
Contributor

So the idea of an in flash stub has always been in the back of my mind which is why the ROM UF2 bootloader accepts flash binaries that don't start at 0x10000000 (even though ELF2UF2 doesn't for now). I certainly didn't want to require one, and there are a number of issues to work thru (especially how to not get users in a hole (and open up a support can of worms) where they don't have the stub). So I decided to not make too many set in stone/hasty-given-our-workload decisions until we saw how people started to use the device.

Some random thoughts in no particular order:

  1. There is the general question of whether you rely on there being a stub on the device. Obviously if you only support picotool loads you could take care of this, but one option to consider is just allowing plugging on the "stub" by picotool (i.e. switch out the stub on a UF2)

  2. The idea of self-configuring binaries already occurred to me as I was writing picotool/binary info stuff (which was actually a very very late addition). Sometime we use #define-ed configuration values, but the application can certainly choose not to do this and so be (re-)configurable. I had potentially envisaged some of this via picotool (i.e. just modify the binary - as part of this i had even considered argc,argv :-) ).

  3. Additional use cases for a stub include perhaps:

    1. a grub-like thing for multiple binaries
    2. debugging firmware (e.g. a core 1 debugger impl for a core 0 only binary)
    3. as you say additional board "driver" firmware
  4. As @Wren6991 says, actually changing the binary layout is easy enough

  5. Would this all confuse picotool/binary_info?

    No, because the stub could just leave a "forwarding/chain" binary_info reference to the next executable header (we would need to decide how to display - in picotool - the relationship between information from the stub binary and the application binary)

  6. Not having to prefix a second stage is helpful though (given the copy-able .S files) not critical for other languages.

  7. The 4K (at least with smaller flash) was very important to me for squeezing stuff in, so we certainly want to leave this as an option even if it is not the default (reminds me a I am meaning to make an example INTERFACE library that pares out as much runtime functionality/spave as possible to show how to get a small binary if you really want)

  8. Sometimes the application has a requirement on a particular second stage (mostly today when hugely overclocking when you need a large SSI clock divider). This can be solved in other ways though, including the configuration mentioned above, or of course such a speicific binary should just include the boot_stage2

After saying all this, getting back to the issue at hand, in order of things we could do:

  1. Let the user/downstream do this in a bespoke fashion; this obviously hurts interoperability.

  2. We can add support for inserting space into binary (using new templated linker scripts to make this easier). We would have to support include a forwarding trampoline VTABLE at +0x100. We would make the default be a space up until 0x10001000, but configurable by build defines.

  3. Start building ELFs/UF2s that don't include 0x100000000-0x10001000. This is where it gets a bit tricky and we need more discussion about how this should really work from a logistical point of view. Two areas of concern:

    1. Downloading a partial binary over a "full" binary if these still exist (which I think they will). Well full binaries might be able to (with updated SDK) check offset 0x10001000 and recognize a valid vector table there. if found it could assume it has been overwritten and forward.
    2. The case where there is nothing on the device at all (or invalid boot stage2)

Obviously the board "firmware" needs more discussion, but perhaps we should split out some new issues once we have discussed the basics here more.

12 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Relationships

None yet

    Development

    No branches or pull requests

      Participants

      @tannewt@swetland@lurch@kilograham@Wren6991

      Issue actions

        Start main binary at 0x10001000 to allow for standalone second stage loader? · Issue #84 · raspberrypi/pico-sdk