|
| 1 | +# Selectors have sufficiently predictable resource budgets to be used in low-trust environments |
| 2 | + |
| 3 | +Authors: @warpfork |
| 4 | + |
| 5 | +Initial PR: https://github.com/protocol/web3-dev-team/pull/27 |
| 6 | + |
| 7 | + |
| 8 | +Purpose & impact |
| 9 | +---------------- |
| 10 | + |
| 11 | +#### Background & intent |
| 12 | +_Describe the desired state of the world after this project? Why does that matter?_ |
| 13 | + |
| 14 | +The status quo is: we have Selectors, and they can be used to describe walks of graphs of data. |
| 15 | +(They're sorta like regexps for DAGs, if that's a useful comparison for you.) |
| 16 | +We want to expose these |
| 17 | + |
| 18 | +The problem is: if a service wants to accept Selectors which are user-specified, |
| 19 | +then the user can ask the service to do arbitrarily expensive work. |
| 20 | +This would create a way for users to take the service down (a DoS). |
| 21 | + |
| 22 | +The intent is: we should create a resource budgeting system for Selectors. |
| 23 | +The system should be declarative and comprehensible, |
| 24 | +and must be something that administrators of services built with Selectors can configure in order to limit their exposure to DoS. |
| 25 | + |
| 26 | +#### Assumptions & hypotheses |
| 27 | +_What must be true for this project to matter?_ |
| 28 | + |
| 29 | +- Selectors are something that either our or our community's projects expose as an API; |
| 30 | +- and that API is expected to be able to accept user-specified Selectors; |
| 31 | +- and the Selector would be evaluated by a different resource owner than the author; |
| 32 | +- and denial-of-service via maliciously crafted Selectors would be problematic. |
| 33 | + |
| 34 | +(That sounds like a lot of conditions, but from what I can tell, |
| 35 | +users often want to treat Selectors like they're "free" to evaluate, |
| 36 | +and that results in folks building APIs with exactly these expectations.) |
| 37 | + |
| 38 | +Another way to address the underlying issue is to make Selector evaluation connected to a billing system, |
| 39 | +but the work would also be required to make that kind of connection possible. |
| 40 | +(A billing system does no good if one can submit a task that bankrupts you before the bill is settlable.) |
| 41 | + |
| 42 | +#### User workflow example |
| 43 | +_How would a developer or user use this new capability?_ |
| 44 | + |
| 45 | +When users ask for data from a service like IPFS, |
| 46 | +they submit a Selector, and expect to receive a series of blocks in response |
| 47 | +(typically in the form of a "car" or "dar" or other such format). |
| 48 | + |
| 49 | +This workflow from the user's perspective shouldn't change significantly. |
| 50 | + |
| 51 | +From the service host's perspective, |
| 52 | +they should probably have some some configuration file which lets them set limits |
| 53 | +for how much data is matched by a single selector before the service cuts off that request. |
| 54 | + |
| 55 | +Ideally, the limit system is comprehensible enough that users can estimate the costs of a query before submitting it, |
| 56 | +because it's typically not pleasant to get a failure after some effort has already been expended. |
| 57 | +(It's not clear how possible this is, but if possible, it's desirable.) |
| 58 | + |
| 59 | +#### Impact |
| 60 | +_How directly important is the outcome to web3 dev stack product-market fit?_ |
| 61 | + |
| 62 | +However important Selectors are to web3 dev stack PMF, this is that times about 0.95. |
| 63 | + |
| 64 | +Within the relevance of Selectors: this budgeting requirement is not critical right up until it's critical. |
| 65 | + |
| 66 | +Building services which accept user-specified Selectors and evaluate them and are exposed to the public is an unwise thing for someone to do until this is addressed. |
| 67 | + |
| 68 | +#### Leverage |
| 69 | +_How much would nailing this project improve our knowledge and ability to execute future projects?_ |
| 70 | + |
| 71 | +Leverage of this is probably low. |
| 72 | +We can already design systems using Selectors. |
| 73 | + |
| 74 | +Assuming it's reasonable to bet that adding resource budgets to Selectors will not drastically change the way they fit together into systems overall, |
| 75 | +this work is overall is fairly deferrable without causing pipeline stalls in other work. |
| 76 | + |
| 77 | +#### Confidence |
| 78 | +_How sure are we that this impact would be realized? Label from [this scale](https://medium.com/@nimay/inside-product-introduction-to-feature-priority-using-ice-impact-confidence-ease-and-gist-5180434e5b15)_. |
| 79 | + |
| 80 | +(Not really sure how to apply the numeric scale to this, sorry.) |
| 81 | + |
| 82 | +3? 10? I think we're extremely sure that this will be a problem for certain user stories. |
| 83 | +We can consult folks working on Filecoin features which block on this for more information. |
| 84 | + |
| 85 | + |
| 86 | +Project definition |
| 87 | +------------------ |
| 88 | + |
| 89 | +#### Brief plan of attack |
| 90 | + |
| 91 | +1. Design work: Figure out what good budgeting means. |
| 92 | + - @warpfork's initial bet is: just having a single global counter which monotonically decreases during evaluation is the right direction. |
| 93 | +2. Design work: Figure out how the limits should be expressed in the Selector format. |
| 94 | + - Should a limit value be always required at the root? |
| 95 | + - Should other sub-limits (i.e. can only further drop the limit, not start a new budget) be allowed throughout the query? |
| 96 | + - What unit is the limit? Blocks or nodes? Or binary size (e.g. does selecting a large string count harder against the budget than a small one)? |
| 97 | + - Consider: that walks with selectors are currently defined as yielding `(path,node)` pairs -- which means reaching the same data by a different path is considered distinct, and causes time to be expended on a visitation that's arguably a repeat. Do we want to revisit this? It has unfortunate performance implications on some densely linked graph structures. |
| 98 | + - Figure out exactly what behavior we expect from APIs when they encounter a limit -- simply halting addresses the DoS concern, but what will a user's action options be when they receive a halt due to budget exceeded? Will there be any option for resumability? Etc. |
| 99 | +3. Design Work: Work through how service operators will be able to look at a Selector and decide if they want to evaluate it or not. |
| 100 | + - This is a sanity-checking process for the either design phase. |
| 101 | +4. Implement: in the [go-ipld-prime/traversal/selectors](https://github.com/ipld/go-ipld-prime/tree/master/traversal/selector) package. |
| 102 | +5. Test: make sure we have examples of datasets and selectors to run on them which we expect to be halted by budget limits. |
| 103 | + - Ideally this should be in language-agnostic test fixture files, so we can reuse them in other selectors implementations. |
| 104 | +6. Documentation: update it. |
| 105 | +7. Synchronize: other implementations! |
| 106 | + - The [ChainSafe forest](https://github.com/ChainSafe/forest/) project contains a Selectors implementation -- communicate with them about these changes! |
| 107 | +8. Propagation to downstream, possible small migrations? |
| 108 | + - If we make the budget system non-optional, then existing Selector documents may not work. |
| 109 | + - Or, there might be no special work needed here, if the budget system is entirely optional. |
| 110 | + |
| 111 | +#### What does done look like? |
| 112 | +_What specific deliverables should completed to consider this project done?_ |
| 113 | + |
| 114 | +- Selectors in the go-ipld-prime implementation should have resource budgeting. |
| 115 | +- Test fixtures should demonstrate what a selection which halts due to a resource budget exhaustion behaves like. |
| 116 | +- The resource budget specification declaration system should be reasonably comprehensible and look like something we can tell administrators of hypothetical services using this system how to configure. |
| 117 | + - Probably: it should be as simple as _one number_. |
| 118 | + |
| 119 | +#### What does success look like? |
| 120 | +_Success means impact. How will we know we did the right thing?_ |
| 121 | + |
| 122 | +When developers such as the Filecoin team feel comfortable exposing features using Selectors to users, then this project is a success. |
| 123 | + |
| 124 | +#### Counterpoints & pre-mortem |
| 125 | +_Why might this project be lower impact than expected? How could this project fail to complete, or fail to be successful?_ |
| 126 | + |
| 127 | +- Overcomplicating the budget system could result in usability failure. |
| 128 | + (Arguably, the current limit systems are this, because they're too granular, which is no substitute for a holistic system.) |
| 129 | + |
| 130 | +- Technical consideration: Beware the "[Billion Laughs](https://en.wikipedia.org/wiki/Billion_laughs)" problem. |
| 131 | + (This is why this document keeps emphasizing a budget that is holistic and monotonically decreasing.) |
| 132 | + |
| 133 | +- A system that halts but returns insufficient information about why could be frustrating to users, |
| 134 | + even if it successfully addresses the DoS problem. |
| 135 | + |
| 136 | +- Keep in mind: this proposal only describes implementing this in golang. |
| 137 | + We do not currently have a javascript Selectors implementation (and creating one is a larger task). |
| 138 | + This is not a problem per se; it's just something to remember when considering what can be immediately built upon this work. |
| 139 | + |
| 140 | +#### Alternatives |
| 141 | +_How might this project’s intent be realized in other ways (other than this project proposal)? What other potential solutions can address the same need?_ |
| 142 | + |
| 143 | +One: See remarks about budgeting in the Assumptions & Hypothesis section. |
| 144 | +Some system of resource currency could be associated with this problem as part of the solution. |
| 145 | +(This doesn't necessarily remove the need for engineering work on the Selector system to support it, though, |
| 146 | +which means this should probably be considered a stretch goal or future work rather than an alternative.) |
| 147 | + |
| 148 | +Two: It's possible to work around this in some cases by building APIs around selectors, |
| 149 | +but then only accept a known, pre-specified set of selectors. |
| 150 | +(If I understand correctly, this is how several pieces of Filecoin currently around around this issue.) |
| 151 | +This is not a general workaround, though, and ruins most of the point of Selectors -- they're *supposed* to be user-specifiable. |
| 152 | + |
| 153 | +Three: a totally distinct graph query mechanism could be proposed. |
| 154 | +However, whatever that system is: it would have the same need for a budget mechanism. |
| 155 | + |
| 156 | +#### Dependencies/prerequisites |
| 157 | + |
| 158 | +- No strict dependencies known. |
| 159 | +- Bonus/Accelerant: if the Selector implementation in go-ipld-prime was refactored to be built off a Schema and use codegen, it would probably be easier to update. |
| 160 | +- Bonus/Stabilizer: if the documentation site which covers Selectors was connected with an automated test suite which checks that examples in the documentation actually match behavior of the libraries, |
| 161 | +it would be much easier to be confident in the correctness and completeness of our documentation. |
| 162 | + |
| 163 | +#### Future opportunities |
| 164 | + |
| 165 | +Selectors with resource budgets make them safe to use in services which accept user-defined Selectors. |
| 166 | + |
| 167 | + |
| 168 | +Required resources |
| 169 | +------------------ |
| 170 | + |
| 171 | +#### Effort estimate |
| 172 | +<!--T-shirt size rating of the size of the project. If the project might require external collaborators/teams, please note in the roles/skills section below). |
| 173 | +For a team of 3-5 people with the appropriate skills: |
| 174 | +- Small, 1-2 weeks |
| 175 | +- Medium, 3-5 weeks |
| 176 | +- Large, 6-10 weeks |
| 177 | +- XLarge, >10 weeks |
| 178 | +Describe any choices and uncertainty in this scope estimate. (E.g. Uncertainty in the scope until design work is complete, low uncertainty in execution thereafter.) |
| 179 | +--> |
| 180 | + |
| 181 | +Probably "Medium, 3-5 weeks". |
| 182 | + |
| 183 | +It's not Small, because the design phases shouldn't be skimped on. |
| 184 | +(It will be easy to implement something that compiles, but doesn't solve the problem correctly; |
| 185 | +therefore it seems unwise to try to cram this into a small 1-2 weeks timeline.) |
| 186 | +(_Maybe_ it will turn out to be small, but I'd rather greet that as a pleasant surprise.) |
| 187 | + |
| 188 | +It's not likely to be Large (6-10 weeks) because there's just not that much work to do here if tackled by a team. |
| 189 | +(It's renovation work and a new feature within an existing system, not a whole new system.) |
| 190 | + |
| 191 | +The "resumability" consideration should probably be considered out of scope, |
| 192 | +or the effort estimate increases significantly and the confidence decreases significantly. |
| 193 | + |
| 194 | +Other Selectors implementations will not necessarily be updated during this work period; |
| 195 | +however, these are maintained by teams outside of PL, so this is natural: |
| 196 | +we should just aim to leave them set up and aware of what they would need to do. |
| 197 | + |
| 198 | +#### Roles / skills needed |
| 199 | + |
| 200 | +- Golang developers (work is required in go-ipld-prime) |
| 201 | + - Bonus if they're already familiar with Selectors |
| 202 | + |
| 203 | +I probably wouldn't recommend trying to spin this out to a community or external team. |
| 204 | +The task size isn't big enough to be worth the overhead, |
| 205 | +the amount of separability of the task is low and would result in friction, |
| 206 | +and the amount of trust we need to have in the result is high, |
| 207 | +so we'd spend as much time reviewing the result as we would just doing the design work ourselves. |
0 commit comments