Skip to content

Commit 0b8dbad

Browse files
committed
Selector resource budgets work proposal
1 parent c263998 commit 0b8dbad

File tree

1 file changed

+207
-0
lines changed

1 file changed

+207
-0
lines changed

proposals/SelectorResourceBudgets.md

Lines changed: 207 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,207 @@
1+
# Selectors have sufficiently predictable resource budgets to be used in low-trust environments
2+
3+
Authors: @warpfork
4+
5+
Initial PR: https://github.com/protocol/web3-dev-team/pull/27
6+
7+
8+
Purpose & impact
9+
----------------
10+
11+
#### Background & intent
12+
_Describe the desired state of the world after this project? Why does that matter?_
13+
14+
The status quo is: we have Selectors, and they can be used to describe walks of graphs of data.
15+
(They're sorta like regexps for DAGs, if that's a useful comparison for you.)
16+
We want to expose these
17+
18+
The problem is: if a service wants to accept Selectors which are user-specified,
19+
then the user can ask the service to do arbitrarily expensive work.
20+
This would create a way for users to take the service down (a DoS).
21+
22+
The intent is: we should create a resource budgeting system for Selectors.
23+
The system should be declarative and comprehensible,
24+
and must be something that administrators of services built with Selectors can configure in order to limit their exposure to DoS.
25+
26+
#### Assumptions & hypotheses
27+
_What must be true for this project to matter?_
28+
29+
- Selectors are something that either our or our community's projects expose as an API;
30+
- and that API is expected to be able to accept user-specified Selectors;
31+
- and the Selector would be evaluated by a different resource owner than the author;
32+
- and denial-of-service via maliciously crafted Selectors would be problematic.
33+
34+
(That sounds like a lot of conditions, but from what I can tell,
35+
users often want to treat Selectors like they're "free" to evaluate,
36+
and that results in folks building APIs with exactly these expectations.)
37+
38+
Another way to address the underlying issue is to make Selector evaluation connected to a billing system,
39+
but the work would also be required to make that kind of connection possible.
40+
(A billing system does no good if one can submit a task that bankrupts you before the bill is settlable.)
41+
42+
#### User workflow example
43+
_How would a developer or user use this new capability?_
44+
45+
When users ask for data from a service like IPFS,
46+
they submit a Selector, and expect to receive a series of blocks in response
47+
(typically in the form of a "car" or "dar" or other such format).
48+
49+
This workflow from the user's perspective shouldn't change significantly.
50+
51+
From the service host's perspective,
52+
they should probably have some some configuration file which lets them set limits
53+
for how much data is matched by a single selector before the service cuts off that request.
54+
55+
Ideally, the limit system is comprehensible enough that users can estimate the costs of a query before submitting it,
56+
because it's typically not pleasant to get a failure after some effort has already been expended.
57+
(It's not clear how possible this is, but if possible, it's desirable.)
58+
59+
#### Impact
60+
_How directly important is the outcome to web3 dev stack product-market fit?_
61+
62+
However important Selectors are to web3 dev stack PMF, this is that times about 0.95.
63+
64+
Within the relevance of Selectors: this budgeting requirement is not critical right up until it's critical.
65+
66+
Building services which accept user-specified Selectors and evaluate them and are exposed to the public is an unwise thing for someone to do until this is addressed.
67+
68+
#### Leverage
69+
_How much would nailing this project improve our knowledge and ability to execute future projects?_
70+
71+
Leverage of this is probably low.
72+
We can already design systems using Selectors.
73+
74+
Assuming it's reasonable to bet that adding resource budgets to Selectors will not drastically change the way they fit together into systems overall,
75+
this work is overall is fairly deferrable without causing pipeline stalls in other work.
76+
77+
#### Confidence
78+
_How sure are we that this impact would be realized? Label from [this scale](https://medium.com/@nimay/inside-product-introduction-to-feature-priority-using-ice-impact-confidence-ease-and-gist-5180434e5b15)_.
79+
80+
(Not really sure how to apply the numeric scale to this, sorry.)
81+
82+
3? 10? I think we're extremely sure that this will be a problem for certain user stories.
83+
We can consult folks working on Filecoin features which block on this for more information.
84+
85+
86+
Project definition
87+
------------------
88+
89+
#### Brief plan of attack
90+
91+
1. Design work: Figure out what good budgeting means.
92+
- @warpfork's initial bet is: just having a single global counter which monotonically decreases during evaluation is the right direction.
93+
2. Design work: Figure out how the limits should be expressed in the Selector format.
94+
- Should a limit value be always required at the root?
95+
- Should other sub-limits (i.e. can only further drop the limit, not start a new budget) be allowed throughout the query?
96+
- What unit is the limit? Blocks or nodes? Or binary size (e.g. does selecting a large string count harder against the budget than a small one)?
97+
- Consider: that walks with selectors are currently defined as yielding `(path,node)` pairs -- which means reaching the same data by a different path is considered distinct, and causes time to be expended on a visitation that's arguably a repeat. Do we want to revisit this? It has unfortunate performance implications on some densely linked graph structures.
98+
- Figure out exactly what behavior we expect from APIs when they encounter a limit -- simply halting addresses the DoS concern, but what will a user's action options be when they receive a halt due to budget exceeded? Will there be any option for resumability? Etc.
99+
3. Design Work: Work through how service operators will be able to look at a Selector and decide if they want to evaluate it or not.
100+
- This is a sanity-checking process for the either design phase.
101+
4. Implement: in the [go-ipld-prime/traversal/selectors](https://github.com/ipld/go-ipld-prime/tree/master/traversal/selector) package.
102+
5. Test: make sure we have examples of datasets and selectors to run on them which we expect to be halted by budget limits.
103+
- Ideally this should be in language-agnostic test fixture files, so we can reuse them in other selectors implementations.
104+
6. Documentation: update it.
105+
7. Synchronize: other implementations!
106+
- The [ChainSafe forest](https://github.com/ChainSafe/forest/) project contains a Selectors implementation -- communicate with them about these changes!
107+
8. Propagation to downstream, possible small migrations?
108+
- If we make the budget system non-optional, then existing Selector documents may not work.
109+
- Or, there might be no special work needed here, if the budget system is entirely optional.
110+
111+
#### What does done look like?
112+
_What specific deliverables should completed to consider this project done?_
113+
114+
- Selectors in the go-ipld-prime implementation should have resource budgeting.
115+
- Test fixtures should demonstrate what a selection which halts due to a resource budget exhaustion behaves like.
116+
- The resource budget specification declaration system should be reasonably comprehensible and look like something we can tell administrators of hypothetical services using this system how to configure.
117+
- Probably: it should be as simple as _one number_.
118+
119+
#### What does success look like?
120+
_Success means impact. How will we know we did the right thing?_
121+
122+
When developers such as the Filecoin team feel comfortable exposing features using Selectors to users, then this project is a success.
123+
124+
#### Counterpoints & pre-mortem
125+
_Why might this project be lower impact than expected? How could this project fail to complete, or fail to be successful?_
126+
127+
- Overcomplicating the budget system could result in usability failure.
128+
(Arguably, the current limit systems are this, because they're too granular, which is no substitute for a holistic system.)
129+
130+
- Technical consideration: Beware the "[Billion Laughs](https://en.wikipedia.org/wiki/Billion_laughs)" problem.
131+
(This is why this document keeps emphasizing a budget that is holistic and monotonically decreasing.)
132+
133+
- A system that halts but returns insufficient information about why could be frustrating to users,
134+
even if it successfully addresses the DoS problem.
135+
136+
- Keep in mind: this proposal only describes implementing this in golang.
137+
We do not currently have a javascript Selectors implementation (and creating one is a larger task).
138+
This is not a problem per se; it's just something to remember when considering what can be immediately built upon this work.
139+
140+
#### Alternatives
141+
_How might this project’s intent be realized in other ways (other than this project proposal)? What other potential solutions can address the same need?_
142+
143+
One: See remarks about budgeting in the Assumptions & Hypothesis section.
144+
Some system of resource currency could be associated with this problem as part of the solution.
145+
(This doesn't necessarily remove the need for engineering work on the Selector system to support it, though,
146+
which means this should probably be considered a stretch goal or future work rather than an alternative.)
147+
148+
Two: It's possible to work around this in some cases by building APIs around selectors,
149+
but then only accept a known, pre-specified set of selectors.
150+
(If I understand correctly, this is how several pieces of Filecoin currently around around this issue.)
151+
This is not a general workaround, though, and ruins most of the point of Selectors -- they're *supposed* to be user-specifiable.
152+
153+
Three: a totally distinct graph query mechanism could be proposed.
154+
However, whatever that system is: it would have the same need for a budget mechanism.
155+
156+
#### Dependencies/prerequisites
157+
158+
- No strict dependencies known.
159+
- Bonus/Accelerant: if the Selector implementation in go-ipld-prime was refactored to be built off a Schema and use codegen, it would probably be easier to update.
160+
- Bonus/Stabilizer: if the documentation site which covers Selectors was connected with an automated test suite which checks that examples in the documentation actually match behavior of the libraries,
161+
it would be much easier to be confident in the correctness and completeness of our documentation.
162+
163+
#### Future opportunities
164+
165+
Selectors with resource budgets make them safe to use in services which accept user-defined Selectors.
166+
167+
168+
Required resources
169+
------------------
170+
171+
#### Effort estimate
172+
<!--T-shirt size rating of the size of the project. If the project might require external collaborators/teams, please note in the roles/skills section below).
173+
For a team of 3-5 people with the appropriate skills:
174+
- Small, 1-2 weeks
175+
- Medium, 3-5 weeks
176+
- Large, 6-10 weeks
177+
- XLarge, >10 weeks
178+
Describe any choices and uncertainty in this scope estimate. (E.g. Uncertainty in the scope until design work is complete, low uncertainty in execution thereafter.)
179+
-->
180+
181+
Probably "Medium, 3-5 weeks".
182+
183+
It's not Small, because the design phases shouldn't be skimped on.
184+
(It will be easy to implement something that compiles, but doesn't solve the problem correctly;
185+
therefore it seems unwise to try to cram this into a small 1-2 weeks timeline.)
186+
(_Maybe_ it will turn out to be small, but I'd rather greet that as a pleasant surprise.)
187+
188+
It's not likely to be Large (6-10 weeks) because there's just not that much work to do here if tackled by a team.
189+
(It's renovation work and a new feature within an existing system, not a whole new system.)
190+
191+
The "resumability" consideration should probably be considered out of scope,
192+
or the effort estimate increases significantly and the confidence decreases significantly.
193+
194+
Other Selectors implementations will not necessarily be updated during this work period;
195+
however, these are maintained by teams outside of PL, so this is natural:
196+
we should just aim to leave them set up and aware of what they would need to do.
197+
198+
#### Roles / skills needed
199+
200+
- Golang developers (work is required in go-ipld-prime)
201+
- Bonus if they're already familiar with Selectors
202+
203+
I probably wouldn't recommend trying to spin this out to a community or external team.
204+
The task size isn't big enough to be worth the overhead,
205+
the amount of separability of the task is low and would result in friction,
206+
and the amount of trust we need to have in the result is high,
207+
so we'd spend as much time reviewing the result as we would just doing the design work ourselves.

0 commit comments

Comments
 (0)