Skip to content

Problems stemming from late loopback.context initialization #1651

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
doublemarked opened this issue Sep 7, 2015 · 17 comments
Closed

Problems stemming from late loopback.context initialization #1651

doublemarked opened this issue Sep 7, 2015 · 17 comments

Comments

@doublemarked
Copy link

The loopback.context middleware is enabled by default via the loopback.rest middleware. With this comes some rudimentary configuration, including enableHttpContext. The default initialization is great, but the delayed mechanism by which its initialized can produce unexpected results.

Specifically - functionality outlined in the sample code of the context documentation (https://docs.strongloop.com/display/LB/Using+current+context) produces a scenario where it's necessary to initialize loopback.context earlier. An excerpt:

app.use(loopback.context());
app.use(loopback.token());
app.use(function setCurrentUser(req, res, next) {
[...]

This explicit initialization of loopback.context and loopback.token is necessary, otherwise the middleware (which necessarily runs before loopback.rest) will not have access to the context. Afterwards, because the context was initialized early, we lose support for the configuration loading present in the loopback.rest init process. To more accurately load the context we would need to either hard-code the enableHttpContext value here, or perform our own config loading. The same holds true for token.

All of this is relatively easy to resolve, provided the developer is willing to spend some hours digging through loopback code trying to figure out why enableHttpContext doesn't work. Anyhow.

Related Questions:

  1. Why is loopback.context initialized so late? It is quite useful for middleware to have access to it in order to pass things on to the API. Should it perhaps be initialized very early?
  2. What's the point of enableHttpContext being configurable, anyways? Isn't this a reasonable default to include in the context?

A low-impact recommendation: Provided I have interpreted the above correctly, should the sample code in the documentation perhaps be changed to explicitly include a value for enableHttpContext? Like this,

app.use(loopback.context({ enableHttpContext: true }));
app.use(loopback.token());
app.use(function setCurrentUser(req, res, next) {
[...]
@doublemarked
Copy link
Author

Adding something small to this - I've taken to using middleware.json to early load loopback.context and loopback.token so that it's available to our other middleware,

   "initial:before": {
      "loopback#context": {
         "params": { "enableHttpContext": true }
      },
      "loopback#token": {}
   }

I'm happy to update the "Using current context" documentation to recommend this, but it's not clear to me whether this is inline with the intentions for when context and token should be used/available.

@superkhau
Copy link
Contributor

@bajtos @ritch @raymondfeng ^

@bajtos
Copy link
Member

bajtos commented Sep 11, 2015

Hey @doublemarked, thank you for writing down this detailed analysis!

What's the point of enableHttpContext being configurable, anyways? Isn't this a reasonable default to include in the context?

The way I see "current context", it should contain transport-agnostic data only, so that your clients can easily switch between transports when needed (REST, JSON-RPC, WebSocket, etc.) By including HTTP req & res objects in the context, it becomes very easy to couple model methods with REST transports and thus make them unusable when a different transport is used.

OTOH, I do realise that multi-transport remains just an theoretical concept not used in practice (so far).

Why is loopback.context initialized so late? It is quite useful for middleware to have access to it in order to pass things on to the API. Should it perhaps be initialized very early?

The current implementation is optimised to make it easy to setup a full app with as few lines of code as possible:

var app = loopback();
app.use(loopback.rest());
// bang, everything is included

Of course, if your middleware setup is more complex than the above, then yes, you should initialise the context middleware very early.

I am proposing the following change: modify the project template in loopback workspace (source) and add explicit configuration of context & token middleware.

{
  "initial": {
    "compression": {},
    "cors": { /* ... */ },
    "loopback#context": {
      "params": { "enableHttpContext": false }
    }
  },
  // ...
  "auth": {
    "loopback#token": {},
  },
  // ...
}

Thoughts?

@doublemarked
Copy link
Author

Thank you @bajtos for the detailed response! It helps a lot to hear the rationale for some of the code design. I understand much more clearly now the motivations.

I find your proposed solution perfect. A small connected question - should the default configuration for enableHttpContext in config.json then also be removed?

@pulkitsinghal
Copy link

Based on this conversation and where it all began (#337), here's what I understand:

  1. Context is a container that carries stuff and should be parallel safe.
  2. The most common use-case is for it to carry currentUser which is transport-agnostic.
  3. A debatable use-case is for it to carry req and res which are transport-specific. And the current default for that in loopback-workspace (templates) is enableHttpContext: false
  4. In order to setup a full app with as few lines of code as possibletheloopback#restscaffoldscontextandtokenby default. But it is smart enough such that if someone sets them up earlier in the pipeline then the subsequent runs insideloopback#rest` are no-ops

A small connected question - should the default configuration for enableHttpContext in config.json then also be removed?

@doublemarked - If i have understood it all correctly then I'm inclined to agree with you that its duplicate work and can probably can be removed.

But even if it was left there it wouldn't do any immediate harm ... although the next time changes need to happen, and its not someone in the "know" then that developer will tear their hair out trying to figure out the significance of the same config in two places and probably wish for better cleanup :)

Disclaimer: @bajtos could still come in from an omniscient angle and inform us that there is an arcane code workflow that is not covered when middleware is moved up the phase chain so take my +1 with a grain of salt.

@bajtos
Copy link
Member

bajtos commented Sep 15, 2015

I find your proposed solution perfect. A small connected question - should the default configuration for enableHttpContext in config.json then also be removed?

I am proposing to go even one step further and prevent the REST adapter from adding any context middleware at all by setting context: false. Here is a link to loopback code handling this case: server/middleware/rest.js.

{
  "remoting": {
    "context": false,
    "rest": {
      "normalizeHttpPath": false,
      "xml": false
    },
    // etc.
  }
}

Here is a use case (user story?) to show the benefit of such approach.

As a LoopBack user, I start by scaffolding a new loopback application via slc loopback. Then I implement the models, etc. and eventually run performance tests. At which point I discover that the context middleware is causing some slowdown and since I am not using the current context at all, I'd like to get rid of this feature.

A quick search shows that server/middleware.json defines loopback#context middleware, so I simply remove this entry from the JSON file.

And here is the catch: if server/config.json did not disable auto-added context middleware, then the change above would be effectively a no-op, the context middleware would be still loaded by loopback.rest. The poor user will experience a WTF moment.

Thoughts?

@doublemarked
Copy link
Author

I agree with your use-case, but it highlights what I feel is more directly the problem - there are two locations for configuring the same module, and now both would be present in the default configuration files. And what's more, one of them is additive (loopback#context is added to middleware.json to add features) and the other is subtractive (context: false is added to remove features). It may be a bit confusing. For instance, for somebody new to the project it may not be clear whether the setting in config.json setting is capable of overriding the middleware.json value.

I'm having trouble coming up with an easy solution to suggest that fits our stated objectives. However - we might consider having a set of "default middleware configuration" that is enabled in absence of explicit configuration. Currently we do have middleware defaults, but these are handled module by module, like the default stuff enabled in loopback.rest. This is a dirty way of dealing with defaults.

Instead, there should simply be an internalized version of middleware.json that, by default, includes the same features present in the default file. The absence of middleware.json (or any of its friends) enables the default configuration. The mere presence of explicit configuration overrides this default configuration. This way, a user can simply remove loopback#context from middleware.json and have expected behavior. There would be no default context configuration in config.json, and eventually any context configuration in that file would be deprecated.

This may produce some backwards compatibility issues that would prevent this from being viable in the near term. I have not yet thought through that aspect.

Thoughts?

@bajtos
Copy link
Member

bajtos commented Sep 15, 2015

There are two locations for configuring the same module, and now both would be present in the default configuration files.

That's a good point, I understand how confusing this is for new LoopBack users.

I'm having trouble coming up with an easy solution to suggest that fits our stated objectives. However - we might consider having a set of "default middleware configuration" that is enabled in absence of explicit configuration. Currently we do have middleware defaults, but these are handled module by module, like the default stuff enabled in loopback.rest. This is a dirty way of dealing with defaults.

Instead, there should simply be an internalized version of middleware.json that, by default, includes the same features present in the default file. The absence of middleware.json (or any of its friends) enables the default configuration. The mere presence of explicit configuration overrides this default configuration. This way, a user can simply remove loopback#context from middleware.json and have expected behavior. There would be no default context configuration in config.json, and eventually any context configuration in that file would be deprecated.

I see two possibly independent changes in your proposal.

  1. No default context configuration in server/config.json
  2. Replace per-module middleware defaults with global per-app middleware defaults.

1) No default context configuration in server/config.json

This makes a lot of sense. I would go even further and propose to move all transport-related configuration (json, urlencoded, cors) from config.json to server/middleware.json.

Before:

// server/config.json
  "remoting": {
    "context": {
      "enableHttpContext": false
    },
    "rest": {
      "normalizeHttpPath": false,
      "xml": false
    },
    "json": {
      "strict": false,
      "limit": "100kb"
    },
    "urlencoded": {
      "extended": true,
      "limit": "100kb"
    },
    // ...
  }
}

Step 1: move remoting configuration from app.remotes() to loopback.rest(options). We need this API to support code-first app.use(loopback.rest(options)).

// server/middleware.json
{
  "routes": {
    "loopback#rest": {
      "paths": ["${restApiRoot}"],
      "params": {
        "normalizeHttpPath": false,
        "xml": false
        "context": {
          "enableHttpContext": false
        },
       "json": {
         "strict": false,
         "limit": "100kb"
       },
      "urlencoded": {
        "extended": true,
        "limit": "100kb"
      },
    }
}

Step 2: rework the template to disable all built-in (auto-added) middleware and configure everything explicitly.

// server/middleware.json
{
  "initial": {
    "loopback#context": {
      "params": { "enableHttpContext": false }
    }
  },
  "auth": {
    "loopback#token": {},
  },
  "parse": {
    "body-parser#json": {
      "params": {
         "strict": false,
         "limit": "100kb"
       },
    }
   },
   "body-parser#urlencoded": {
     "params": {
        "extended": true,
        "limit": "100kb"
      },
   }
  }
  "routes": {
    "loopback#rest": {
      "paths": ["${restApiRoot}"],
      "params": {
        "normalizeHttpPath": false,
        "xml": false,
        "autoloadDependencies": false,
      }
    }
}

2) Replace per-module middleware defaults with global per-app middleware defaults.

I am not sure if this is a good idea. I am concerned that this may add unnecessary dependencies to applications that use a different transport than REST.

In other words, the per-app default middleware configuration depends on which transport middleware (REST, JSON-RPC, WebSocket, etc.) the application wants to use. Applications using WebSockets probably don't need JSON body parser by default.

That's why I like more the current approach, where each module (transport middleware) adds just the middleware it requires, and nothing more.

Thoughts?

This may produce some backwards compatibility issues that would prevent this from being viable in the near term. I have not yet thought through that aspect.

Yes, backwards (in)compatibility will be important here. Perhaps we can split the work in two parts and start with implementing additive changes that are backwards compatible?

@doublemarked
Copy link
Author

Hi there - sorry for the delayed response here.

  1. No default context configuration in server/config.json

This makes a lot of sense. I would go even further and propose to move all transport-related configuration (json, urlencoded, cors) from config.json to server/middleware.json.
[examples snipped]

Great! This will be a big improvement. I love it. It also removes the boiler plate code from server/boot.

Q: should loopback.rest have a short-circuit included to prevent it from double initialization/execution?

  1. Replace per-module middleware defaults with global per-app middleware defaults.

I am not sure if this is a good idea. I am concerned that this may add unnecessary dependencies to applications that use a different transport than REST.

I'm with you. Thanks for this reminder of the design objectives.

Yes, backwards (in)compatibility will be important here. Perhaps we can split the work in two parts and start with implementing additive changes that are backwards compatible?

Yep, sounds good. Am I correct in seeing that the additive changes will be primarily to the template, without substantial code changes?

Tangential question: Your config sample has autoloadDependencies - what is this?

@bajtos
Copy link
Member

bajtos commented Sep 21, 2015

Q: should loopback.rest have a short-circuit included to prevent it from double initialization/execution?

I think it's not necessary, as loopback.rest always handles the request (e.g. by sending 404 for unknown paths). At least in the default configuration.

Am I correct in seeing that the additive changes will be primarily to the template, without substantial code changes?

I hope so :)

Tangential question: Your config sample has autoloadDependencies - what is this?

I'm glad you have asked. My original idea was that this new option can serve as a shortcut for context:false,json:false,urlencoded:false. However, on the second thought, I think it's better to explicitly list the dependent middleware which we want to disable.

This becomes important when loopback.rest adds a new dependent middleware in the future, as it will simplify upgrade of existing applications.

Because existing apps will be unaware of that new dependency (it won't be disabled in rest config and configured in middleware config), the auto-loading mechanism will kick in and ensure that the rest transport has all dependencies loaded.

If we were using "autoloadDependencies" to disable all dependent middleware, then we would end up with an application that tells rest middleware to not load the new dependency, but at the same time the app won't configure the new dependency explicitly.

@doublemarked
Copy link
Author

So, I spent some time tonight poking around at this topic. I've now done my homework on how loopback.rest interacts with strong-remoting and start to better understand your outline from two messages ago.

As I understand it - we extract all middlewares that have been implicitly loaded via loopback.rest, including those implicitly loaded by strong-remoting's RestAdapter and explicitly configure them via middleware.json. The behavior of these, as configured by default in middleware.json, must match the behavior when the configuration is absent from middleware.json (which I hope matches current default config).

Specifically, the following middlewares will now be configured by default in middleware.json: loopback.context, loopback.token, loopback.rest, body-parser.json, body-parser.urlencoded, cors.

Question/Concern: The strong-remoting RestAdapter is providing special behavior related to body-parser.json and cors which will not be possible if we are early initializing these modules:

Question/Concern: The RestAdapter.errorHandler is a final piece of middleware loaded by RestAdapter to process errors in a consistent manner. Unfortunately, this handler is configured explicitly via the top-level remotes config instead of using the configuration passed to RestAdapter. This will preclude us moving errorHandler: { disableStackTrace: false } into middleware.json. Ideas?

Question/Concern: While the body-parser module is smart enough to short-circuit itself if it's being reinitialized, it appears that cors is not. This means that, without explicitly disabling cors in strong-remoting, it will execute twice.

Tangential Question: Why are some 3rd party middlewares (favicon, responseTime, timeout, etc.) incorporated into loopback despite no internal usage, while cors is simply made a project dep? Unclear to me under what conditions a middleware should be added to loopback/lib/express-middleware.js.

Setting aside the above questions, here is a summary of the changes as I currently see them:

  1. loopback.rest: Include support for passing options, and pass these options along to the strong-remoting rest handler when we initialize it. Options passed to loopback.rest will override any provided by remoting.rest config.
  2. loopback-workspace template: Removal of remoting from default config
  3. loopback-workspace template: Addition of explicit configuration for all middlewares in middleware.json.

Looking forward to hearing back from you.

@bajtos
Copy link
Member

bajtos commented Oct 12, 2015

Hi @doublemarked, sorry for a late reply, your thorough analysis requires more time to digest it.

Question/Concern: The strong-remoting RestAdapter is providing special behavior related to body-parser.json and cors which will not be possible if we are early initializing these modules:

In general, I think we should upstream any customizations of 3rd party middleware.

https://github.com/strongloop/strong-remoting/blob/master/lib/rest-adapter.js#L217-L227

"tolerate empty json" - this may be rather controversial. We are returning an empty object when the body is empty, however I can imagine that null or even an empty array may be valid alternatives too. Perhaps this should be added as configuration option of body-parser.json and be disabled by default?

https://github.com/strongloop/strong-remoting/blob/master/lib/rest-adapter.js#L239-L247

This is a performance optimisation, I hope it should be accepted to CORS without much resistance.

Question/Concern: The RestAdapter.errorHandler is a final piece of middleware loaded by RestAdapter to process errors in a consistent manner. Unfortunately, this handler is configured explicitly via the top-level remotes config instead of using the configuration passed to RestAdapter. This will preclude us moving errorHandler: { disableStackTrace: false } into middleware.json. Ideas?

Can we rework RestAdapter to load the configuration both from remotes config and RestAdapter options? IMO, RestAdapter options specified in middleware.json should override remotes config.

Question/Concern: While the body-parser module is smart enough to short-circuit itself if it's being reinitialized, it appears that cors is not. This means that, without explicitly disabling cors in strong-remoting, it will execute twice.

I am not an expert on CORS, but I would expect that multiple invocations of CORS middleware would result in a more strict policy being applied. In which case multiple invocations remain a performance problem only. Which is a minor issue IMO, because the solution is quick & easy.

Thoughts?

Why are some 3rd party middlewares (favicon, responseTime, timeout, etc.) incorporated into loopback despite no internal usage, while cors is simply made a project dep?

favicon is incorporated because we provide a default icon image. static is included because that's what express includes too.

Other middleware (responseTime, timeout, etc.) is not included, we only provide getter methods to simplify migration from LoopBack 1.x (express 3.x), which included much more middleware.

Unclear to me under what conditions a middleware should be added to loopback/lib/express-middleware.js.

Only middleware that is provided via express.{name} by [email protected] should go to that file.

However, note that we actually do include few more middleware in LoopBack, see loopback's server/middleware.

Setting aside the above questions, here is a summary of the changes as I currently see them:

  1. loopback.rest: Include support for passing options, and pass these options along to the strong-remoting rest handler when we initialize it. Options passed to loopback.rest will override any provided by remoting.rest config.
  2. loopback-workspace template: Removal of remoting from default config
  3. loopback-workspace template: Addition of explicit configuration for all middlewares in middleware.json.

Sounds good 👍

I think RestAdapter's customization of JSON & CORS need to be upstreamed as step 0.

@doublemarked
Copy link
Author

Hey @bajtos - my turn to apologize. I'm sorry, it's been quite a long delay here on my side.

However - as is probably apparent, the size of this has ballooned a good bit (now stretching into three projects), and has started to exceed the bandwidth I have available to get involved, particularly as we shift more into less familiar territory (strong-remoting, etc.). I've been holding out, hoping to dig deeper here, but it's been six weeks ;)

That said, I want to continue the discussion, which I hope may be valuable. Comments below.

https://github.com/strongloop/strong-remoting/blob/master/lib/rest-adapter.js#L217-L227

"tolerate empty json" - this may be rather controversial. We are returning an empty object when the body is empty, however I can imagine that null or even an empty array may be valid alternatives too. Perhaps this should be added as configuration option of body-parser.json and be disabled by default?

Sounds like a reasonable solution (config driven), though it may cause unexpected changes in behavior for people depending on the current functionality. However I agree that burying controversial exceptions like this is undesirable.

Question/Concern: The RestAdapter.errorHandler is a final piece of middleware loaded by RestAdapter to process errors in a consistent manner. Unfortunately, this handler is configured explicitly via the top-level remotes config instead of using the configuration passed to RestAdapter. This will preclude us moving errorHandler: { disableStackTrace: false } into middleware.json. Ideas?

Can we rework RestAdapter to load the configuration both from remotes config and RestAdapter options? IMO, RestAdapter options specified in middleware.json should override remotes config.

That's my preference as well. It looks like @digitalsadhu committed something to fix this in October, immediately after you wrote your message. May not be a coincidence ;)

Question/Concern: While the body-parser module is smart enough to short-circuit itself if it's being reinitialized, it appears that cors is not. This means that, without explicitly disabling cors in strong-remoting, it will execute twice.
I am not an expert on CORS, but I would expect that multiple invocations of CORS middleware would result in a more strict policy being applied. In which case multiple invocations remain a performance problem only. Which is a minor issue IMO, because the solution is quick & easy.
Thoughts?

You may be right, though I am wary of placing so much hope/expectation in 3rd party modules the that they behave properly/efficiently when added twice. IIRC the objective of enabling things here (in RestAdapter) is to allow for configuration-less setup, and there may eventually be a scenario where there's a piece of middleware that can be meaningfully added twice in other use cases but not in ours, preventing us from depending on the vendor to short-circuit. But you're right, this is not the case with the ones I cited.

I think RestAdapter's customization of JSON & CORS need to be upstreamed as step 0.

Agreed!

@bajtos
Copy link
Member

bajtos commented Nov 27, 2015

favicon is incorporated because we provide a default icon image.

FWIW, we are discussing that favicon should be removed: #1827

@bajtos
Copy link
Member

bajtos commented Nov 27, 2015

Hey @doublemarked, thank you for the update. I agree this has grown much more that we anticipated in the beginning. I'll try to revisit this issue and our discussion in order to come up with some smallish incremental steps that are beginner-friendly. However, I can't promise any timeframe, it may take few weeks :(

@bajtos
Copy link
Member

bajtos commented Jan 9, 2017

Ouch, those few weeks to come back to this issue grew into a year :(

Some parts of this discussion is no longer relevant:

  • remoting.context is disabled in project template (source) now
  • remoting.cors is disabled too (source)

The remaining parts (moving strong-remoting configuration from server/config.json to server/middleware.json, etc.) are still relevant, but they don't seem to be highly important considering how long this issue was sitting here waiting.

@doublemarked what's your opinion? Are you still keen to contribute some of the changes described above? Or should we closing this issue? Any other idea?

@doublemarked
Copy link
Author

@bajtos I agree that this has grown stale and that much of it is no longer relevant. I'm in favor of closing it out. Thanks for following up on it, regardless of the delay :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants