Skip to content

Tracing does not work with express #302

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
prashanth007 opened this issue Oct 4, 2018 · 29 comments
Closed

Tracing does not work with express #302

prashanth007 opened this issue Oct 4, 2018 · 29 comments
Labels
community integrations question Further information is requested

Comments

@prashanth007
Copy link

prashanth007 commented Oct 4, 2018

Distributed Tracing does not seem to work with Express and Redis. I also use UWS for websocket support.

The following are my dependencies.

"dependencies": {
"body-parser": "^1.18.2",
"express": "^4.16.2",
"log4js": "^2.3.7",
"nodemailer": "^4.3.0",
"redis": "^2.8.0",
"request": "^2.83.0",
"request-promise": "^4.2.2",
"retry": "^0.10.1",
"uws": "^9.14.0",
"valid-url": "^1.0.9"
}

Our app runs in AWS Elastic Beanstalk

Nodejs version : 8.11.3

dd-trace-js version : 0.60.0

Instructions that i followed to install Datadog in Elasticbeanstalk

https://www.datadoghq.com/blog/deploy-datadog-aws-elastic-beanstalk/

@rochdev
Copy link
Member

rochdev commented Oct 4, 2018

Could you describe the behavior you are experiencing? What do you mean by "Distributed Tracing does not seem to work"? Do you mean it doesn't work at all? Or does it work only partially?

Also, what is the language of each service? (client + server)

@rochdev rochdev added question Further information is requested community integrations labels Oct 4, 2018
@prashanth007
Copy link
Author

prashanth007 commented Oct 4, 2018

The tracing appears like below
screenshot 2018-10-04 at 3 46 18 pm

My question is why is it not showing up the spans of redis

@rochdev
Copy link
Member

rochdev commented Oct 4, 2018

Could you share a snippet of how the tracer is instantiated, as well as express and redis?

@prashanth007
Copy link
Author

const tracer = require('dd-trace').init( { env : process.env.NODE_ENV , service : 'RTMServer' } )

const redis    = require( 'redis' );
this.redisClient = this.createRedisClient();
const server   = require( 'http' ).createServer();
const app      = require( './app.js' );
const wss      = require( './connection.js' )( server );

class SocketServer {
	constructor()
	{
		server.on( 'request' , app );
		server.listen( port , () => {
			logger.info( 'Listening on ' + server.address().port );
		} );
		process.on( 'SIGINT' , this.shutdown );
		process.on( 'SIGTERM' , this.shutdown );
	}
}

@rochdev
Copy link
Member

rochdev commented Oct 4, 2018

Is this tracer initialization done before redis is required?

Also, are you getting individual redis traces in another service in the UI or you are not getting these spans at all?

@prashanth007
Copy link
Author

prashanth007 commented Oct 5, 2018

Yeah i am initializing the tracer before redis. It appears as a seperate service also showing a span for each redis command call.

@prashanth007
Copy link
Author

Could anyone help us on this? What are we doing wrong here?

@rochdev
Copy link
Member

rochdev commented Oct 10, 2018

I see that one of the dependencies is request-promise which uses Bluebird under the hood. Is this used between the time the request is received and the call to Redis?

The reason I'm asking is because Bluebird uses an internal queue that doesn't play well with context propagation. If this is the case, I can help to work around the issue until we improve support for Bluebird.

@prashanth007
Copy link
Author

prashanth007 commented Oct 11, 2018

No the request promise module is used to just make only one call to an external api. But the tracing info is not available for any requests that comes into server.

I also suspect the redis module. I saw in some other forum that redis module doesnot play well with async context propagation

@prashanth007
Copy link
Author

Please do also let me know the work around i will try and see if that fixes the issue.

@rochdev
Copy link
Member

rochdev commented Oct 11, 2018

The workaround would basically be to reactivate the correct scope before the Redis call.

For example it's possible to grab the current span at the beginning of the route handler:

const span = tracer.scopeManager().active()

And then reactivate it just before sending a Redis command

tracer.scopeManager().activate(span)

Of course this is not ideal, and you would have to propagate that span manually. What you are experiencing however should never happen, and if you can pinpoint where the context is lost, I'll take a look as to how we can prevent that. One way to find the source of the problem is to debug and put a watch on the above snippet to check that the scope is still active. When the watcher returns null, then you know the scope was lost.

@prashanth007
Copy link
Author

prashanth007 commented Oct 16, 2018

I just tried the above solution and found out that the span context is being lost during express body parser. I also tried getting the context and activating the span later but i am getting the following error

Error: Expected [object Object] to be an instance of SpanContext.

So I took the span out and set it to the request object as well as the response.locals object and tried access it post the json parser middleware. The span context is still an object, but when i set it back as activate it throws the above during the subsequent call to redis.

Edit : const span = tracer.scopeManager().active().span();
tracer.scopeManager().activate(span)

let getContext = ( req , res , next ) => {
req.span = global.tracer.scopeManager().active()
.span();
next();
};
let attachContext = ( req , res , next ) => {
global.tracer.scopeManager().activate( req.span );
next();
};

so i added getcontext and attachcontext before and after json parser which fixed the problem now and i can see the traces.

Thank you for your time.

It would be really good if there is a guide on how to run the agent v6 in elasticbeanstalk. Instructions here https://www.datadoghq.com/blog/deploy-datadog-aws-elastic-beanstalk/ work only for v5.

@rochdev
Copy link
Member

rochdev commented Oct 16, 2018

I just tried the above solution and found out that the span context is being lost during express body parser.

This was a problem in the past, but should no longer happen in newer versions since we reactivate the span in every middleware similar to what you did:

web.reactivate(req)

// Reactivate the request scope in case it was changed by a middleware.
reactivate (req) {
req._datadog.scope && req._datadog.scope.close()
req._datadog.scope = req._datadog.tracer.scopeManager().activate(req._datadog.span)
},

Can you make sure that the version actually installed is really 0.6.0? If you can confirm this is the case, then can you provide a minimal snippet to reproduce the problem? (i.e. just a small express app with a health check that logs the current span to show it's not available). Then I'll use this to reproduce on my end since for me it works with body-parser. The helpers you had to add should definitely not be necessary.

It would be really good if there is a guide on how to run the agent v6 in elasticbeanstalk. Instructions here https://www.datadoghq.com/blog/deploy-datadog-aws-elastic-beanstalk/ work only for v5.

If you are using Docker, take a look at https://docs.datadoghq.com/integrations/amazon_elasticbeanstalk/

If you are deploying directly to EC2s, take a look at https://wimdeblauwe.wordpress.com/2018/07/06/datadog-integration-with-aws-elastic-beanstalk-for-spring-boot-application/

@prashanth007
Copy link
Author

prashanth007 commented Oct 18, 2018

I just made sure that the version installed is 0.6.0 and it still doesn't work without the fix. I will also make a sample app as you requested to demonstrate the issue.

The instructions provided for agent installation has a big problem in it.
If the user is installing datadog agent for the first time then it would work fine, but if he upgrading from v5, the instructions does not work. so i had to manually update the agent in every host and then use the instructions above.

Now when i pushing a guage metric to datadog. it does not sum up the values from the processes, resulting in a mismatch.

For Example :
I have four processes running in two hosts( 2 processes per host).

I have around 1.8k websockets connected in average to each process( 7.2k connections in total ).

screenshot 2018-10-18 at 1 38 33 pm

But as the graph shows its just 3.6k in total. My guess is that it misses to aggregate the values submitted on the process level.

@rochdev
Copy link
Member

rochdev commented Oct 19, 2018

Are these custom metrics or metrics from one of our integrations? In any case it would probably be best to either open an issue in our agent repo or open a ticket with support.

@ignaciocavina
Copy link

Hi! I'm using superagent (https://www.npmjs.com/package/superagent) to perform a http call, and I'm having the same issue as described here. Any clues on how to fix it?

@rochdev
Copy link
Member

rochdev commented Aug 5, 2019

@ignaciocavina Could you clarify which of the issues mentioned above? What is the expected behavior in your case, and what are you seeing instead?

@ignaciocavina
Copy link

ignaciocavina commented Aug 5, 2019 via email

@rochdev
Copy link
Member

rochdev commented Aug 6, 2019

I turned on debug and got an error saying that a span was expected but got
null.

This error has been fixed in 0.14.0

I did debug the trace and could verify that there is an active span until
the very first inner parts of superagent, but then it is lost.

It's possible there is some incompatibility with superagent. I'll have to look into it as it may require a new plugin to be added to explicitly support it. This has happened for several other libraries that were not correctly propagating their internal scope using AsyncResource.

I could give you more pointers to where that happens, if needed.

If you can pinpoint exactly where the scope is lost within the library, that would be very helpful.

@ignaciocavina
Copy link

ignaciocavina commented Aug 6, 2019 via email

@rochdev
Copy link
Member

rochdev commented Aug 7, 2019

Thanks for the additional information, I'll look into it.

About the error, this is very odd as there is now a check on the presence of the span context, which null and undefined won't pass. I wonder if something is causing the span context to actually be the null string and not actually null? See https://github.com/DataDog/dd-trace-js/pull/639/files

@ignaciocavina
Copy link

ignaciocavina commented Aug 7, 2019 via email

@rochdev
Copy link
Member

rochdev commented Aug 9, 2019

@ignaciocavina I am unable to reproduce locally. Can you provide a minimal snippet that I could use to replicate the behavior?

@ignaciocavina
Copy link

ignaciocavina commented Aug 9, 2019 via email

@rochdev
Copy link
Member

rochdev commented Aug 14, 2019

@ignaciocavina The issue is that superagent uses a thenable object that is not a promise, which is a pattern that Node doest not support.

A possible workaround is to call then before calling await, for example await request.get(url).then(x => x). In general, libraries that do this should be avoided when context propagation is needed.

I'm working on an experimental workaround directly from within the tracer, but it does come with a performance cost, so there is a tradeoff to use it. I'll update the thread when it's released.

@ignaciocavina
Copy link

Thanks a lot @rochdev !

@rochdev
Copy link
Member

rochdev commented Aug 20, 2019

@ignaciocavina Let's continue the discussion in #654 since it's unrelated to the original issue of this thread.

@andrewsouthard1
Copy link

@prashanth007 - are you still having this issue? Closing for now but please feel free to open a support ticket and we can investigate further.

@rochdev
Copy link
Member

rochdev commented Sep 24, 2020

Thenables are now supported properly out of the box in Node >=12.18.3 or >=14.5.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community integrations question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants