-
Notifications
You must be signed in to change notification settings - Fork 28
add migration 0490_job_st_fin_allrows_backpop
#4367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
add migration 0490_job_st_fin_allrows_backpop
#4367
Conversation
Is the squawk failure a matter of it being confused by the |
There’s already a 487 now so you’ll need to update the version number in this PR |
3ffb771
to
417eefb
Compare
0487_job_st_fin_allrows_backpop
0490_job_st_fin_allrows_backpop
jobs.update() | ||
.where( | ||
jobs.c.job_status == "finished", | ||
jobs.c.scheduled_for < datetime.datetime.utcnow() - datetime.timedelta(days=1), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most jobs don’t have a scheduled_for
date (only ones where the user choose a later time to send them). The processing_started
column is a more reliable indicator of when they were sent.
jobs.c.scheduled_for < datetime.datetime.utcnow() - datetime.timedelta(days=1), | |
jobs.c.processing_started < datetime.datetime.utcnow() - datetime.timedelta(days=1), |
Some example jobs from the Notify service in production here
job_status | scheduled_for | processing_started | processing_finished |
---|---|---|---|
finished all notifications created | 2025-02-13 15:06:00.585552 | 2025-02-13 15:06:00.890576 | |
finished all notifications created | 2025-02-13 14:34:31.734979 | 2025-02-13 14:34:32.065425 | |
finished all notifications created | 2025-02-13 14:04:45.689966 | 2025-02-13 14:04:46.028345 | |
finished all notifications created | 2025-02-12 13:15:40.374586 | 2025-02-12 13:15:40.552505 | |
finished all notifications created | 2025-02-12 13:13:25.01338 | 2025-02-12 13:13:25.207884 | |
finished all notifications created | 2025-02-12 13:10:19.987049 | 2025-02-12 13:10:20.18051 | |
finished all notifications created | 2025-02-12 13:08:14.185305 | 2025-02-12 13:08:14.375147 | |
finished all notifications created | 2025-02-12 13:05:08.103867 | 2025-02-12 13:05:08.251264 | |
finished all notifications created | 2025-02-12 12:59:40.527564 | 2025-02-12 12:59:40.693566 | |
finished all notifications created | 2025-02-12 12:53:26.41888 | 2025-02-12 12:53:26.601478 | |
finished all notifications created | 2025-02-12 12:48:01.024616 | 2025-02-12 12:48:01.197442 | |
finished all notifications created | 2025-02-04 10:44:56.358388 | 2025-02-04 10:44:56.570331 | |
finished all notifications created | 2025-02-04 10:42:47.799703 | 2025-02-04 10:42:48.065459 | |
finished all notifications created | 2025-01-21 13:39:11.814203 | 2025-01-21 13:39:17.399959 | |
finished all notifications created | 2025-01-17 09:40:43.544483 | 2025-01-17 09:41:00.544936 | |
finished | 2024-12-19 16:56:52.850092 | 2024-12-19 16:57:07.990041 | |
finished | 2024-12-11 16:27:15.68192 | 2024-12-11 16:27:16.155975 | |
finished | 2024-12-11 16:10:47.650919 | 2024-12-11 16:10:48.851847 | |
finished | 2024-12-04 16:22:31.478881 | 2024-12-04 16:25:53.121038 | |
finished | 2024-11-28 13:55:19.573986 | 2024-11-28 13:55:25.801151 | |
finished | 2024-11-21 16:47:27.065497 | 2024-11-21 16:47:31.796688 | |
finished | 2024-11-19 15:36:51.598946 | 2024-11-19 15:36:57.850349 | |
finished | 2024-11-14 11:04:40.956831 | 2024-11-14 11:04:47.690653 | |
finished | 2024-11-05 15:00:52.806905 | 2024-11-05 15:04:49.979648 | |
finished | 2024-10-18 11:58:45.601174 | 2024-10-18 11:58:45.905006 | |
finished | 2024-10-17 15:49:58.747644 | 2024-10-17 15:50:01.29511 | |
finished | 2024-10-15 08:00:00 | 2024-10-15 08:00:04.948226 | 2024-10-15 08:00:37.527912 |
finished | 2024-10-15 08:00:00 | 2024-10-15 08:00:02.910461 | 2024-10-15 08:01:38.208428 |
finished | 2024-10-15 08:00:00 | 2024-10-15 08:00:02.486629 | 2024-10-15 08:00:09.805449 |
finished | 2024-10-15 08:00:00 | 2024-10-15 08:00:02.384441 | 2024-10-15 08:02:17.828845 |
finished | 2024-08-06 08:00:00 | 2024-08-06 08:00:02.735349 | 2024-08-06 08:00:04.670112 |
finished | 2024-08-01 14:31:09.607592 | 2024-08-01 14:31:20.69365 | |
finished | 2024-07-17 08:00:00 | 2024-07-17 08:00:01.380056 | 2024-07-17 08:00:05.30163 |
finished | 2024-07-09 10:31:34.372538 | 2024-07-09 10:31:43.973822 | |
finished | 2024-06-25 11:09:05.333171 | 2024-06-25 11:11:45.1229 |
select job_status, scheduled_for, processing_started, processing_finished from jobs where service_id = 'd6aa2c68-a2d9-4437-ab19-3ae8eb202553
' and job_status != 'cancelled' order by 3 desc limit 35
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmmmmmmmmmmm the problem is there isn't an index on processing_started
, so there isn't a way of picking them out without a table scan which might take ages 😢
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
select count(*) from jobs;
count |
---|
2255339 |
Regardless of how you select
them, is it even safe to update that many in one go?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to pg_stats
this column's null_frac
(the proportion of values that are null) is 0.004, so there are really very few and I could probably add a second statement catching those that are null and make the timestamp condition based on another column.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(perhaps this is all overthinking anyway because this clause has an extremely low selectivity meaning a table scan isn't much worse)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rows where processing_started
is null
will probably have a job_status
of cancelled
so will be caught by your job_status == "finished"
clause anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bit more analysis because I’m not sure you’re getting the right thing from pg_stats
, or you’re looking at the wrong column
Column | null |
not null |
---|---|---|
scheduled_for |
91% | 9% |
processing_started |
0.5% | 99.5% |
select job_status, count(*) from jobs
where processing_started is null
group by 1 order by 2 desc;
job_status | count |
---|---|
cancelled | 9149 |
scheduled | 267 |
sent to dvla | 103 |
pending | 27 |
sending limits exceeded | 9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think I was talking about processing_started
having a null_frac
of 0.004
which is ~ 0.5%. Not totally sure what my point was thereon though..
this updates all "finished" jobs more than 24h since scheduled_for to the new "finished all notifications created" status. why 24h? 24h is how far back the check-for-missing-rows-in-completed-jobs celery task will look and so anything older than that won't get converted to "finished all notifications created" automatically. there *shouldn't* be any jobs that make it to 24h without all rows or being moved to a failed state by then anyway...
417eefb
to
07aee6e
Compare
A follow-up to #4323
This updates all
finished
jobs more than 24h sincescheduled_for
to the newfinished all notifications created
status. why 24h? 24h is how far back thecheck-for-missing-rows-in-completed-jobs
celery task will look and so anything older than that won't get converted tofinished all notifications created
automatically.There shouldn't be any jobs that make it to 24h without all rows or being moved to a failed state by then anyway...