-
-
Notifications
You must be signed in to change notification settings - Fork 6k
Description
Description
When I updated from 1.16.5 to 1.17.0, the main page of gitea (the user dashboard, hosting the contribution activity thing, some recent events and the repositories list) started to take a very long time to load.
It only takes a long time (80 - 100 s) if I didn't load it recently. If I waited for it to load, and then reloaded it, it would load quickly.
While I'm waiting for the dashboard to load, I see both increased database process CPU load and increased iowait.
I did not update the database server at the time of updating Gitea.
The logs were made slightly after restarting Gitea. I've waited with opening the dashboard until repository indexing finished.
Gitea Version
v1.17.3
Can you reproduce the bug on the Gitea demo site?
No
Log Gist
https://gist.github.com/mpeter50/1fa49a4e9c9f05536baedef0668e8c92
Screenshots
No response
Git Version
2.36.2
Operating System
Raspbian
How are you running Gitea?
In self-built Docker container, built according to the Docker.rootless Dockerfile.
Database
MySQL
Activity
silverwind commentedon Oct 26, 2022
#21045 may be related, but also maybe not as I didn't see such massive load times, maybe an 200-400ms extra for the heatmap data.
Do you also see it on https://yourhost/youruser?tab=activity?
mpeter50 commentedon Oct 26, 2022
No, this page loaded pretty quickly, along with the heatmap.
wxiaoguang commentedon Oct 27, 2022
Check your
action
table if it is large, it could be emptied if you do not need the activity logs.Maybe related to:
action
table may cause 500 error for home page #18666and more
mpeter50 commentedon Oct 27, 2022
Do you mean the mentioned database table in the database server, or somewhere else?
Also, this is about the section marked on the following picture, right?
Oh I wanted to ask about it for some time, but always forgot it. I would really like if this could be filtered, for example so that commit sync events are not shown, because they happen so frequently (this is not a problem) that they hide everything else (this is a problem).
But, that is out of scope for this issue, I'll look up if theres a similar feature request.
So well, yeah, I don't need most of these.
But then also, on 1.16.5 it did not take much time to load this page.
I'm not familiar with the codebase, but could you please take a quick look if you have modified the action query in a way that is now less efficient?
lunny commentedon Oct 29, 2022
It's very strange there is no router log in your file.
mpeter50 commentedon Oct 29, 2022
That is because the log I made was made by a non-default logger, developer_debug here:
Would having router logs help? If so, let me know and I'll make a new log file that also includes the router log.
lunny commentedon Oct 30, 2022
Yes, router log will be helpful
mpeter50 commentedon Oct 30, 2022
I'm having a hard time in figuring out how to set up the router logger to log into the developer_debug file logger.
Could you please help with it?
I'll try to describe what did I try. After every change, I used the "Log Configuration" section of the
https://myinstance.lan/admin/config
page to check if Gitea has started using my logger, and also I checked the log files in case they included router logs after loading the mentioned page.I used these resources to read about how to configure logging:
This is what I had when creating the log gist linked in the issue:
The following is how I tried to change the above.
I tried to configure the router logger of the developer_debug sublogger, so I added this new section, and restarted Gitea:
According to Gitea (https://myinstance.lan/admin/config, Log Configuration) this did not change or add a new router logger.
Then I thought maybe the problem is that I didn't define the target (where should this be output to), so I added
ROUTER = log.developer_debug
to the above section in the hopes that it will be combined with that, which now looked like this:Later I started to suspect this is the same ROUTER as what can be found directly in [log].
Still no change, so I thought maybe this logger isn't detected by Gitea at all, and remembered that I had to add even my developer_debug sublogger to [log].MODE.
So I tried configuring [log].MODE so that the router logger also logs to this new sublogger:
With this, now I see that a new Router logger is visible in the log configuration (https://myinstance.lan/admin/config, Log Configuration), but I still don't see router logs in this log file.
However I think I see now that this variable ([log].ROUTER) is key to my sublogger being detected, and that it is similar to [log].MODE, and probably [log].ACCESS and possibly others work this way too.
Then went back to this: https://docs.gitea.io/en-us/config-cheat-sheet/#router-log-log
Noticed I also need to set
DISABLE_ROUTER_LOG = false
in[log]
for the [log].ROUTER to be taken into account, so did that. Restarted Gitea, and still no difference.Later I reverted this by commentint out the line, as I no longer know how to interpret this. I don't know anymore if this will disable router logging totally, or if it will make the [log].ROUTER be taken into account.
I had a hard time in setting up the developer_debug logger too, but I did manage that.
However I don't know what am I doing wrong. The docs on logging are confusing me. In desperation I redeclared the MODE and FILE_PATH for this sub-sublogger, but this just doesn't seem to work for some reason.
This is all of the configuration that I now have related to logging:
mpeter50 commentedon Oct 30, 2022
I think I got it.
Some important things I missed:
[log.name.xy]
where xy is the name of a logger, is how loggers are actually configured for a specific output named 'name'[log].ROUTER
needs to have a list of all subloggers that wants to receive router events, if it differs fromconsole
only[log.name.router]
always needs to have MODE specified, as it will not inherit it from[log.name]
FILE_PATH
-s can be shared between loggers of outputs, but this won't be inherited either, as it has a default value@lunny here is a new log gist with router logs. It includes logs from recent startup of Gitea. Last event was made after the user dashboard finished loading.
lunny commentedon Oct 31, 2022
lunny commentedon Jan 18, 2023
Could you test v1.18.1 resolved the problem?
mpeter50 commentedon Jan 19, 2023
It doesn't yet. Page load time of the main page was 157949ms according to the footer.
lunny commentedon Jan 19, 2023
How many records in your
action
table? Could you give more context?15 remaining items
action
table #23532Improve indices for `action` table (#23532)
mpeter50 commentedon Apr 11, 2023
I'm sorry for the late reply!
Today I have updated my instance to 1.19.0, and the latency is still very similar as before:
According to the database server, I had the following indexes on the action table at this time:
So then I created the suggested index:
And the list of indexes now look like this:
However, the dashboard takes around the same amount of time to load as before:
rsq424 commentedon Apr 13, 2023
I think you can look at gitea.log and find out the longest time SQL AND explain "SQL"
mpeter50 commentedon Apr 16, 2023
I have re-enabled sql logging, and uploaded the logs here that were gathered the next time I tried loading the dashboard.
The longest SQL statement was this one, it is on the 10th line in the gist:
I'm sorry but I did not understand the second part of your comment.
What should I explain?
lunny commentedon Apr 16, 2023
What's your Gitea instance version and how many records on your Gitea database table
action
?wxiaoguang commentedon Apr 16, 2023
I guess these information is in #21611 (comment)
If it's a MySQL index problem, it's possible try to use
analyze table action
to reset the table statistics data, or modify code to "FORCE INDEX".Otherwise, if the MySQL server is slow (eg: hardware limitation), at the moment it's difficult to optimize this SQL (the "activity" mechanism would be totally rewritten). I used to truncate my
action
table regularly ......mpeter50 commentedon Apr 16, 2023
Thats right, Gitea 1.19.0 and ~459340 action records according to the admin dashboard.
The
analyze table action
command while connected to the database did not help on the problem.This is probably the case, while waiting for the dashboard to load I can see in a resource monitor that the mysqld process is waiting for io most of its time.
I'll look into truncating the action table. I guess it is normally just dropping and recreating it, but I would prefer to only delete the
<org> synced commits to <branch> at <org>/<repo> from mirror
type of actions.Also, a related question: is there currently a mechanism for ignoring (as in not saving to db) certain types of actions?
For example, those I referred to above..
My action table mostly consists of these, but I don't need them, and actually these kind of bothered me on the user dashboard, but only because they squeeze out more interesting events.
mpeter50 commentedon Apr 16, 2023
Update: the admin dashboard now counts ~547767 actions, probably the analyze table command recounted them.
wxiaoguang commentedon Apr 16, 2023
I guess you are using HDD/Slow storage and/or MySQL's memory is pretty low.
That's
ActionMirrorSyncPush // 18
,op_type=18
IIRC, no ...... unfortunately
lunny commentedon Apr 16, 2023
Or you can enable
[cron.delete_old_actions]
and changeOLDER_THAN
to a short value.mpeter50 commentedon Apr 16, 2023
Yes, it is a portable HDD that is shared with other services and even the OS.
But for the action query ~200 seconds delay no other load measurable load is needed.
Thanks a lot, this will be helpful!
I'll consider that too, thanks!
mpeter50 commentedon Apr 16, 2023
For anyone (and future me) reading this in the future:
BEFORE EXECUTING THESE COMMANDS you probably want to shut down Gitea and make a backup of the database.
Deleting rows with op_type=18 ("synced from" actions, as seen on the user dashboard) only helps if you are a data hoarder and you mirror dozens of active repositories.
According to
SELECT COUNT(id) FROM action WHERE op_type = 18;
, I had 352867 "synced from" records.After deleting those with the
DELETE FROM action WHERE op_type=18;
command, 227443 rows remained in the table, and the dashboard now loads under 10 seconds, at first 9300 ms, but most times its lower (e.g. 3400), and also if you load it once it will be faster for a few minutes (maybe until new actions are created?).If this delay is still too much, you can check what other types of actions are amassed in the table with
SELECT COUNT(id),op_type FROM action GROUP BY op_type;
. The query will return op_type IDs and their amounts, but you can look them up here.In my case the majority of the remaining actions are of type
ActionMirrorSyncCreate
andActionMirrorSyncDelete
, which are probably "synced new reference" and "synced and deleted reference" actions. It's on you whether you keep them, AFAIK no rows of this table are important for Gitea to work correctly.Be aware though that you will need to repeat this regularly, as "synced from" actions are still created at every automatic mirror sync when changes are pulled.
How often? You'll notice it when you need to do it, but probably you wont need to do it more than once a few months or a year.
If you change the mirrors to sync less often (default is 8 hours), these will pile up slower.
You can also set up automatic deletion of old actions as Lunny said above, but this will probably also mess with your and other users' activity graph (on the profile page, as seen below) if you set it to a value too low.
This applies especially if you delete all actions.