From 8e3691b4d743d0eb6f0a8e94a5fcf096ebe5b447 Mon Sep 17 00:00:00 2001 From: Gergely Nagy Date: Thu, 28 Dec 2023 10:11:53 +0100 Subject: [PATCH] [DOCS]: Suggest using `utf8mb4_bin` for collation For MySQL, suggest using `utf8mb4_bin` for collation, rather than `utf8mb4_general_ci`, because the latter is case insensitive, and can break assumptions in various parts of Gitea. Such as branch names: branch names are case sensitive in git, but with a case insensitive collation, and the branch names stored in the database, repositories with branch names that differ in case only can lead to internal errors. This little change updates the documentation only, as a first step of addressing the problem. Signed-off-by: Gergely Nagy --- docs/content/help/faq.en-us.md | 4 ++-- docs/content/help/faq.zh-cn.md | 4 ++-- docs/content/installation/database-preparation.en-us.md | 4 ++-- docs/content/installation/database-preparation.zh-cn.md | 4 ++-- 4 files changed, 8 insertions(+), 8 deletions(-) diff --git a/docs/content/help/faq.en-us.md b/docs/content/help/faq.en-us.md index e6350936ef816..39e461c7278ce 100644 --- a/docs/content/help/faq.en-us.md +++ b/docs/content/help/faq.en-us.md @@ -385,8 +385,8 @@ Unfortunately MySQL's `utf8` charset does not completely allow all possible UTF- They created a new charset and collation called `utf8mb4` that allows for emoji to be stored but tables which use the `utf8` charset, and connections which use the `utf8` charset will not use this. -Please run `gitea doctor convert`, or run `ALTER DATABASE database_name CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;` -for the database_name and run `ALTER TABLE table_name CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;` +Please run `gitea doctor convert`, or run `ALTER DATABASE database_name CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;` +for the database_name and run `ALTER TABLE table_name CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;` for each table in the database. ## Why are Emoji displaying only as placeholders or in monochrome diff --git a/docs/content/help/faq.zh-cn.md b/docs/content/help/faq.zh-cn.md index 909ca7e5e2138..f24789e9c040d 100644 --- a/docs/content/help/faq.zh-cn.md +++ b/docs/content/help/faq.zh-cn.md @@ -389,8 +389,8 @@ SET GLOBAL innodb_large_prefix=1; 他们创建了一个名为 `utf8mb4`的字符集和校对规则,允许存储 Emoji,但使用 utf8 字符集的表和连接将不会使用它。 -请运行 `gitea doctor convert` 或对数据库运行 `ALTER DATABASE database_name CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;` -并对每个表运行 `ALTER TABLE table_name CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;`。 +请运行 `gitea doctor convert` 或对数据库运行 `ALTER DATABASE database_name CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;` +并对每个表运行 `ALTER TABLE table_name CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;`。 您还需要将`app.ini`文件中的数据库字符集设置为`CHARSET=utf8mb4`。 diff --git a/docs/content/installation/database-preparation.en-us.md b/docs/content/installation/database-preparation.en-us.md index 25b163793eeac..c4f848fcd3aa8 100644 --- a/docs/content/installation/database-preparation.en-us.md +++ b/docs/content/installation/database-preparation.en-us.md @@ -61,10 +61,10 @@ Note: All steps below requires that the database engine of your choice is instal Replace username and password above as appropriate. -4. Create database with UTF-8 charset and collation. Make sure to use `utf8mb4` charset instead of `utf8` as the former supports all Unicode characters (including emojis) beyond _Basic Multilingual Plane_. Also, collation chosen depending on your expected content. When in doubt, use either `unicode_ci` or `general_ci`. +4. Create database with UTF-8 charset and collation. Make sure to use `utf8mb4` charset instead of `utf8` as the former supports all Unicode characters (including emojis) beyond _Basic Multilingual Plane_. Also, collation chosen depending on your expected content. When in doubt, use `utf8mb4_bin`. ```sql - CREATE DATABASE giteadb CHARACTER SET 'utf8mb4' COLLATE 'utf8mb4_unicode_ci'; + CREATE DATABASE giteadb CHARACTER SET 'utf8mb4' COLLATE 'utf8mb4_bin'; ``` Replace database name as appropriate. diff --git a/docs/content/installation/database-preparation.zh-cn.md b/docs/content/installation/database-preparation.zh-cn.md index b58344133ee70..0c322a39a7088 100644 --- a/docs/content/installation/database-preparation.zh-cn.md +++ b/docs/content/installation/database-preparation.zh-cn.md @@ -59,10 +59,10 @@ menu: 根据需要替换上述用户名和密码。 -4. 使用 UTF-8 字符集和排序规则创建数据库。确保使用 `**utf8mb4**` 字符集,而不是 `utf8`,因为前者支持 _Basic Multilingual Plane_ 之外的所有 Unicode 字符(包括表情符号)。排序规则根据您预期的内容选择。如果不确定,可以使用 `unicode_ci` 或 `general_ci`。 +4. 使用 UTF-8 字符集和排序规则创建数据库。确保使用 `**utf8mb4**` 字符集,而不是 `utf8`,因为前者支持 _Basic Multilingual Plane_ 之外的所有 Unicode 字符(包括表情符号)。排序规则根据您预期的内容选择。如果不确定,可以使用 `utf8mb4_bin`。 ```sql - CREATE DATABASE giteadb CHARACTER SET 'utf8mb4' COLLATE 'utf8mb4_unicode_ci'; + CREATE DATABASE giteadb CHARACTER SET 'utf8mb4' COLLATE 'utf8mb4_bin'; ``` 根据需要替换数据库名称。