Skip to content

Commit c71aed4

Browse files
author
zjb
committed
feat: 漫画数据爬取提交
1 parent 49fd197 commit c71aed4

File tree

10 files changed

+2199
-2
lines changed

10 files changed

+2199
-2
lines changed

.gitignore

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
# Logs
2+
logs
3+
*.log
4+
npm-debug.log*
5+
yarn-debug.log*
6+
yarn-error.log*
7+
lerna-debug.log*
8+
9+
# Diagnostic reports (https://nodejs.org/api/report.html)
10+
report.[0-9]*.[0-9]*.[0-9]*.[0-9]*.json
11+
12+
# Runtime data
13+
pids
14+
*.pid
15+
*.seed
16+
*.pid.lock
17+
.DS_Store
18+
.vscode/**
19+
20+
# Dependency directories
21+
node_modules/
22+
dist
23+
src/pages/test-page
24+
25+
# TypeScript cache
26+
*.tsbuildinfo
27+
28+
# Optional npm cache directory
29+
.npm
30+
31+
# Optional eslint cache
32+
.eslintcache
33+
34+
35+
# Output of 'npm pack'
36+
*.tgz
37+
dist.zip
38+
39+
y
40+
41+
# Vite
42+
.vite/
43+
44+
# Electron-Forge
45+
out/
46+
47+
# xlsx
48+
.~lang.xlsx
49+
Makefile
50+
51+
# User-Define
52+
package-lock.json
53+
!.vscode/settings.json
54+
55+
# scripts
56+
scripts
57+
backup-repo.git

README.md

Lines changed: 43 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,43 @@
1-
# nodejs-spider
2-
日常爬虫需求集合
1+
# 漫画网站爬取
2+
3+
- https://m.kuaikanmanhua.com/tag/0?region=0&sort=1
4+
5+
- 技术栈 puppeteer, nodemon 做监控 node 服务
6+
7+
## 安装依赖
8+
9+
### npm install
10+
11+
## 运行项目
12+
13+
### npm run dev
14+
15+
## 说明
16+
17+
中途不可退出,等待浏览器操作即可。
18+
19+
## 移动端网页数据抓取
20+
21+
### npm run dev:mobile
22+
23+
### 先清空 progressFile
24+
25+
### 10.27 chore
26+
27+
1. 每个章节下的图片数量应保持在 10-20 张
28+
2. 章节数不少于 10
29+
30+
#### fix:
31+
32+
- 检查上一个目录中的文件数
33+
- 如果上一个目录的文件数少于 20,则先填充上一个目录,并从当前组中取出图片
34+
- 如果当前组仍有剩余图片,则创建新目录并保存
35+
- 将剩余图片存入新目录
36+
37+
## 漫画数据爬取
38+
39+
### code
40+
41+
```
42+
pnpm dev:comic
43+
```

0 commit comments

Comments
 (0)