Otakuverse is an app that uses AI to detect and translate text in manga panels automatically. It's being build on top of Spheron's decentralized GPU network (both for model training and inference), which makes it fast, reliable, and accurate. Plus, it keeps the original art style intact! π
- AI-Powered Text Detection: Automatically detects text bubbles in manga panels using YOLOv8, translates them to English, and in-paints them on the image.
- Multiple Translation Options: Supports various translation methods including Google, HuggingFace, Baidu, and Bing
- Font Customization: Multiple manga-style fonts for natural-looking translations
- Responsive UI: Beautiful, modern interface built with Next.js and Tailwind CSS
assets/
: Contains project-related static assetsinference-server/
: Python-based backend for image processing and inference- Includes bubble detection, text addition, and translation modules
training-pipeline/
: Machine learning model training resourcesweb-app/
: Next.js frontend application- Configured with Tailwind CSS and TypeScript
- Node.js (v18 or higher)
- Python 3.10+
- Docker (optional)
- Clone the repository
git clone https://github.com/yourusername/otakuverse.git
cd otakuverse
- Install frontend dependencies
cd web-app
pnpm install
- Install backend dependencies
cd ../inference-server
pip install -r requirements.txt
NOTE: You might want to do this in a virtual environment
- Configure environment variables
cp .env.example .env
# Edit .env with your configuration
- Start the frontend
cd web-app
pnpm dev
- Start the inference server
cd inference-server
python app.py
The web app will be available at http://localhost:3000
and the inference server at http://localhost:5000
or the value in the INFERENCE_SERVER_URL
environment variable.
Otakuverse is built with a modern tech stack:
- Frontend: Next.js, TailwindCSS, Framer Motion
- Backend: Flask, OpenCV, PyTorch
- AI Models: YOLOv8 for text detection, Various translation APIs
- Infrastructure: Spheron Network for decentralized training and inference
The model used to identify speech bubbles is YOLOv8. Using it directly doesn't give you good results as they are trained on a general dataset. So, we have trained our own dataset and fine-tuned the model for manga text detection.
This is done by fine-tuning the model on our own dataset. The dataset is a collection of manga images with speech bubbles annotated with the text inside them. The model is then trained on this dataset to identify the text inside the speech bubbles.
Here is the data:
- Dataset: Manga Speech Bubbles Dataset
The web app is built using Next.js and TailwindCSS. It is a simple website the handles the mange file/image upload and sends it to the inference server for processing.
This the core component of the application. It is a Flask server that receives the image from the web app and processes it using OpenCV, PyTorch and various translation APIs.
The pipeline is setup as follows:
- Receive image from web app
- Detect text bubbles using YOLOv8
- Translate text using various translation APIs
- In-paint translated text on the image
- Send processed image back to web app
For the local deployment, you can refer the instructions given in the previous setions.
The following steps will guide you through the process of deploying the inference server on the Spheron Network:
curl -sL1 https://sphnctl.sh | bash
After installation, verify the installation by using a simple command to check the Spheron version:
sphnctl version
This wallet will be used to pay for the usage of compute on the Spheron Network.
sphnctl wallet create --name <your-wallet-name>
Replace with your desired wallet name. Here is an example of how the result will look:
path: /path/to/spheron/primary.json
address: 0x3837215Cc8701C99C1A496B6fB9a715BFAd65262
secret: xxxxxxxx
mnemonic: water vicious naive nurse sample armed exit crazy game eagle blood woman
Make sure to securely save the mnemonic phrase and key secret provided.
You will need some token to deploy on Spheron. Visit the Spheron Faucet to obtain test tokens for deployment. After receiving the tokens, you can check your wallet balance with:
sphnctl wallet balance --token USDT
Here is an example of how the result will look:
Current ETH balance: 0.02993669528
Total USDT balance: 35
Deposited USDT balance
unlocked: 14.3307
locked: 1e-06
Note: You might have locked tokens. You can unlock them with:
Deposit USDT to your escrow wallet for deployment:
sphnctl payment deposit --amount 15 --token USDT
The deployment is a concept in the Spheron Network where you can request compute resources from the Network and use them for the inference server. A deployment can be created using the dashboard or the Protocol CLI and the Infrastructure Composition Language (ICL).
The deployement configuration can be found here.
The deployment file refers to a public image - ghcr.io/shubham-rasal/inference-server:latest
which is already been setup with a fine-tuned YOLOv8 model trained before.
To create the deployment, we will need to follow the following steps:
- Go to inference-server directory
cd inference-server
- Run the following command to create the deployment
sphnctl deployment create deploy.yml
Here is an example of how the result will look:
Validating SDL configuration.
SDL validated.
Sending configuration for provider matching.
Deployment order created: 0x1ae69a3f63cf241495c3b91db620b72625bffd8b08afd0691309ca63a4773368
Waiting for providers to bid on the deployment order...
Bid found.
Order matched successfully.
Deployment created using wallet 0x3837215Cc8701C99C1A496B6fB9a715BFAd65262
lid: 2866
provider: 0x5Ed271e74ff9b6aB90A7D18B7f4103D6ad361D2b
agreed price per hour: 0.3027243318506784
Sending manifest to provider...
Deployment manifest sent, waiting for acknowledgment.
Deployment is finished.
Note: Sometimes the deployment might fail as the exact configuration might not match the provider's requirements. In that case, you can try again with a different configuration. Just make sure to include atleast one GPU and CPU unit.
- Fetch Deployment Details
To fetch your deployment / lease details, you need to run this command to fetch it:
sphnctl deployment get --lid [your-lid]
Here is an example of how the result will look:
Status of the deployment ID: 2866
Deployment on-chain details:
Status: Matched
Provider: 0x5Ed271e74ff9b6aB90A7D18B7f4103D6ad361D2b
Price per hour: 0.3027243318506784
Start time: 2024-12-12T06:18:38Z
Remaining time: 55min, 25sec
Services running:
py-cuda
URL: []
Ports:
- provider.gpu.gpufarm.xyz:32674 -> 8888 (TCP)
Replicas: 1/1 available, 1 ready
Host URI: provider.gpu.gpufarm.xyz
Region: us-central
IPs:
This will contain URL to access the deployment server, all the assigned ports and the URI to access it. With this you can check your deployment status.
Add the URL to the NEXT_PUBLIC_INFERENCE_SERVER_URL
environment variable in the .env
file of the web-app directory.
For example, here is how the .env
file should look like:
NEXT_PUBLIC_INFERENCE_SERVER_URL=http://provider.gpu.gpufarm.xyz:31707
Congratulations! You have successfully deployed the inference server on the Spheron Network.
If you want to train the model on your own dataset, you can follow the steps below.
The training pipeline is relatively straight forward with as it is a YOLOv8 model. You can find the training pipeline notebook in the training-pipeline directory.
The pipeline is based on the YOLOv8 Quick Start by Ultralytics.
To train this model, we will need a GPU which can be obtained from Spheron Network.
To train the model, the first few steps are common to the deployment of the inference server :
Refer here
Refer here
Refer here
The one we will base our training deployment can be found here. It is a simple Jupyter notebook that uses PyTorch to train a YOLOv8 model.
To create the deployment, we will need to follow the following steps:
- Go to training-pipeline directory
cd training-pipeline/model
- Run the following command to create the deployment
sphnctl deployment create train.yml
Note: Sometimes the deployment might fail as the exact configuration might not match the provider's requirements. In that case, you can try again with a different configuration. Just make sure to include atleast one GPU and CPU unit.
- Fetch Deployment Details
To fetch your deployment / lease details, you need to run this command to fetch it:
sphnctl deployment get --lid [your-lid]
This will contain URL to access the deployment server, all the assigned ports and the URI to access it. With this you can check your deployment status.
- Setup the Notebook Environment
You can access your notebook environment at the url returned by the previous command. You can setup the environment by following the instructions in the training-pipeline directory.
We welcome contributions! Please follow these steps:
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature
) - Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Built with β€οΈ using Spheron Network
- YOLOv8 for object detection
Note: This is a submission for the Spheron Network Bounty Program. The project demonstrates the capabilities of Spheron's decentralized infrastructure for training and hosting AI-powered applications.