Skip to content

Commit debef24

Browse files
authored
Merge branch 'main' into lstein/feat/multi-gpu
2 parents e57809e + 6b98dba commit debef24

File tree

32 files changed

+529
-423
lines changed

32 files changed

+529
-423
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212

1313
Invoke is a leading creative engine built to empower professionals and enthusiasts alike. Generate and create stunning visual media using the latest AI-driven technologies. Invoke offers an industry leading web-based UI, and serves as the foundation for multiple commercial products.
1414

15-
[Installation][installation docs] - [Documentation and Tutorials][docs home] - [Bug Reports][github issues] - [Contributing][contributing docs]
15+
[Installation and Updates][installation docs] - [Documentation and Tutorials][docs home] - [Bug Reports][github issues] - [Contributing][contributing docs]
1616

1717
<div align="center">
1818

docs/features/TRAINING.md

Lines changed: 2 additions & 274 deletions
Original file line numberDiff line numberDiff line change
@@ -4,278 +4,6 @@ title: Training
44

55
# :material-file-document: Training
66

7-
# Textual Inversion Training
8-
## **Personalizing Text-to-Image Generation**
7+
Invoke Training has moved to its own repository, with a dedicated UI for accessing common scripts like Textual Inversion and LoRA training.
98

10-
You may personalize the generated images to provide your own styles or objects
11-
by training a new LDM checkpoint and introducing a new vocabulary to the fixed
12-
model as a (.pt) embeddings file. Alternatively, you may use or train
13-
HuggingFace Concepts embeddings files (.bin) from
14-
<https://huggingface.co/sd-concepts-library> and its associated
15-
notebooks.
16-
17-
## **Hardware and Software Requirements**
18-
19-
You will need a GPU to perform training in a reasonable length of
20-
time, and at least 12 GB of VRAM. We recommend using the [`xformers`
21-
library](../installation/070_INSTALL_XFORMERS.md) to accelerate the
22-
training process further. During training, about ~8 GB is temporarily
23-
needed in order to store intermediate models, checkpoints and logs.
24-
25-
## **Preparing for Training**
26-
27-
To train, prepare a folder that contains 3-5 images that illustrate
28-
the object or concept. It is good to provide a variety of examples or
29-
poses to avoid overtraining the system. Format these images as PNG
30-
(preferred) or JPG. You do not need to resize or crop the images in
31-
advance, but for more control you may wish to do so.
32-
33-
Place the training images in a directory on the machine InvokeAI runs
34-
on. We recommend placing them in a subdirectory of the
35-
`text-inversion-training-data` folder located in the InvokeAI root
36-
directory, ordinarily `~/invokeai` (Linux/Mac), or
37-
`C:\Users\your_name\invokeai` (Windows). For example, to create an
38-
embedding for the "psychedelic" style, you'd place the training images
39-
into the directory
40-
`~invokeai/text-inversion-training-data/psychedelic`.
41-
42-
## **Launching Training Using the Console Front End**
43-
44-
InvokeAI 2.3 and higher comes with a text console-based training front
45-
end. From within the `invoke.sh`/`invoke.bat` Invoke launcher script,
46-
start training tool selecting choice (3):
47-
48-
```sh
49-
1 "Generate images with a browser-based interface"
50-
2 "Explore InvokeAI nodes using a command-line interface"
51-
3 "Textual inversion training"
52-
4 "Merge models (diffusers type only)"
53-
5 "Download and install models"
54-
6 "Change InvokeAI startup options"
55-
7 "Re-run the configure script to fix a broken install or to complete a major upgrade"
56-
8 "Open the developer console"
57-
9 "Update InvokeAI"
58-
```
59-
60-
Alternatively, you can select option (8) or from the command line, with the InvokeAI virtual environment active,
61-
you can then launch the front end with the command `invokeai-ti --gui`.
62-
63-
This will launch a text-based front end that will look like this:
64-
65-
<figure markdown>
66-
![ti-frontend](../assets/textual-inversion/ti-frontend.png)
67-
</figure>
68-
69-
The interface is keyboard-based. Move from field to field using
70-
control-N (^N) to move to the next field and control-P (^P) to the
71-
previous one. <Tab> and <shift-TAB> work as well. Once a field is
72-
active, use the cursor keys. In a checkbox group, use the up and down
73-
cursor keys to move from choice to choice, and <space> to select a
74-
choice. In a scrollbar, use the left and right cursor keys to increase
75-
and decrease the value of the scroll. In textfields, type the desired
76-
values.
77-
78-
The number of parameters may look intimidating, but in most cases the
79-
predefined defaults work fine. The red circled fields in the above
80-
illustration are the ones you will adjust most frequently.
81-
82-
### Model Name
83-
84-
This will list all the diffusers models that are currently
85-
installed. Select the one you wish to use as the basis for your
86-
embedding. Be aware that if you use a SD-1.X-based model for your
87-
training, you will only be able to use this embedding with other
88-
SD-1.X-based models. Similarly, if you train on SD-2.X, you will only
89-
be able to use the embeddings with models based on SD-2.X.
90-
91-
### Trigger Term
92-
93-
This is the prompt term you will use to trigger the embedding. Type a
94-
single word or phrase you wish to use as the trigger, example
95-
"psychedelic" (without angle brackets). Within InvokeAI, you will then
96-
be able to activate the trigger using the syntax `<psychedelic>`.
97-
98-
### Initializer
99-
100-
This is a single character that is used internally during the training
101-
process as a placeholder for the trigger term. It defaults to "*" and
102-
can usually be left alone.
103-
104-
### Resume from last saved checkpoint
105-
106-
As training proceeds, textual inversion will write a series of
107-
intermediate files that can be used to resume training from where it
108-
was left off in the case of an interruption. This checkbox will be
109-
automatically selected if you provide a previously used trigger term
110-
and at least one checkpoint file is found on disk.
111-
112-
Note that as of 20 January 2023, resume does not seem to be working
113-
properly due to an issue with the upstream code.
114-
115-
### Data Training Directory
116-
117-
This is the location of the images to be used for training. When you
118-
select a trigger term like "my-trigger", the frontend will prepopulate
119-
this field with `~/invokeai/text-inversion-training-data/my-trigger`,
120-
but you can change the path to wherever you want.
121-
122-
### Output Destination Directory
123-
124-
This is the location of the logs, checkpoint files, and embedding
125-
files created during training. When you select a trigger term like
126-
"my-trigger", the frontend will prepopulate this field with
127-
`~/invokeai/text-inversion-output/my-trigger`, but you can change the
128-
path to wherever you want.
129-
130-
### Image resolution
131-
132-
The images in the training directory will be automatically scaled to
133-
the value you use here. For best results, you will want to use the
134-
same default resolution of the underlying model (512 pixels for
135-
SD-1.5, 768 for the larger version of SD-2.1).
136-
137-
### Center crop images
138-
139-
If this is selected, your images will be center cropped to make them
140-
square before resizing them to the desired resolution. Center cropping
141-
can indiscriminately cut off the top of subjects' heads for portrait
142-
aspect images, so if you have images like this, you may wish to use a
143-
photoeditor to manually crop them to a square aspect ratio.
144-
145-
### Mixed precision
146-
147-
Select the floating point precision for the embedding. "no" will
148-
result in a full 32-bit precision, "fp16" will provide 16-bit
149-
precision, and "bf16" will provide mixed precision (only available
150-
when XFormers is used).
151-
152-
### Max training steps
153-
154-
How many steps the training will take before the model converges. Most
155-
training sets will converge with 2000-3000 steps.
156-
157-
### Batch size
158-
159-
This adjusts how many training images are processed simultaneously in
160-
each step. Higher values will cause the training process to run more
161-
quickly, but use more memory. The default size will run with GPUs with
162-
as little as 12 GB.
163-
164-
### Learning rate
165-
166-
The rate at which the system adjusts its internal weights during
167-
training. Higher values risk overtraining (getting the same image each
168-
time), and lower values will take more steps to train a good
169-
model. The default of 0.0005 is conservative; you may wish to increase
170-
it to 0.005 to speed up training.
171-
172-
### Scale learning rate by number of GPUs, steps and batch size
173-
174-
If this is selected (the default) the system will adjust the provided
175-
learning rate to improve performance.
176-
177-
### Use xformers acceleration
178-
179-
This will activate XFormers memory-efficient attention. You need to
180-
have XFormers installed for this to have an effect.
181-
182-
### Learning rate scheduler
183-
184-
This adjusts how the learning rate changes over the course of
185-
training. The default "constant" means to use a constant learning rate
186-
for the entire training session. The other values scale the learning
187-
rate according to various formulas.
188-
189-
Only "constant" is supported by the XFormers library.
190-
191-
### Gradient accumulation steps
192-
193-
This is a parameter that allows you to use bigger batch sizes than
194-
your GPU's VRAM would ordinarily accommodate, at the cost of some
195-
performance.
196-
197-
### Warmup steps
198-
199-
If "constant_with_warmup" is selected in the learning rate scheduler,
200-
then this provides the number of warmup steps. Warmup steps have a
201-
very low learning rate, and are one way of preventing early
202-
overtraining.
203-
204-
## The training run
205-
206-
Start the training run by advancing to the OK button (bottom right)
207-
and pressing <enter>. A series of progress messages will be displayed
208-
as the training process proceeds. This may take an hour or two,
209-
depending on settings and the speed of your system. Various log and
210-
checkpoint files will be written into the output directory (ordinarily
211-
`~/invokeai/text-inversion-output/my-model/`)
212-
213-
At the end of successful training, the system will copy the file
214-
`learned_embeds.bin` into the InvokeAI root directory's `embeddings`
215-
directory, using a subdirectory named after the trigger token. For
216-
example, if the trigger token was `psychedelic`, then look for the
217-
embeddings file in
218-
`~/invokeai/embeddings/psychedelic/learned_embeds.bin`
219-
220-
You may now launch InvokeAI and try out a prompt that uses the trigger
221-
term. For example `a plate of banana sushi in <psychedelic> style`.
222-
223-
## **Training with the Command-Line Script**
224-
225-
Training can also be done using a traditional command-line script. It
226-
can be launched from within the "developer's console", or from the
227-
command line after activating InvokeAI's virtual environment.
228-
229-
It accepts a large number of arguments, which can be summarized by
230-
passing the `--help` argument:
231-
232-
```sh
233-
invokeai-ti --help
234-
```
235-
236-
Typical usage is shown here:
237-
```sh
238-
invokeai-ti \
239-
--model=stable-diffusion-1.5 \
240-
--resolution=512 \
241-
--learnable_property=style \
242-
--initializer_token='*' \
243-
--placeholder_token='<psychedelic>' \
244-
--train_data_dir=/home/lstein/invokeai/training-data/psychedelic \
245-
--output_dir=/home/lstein/invokeai/text-inversion-training/psychedelic \
246-
--scale_lr \
247-
--train_batch_size=8 \
248-
--gradient_accumulation_steps=4 \
249-
--max_train_steps=3000 \
250-
--learning_rate=0.0005 \
251-
--resume_from_checkpoint=latest \
252-
--lr_scheduler=constant \
253-
--mixed_precision=fp16 \
254-
--only_save_embeds
255-
```
256-
257-
## Troubleshooting
258-
259-
### `Cannot load embedding for <trigger>. It was trained on a model with token dimension 1024, but the current model has token dimension 768`
260-
261-
Messages like this indicate you trained the embedding on a different base model than the currently selected one.
262-
263-
For example, in the error above, the training was done on SD2.1 (768x768) but it was used on SD1.5 (512x512).
264-
265-
## Reading
266-
267-
For more information on textual inversion, please see the following
268-
resources:
269-
270-
* The [textual inversion repository](https://github.com/rinongal/textual_inversion) and
271-
associated paper for details and limitations.
272-
* [HuggingFace's textual inversion training
273-
page](https://huggingface.co/docs/diffusers/training/text_inversion)
274-
* [HuggingFace example script
275-
documentation](https://github.com/huggingface/diffusers/tree/main/examples/textual_inversion)
276-
(Note that this script is similar to, but not identical, to
277-
`textual_inversion`, but produces embed files that are completely compatible.
278-
279-
---
280-
281-
copyright (c) 2023, Lincoln Stein and the InvokeAI Development Team
9+
You can find more by visiting the repo at https://github.com/invoke-ai/invoke-training

docs/installation/010_INSTALL_AUTOMATED.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,10 @@
1-
# Automatic Install
1+
# Automatic Install & Updates
22

3-
The installer is used for both new installs and updates.
3+
**The same packaged installer file can be used for both new installs and updates.**
4+
Using the installer for updates will leave everything you've added since installation, and just update the core libraries used to run Invoke.
5+
Simply use the same path you installed to originally.
46

5-
Both release and pre-release versions can be installed using it. It also supports install a wheel if needed.
7+
Both release and pre-release versions can be installed using the installer. It also supports install through a wheel if needed.
68

79
Be sure to review the [installation requirements] and ensure your system has everything it needs to install Invoke.
810

docs/installation/INSTALLATION.md

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,26 @@
1-
# Installation Overview
1+
# Installation and Updating Overview
22

33
Before installing, review the [installation requirements] to ensure your system is set up properly.
44

55
See the [FAQ] for frequently-encountered installation issues.
66

77
If you need more help, join our [discord] or [create an issue].
88

9-
<h2>Automatic Install</h2>
9+
<h2>Automatic Install & Updates </h2>
1010

1111
✅ The automatic install is the best way to run InvokeAI. Check out the [installation guide] to get started.
1212

13+
⬆️ The same installer is also the best way to update InvokeAI - Simply rerun it for the same folder you installed to.
14+
15+
The installation process simply manages installation for the core libraries & application dependencies that run Invoke.
16+
Any models, images, or other assets in the Invoke root folder won't be affected by the installation process.
17+
1318
<h2>Manual Install</h2>
1419

1520
If you are familiar with python and want more control over the packages that are installed, you can [install InvokeAI manually via PyPI].
1621

22+
Updates are managed by reinstalling the latest version through PyPi.
23+
1724
<h2>Developer Install</h2>
1825

1926
If you want to contribute to InvokeAI, consult the [developer install guide].

invokeai/frontend/web/package.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -89,6 +89,7 @@
8989
"react-konva": "^18.2.10",
9090
"react-redux": "9.1.0",
9191
"react-resizable-panels": "^2.0.16",
92+
"react-rnd": "^10.4.10",
9293
"react-select": "5.8.0",
9394
"react-use": "^17.5.0",
9495
"react-virtuoso": "^4.7.5",

0 commit comments

Comments
 (0)