config through env vars #350

phemmer · 2014-05-09T13:04:58Z

This is a PR for #347

This adds a default env.yml file which supports all options through environment variables.
You can use this to launch a container and set any parameter without having to inject your own config file.

Example usage:

docker run \
  -e SETTINGS_FLAVOR=s3 \
  -e AWS_REGION=us-west-1 \
  -e AWS_BUCKET=my-docker-bucket \
  -e AWS_KEY='AKIASDHAF48HSKDAF' \
  -e AWS_SECRET='SAGDKFJSHDKFJHAVUSKDJFSHKLDFLSDF' \
  -e SECRET_KEY='foo'  \
  -e SEARCH_BACKEND=sqlalchemy \
  registry

You can also launch without any settings to get the dev 'flavor' which will launch local storage with sqlalchemy enabled.

I tried to get every single config parameter possible, but they are scattered around all over the place, so it is very likely I missed a few.

I also wasn't sure how to go on defaults. This bit should probably be determined and cleaned up. For example in the new env.yml, s3_encrypt defaults to true if not specified. I did this as this is the value set in config_s3.yml. But I'm not sure if we want to leave this unset and let the code handle it.
But then on some things like config_path, I think a default makes sense. So which parameters get defaults and which don't?

The other thing is the env var names. I tried to preserve the existing variable names if they were documented or used already. However the names are rather inconsistent. For example, the s3 bucket var in config_s3.yml is $AWS_BUCKET, but in the setup-configs.sh it's $S3_BUCKET. Then swift_container is $SWIFT_CONTAINER, but the rest of the vars are $OS_....

I personally think it would be best if all the vars for a storage engine were prefixed with that engine name. So engine: swift would be $SWIFT_AUTH_URL, not $OS_AUTH_URL.

Lastly I have only tested the s3, local, and dev 'flavors'. I don't have the other services to be able to test them.

phemmer · 2014-05-09T13:06:41Z

docker_registry/lib/config.py

@@ -40,7 +45,7 @@ def _replace_env(s):
        if isinstance(s, basestring) and s.startswith('_env:'):
            parts = s.split(':', 2)
            varname = parts[1]
-            vardefault = '!ENV_NOT_FOUND' if len(parts) < 3 else parts[2]
+            vardefault = None if len(parts) < 3 else parts[2]


This, and the above changes, are to treat defined but unset config parameters as undefined. This is so that if the config references an environment variable which doesn't exist, we act as if that config param weren't defined.

phemmer · 2014-05-09T13:12:40Z

I'd also be up for adding another commit to this PR to update the README. A lot of variables are missing (or are scattered around), and the supported storage engines aren't listed very well. I just want to make sure the code is proper before continuing to documentation.

wking · 2014-05-09T15:48:13Z

On Fri, May 09, 2014 at 06:04:59AM -0700, Patrick Hemmer wrote:

Clean up configuration management.

Why did you drop the 'exec' in your Dockerfile CMD? I don't see a
reason to keep the shell process around.

This is a PR for #347

DON'T MERGE YET

This adds a default env.yml file which supports all options through environment variables.
You can use this to launch a container and set any parameter without having to inject your own config file.

Example usage:
docker run \
  -e SETTINGS_FLAVOR=s3 \
  -e AWS_REGION=us-west-1 \
  -e AWS_BUCKET=my-docker-bucket \
  -e AWS_KEY='AKIASDHAF48HSKDAF' \
  -e AWS_SECRET='SAGDKFJSHDKFJHAVUSKDJFSHKLDFLSDF' \
  -e SECRET_KEY='foo'  \
  -e SEARCH_BACKEND=sqlalchemy
  registry
You can also launch without any settings to get the dev 'flavor'
which will launch local storage with sqlalchemy enabled.

I still think it would be better to just have a STORAGE variable
instead of SETTINGS_FLAVOR blocks 1. If STORAGE=s3, it doesn't
matter if there are also a bunch of swift or glance configs set,
you're not going to use them. Having a single block would also avoid
fragmentation like AWS_BUCKET and GCS_BUCKET variables for the same
boto_bucket setting. It also avoids the whole glance-swift business,
just set:

STORAGE=glance GLANCE_STORAGE_ALTERNATE=swift

I also wasn't sure how to go on defaults. This bit should probably
be determined and cleaned up. For example in the new env.yml,
s3_encrypt defaults to true if not specified. I did this as this
is the value set in config_s3.yml. But I'm not sure if we want to
leave this unset and let the code handle it. But then on some
things like config_path, I think a default makes sense. So which
parameters get defaults and which don't?

I'd add defaults everywhere we think we have a reasonable suggestion
for development setups. Then I'd add notes to the README explaining
how folks might want to tune for higher loads.

I personally think it would be best if all the vars for a storage
engine were prefixed with that engine name. So engine: swift would
be $SWIFT_AUTH_URL, not $OS_AUTH_URL.

+1. I expect this will shake up existing configs a bit, so it's
better to do a full cleanup all at once.

phemmer · 2014-05-09T16:23:13Z

Why did you drop the 'exec' in your Dockerfile CMD? I don't see a
reason to keep the shell process around.

Just an omission when cleaning it up. I can put it back, or we can go with the CMD ["/usr/local/bin/docker-registry"] syntax and avoid the shell. I would lean a bit towards exec though, as it's more flexible and resistant to breakage.

I still think it would be better to just have a STORAGE variable
instead of SETTINGS_FLAVOR blocks [1]. If STORAGE=s3, it doesn't
matter if there are also a bunch of swift or glance configs set,
you're not going to use them. Having a single block would also avoid
fragmentation like AWS_BUCKET and GCS_BUCKET variables for the same
boto_bucket setting. It also avoids the whole glance-swift business,
just set:
STORAGE=glance GLANCE_STORAGE_ALTERNATE=swift

I think I see what you're getting at.
I think that would work as long as the app doesn't do anything if a config param is set and not used. Like if storage: s3 and elliptics_addr_family is set.
I wasn't sure if this is the case, so I kept everything separate.

The other argument for keeping them separate is to have different defaults for the different storage engines. Like for local storage, storage_path defaults to /tmp/registry. But for S3, it defaults to /registry.

wking · 2014-05-09T16:32:27Z

On Fri, May 09, 2014 at 09:23:15AM -0700, Patrick Hemmer wrote:

I think that would work as long as the app doesn't do anything if a
config param is set and not used. Like if storage: s3 and
elliptics_addr_family is set. I wasn't sure if this is the case,
so I kept everything separate.

$ git grep -c elliptics_
README.md:20
docker_registry/storage/ellipticsbackend.py:1

so at least for the elliptics backend, you don't have to worry about
accidentally using those settings unless you're using that backend.

The other argument for keeping them separate is to have different
defaults for the different storage engines. Like for local storage,
storage_path defaults to /tmp/registry. But for S3, it defaults
to /registry.

I think that small gain is not worth the added layer of config-syntax
complexity. I'd be happy to make the default STORAGE_PATH
/docker-registry in all cases, and leave it up to folks who don't like
that to override as they see fit. In cases where a globally-workable
default can not be found (maybe an empty set?), a note in the README
saying “If you're using storage backend $a, make sure to override
environment variable $b with a suiltable value (e.g. $c). The default
$d doesn't work because…” should be sufficient.

dmp42 · 2014-05-09T17:00:35Z

@wking I'm concerned about the extent of this PR.
Getting rid of setup-configs.sh and harmonizing the use of env vars is one thing - introducing breaking changes is another.

I certainly agree both are needed, just probably not in the same timeframe.
I would definitely be glad to merge the first ASAP, but killing SETTINGS_FLAVOR and changing the way storages are selected will probably have to wait.

That being said, this is certainly good work, I just want to make sure that we are not breaking things unexpectedly for people using this, and that breaking changes are introduced timely.

What do you think?

Also pinging @shin- and @samalba for their opinions

Best regards

wking · 2014-05-09T17:19:33Z

On Fri, May 09, 2014 at 10:00:35AM -0700, Mangled Deutz wrote:

Getting rid of setup-configs.sh

You can probably do this without much trouble, and you can certainly
land that now and put off a more thorough cleanup until later.

and harmonizing the use of env vars is one thing - introducing
breaking changes is another.

But I expect you won't be able to “harmonize the use of env vars”
without requiring users to adapt to the new variables. I think the
blind-user count is still low enough that cutting 0.7.0 with release
notes that describe suggested environment variable mappings:

AWS_ACCESS_KEY_ID → S3_ACCESS_KEY
AWS_SECRET_KEY → S3_SECRET_KEY
…

should cover almost all users, without needing code in the registry to
support both config flavors while emitting deprecation warnings.

I would definitely be glad to merge the first ASAP, but killing
SETTINGS_FLAVOR and changing the way storages are selected will
probably have to wait.

I don't see a particular rush in either case, but breaking user-facing
settings is best done before the user base grows too large, and
includes too many people who will blindly upgrade without reading
release notes. I'd be happy to write up a patch killing
SETTINGS_FLAVOR (based on this PR) if you'd like to kick it around. I
don't expect it to be too intrusive.

dmp42 · 2014-05-09T17:46:56Z

@wking that certainly makes a lot of sense, and we definitely agree here.

Now, I think that what we don't want for the future is (I'm sure you will agree as well with that):

tons of pending PR (either because they were too ambitious, or not timely)
PR that bit-rot (trying to do too much)
low-quality, broken code lying around (cleaning-up is not a sexy task...)
breaking changes being slipped through for lack of planification

Unfortunately, this is (partly) where we are right now.
I'm not blaming the contributors there. Actually, I'm not blaming anyone at all :-)
I'm just saying there is room for improvement for us here.

One way to improve might be to ensure that:

low hanging fruits and easy changes with clear benefits gets driven and merged fast (eg: getting rid of setup-configs.sh for example, which is responsible for two open issues and a lot of confusion)
more ambitious (API breaking changes) are concerted, clearly identified as such and part of a milestone plan
we have people willing to test ambitious PR exhaustively (ain't sexy either...) - the bigger the changes, the more complex they are to review
and/or we break things in smaller, more digest-able chunks

Please note I certainly don't want to kill the enthusiasm, or impose my ways on people :-)
This is an open discussion.

Best.

wking · 2014-05-09T17:55:52Z

On Fri, May 09, 2014 at 10:46:57AM -0700, Mangled Deutz wrote:

@wking that certainly makes a lot of sense, and we definitely agree here.
[snip good stuff]

That all sounds good to me. I just think it helps to have explicit
timelines etc. for the non-sexy bits before kicking them down the
road. Then feelings of guilt will help them actually get done ;).

I'm fine if this PR explicitly punts on environment variable renaming
and SETTINGS_FLAVOR removal. In that case, it looks pretty solid as
is ;).

phemmer · 2014-05-09T20:51:17Z

Ok, i've updated for the exec in the Dockerfile since that seems to be agreed upon.

Are we settled upon leaving env.yml as is? If so I'll update the README.

wking · 2014-05-09T21:08:29Z

On Fri, May 09, 2014 at 01:51:18PM -0700, Patrick Hemmer wrote:

Are we settled upon leaving env.yml as is? If so I'll update the README.

I see no reason to add a new config when we can just replace any
existing REPLACEME markers with the analagous _env:…. That should
make it easy to ensure we're not breaking existing users.

After a commit like that, I think your current env.yml is mostly a
streamlined config_sample.yml, so it should replace the existing
config_sample.yml (but probably keep the comments), with a commit
message that explains the improvements (e.g. “pulls in glance-swift,
etc. for a single, unified config script”). Then you can remove the
storage-specific config files as you integrate the in
config_sample.yml, and leave symlinks while we wait for folks to
migrate away from the old filenames.

phemmer · 2014-05-12T14:37:02Z

Ok, I've merged most of into config_sample.yml.
I did not symlink config_mirror.yml to it though. This is because config_mirror.yml defaults to enabling mirroring, while the config_sample.yml did not. Thus joining them together would be a significant breaking change.

It's a separate commit for now, but if this is the route we go, I'll squash them.

shin- · 2014-05-12T15:04:14Z

config/config_test.yml

@@ -1,20 +0,0 @@
-test:


Assuming this is a symlink, seems like it would be cleaner to just remove those files (especially since they were just here for reference in the first place)

I'm just going off what wking said. If we feel they should be just removed, I can go with that.

I think it's safe to remove them.

On Mon, May 12, 2014 at 08:04:15AM -0700, Joffrey F wrote:

@@ -1,20 +0,0 @@
-test:

Assuming this is a symlink, seems like it would be cleaner to just
remove those files (especially since they were just here for
reference in the first place)

Maybe you think they were only there for reference, but I don't see
why someone couldn't have been using:

DOCKER_REGISTRY_CONFIG=config_test.yml

or whatever when starting their registry. I don't mind removing the
files, since I still think we're in the “users can be expected to read
release notes” stage of development. However, @dmp42 at least seems
to prefer more backward compatibility 1. Maybe the risk of breakage
for folks using DOCKER_REGISTRY_CONFIG to point at example configs is
acceptable, and we don't need that much backward compatibility?

Literally the first section of the README advises to do cp config/config_sample.yml config/config.yml, so I think people using DOCKER_REGISTRY_CONFIG to point to a pre-defined sample config will be a very small minority.

shin- · 2014-05-12T15:44:11Z

Ok, LGTM! =)

shin- · 2014-05-12T16:58:26Z

Just one last thing, now that config_test.yml has been removed you need to update the tox.ini file to use config_sample.yml instead.

phemmer · 2014-05-12T19:18:01Z

I've fixed tox.ini and rebased the branch. Not sure where the travis build went.

Anyway, I'll work on updating the README to reflect the changes.

This adds a default `config.yml` file which supports all options through environment variables. You can use this to launch a container and set any parameter without having to inject your own config file. ---- This also involved a few code changes. One is to treat defined but unset config parameters as undefined. This is so that if the config references an environment variable which doesn't exist, we act as if that config param weren't set. The other is to allow passing the elliptics node list through a string (which is what env vars are). Previously `elliptics_nodes` only accepted a hash/dictionary. This change makes it so that it also accepts a list/array or a string. For example: elliptics: elliptics_nodes: - 1.1.1.1:2222 - 1.1.1.2:2222 or elliptics: elliptics_nodes: "1.1.1.1:2222 1.1.1.2:2222"

config_sample.yml is now the comprehensive config file

phemmer · 2014-05-14T23:11:29Z

Ok, I've gone through and updated the documentation.

I moved a few things around so that it's not so scattered. I moved all the config stuff that is independent of the storage engine to the top of the 'config options' section. All the storage engines and the parameters they support are then at the bottom.

I also added a quick start to the very top.

I also added several parameters I ran across during my work that weren't documented. However there are several which I didn't add an explanation on as I'm not sure how to best describe them. These are: the mirroring options, the swift options, the glance options, and cache/cache_lru (whats the difference).

Quick link to view README: https://github.com/phemmer/docker-registry/tree/config_cleanup#docker-registry

Additional note: I also wanted to remove the numbered lists and use bullet lists instead (but didn't). I personally think they look cleaner, especially once you start nesting. However I didn't want to change formatting since numbered lists were originally used.

wking · 2014-05-14T23:27:50Z

On Wed, May 14, 2014 at 04:11:31PM -0700, Patrick Hemmer wrote:

cache/cache_lru (whats the difference).

It looks like cache_lru is used to cache files for the storage drivers
(see local.LocalStorage.get_content), while cache (without _lru) is
used as a job-queue for image diffs. The code overlaps almost
completely though, so I'm in favor of consolidating these (but
probably not in this PR).

dmp42 · 2014-05-16T08:10:59Z

Thanks a lot - this is great work.

I just have a couple of things blocking this, and would like to merge ASAP:

@phemmer your changes on the elliptics driver (docker_registry/storage/ellipticsbackend.py) will get discarded as soon as we land the "plugins" branch - the reason for that being it is now standalone. If you want these changes, you should PR there: https://github.com/noxiouz/docker-registry-driver-elliptics
@shin- are we confident these changes will work for docker prod? how can we test that as fast and painless as possible?

dmp42 · 2014-05-16T08:15:31Z

@wking thanks a lot for driving this through!

@samalba a quick validation on this before we merge would be swell

shin- · 2014-05-16T12:26:57Z

are we confident these changes will work for docker prod? how can we test that as fast and painless as possible?

I don't expect this to break anything, and it will get through staging first anyway.

phemmer · 2014-05-16T13:52:49Z

elliptics_nodes change has been opened as noxiouz/docker-registry-driver-elliptics#4

dmp42 · 2014-05-16T13:53:31Z

@phemmer great!

dmp42 · 2014-05-16T15:09:30Z

LGTM - let's merge

config through env vars

phemmer reviewed May 9, 2014
View reviewed changes

shin- reviewed May 12, 2014
View reviewed changes

samalba mentioned this pull request May 13, 2014

Spurious files / spring cleanup #358

Closed

phemmer added 3 commits May 14, 2014 18:52

remove old sample config files

53e7978

config_sample.yml is now the comprehensive config file

update documentation for config_sample.yml changes

bb69c43

dmp42 mentioned this pull request May 15, 2014

Pluggable architecture #353

Merged

phemmer mentioned this pull request May 16, 2014

allow config.elliptics_nodes to be string or array noxiouz/docker-registry-driver-elliptics#4

Closed

dmp42 added a commit that referenced this pull request May 16, 2014

Merge pull request #350 from phemmer/config_cleanup

3964f29

config through env vars

dmp42 merged commit 3964f29 into docker-archive:master May 16, 2014

phemmer mentioned this pull request May 16, 2014

Inconsistent config handling #347

Closed

dmp42 mentioned this pull request May 16, 2014

Issue with DOCKER_REGISTRY_CONFIG var #332

Closed

config through env vars #350

config through env vars #350

Uh oh!

Conversation

phemmer commented May 9, 2014

Uh oh!

phemmer May 9, 2014

Choose a reason for hiding this comment

Uh oh!

phemmer commented May 9, 2014

Uh oh!

wking commented May 9, 2014

DON'T MERGE YET

Uh oh!

phemmer commented May 9, 2014

Uh oh!

wking commented May 9, 2014

Uh oh!

dmp42 commented May 9, 2014

Uh oh!

wking commented May 9, 2014

Uh oh!

dmp42 commented May 9, 2014

Uh oh!

wking commented May 9, 2014

Uh oh!

phemmer commented May 9, 2014

Uh oh!

wking commented May 9, 2014

Uh oh!

phemmer commented May 12, 2014

Uh oh!

shin- May 12, 2014

Choose a reason for hiding this comment

Uh oh!

phemmer May 12, 2014

Choose a reason for hiding this comment

Uh oh!

shin- May 12, 2014

Choose a reason for hiding this comment

Uh oh!

phemmer May 12, 2014

Choose a reason for hiding this comment

Uh oh!

wking May 12, 2014

Choose a reason for hiding this comment

Uh oh!

shin- May 12, 2014

Choose a reason for hiding this comment

Uh oh!

shin- commented May 12, 2014

Uh oh!

shin- commented May 12, 2014

Uh oh!

phemmer commented May 12, 2014

Uh oh!

phemmer commented May 14, 2014

Uh oh!

wking commented May 14, 2014

Uh oh!

dmp42 commented May 16, 2014

Uh oh!

dmp42 commented May 16, 2014

Uh oh!

shin- commented May 16, 2014

Uh oh!

phemmer commented May 16, 2014

Uh oh!

dmp42 commented May 16, 2014

Uh oh!

dmp42 commented May 16, 2014

Uh oh!

Uh oh!