Skip to content
This repository was archived by the owner on Sep 12, 2018. It is now read-only.

config through env vars #350

Merged
merged 3 commits into from
May 16, 2014
Merged

config through env vars #350

merged 3 commits into from
May 16, 2014

Conversation

phemmer
Copy link
Contributor

@phemmer phemmer commented May 9, 2014

This is a PR for #347

This adds a default env.yml file which supports all options through environment variables.
You can use this to launch a container and set any parameter without having to inject your own config file.

Example usage:

docker run \
  -e SETTINGS_FLAVOR=s3 \
  -e AWS_REGION=us-west-1 \
  -e AWS_BUCKET=my-docker-bucket \
  -e AWS_KEY='AKIASDHAF48HSKDAF' \
  -e AWS_SECRET='SAGDKFJSHDKFJHAVUSKDJFSHKLDFLSDF' \
  -e SECRET_KEY='foo'  \
  -e SEARCH_BACKEND=sqlalchemy \
  registry

You can also launch without any settings to get the dev 'flavor' which will launch local storage with sqlalchemy enabled.
 

I tried to get every single config parameter possible, but they are scattered around all over the place, so it is very likely I missed a few.

I also wasn't sure how to go on defaults. This bit should probably be determined and cleaned up. For example in the new env.yml, s3_encrypt defaults to true if not specified. I did this as this is the value set in config_s3.yml. But I'm not sure if we want to leave this unset and let the code handle it.
But then on some things like config_path, I think a default makes sense. So which parameters get defaults and which don't?

 

The other thing is the env var names. I tried to preserve the existing variable names if they were documented or used already. However the names are rather inconsistent. For example, the s3 bucket var in config_s3.yml is $AWS_BUCKET, but in the setup-configs.sh it's $S3_BUCKET. Then swift_container is $SWIFT_CONTAINER, but the rest of the vars are $OS_....

I personally think it would be best if all the vars for a storage engine were prefixed with that engine name. So engine: swift would be $SWIFT_AUTH_URL, not $OS_AUTH_URL.

 

Lastly I have only tested the s3, local, and dev 'flavors'. I don't have the other services to be able to test them.

@@ -40,7 +45,7 @@ def _replace_env(s):
if isinstance(s, basestring) and s.startswith('_env:'):
parts = s.split(':', 2)
varname = parts[1]
vardefault = '!ENV_NOT_FOUND' if len(parts) < 3 else parts[2]
vardefault = None if len(parts) < 3 else parts[2]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This, and the above changes, are to treat defined but unset config parameters as undefined. This is so that if the config references an environment variable which doesn't exist, we act as if that config param weren't defined.

@phemmer
Copy link
Contributor Author

phemmer commented May 9, 2014

I'd also be up for adding another commit to this PR to update the README. A lot of variables are missing (or are scattered around), and the supported storage engines aren't listed very well. I just want to make sure the code is proper before continuing to documentation.

@wking
Copy link
Contributor

wking commented May 9, 2014

On Fri, May 09, 2014 at 06:04:59AM -0700, Patrick Hemmer wrote:

  • Clean up configuration management.

Why did you drop the 'exec' in your Dockerfile CMD? I don't see a
reason to keep the shell process around.

This is a PR for #347

DON'T MERGE YET

This adds a default env.yml file which supports all options through environment variables.
You can use this to launch a container and set any parameter without having to inject your own config file.

Example usage:

docker run \
  -e SETTINGS_FLAVOR=s3 \
  -e AWS_REGION=us-west-1 \
  -e AWS_BUCKET=my-docker-bucket \
  -e AWS_KEY='AKIASDHAF48HSKDAF' \
  -e AWS_SECRET='SAGDKFJSHDKFJHAVUSKDJFSHKLDFLSDF' \
  -e SECRET_KEY='foo'  \
  -e SEARCH_BACKEND=sqlalchemy
  registry

You can also launch without any settings to get the dev 'flavor'
which will launch local storage with sqlalchemy enabled.

I still think it would be better to just have a STORAGE variable
instead of SETTINGS_FLAVOR blocks 1. If STORAGE=s3, it doesn't
matter if there are also a bunch of swift or glance configs set,
you're not going to use them. Having a single block would also avoid
fragmentation like AWS_BUCKET and GCS_BUCKET variables for the same
boto_bucket setting. It also avoids the whole glance-swift business,
just set:

STORAGE=glance GLANCE_STORAGE_ALTERNATE=swift

I also wasn't sure how to go on defaults. This bit should probably
be determined and cleaned up. For example in the new env.yml,
s3_encrypt defaults to true if not specified. I did this as this
is the value set in config_s3.yml. But I'm not sure if we want to
leave this unset and let the code handle it. But then on some
things like config_path, I think a default makes sense. So which
parameters get defaults and which don't?

I'd add defaults everywhere we think we have a reasonable suggestion
for development setups. Then I'd add notes to the README explaining
how folks might want to tune for higher loads.

I personally think it would be best if all the vars for a storage
engine were prefixed with that engine name. So engine: swift would
be $SWIFT_AUTH_URL, not $OS_AUTH_URL.

+1. I expect this will shake up existing configs a bit, so it's
better to do a full cleanup all at once.

@phemmer
Copy link
Contributor Author

phemmer commented May 9, 2014

Why did you drop the 'exec' in your Dockerfile CMD? I don't see a
reason to keep the shell process around.

Just an omission when cleaning it up. I can put it back, or we can go with the CMD ["/usr/local/bin/docker-registry"] syntax and avoid the shell. I would lean a bit towards exec though, as it's more flexible and resistant to breakage.

I still think it would be better to just have a STORAGE variable
instead of SETTINGS_FLAVOR blocks [1]. If STORAGE=s3, it doesn't
matter if there are also a bunch of swift or glance configs set,
you're not going to use them. Having a single block would also avoid
fragmentation like AWS_BUCKET and GCS_BUCKET variables for the same
boto_bucket setting. It also avoids the whole glance-swift business,
just set:
STORAGE=glance GLANCE_STORAGE_ALTERNATE=swift

I think I see what you're getting at.
I think that would work as long as the app doesn't do anything if a config param is set and not used. Like if storage: s3 and elliptics_addr_family is set.
I wasn't sure if this is the case, so I kept everything separate.

The other argument for keeping them separate is to have different defaults for the different storage engines. Like for local storage, storage_path defaults to /tmp/registry. But for S3, it defaults to /registry.

@wking
Copy link
Contributor

wking commented May 9, 2014

On Fri, May 09, 2014 at 09:23:15AM -0700, Patrick Hemmer wrote:

I think that would work as long as the app doesn't do anything if a
config param is set and not used. Like if storage: s3 and
elliptics_addr_family is set. I wasn't sure if this is the case,
so I kept everything separate.

$ git grep -c elliptics_
README.md:20
docker_registry/storage/ellipticsbackend.py:1

so at least for the elliptics backend, you don't have to worry about
accidentally using those settings unless you're using that backend.

The other argument for keeping them separate is to have different
defaults for the different storage engines. Like for local storage,
storage_path defaults to /tmp/registry. But for S3, it defaults
to /registry.

I think that small gain is not worth the added layer of config-syntax
complexity. I'd be happy to make the default STORAGE_PATH
/docker-registry in all cases, and leave it up to folks who don't like
that to override as they see fit. In cases where a globally-workable
default can not be found (maybe an empty set?), a note in the README
saying “If you're using storage backend $a, make sure to override
environment variable $b with a suiltable value (e.g. $c). The default
$d doesn't work because…” should be sufficient.

@dmp42
Copy link
Contributor

dmp42 commented May 9, 2014

@wking I'm concerned about the extent of this PR.
Getting rid of setup-configs.sh and harmonizing the use of env vars is one thing - introducing breaking changes is another.

I certainly agree both are needed, just probably not in the same timeframe.
I would definitely be glad to merge the first ASAP, but killing SETTINGS_FLAVOR and changing the way storages are selected will probably have to wait.

That being said, this is certainly good work, I just want to make sure that we are not breaking things unexpectedly for people using this, and that breaking changes are introduced timely.

What do you think?

Also pinging @shin- and @samalba for their opinions

Best regards

@wking
Copy link
Contributor

wking commented May 9, 2014

On Fri, May 09, 2014 at 10:00:35AM -0700, Mangled Deutz wrote:

Getting rid of setup-configs.sh

You can probably do this without much trouble, and you can certainly
land that now and put off a more thorough cleanup until later.

and harmonizing the use of env vars is one thing - introducing
breaking changes is another.

But I expect you won't be able to “harmonize the use of env vars”
without requiring users to adapt to the new variables. I think the
blind-user count is still low enough that cutting 0.7.0 with release
notes that describe suggested environment variable mappings:

AWS_ACCESS_KEY_ID → S3_ACCESS_KEY
AWS_SECRET_KEY → S3_SECRET_KEY

should cover almost all users, without needing code in the registry to
support both config flavors while emitting deprecation warnings.

I would definitely be glad to merge the first ASAP, but killing
SETTINGS_FLAVOR and changing the way storages are selected will
probably have to wait.

I don't see a particular rush in either case, but breaking user-facing
settings is best done before the user base grows too large, and
includes too many people who will blindly upgrade without reading
release notes. I'd be happy to write up a patch killing
SETTINGS_FLAVOR (based on this PR) if you'd like to kick it around. I
don't expect it to be too intrusive.

@dmp42
Copy link
Contributor

dmp42 commented May 9, 2014

@wking that certainly makes a lot of sense, and we definitely agree here.

Now, I think that what we don't want for the future is (I'm sure you will agree as well with that):

  • tons of pending PR (either because they were too ambitious, or not timely)
  • PR that bit-rot (trying to do too much)
  • low-quality, broken code lying around (cleaning-up is not a sexy task...)
  • breaking changes being slipped through for lack of planification

Unfortunately, this is (partly) where we are right now.
I'm not blaming the contributors there. Actually, I'm not blaming anyone at all :-)
I'm just saying there is room for improvement for us here.

One way to improve might be to ensure that:

  • low hanging fruits and easy changes with clear benefits gets driven and merged fast (eg: getting rid of setup-configs.sh for example, which is responsible for two open issues and a lot of confusion)
  • more ambitious (API breaking changes) are concerted, clearly identified as such and part of a milestone plan
  • we have people willing to test ambitious PR exhaustively (ain't sexy either...) - the bigger the changes, the more complex they are to review
  • and/or we break things in smaller, more digest-able chunks

Please note I certainly don't want to kill the enthusiasm, or impose my ways on people :-)
This is an open discussion.

Best.

@wking
Copy link
Contributor

wking commented May 9, 2014

On Fri, May 09, 2014 at 10:46:57AM -0700, Mangled Deutz wrote:

@wking that certainly makes a lot of sense, and we definitely agree here.
[snip good stuff]

That all sounds good to me. I just think it helps to have explicit
timelines etc. for the non-sexy bits before kicking them down the
road. Then feelings of guilt will help them actually get done ;).

I'm fine if this PR explicitly punts on environment variable renaming
and SETTINGS_FLAVOR removal. In that case, it looks pretty solid as
is ;).

@phemmer
Copy link
Contributor Author

phemmer commented May 9, 2014

Ok, i've updated for the exec in the Dockerfile since that seems to be agreed upon.

Are we settled upon leaving env.yml as is? If so I'll update the README.

@wking
Copy link
Contributor

wking commented May 9, 2014

On Fri, May 09, 2014 at 01:51:18PM -0700, Patrick Hemmer wrote:

Are we settled upon leaving env.yml as is? If so I'll update the README.

I see no reason to add a new config when we can just replace any
existing REPLACEME markers with the analagous _env:…. That should
make it easy to ensure we're not breaking existing users.

After a commit like that, I think your current env.yml is mostly a
streamlined config_sample.yml, so it should replace the existing
config_sample.yml (but probably keep the comments), with a commit
message that explains the improvements (e.g. “pulls in glance-swift,
etc. for a single, unified config script”). Then you can remove the
storage-specific config files as you integrate the in
config_sample.yml, and leave symlinks while we wait for folks to
migrate away from the old filenames.

@phemmer
Copy link
Contributor Author

phemmer commented May 12, 2014

Ok, I've merged most of into config_sample.yml.
I did not symlink config_mirror.yml to it though. This is because config_mirror.yml defaults to enabling mirroring, while the config_sample.yml did not. Thus joining them together would be a significant breaking change.

It's a separate commit for now, but if this is the route we go, I'll squash them.

@@ -1,20 +0,0 @@
test:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming this is a symlink, seems like it would be cleaner to just remove those files (especially since they were just here for reference in the first place)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just going off what wking said. If we feel they should be just removed, I can go with that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's safe to remove them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Mon, May 12, 2014 at 08:04:15AM -0700, Joffrey F wrote:

@@ -1,20 +0,0 @@
-test:

Assuming this is a symlink, seems like it would be cleaner to just
remove those files (especially since they were just here for
reference in the first place)

Maybe you think they were only there for reference, but I don't see
why someone couldn't have been using:

DOCKER_REGISTRY_CONFIG=config_test.yml

or whatever when starting their registry. I don't mind removing the
files, since I still think we're in the “users can be expected to read
release notes” stage of development. However, @dmp42 at least seems
to prefer more backward compatibility 1. Maybe the risk of breakage
for folks using DOCKER_REGISTRY_CONFIG to point at example configs is
acceptable, and we don't need that much backward compatibility?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Literally the first section of the README advises to do cp config/config_sample.yml config/config.yml, so I think people using DOCKER_REGISTRY_CONFIG to point to a pre-defined sample config will be a very small minority.

@shin-
Copy link
Contributor

shin- commented May 12, 2014

Ok, LGTM! =)

@shin-
Copy link
Contributor

shin- commented May 12, 2014

Just one last thing, now that config_test.yml has been removed you need to update the tox.ini file to use config_sample.yml instead.

@phemmer
Copy link
Contributor Author

phemmer commented May 12, 2014

I've fixed tox.ini and rebased the branch. Not sure where the travis build went.

Anyway, I'll work on updating the README to reflect the changes.

phemmer added 3 commits May 14, 2014 18:52
This adds a default `config.yml` file which supports all options through environment variables.
You can use this to launch a container and set any parameter without having to inject your own config file.

----

This also involved a few code changes.

One is to treat defined but unset config parameters as undefined. This is so that if the config references an environment variable which doesn't exist, we act as if that config param weren't set.

The other is to allow passing the elliptics node list through a string (which is what env vars are). Previously `elliptics_nodes` only accepted a hash/dictionary. This change makes it so that it also accepts a list/array or a string.
For example:

    elliptics:
      elliptics_nodes:
        - 1.1.1.1:2222
        - 1.1.1.2:2222

or

    elliptics:
      elliptics_nodes: "1.1.1.1:2222 1.1.1.2:2222"
config_sample.yml is now the comprehensive config file
@phemmer
Copy link
Contributor Author

phemmer commented May 14, 2014

Ok, I've gone through and updated the documentation.

I moved a few things around so that it's not so scattered. I moved all the config stuff that is independent of the storage engine to the top of the 'config options' section. All the storage engines and the parameters they support are then at the bottom.

I also added a quick start to the very top.

I also added several parameters I ran across during my work that weren't documented. However there are several which I didn't add an explanation on as I'm not sure how to best describe them. These are: the mirroring options, the swift options, the glance options, and cache/cache_lru (whats the difference).

Quick link to view README: https://github.com/phemmer/docker-registry/tree/config_cleanup#docker-registry

Additional note: I also wanted to remove the numbered lists and use bullet lists instead (but didn't). I personally think they look cleaner, especially once you start nesting. However I didn't want to change formatting since numbered lists were originally used.

@wking
Copy link
Contributor

wking commented May 14, 2014

On Wed, May 14, 2014 at 04:11:31PM -0700, Patrick Hemmer wrote:

cache/cache_lru (whats the difference).

It looks like cache_lru is used to cache files for the storage drivers
(see local.LocalStorage.get_content), while cache (without _lru) is
used as a job-queue for image diffs. The code overlaps almost
completely though, so I'm in favor of consolidating these (but
probably not in this PR).

@dmp42 dmp42 mentioned this pull request May 15, 2014
@dmp42
Copy link
Contributor

dmp42 commented May 16, 2014

Thanks a lot - this is great work.

I just have a couple of things blocking this, and would like to merge ASAP:

  • @phemmer your changes on the elliptics driver (docker_registry/storage/ellipticsbackend.py) will get discarded as soon as we land the "plugins" branch - the reason for that being it is now standalone. If you want these changes, you should PR there: https://github.com/noxiouz/docker-registry-driver-elliptics
  • @shin- are we confident these changes will work for docker prod? how can we test that as fast and painless as possible?

@dmp42
Copy link
Contributor

dmp42 commented May 16, 2014

@wking thanks a lot for driving this through!

@samalba a quick validation on this before we merge would be swell

@shin-
Copy link
Contributor

shin- commented May 16, 2014

are we confident these changes will work for docker prod? how can we test that as fast and painless as possible?

I don't expect this to break anything, and it will get through staging first anyway.

@phemmer
Copy link
Contributor Author

phemmer commented May 16, 2014

elliptics_nodes change has been opened as noxiouz/docker-registry-driver-elliptics#4

@dmp42
Copy link
Contributor

dmp42 commented May 16, 2014

@phemmer great!

@dmp42
Copy link
Contributor

dmp42 commented May 16, 2014

LGTM - let's merge

dmp42 added a commit that referenced this pull request May 16, 2014
@dmp42 dmp42 merged commit 3964f29 into docker-archive:master May 16, 2014
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants