-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Convert ci-kubernetes-e2e-cri-gce-flaky to start with #1581
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
1a7f970
to
e7b8d33
Compare
@spxtr I got
|
Yeah, I'm looking at that now. |
Wow and here it passes now... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great start! Excited to see this get in.
|
||
# Boilerplate envs | ||
# Assume we're upping, testing, and downing a cluster | ||
os.environ['E2E_UP'] = os.environ.get('E2E_UP') or 'true' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider dict.setdefault() for this
Also, I would be much happier if we simplify this to set the variables explicitly rather than relying on magic branching behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved them to args.
print >>sys.stderr, 'envfile %s does not exist' % envpath | ||
return | ||
with open(envpath, 'r') as envfile: | ||
for line in continuation_lines(envfile): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems pretty fragile. I'd consider disallowing this pattern for the initial version. Also how are you handling quoting? Does it work with docker run --env-file
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
check(['docker', 'stop', CONTAINER]) | ||
|
||
|
||
def sigint_handler(_signo, _frame): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you have two functions that do exactly the same thing? Especially since whether it is handling a SIGTERM or SIGINT signal is captured in _signo.
Also prefixing method arguments with a underscore looks odd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
'-v', '/etc/localtime:/etc/localtime:ro' | ||
] | ||
|
||
if os.environ.get('JENKINS_GCE_SSH_PRIVATE_KEY_FILE'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me this seems like something I would stick in a flag like we do elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flags now
PARSER = argparse.ArgumentParser( | ||
'Runs e2e jobs on the kubernetes repo') | ||
PARSER.add_argument( | ||
'--branch', default='master', help='Upstream target repo') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this exist?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deleted.
except subprocess.CalledProcessError as exc: | ||
print 'Exiting with code: %d' % exc.returncode # fine? | ||
|
||
signal.signal(signal.SIGTERM, sigterm_handler) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This waits until everything is complete before registering the signal handler, which means sigterm_handler will not catch the signal if one occurs curing check(*cmd)
above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved up.
with open(e2e_image_tag_override, 'r') as tag: | ||
e2e_image_tag = tag.read() | ||
|
||
docker_env_ignore = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would much rather we switch to a whitelist rather than a blacklist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah agree. I'll make it a separate PR, I'd rather make a job works as previous first, and dump down what do we really need here.
yield line | ||
|
||
|
||
def load_env(envpath): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider skipping this step entirely and pass --env-file directly to the docker run command.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed all not-yet-needed functions so it's cleaner now.
artifacts = '%s/_artifacts' % workspace | ||
os.makedirs(artifacts) | ||
|
||
# TODO(ixdy): remove when all jobs are setting these vars using Jenkins credentials |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to keep this todo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
buried
PARSER.add_argument( | ||
'--branch', default='master', help='Upstream target repo') | ||
PARSER.add_argument( | ||
'--platform', help='Platform test runs on') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Especially given that --platform is just a wrapper around loading a gce.env, to me the following seems a lot easier to understand:
kubernetes_e2e.py --env_file=gce.env --env_file=kubernetes-e2e-gce.env
In other words I would pretty much eliminate all of these flags. Especially things like --project and --zone which are specific to GCP
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deleted.
@fejta updated, PTAL |
2358abf
to
07bb4b6
Compare
@k8s-bot bazel test this |
'--env-file', | ||
action="append", | ||
help='Job specific environment file') | ||
PARSER.add_argument( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will you match these flags to the kubernetes_kubelet.py scenario? (Update one or the other please)
https://github.com/kubernetes/test-infra/blob/master/scenarios/kubernetes_kubelet.py#L103
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I edited the other one so I don't need include var function here.
|
||
for key, value in os.environ.items(): | ||
if key not in docker_env_ignore: | ||
cmd.extend(['-e', '%s=%s' % (key, value)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure this ordering regresses how things work: specifically if my environment has export INSTANCE_PREFIX=foo
it ill now do the following:
docker run -e INSTANCE_PREFIX=bootstrap-e2e -e INSTANCE_PREFIX=foo
Previously the foo.sh would override INSTANCE_PREFIX=foo with INSTANCE_PREFXI=bootstrap-e2e and then we would only do the following:
env > env.list
docker run --env-file=env.list # aka only one -e INSTANCE_PREFIX=bootstrape2e
Is this change in behavior intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, bah... I thought nobody is going to touch these boilerplates :-(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added a comment. this should ensure local var always overwrites anything else
] | ||
|
||
docker_extra_args = [] | ||
if (os.environ.get('JENKINS_ENABLE_DOCKER_IN_DOCKER') and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't looks like the flaky job uses any of this functionality yet (overriding the runner, docker-in-docker, local binaries). Can we remove it from this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deleted for now
|
||
if args.env_file: | ||
for env in args.env_file: | ||
cmd.extend(['--env-file', env]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suppose foo.env defines INSTANCE_PREFIX and/or KUBE_GKE_NETWORK, which is also set by --e2e-name. Which do you want to take precedence and why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then isn't it better to pre-process the envs like before then?
Since I can have FOO in .env file, and also set FOO locally.
]) | ||
|
||
try: | ||
signal.signal(signal.SIGTERM, sig_handler) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minimize the amount of code inside a try block. Move the signal registration (which will never raise a CalledProcessError) outside of it:
https://google.github.io/styleguide/pyguide.html?showone=Exceptions#Exceptions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
PARSER.add_argument( | ||
'--e2e-name', | ||
default='bootstrap-e2e', | ||
help='If we need to set --down in e2e.go') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copy-paste error
os.environ['PATH'] += ':/usr/local/go/bin' | ||
|
||
# dockerized-e2e-runner goodies setup | ||
workspace = os.environ.get('WORKSPACE') or os.getcwd() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setdefault?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
something like this?
def main(args): | ||
"""Set up env, trigger runner, handle termination. """ | ||
# pylint: disable=too-many-locals | ||
# pylint: disable=too-many-branches |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please enable too-many-branches (the other two are fine). I would not expect this method to have an excessive number of branches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
enabled
gce_pub = '%s.pub' % gce_ssh | ||
service = '/service-account.json' | ||
|
||
cmd.extend(['-v', '%s:%s:ro' % (args.gce_ssh, gce_ssh)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason not to combine all these extends into into the cmd = [] above?
I suspect that will enable you to enable the too-many-statements check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep.
f6d7197
to
322e6e3
Compare
|
0afe01c
to
2194bb7
Compare
@fejta PTAL again |
help='Job specific environment file') | ||
|
||
PARSER.add_argument( | ||
'--gce-ssh', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean is that this flag is called --ssh
in the kubelet and --gce-ssh
in the e2e scenarios. I would like the flag names to be the same. Please call both of them the same thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated kubelet, since here later we will need aws-ssh
and aws-pub
gce_pub = '%s.pub' % gce_ssh | ||
service = '/service-account.json' | ||
|
||
cmd.extend(['-v', '%s:%s:ro' % (args.gce_ssh, gce_ssh), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It still seems like these 7 cmd.extend() commands can be combined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
if key not in docker_env_ignore: | ||
cmd.extend(['-e', '%s=%s' % (key, value)]) | ||
|
||
cmd.extend([ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason these two need to go at the end instead of with the other -e stanzas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you can now switch this to cmd.append()
😁
check(*cmd) | ||
print 'Finished Successfully' | ||
except subprocess.CalledProcessError as exc: | ||
print 'Exiting with code: %d' % exc.returncode # fine? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does the fine? comment mean?
Also, consider writing errors to stderr instead of stdout
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done. fine is my best wish to the failing test...
|
||
lookup = {'workspace':os.environ.get('WORKSPACE')} | ||
|
||
# Get golang into our PATH so we can run e2e.go |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not understand this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes but why is it there? What is it trying to do? If you are changing os.environ but not using it directly it is a pretty strong signal to think about what it is trying to do and if there is a better way to do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not seem to do anything since PATH is specifically not mapped into docker? From the comment I would assume it was some legacy at some point?
lookup = {'workspace':os.environ.get('WORKSPACE')} | ||
|
||
# Get golang into our PATH so we can run e2e.go | ||
os.environ['PATH'] += ':/usr/local/go/bin' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: consider using % instead of +=
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deleted.
# dockerized-e2e-runner goodies setup | ||
workspace = lookup.setdefault('workspace', os.getcwd()) | ||
artifacts = '%s/_artifacts' % workspace | ||
os.makedirs(artifacts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will fail if $WORKSPACE/_artifacts
already exists, unlike mkdir -p
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added check.
e2e_image_tag = 'v20170104-9031f1d' | ||
e2e_image_tag_override = '%s/hack/jenkins/.kubekins_e2e_image_tag' % workspace | ||
if os.path.isfile(e2e_image_tag_override): | ||
with open(e2e_image_tag_override, 'r') as tag: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: 'r'
is unnecessary as this is the default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
# Use default component update behavior | ||
cmd.extend(['-e', 'CLOUDSDK_EXPERIMENTAL_FAST_COMPONENT_UPDATE=\'false\'']) | ||
|
||
cmd.extend(['-e', 'E2E_UP=%s' % args.e2e_up, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: if you're using multiple lines please do a line break beforehand so that we do not need to adjust things if you change the name of cmd to commmand.
Aka if you do this:
change_cmd.extend(['if I change this line',
'Note how this line is now misaligned'])
change_cmd.extend([
'However with this strategy',
'the padding does not change when I change the first line'])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
try: | ||
check(*cmd) | ||
print 'Finished Successfully' | ||
except subprocess.CalledProcessError as exc: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file needs to raise and exception and/or exit 1 whenever tests fail. Right now main() will always exit 0 regardless of whether tests fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was once intended to trap return signal from exc.returncode
.
Maybe I can get rid of this try block all together just call check(*cmd)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raises the exception and let upper layer deals with it.
Bazel test failed for commit 7a7bbf2. Full PR test history. cc @krzyzacy The magic incantation to run this job again is Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
5d838f3
to
6eb2916
Compare
another iteration, comments addressed. |
print 'Finished Successfully' | ||
except subprocess.CalledProcessError as exc: | ||
print >>sys.stderr, 'Exiting with code: %d' % exc.returncode | ||
raise exc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please just use raise
to preserve the stack trace
Compare:
def Raises():
raise ValueError('hello there')
def CatchAndRaise():
try:
Raises()
except ValueError as ve:
raise ve
def OnlyRaise():
try:
Raises()
except ValueError as ve:
raise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got rid of this try-catch
"""Set up env, trigger runner, handle termination. """ | ||
# pylint: disable=too-many-locals | ||
|
||
lookup = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, why do you need this variable? It doesn't seem to serve a useful purpose
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ehhhh isn't it I need to declare a dict first?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am pretty sure these two things do the same thing:
lookup = 5
lookup += alpha
workspace = lookup + beta # lookup is pointless, get rid of it
workspace = 5 + alpha + beta # same result
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ehhh then how do I call setdefault?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need to call that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
|
||
# Assume we're upping, testing, and downing a cluster by default | ||
PARSER.add_argument( | ||
'--e2e-up', default='true', help='If we need to set --up in e2e.go') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that the scenario is kubernetes_e2e.py it seems redundant to prefix flags with --e2e, especially since not all the flags have this prefix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
PARSER.add_argument( | ||
'--e2e-down', default='true', help='If we need to set --down in e2e.go') | ||
PARSER.add_argument( | ||
'--e2e-name', default='bootstrap-e2e', help='Name of the e2e test') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about --cluster since the primary purpose of this is to name the cluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
realjob = self.GetRealBootstrapJob('job-configs/kubernetes-jenkins/bootstrap-ci.yaml', job) | ||
self.assertTrue(realjob) | ||
docker_timeout = realjob['timeout'] | ||
print docker_timeout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this left over debug code? Please assert something here, rather than print.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
@@ -51,413 +53,605 @@ | |||
- kubernetes-e2e-aws: | |||
job-name: ci-kubernetes-e2e-aws | |||
frequency: 'H/5 * * * *' # At least every 30m | |||
timeout: 140 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For jobs you aren't changing please leave the timeout at 0.
Also given the enormous number of jobs you're changing, will you send me a separate PR that adds --json='{json}' --timeout='{timeout}'
to the template, and then sets these values to zero to all jobs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
self.assertTrue('scenario' in config[job], job) | ||
self.assertTrue(os.path.isfile(bootstrap.test_infra('scenarios/%s.py' % config[job]['scenario'])), job) | ||
if 'args' in config[job]: | ||
for arg in config[job].get('args', []): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The if statement makes the get('args', []) redundant. Please only keep one of these
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
self.assertTrue(os.path.isfile(bootstrap.test_infra('scenarios/%s.py' % config[job]['scenario'])), job) | ||
if 'args' in config[job]: | ||
for arg in config[job].get('args', []): | ||
m = re.match(r'--env-file=([^\"]+)', arg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the scenario is kubernetes-e2e I would validate that there is at least one --env-file flag
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
@@ -1527,6 +1581,19 @@ def testNoBadVarsInJobs(self): | |||
if bad_vars: | |||
self.fail('Job %s contains bad bash variables: %s' % (job, ' '.join(bad_vars))) | |||
|
|||
def testValidJobEnvs(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fantastic unit test!
@@ -1403,10 +1409,33 @@ def CheckBootstrapYaml(self, path, check, suffix='suffix', use_json=False): | |||
self.assertTrue(job_name) | |||
self.assertEquals(job_name, real_job.get('job-name')) | |||
|
|||
def GetRealBootstrapJob(self, path, job): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should CheckBootstrapYaml use this function? It seems to duplicate a lot of functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO This is a dump down version and 'assumes' everything would work without error after passes CheckBootstrapYaml (which has all the assertions)..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems slow and silly to load the yaml file twice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh indeed it will load the yaml n times :-(... I'll do something here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about something like this
8593c52
to
35e3f9a
Compare
@fejta updated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking really good. This is really close. Just a couple nits and then would like to do a little bit more on the realjobs situation.
if key not in docker_env_ignore: | ||
cmd.extend(['-e', '%s=%s' % (key, value)]) | ||
|
||
cmd.extend([ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you can now switch this to cmd.append()
😁
|
||
|
||
def sig_handler(_signo, _frame): | ||
"""Handle signal.SIGTERM and signal.SIGINT.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noting that this stops the container would be more useful documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
|
||
|
||
def main(args): | ||
"""Set up env, trigger runner, handle termination. """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: starts kubekins-e2e may be a better description than trigger runner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
@@ -0,0 +1,8 @@ | |||
# GKE E2E ENVS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed by this pr, please remove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
mat = re.match(r'export KUBEKINS_TIMEOUT="(\d+)m".*', line) | ||
else: | ||
mat = re.match(r'KUBEKINS_TIMEOUT="(\d+)m".*', line) | ||
realjob = self.GetRealBootstrapJob(job) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem like a super great strategy either. Specifically you're now depending on the fact that testValidTimeout runs after testBootstrapCIYaml based on where they are defined in this file, which does not seem like a great practice.
@@ -1416,10 +1426,16 @@ def CheckBootstrapYaml(self, path, check, suffix='suffix', use_json=False): | |||
self.assertTrue(job_name) | |||
self.assertEquals(job_name, real_job.get('job-name')) | |||
|
|||
def GetRealBootstrapJob(self, job): | |||
key = os.path.splitext(job.strip())[0][3:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider refactoring this to check and see whether key is in self.realjobs, returning the cached version if it is, otherwise loading the yaml. Then update CheckBootstrapYaml() to call this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohhhh... the issue here is that from the job I would not know which yaml file to load
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pass in job_path as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
job_path
is going to be path/to/jobs/FOO.sh
rather than bootstrap-foo.yaml
, and jobs are gathered from config.json
and iter all files from jobs/
dir. I can list out all the existing yaml files and load them, if a job is not cached yet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like it will be easier to put it in the yaml test then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like this? It works if I move testValidTimeout to the front.
LGTM aside from the GetRealBootstrapJob fix |
/lgtm |
let the fire burn! |
…r images as necessary. (kubernetes#1581)
split up #1569 #1525
ref #1225 #1226 #1230
This includes all the unit-test I'd like to add.
Pick an always-failing job to muck with :-)