openstackgerrit | James E. Blair proposed zuul/zuul-jobs master: Add upload-logs-google role https://review.opendev.org/703711 | 00:07 |
---|---|---|
*** armstrongs has joined #zuul | 00:09 | |
*** mattw4 has quit IRC | 00:10 | |
*** jamesmcarthur has joined #zuul | 01:26 | |
*** jamesmcarthur has quit IRC | 01:38 | |
*** jamesmcarthur has joined #zuul | 01:47 | |
*** armstrongs has quit IRC | 02:01 | |
*** saneax has joined #zuul | 02:35 | |
*** jamesmcarthur has quit IRC | 02:50 | |
*** jamesmcarthur has joined #zuul | 03:13 | |
*** bhavikdbavishi has joined #zuul | 03:26 | |
*** bhavikdbavishi1 has joined #zuul | 03:29 | |
*** saneax has quit IRC | 03:29 | |
*** bhavikdbavishi has quit IRC | 03:31 | |
*** bhavikdbavishi1 is now known as bhavikdbavishi | 03:31 | |
*** jamesmcarthur has quit IRC | 04:14 | |
*** jamesmcarthur has joined #zuul | 04:15 | |
*** jamesmcarthur has quit IRC | 04:21 | |
*** jamesmcarthur has joined #zuul | 04:45 | |
*** rlandy has quit IRC | 04:47 | |
*** jamesmcarthur has quit IRC | 04:49 | |
*** raukadah is now known as chandankumar | 04:50 | |
tobiash | clarkb: regarding test runtime, it's true that a large part of the test run in total is git operations and a year ago I've experimented a bit with replacing gitpython by pygit2 which improved the performance quite a bit | 05:32 |
*** evrardjp has quit IRC | 05:34 | |
*** evrardjp has joined #zuul | 05:34 | |
*** sgw1 has joined #zuul | 05:39 | |
*** sgw has quit IRC | 05:41 | |
*** saneax has joined #zuul | 07:24 | |
tobiash | frickler: regarding your question about recheck: in the zuul tenant we dropped the clean check requirement to enable us to quickly take critical zuul changes into gate | 07:57 |
*** themroc has joined #zuul | 08:10 | |
*** tosky has joined #zuul | 08:27 | |
*** jpena|off is now known as jpena | 08:43 | |
reiterative | fungi Thanks! I was using the git connector - changing it to gerrit has done the trick! | 08:48 |
openstackgerrit | Simon Westphahl proposed zuul/nodepool master: Centralize logging adapters https://review.opendev.org/703407 | 08:50 |
openstackgerrit | Simon Westphahl proposed zuul/nodepool master: Pass node request handler to launcher base class https://review.opendev.org/703549 | 08:50 |
openstackgerrit | Simon Westphahl proposed zuul/nodepool master: Annotate logs in launcher https://review.opendev.org/703558 | 08:50 |
openstackgerrit | Simon Westphahl proposed zuul/nodepool master: Annotate logs in node request handler https://review.opendev.org/703559 | 08:50 |
openstackgerrit | Simon Westphahl proposed zuul/nodepool master: Include event id in node request listings https://review.opendev.org/703560 | 08:50 |
openstackgerrit | Simon Westphahl proposed zuul/nodepool master: Annotate logs in zk module https://review.opendev.org/703561 | 08:50 |
*** hashar has joined #zuul | 08:53 | |
*** yolanda has quit IRC | 08:54 | |
*** yolanda has joined #zuul | 09:00 | |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: ensure-tox: use pip3 in preference to pip https://review.opendev.org/703694 | 09:22 |
openstackgerrit | Benjamin Schanzel proposed zuul/zuul master: Fix Test Case "TestScheduler.test_timer_with_jitter" https://review.opendev.org/703749 | 09:22 |
openstackgerrit | Benjamin Schanzel proposed zuul/zuul master: Fix Test Case "TestScheduler.test_timer_with_jitter" https://review.opendev.org/703749 | 09:25 |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: ensure-tox: use pip3 in preference to pip https://review.opendev.org/703694 | 09:39 |
openstackgerrit | Merged zuul/zuul master: tox: reduce deps used for pep8 env https://review.opendev.org/703634 | 09:47 |
openstackgerrit | Benjamin Schanzel proposed zuul/zuul master: Fix Test Case "TestScheduler.test_timer_with_jitter" https://review.opendev.org/703749 | 09:53 |
*** bhavikdbavishi has quit IRC | 10:12 | |
openstackgerrit | Merged zuul/zuul-jobs master: Make ara-report role to zuul_return an artifact https://review.opendev.org/697681 | 10:33 |
openstackgerrit | Clément Mondion proposed zuul/nodepool master: add tags support for aws provider https://review.opendev.org/703651 | 10:37 |
openstackgerrit | Clément Mondion proposed zuul/nodepool master: add tags support for aws provider https://review.opendev.org/703651 | 10:47 |
openstackgerrit | Antoine Musso proposed zuul/zuul master: test: prevent ResourceWarning in test_bubblewrap https://review.opendev.org/703767 | 10:51 |
openstackgerrit | Clément Mondion proposed zuul/nodepool master: add tags support for aws provider https://review.opendev.org/703651 | 10:55 |
*** hashar has quit IRC | 10:56 | |
*** pcaruana has joined #zuul | 11:02 | |
*** themroc has quit IRC | 11:09 | |
*** xeivieni has joined #zuul | 11:13 | |
*** bhavikdbavishi has joined #zuul | 11:17 | |
*** bhavikdbavishi1 has joined #zuul | 11:20 | |
*** bhavikdbavishi has quit IRC | 11:21 | |
*** bhavikdbavishi1 is now known as bhavikdbavishi | 11:21 | |
*** xeivieni has quit IRC | 11:23 | |
*** hashar has joined #zuul | 11:46 | |
*** zxiiro has quit IRC | 11:56 | |
openstackgerrit | Antoine Musso proposed zuul/zuul master: test: prevent ResourceWarning in test_client https://review.opendev.org/703782 | 11:58 |
*** avass has joined #zuul | 12:04 | |
*** jpena is now known as jpena|lunch | 12:16 | |
*** dmellado has quit IRC | 12:24 | |
*** dmellado has joined #zuul | 12:26 | |
*** hashar has quit IRC | 12:27 | |
*** zbr has quit IRC | 12:34 | |
*** zbr has joined #zuul | 12:35 | |
*** zbr_ has joined #zuul | 12:43 | |
*** zbr has quit IRC | 12:46 | |
*** zbr_ has quit IRC | 12:46 | |
*** zbr has joined #zuul | 12:48 | |
*** rlandy has joined #zuul | 12:59 | |
*** avass has quit IRC | 13:05 | |
*** jamesmcarthur has joined #zuul | 13:18 | |
*** jpena|lunch is now known as jpena | 13:21 | |
*** zbr has quit IRC | 13:26 | |
*** avass has joined #zuul | 13:28 | |
*** zbr has joined #zuul | 13:29 | |
openstackgerrit | Clément Mondion proposed zuul/nodepool master: add tags support for aws provider https://review.opendev.org/703651 | 13:33 |
*** jamesmcarthur has quit IRC | 13:34 | |
openstackgerrit | Clément Mondion proposed zuul/nodepool master: add tags support for aws provider https://review.opendev.org/703651 | 13:35 |
*** jmaselbas has joined #zuul | 13:39 | |
*** jamesmcarthur has joined #zuul | 13:49 | |
*** sshnaidm is now known as sshnaidm|mtg | 13:55 | |
tobiash | mordred: are you here by chance? | 14:03 |
tobiash | the nodepool image build job has a problem because of not having gcc available (in the nodepool-base target, not the builder stage) | 14:03 |
tobiash | looks like pip doesn't install the netifaces wheel but rebuild it again | 14:04 |
tobiash | despite having the wheel cache here: https://opendev.org/opendev/system-config/src/branch/master/docker/python-builder/scripts/install-from-bindep#L26 | 14:04 |
openstackgerrit | Clint 'SpamapS' Byrum proposed zuul/zuul master: Add irrelevant-branches negative matcher https://review.opendev.org/552809 | 14:05 |
openstackgerrit | Clint 'SpamapS' Byrum proposed zuul/zuul master: Use re2 for change_matcher https://review.opendev.org/536389 | 14:05 |
fungi | tobiash: have a link to an example run where it's rebuilding from sdist? | 14:13 |
*** hashar has joined #zuul | 14:13 | |
*** jamesmcarthur has quit IRC | 14:14 | |
tobiash | fungi: https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_899/703407/3/check/nodepool-build-image/899314b/job-output.txt | 14:14 |
tobiash | essentially it just ignores the prebuilt wheels from the step before | 14:14 |
openstackgerrit | Clint 'SpamapS' Byrum proposed zuul/zuul-jobs master: Add a markdownlint job and role https://review.opendev.org/607691 | 14:20 |
tobiash | sounds a bit like https://github.com/pypa/pip/issues/6852 | 14:23 |
tobiash | but this is marked as fixed | 14:23 |
tobiash | we might need to update pip in the python-builder and python-base images due to https://github.com/pypa/pip/issues/6852 maybe not in the pip version that is used | 14:27 |
tobiash | I'll try that locally | 14:27 |
*** hashar has quit IRC | 14:28 | |
*** jmaselbas has left #zuul | 14:29 | |
fungi | tobiash: one thing i notice is that the reusable wheel cache in our ci system only carries python 3.6 wheels, not 3.7: http://files.openstack.org/mirror/wheel/ubuntu-18.04-x86_64/n/netifaces/ | 14:30 |
fungi | and that job is using 3.7 | 14:31 |
tobiash | fungi: the build works like this: builder image builds wheels, then the nodepool-stage image gets them from the builder image and uses the cache pip built there | 14:31 |
tobiash | so this shouldn't use the wheel mirror but the local wheel cache | 14:31 |
fungi | i see, and i agree the log does show it building twice | 14:32 |
fungi | it reuses the cached download of the sdist, not the cached wheel it built | 14:32 |
tobiash | yes | 14:34 |
openstackgerrit | Clint 'SpamapS' Byrum proposed zuul/zuul master: Use re2 for change_matcher https://review.opendev.org/536389 | 14:35 |
tobiash | fungi: confirmed, upgrading pip in builder and base image prior to building the wheels fixes it locally, I'll upload a fix | 14:38 |
fungi | tobiash: oh, awesome--thanks! | 14:44 |
*** jamesmcarthur has joined #zuul | 14:45 | |
fungi | i wonder if we shouldn't also try to use opendev's wheel cache, though that wouldn't have helped in this particular case at the moment | 14:45 |
*** jamesmcarthur has quit IRC | 14:46 | |
*** jamesmcarthur has joined #zuul | 14:46 | |
tobiash | remote: https://review.opendev.org/703807 Upgrade pip in python-builder and base | 14:48 |
tobiash | this should unbreak nodepool jobs ^ | 14:48 |
*** sgw1 is now known as sgw | 14:48 | |
tobiash | in case it's hard to land this quickly I could also do the same workaround in nodepool in the meantime | 14:49 |
openstackgerrit | Tobias Henkel proposed zuul/nodepool master: Temporarily fix image build in nodepool https://review.opendev.org/703811 | 14:52 |
tobiash | this fix works locally in nodepool ^ | 14:52 |
*** saneax has quit IRC | 14:57 | |
openstackgerrit | Clément Mondion proposed zuul/nodepool master: add tags support for aws provider https://review.opendev.org/703651 | 15:06 |
*** chandankumar is now known as chkumar|ruck | 15:10 | |
*** swest has quit IRC | 15:12 | |
*** zxiiro has joined #zuul | 15:12 | |
Shrews | tobiash: thanks for looking into that. is there a particular version of pip we should avoid? | 15:12 |
Shrews | or is just the upgrade enough? | 15:13 |
tobiash | Shrews: upgrade is enough afaik | 15:13 |
Shrews | k | 15:13 |
fungi | 20.0.0 had a nasty bug but as long as you use 20.0.1 it should be fine | 15:14 |
Shrews | i've approved it | 15:14 |
tobiash | it looks like that caching was broken in all versions until last november | 15:14 |
fungi | yes, that's how i read it as well | 15:14 |
tobiash | I'm just not sure why we're hitting this just now | 15:14 |
fungi | did we recently switch the job to python 3.7? | 15:14 |
tobiash | maybe we got upstream built versions of netifaces before | 15:15 |
tobiash | or that | 15:15 |
fungi | netifaces publishes manylinux1 wheels for 3.4 through 3.6 but not (yet) 3.7 | 15:15 |
fungi | i expect they'll add 3.7 wheels the next time they make a new release | 15:15 |
Shrews | were the zuul upload image jobs not hitting this? | 15:19 |
tobiash | Shrews: zuul didn't hit this because it probably doesn't pull in netifaces | 15:20 |
*** chkumar|ruck is now known as chkumar|rover | 15:23 | |
openstackgerrit | Clément Mondion proposed zuul/nodepool master: add tags support for aws provider https://review.opendev.org/703651 | 15:31 |
*** chkumar|rover is now known as raukadah | 15:35 | |
*** sshnaidm|mtg is now known as sshnaidm | 15:37 | |
corvus | looks like the python-builder change is going through, so maybe we don't need to merge the nodepool change? | 15:50 |
Shrews | corvus: that's correct | 15:53 |
openstackgerrit | Merged zuul/zuul master: test: prevent ResourceWarning in test_bubblewrap https://review.opendev.org/703767 | 16:01 |
openstackgerrit | Clément Mondion proposed zuul/nodepool master: add tags support for aws provider https://review.opendev.org/703651 | 16:07 |
tristanC | would it possible to make zuul operate on all the project hosted on a platform? | 16:12 |
tristanC | for example, the src.fedoraproject.org host 30k projects and it seems like adding them all to the tenant configuration is not going to be sustainable. | 16:12 |
tristanC | also i noticed gerrit is sending a 'project-created' event, would it then be possible to tell zuul to add such new project to the untrusted list of a tenant? | 16:13 |
corvus | tristanC: i think the first would be possible (but not trivial). i think the second idea would depend on the first. | 16:17 |
corvus | tristanC: to do the first, i think the configloader would need to ask the source-connection for the list of projects. the secound would just have to trigger a full-reconfigure when that event arrives. | 16:18 |
clarkb | corvus: in the case of gerrit at least, doesn't the project list essentially act as filter of gerrit events? | 16:18 |
corvus | tristanC: keep in mind that order matters with projects listed in the tenant config, so you'd want to be able to support listing some projects explicitly, then having the system collect the rest. | 16:18 |
clarkb | if that is how it works wouldn't the easiest thing be to not filter those events and handle all of them? | 16:19 |
clarkb | ah its the config ordering that matters | 16:19 |
corvus | clarkb: zuul expects to know all the projects it manages | 16:19 |
corvus | (i think there are many places that depend on verifying that a project is known; i think changing that would be a big job) | 16:20 |
corvus | tristanC: obviously for the moment, you could write a quick python script to write out the tenant yaml for fedora and see how zuul handles 30k projects | 16:23 |
*** mattw4 has joined #zuul | 16:24 | |
Shrews | tobiash: fyi, i rechecked 703407 to see if nodepool is unbroken | 16:25 |
pabelanger | corvus: tristanC: Couldn't https://zuul-ci.org/docs/zuul/discussion/components.html#attr-scheduler.tenant_config_script be used? Or | 16:27 |
pabelanger | did I not understand the comments | 16:27 |
corvus | pabelanger: sure. whether a temporary script is run by zuul or manually, it's the same. but i think tristanC is suggesting that "all the projects in the system" is a reasonably-common enough case that maybe it should just be supported by the drivers. | 16:28 |
tristanC | the issue with the current tenant configuration is that on restart or reconfiguration, the scheduler serially list all the branches of all the projects, which i think would just take too long for that many projects | 16:29 |
corvus | that sounds like a very different question | 16:30 |
*** tosky has quit IRC | 16:31 | |
tristanC | yes, i was wondering that in the event of such 'operate on all the project', zuul could implement a lazy loading of the projects | 16:31 |
corvus | that's more like clarkb's suggestion. i think it would require major changes to zuul. | 16:31 |
corvus | an alternative would be a new kind of reconfiguration that keeps the branch cache of existing projects and just does lookups on new ones. | 16:33 |
tristanC | or perhaps with the scheduler ha, then that won't be necessary as i guess the project status (e.g. branch list and conf) would be stored in zookeeper | 16:33 |
corvus | yes, i anticipate that would be the case | 16:33 |
corvus | implementing the "new kind of reconfiguration that keeps the branch cache of existing projects and just does lookups on new ones" now should be compatible with the future ha scheduler work | 16:34 |
Shrews | that would be interesting to see the impact on ZK that would have for many projects :) | 16:34 |
corvus | Shrews: yeah, it's unclear how much data we'll be able to store there. maybe we can put all of the cached config data in zk, or maybe it's too much and we'll have to put checksums in there and have the schedulers each get a copy of the data and store it locally | 16:38 |
tristanC | corvus: with that caching feature, how would zuul knows if the cache is consistent? | 16:38 |
corvus | tristanC: which cache? | 16:38 |
tristanC | "the branch cache of existing projects" | 16:39 |
corvus | tristanC: that exists now -- the start of the configuration process is to create an empty cache for every project-branch, then load the raw text of the config into that cache. | 16:39 |
tristanC | oh, i thought it would be persisted on disk | 16:40 |
corvus | ah, no that's a ram cache | 16:40 |
corvus | when we move to ha scheduler, we will need to persist it and deal with coherency | 16:40 |
corvus | my suggestion of how to deal with the fact that creating and repopulating the current cache on full-reconfigure is too slow with 30k projects is to add a new kind of reconfigure (delta-reconfigure?) that just initializes the branch cache for each new project (and removes it for each old project) | 16:41 |
tristanC | iiuc, delta-reconfigure could be implemented without changing the scheduler internals | 16:42 |
corvus | tristanC: correct, only scheduler change is the event handling around it. it would mostly be a change to configloader to do less than it does on full-reconfigure. | 16:43 |
corvus | tristanC, clarkb: the main reason that zuul needs to know about all the project branches in the system is that's how it knows what configuration to read. imagine a system where that wasn't the case, and a change to dib arrived and tried to run one of the functest jobs and failed because zuul had not loaded the nodepool project config yet. | 16:43 |
tristanC | and then, with further work on the ha scheduler, we would be able to cold restart the scheduler without looking up every project | 16:44 |
corvus | tristanC: yes, so that most reconfigurations are just deltas | 16:44 |
tristanC | alright, thank you for the prompt feedback and suggestions | 16:44 |
fungi | one thing i worry about with relying on project-created events from gerrit is what happens when you miss one of those | 16:45 |
corvus | (i'm sure we could engineer some compromises to support lazy-loading, but it would be a significant design/engineering effort) | 16:45 |
corvus | fungi: yep; one of the ideas of full-reconfigure is it is always an option to fix things if they somehow get out of sync | 16:45 |
tristanC | oh, and the other thing we would need is a 'get-all-project' source driver function , and a way to indicate a tenant to add all the missing project from a source to the untrusted list | 16:45 |
fungi | if zuul misses a patchset-created or comment-added event or whatever then it's more easily addressed | 16:45 |
corvus | i have some errands to run this morning, my availability may be limited. | 16:46 |
corvus | tristanC: correct | 16:46 |
fungi | i'm finishing up a zuul section for an upcoming osf newsletter... aside from the zuul and nodepool releases in december/january (and the features and changes they brought), is there anything else worth mentioning since november? | 16:46 |
fungi | oh, i guess the renewal of the pl position | 16:46 |
fungi | documentation overhaul | 16:46 |
clarkb | documentation overhaul is probably worthy of the focus there given the audience? | 16:47 |
fungi | yep | 16:47 |
pabelanger | speaking of specs for zuul. What are folks thoughts on maybe starting up again the discussion on circular dependencies: https://review.opendev.org/643309/ ? | 16:47 |
pabelanger | The topic recently came up on ansible side, with new collection split that is happening | 16:48 |
openstackgerrit | Clément Mondion proposed zuul/nodepool master: add tags support for aws provider https://review.opendev.org/703651 | 16:48 |
pabelanger | tobiash:^ I am not sure simon IRC handle | 16:48 |
tobiash | pabelanger: his irc handle is swest | 16:52 |
pabelanger | thanks! | 16:54 |
Shrews | tobiash: nodepool-build-image passed. \o/ | 17:07 |
Shrews | thx again | 17:07 |
fungi | anyone recall where we discussed formalizing our ansible support lifecycle? i found pabelanger's e-mail from october: http://lists.zuul-ci.org/pipermail/zuul-discuss/2019-October/001043.html | 17:09 |
tobiash | \o/ | 17:10 |
fungi | i'm not seeing it in any of the review comments for removing 2.5, adding 2.9 or switching the default to 2.8 | 17:10 |
Shrews | fungi: https://zuul-ci.org/docs/zuul/reference/developer/specs/multiple-ansible-versions.html?highlight=ansible#deprecation-policy | 17:10 |
fungi | haha, thanks Shrews!!! | 17:10 |
fungi | looks like we should add it to https://zuul-ci.org/docs/zuul-jobs/policy.html#deprecation-policy | 17:12 |
*** mattw4 has quit IRC | 17:12 | |
fungi | or...somewhere | 17:14 |
fungi | though i recall discussion getting more detailed about how to decide which versions to deprecate, which to select as default | 17:14 |
Shrews | hrm, don't recall | 17:16 |
*** mattw4 has joined #zuul | 17:19 | |
pabelanger | fungi: http://eavesdrop.openstack.org/irclogs/%23zuul/%23zuul.2019-12-09.log.html#t2019-12-09T16:22:34 is last time I remember it coming up. I did sign up for doc on ansible removal, but haven't started it TBH | 17:21 |
*** sshnaidm is now known as sshnaidm|afk | 17:27 | |
openstackgerrit | Clément Mondion proposed zuul/nodepool master: add tags support for aws provider https://review.opendev.org/703651 | 17:27 |
fungi | pabelanger: no worries, i was going to try to link to some record of the plan, but maybe i'll just say it's in the process of being formalized instead | 17:27 |
*** evrardjp has quit IRC | 17:34 | |
*** evrardjp has joined #zuul | 17:34 | |
*** jpena is now known as jpena|off | 17:48 | |
openstackgerrit | Merged zuul/zuul master: tests: remove test_repo_repr https://review.opendev.org/703698 | 17:49 |
openstackgerrit | Clint 'SpamapS' Byrum proposed zuul/zuul-jobs master: Add a markdownlint job and role https://review.opendev.org/607691 | 17:50 |
*** jamesmcarthur has quit IRC | 18:05 | |
*** jamesmcarthur has joined #zuul | 18:06 | |
*** jamesmcarthur has quit IRC | 18:07 | |
*** bhavikdbavishi has quit IRC | 18:09 | |
*** jamesmcarthur has joined #zuul | 18:38 | |
*** hashar has joined #zuul | 18:56 | |
*** jamesmcarthur has quit IRC | 19:00 | |
*** paladox is now known as paladox_UK_IN_EU | 19:09 | |
*** paladox_UK_IN_EU is now known as paladox | 19:09 | |
*** jamesmcarthur has joined #zuul | 19:13 | |
openstackgerrit | Merged zuul/nodepool master: Make flake8 config compatible with latest version https://review.opendev.org/703410 | 19:14 |
openstackgerrit | Merged zuul/nodepool master: Handle event id in node requests https://review.opendev.org/703406 | 19:14 |
*** paladox is now known as paladox_UK_IN_EU | 19:15 | |
*** hashar has quit IRC | 19:30 | |
*** hashar has joined #zuul | 19:30 | |
openstackgerrit | Merged zuul/nodepool master: Centralize logging adapters https://review.opendev.org/703407 | 19:34 |
*** gmann is now known as gmann_afk | 19:38 | |
*** paladox_UK_IN_EU is now known as paladox | 19:51 | |
*** patrick34 has joined #zuul | 20:02 | |
patrick34 | Hi | 20:04 |
patrick34 | I am trying to use kubernetes as build nodes for zuul. I was able to successfully configure my cluster in nodepool and I ged pods ready in my node list. | 20:05 |
patrick34 | However I can't seem to build anything. I wonder if I am not missing something in the required configs for kubernetes | 20:05 |
patrick34 | My job starts but returns Gathering Facts: MODULE FAILURE: error: You must be logged in to the server (Unauthorize | 20:06 |
patrick34 | I feel like I might need to export KUBECONFIG somewhere in the zuul configurations.. ? | 20:12 |
*** gmann_afk is now known as gmann | 20:18 | |
pabelanger | what does your nodepool.yaml file look like? | 20:19 |
pabelanger | I don't think kubectl can do facts | 20:21 |
tristanC | pabelanger: nodepool creates a service account for zuul to use, perhaps your kubernetes provider doesn't let service account run exec on pod? | 20:21 |
tristanC | oops, that was meant for patrick34 ^ | 20:21 |
patrick34 | hum I use a vanilla rancher 'k3s' depoyment, pretty sure the default provider gives me all permissions | 20:22 |
patrick34 | I use the same kube config file for myself as a test and I can run exec | 20:22 |
patrick34 | here'S my nodepool file pabelanger https://gist.github.com/plaurin84/93f01dad7c5f91548b1b9e1279aba04e | 20:25 |
tristanC | patrick34: if you run 'zuul-executor keep', then you'll find the kubeconfig file used by zuul in /var/lib/zuul/builds/{uid}/.kube/config | 20:27 |
pabelanger | thanks, are you able to share job log too? | 20:28 |
clarkb | just remember to turn that off once you've figured it out or your disks can fill up, but that is a good debugging tip /me needs to remember it exist more often | 20:28 |
pabelanger | Gathering Facts, to me is ansible 2.9 | 20:28 |
pabelanger | if that comes from a task | 20:28 |
pabelanger | before 2.9, that wasn't logged | 20:28 |
tristanC | pabelanger: fwiw, Software Factory CI posted a build running with kubectl on https://review.opendev.org/#/c/682049/ | 20:29 |
patrick34 | zuul-executor keep doesn't seem to work Exception: Unable to locate config file in ['/etc/zuul/zuul.conf', '~/zuul.conf'] | 20:30 |
patrick34 | however that file exists | 20:30 |
patrick34 | oh nvm | 20:30 |
pabelanger | ack | 20:30 |
clarkb | permissions issue probably | 20:30 |
patrick34 | was using my reg user, used sudo it works | 20:30 |
patrick34 | sooo whawt does it do now | 20:31 |
patrick34 | zuul-executor keep | 20:31 |
pabelanger | as tristanC said, you'll start to see build artifacts on your executor now | 20:31 |
pabelanger | so you can inspect content and validate kube.conf is correct | 20:32 |
patrick34 | oh I see | 20:32 |
patrick34 | in the logs if I run a job? | 20:32 |
tristanC | patrick34: you need to recheck a failed job, then look for a .kube/config file in the zuul home dir | 20:33 |
patrick34 | ok | 20:33 |
tristanC | patrick34: arg, you would also need to use autohold to keep the namespace | 20:33 |
patrick34 | okay will do | 20:34 |
patrick34 | pods are quick gotta be fast | 20:34 |
corvus | the .kube/config should show up in the build dir, right? | 20:35 |
tristanC | corvus: yes, it's setup here: https://opendev.org/zuul/zuul/src/branch/master/zuul/executor/server.py#L1722 | 20:36 |
corvus | so once you recheck a job, it should be in a directory like /var/lib/zuul/builds/$UUID/work/.kube/config where $UUID is the unique id of the build (you can see it in the logs) | 20:36 |
pabelanger | what does work/.kube/config contain? | 20:37 |
tristanC | pabelanger: the service account token to access the namespace created by nodepool | 20:38 |
pabelanger | tristanC: is that onetime use? | 20:38 |
tristanC | pabelanger: yes, it's auto created by the k8s service for service account | 20:39 |
pabelanger | ack, mostly wondering, if that is something we could collection to aid debugging for jobs | 20:40 |
corvus | so if the namespace is held too, you can run "kubectl --kubeconfig /var/lib/zuul/builds/$UUID/work/.kube/config version" and verify things are working. | 20:40 |
corvus | and kubectl exec, etc | 20:41 |
*** wxy-xiyuan has quit IRC | 20:41 | |
patrick34 | checking | 20:41 |
tristanC | patrick34: are you running nodepool-3.11.0 ? | 20:43 |
patrick34 | 3.9.1.dev6 | 20:44 |
patrick34 | nodepool seems to be working fine with kubernetes | 20:44 |
patrick34 | okay I got my debug node | 20:44 |
tristanC | patrick34: 3.11.0 includes a fix regarding how the token it creates for zuul is encoded, you might need to upgrade to get zuul able to use the pod | 20:45 |
tristanC | patrick34: ftr it's https://review.opendev.org/687435 | 20:45 |
patrick34 | I can't find any file or zuul config file in the pod | 20:46 |
patrick34 | btw here's the log https://gist.github.com/plaurin84/c49d973e4bdfa7903d545facb2f8c4c4 | 20:46 |
patrick34 | also I don't get the 'namespace' part of the nodepool config. It creates the namespace for the pod to run, but what's the other namespace for? | 20:47 |
pabelanger | and if you look in executor logs, you see the traceback from ansible right? | 20:47 |
patrick34 | yes | 20:48 |
openstackgerrit | James E. Blair proposed zuul/zuul-jobs master: Add upload-logs-google role https://review.opendev.org/703711 | 20:48 |
corvus | patrick34: the config file would be on the executor in a path like /var/lib/zuul/builds/$UUID/work/.kube/config | 20:48 |
*** panda has quit IRC | 20:49 | |
patrick34 | okay yes I see this | 20:51 |
*** panda has joined #zuul | 20:51 | |
patrick34 | oh it's a bit different than my config file | 20:51 |
patrick34 | the context uses a user that is not 'default' | 20:52 |
patrick34 | shouldn't nodepool or the executor 'create' this user in the cluster? | 20:52 |
tristanC | patrick34: nodepool creates a namespace and service account for each build | 20:53 |
patrick34 | I see | 20:54 |
patrick34 | I only see the default service account in my cluster | 20:54 |
tristanC | patrick34: i suspect nodepool-3.9.1.dev6 has an issue where it store the token base64-encoded twice, it's fixed by https://review.opendev.org/687435 and you need to upgrade to nodepool-3.11.0 | 20:54 |
patrick34 | okay I don't feel safe upgrading this critical cluster .. | 20:55 |
patrick34 | I just tested something, I don't see any serviceaccount being created when the job is running | 20:56 |
patrick34 | using watch kubectl get serviceaccounts | 20:57 |
pabelanger | is self-provisioner setup on default? | 20:58 |
pabelanger | I am juat reading https://zuul-ci.org/docs/nodepool/configuration.html#attr-providers.[openshift].context | 20:58 |
pabelanger | just* | 20:58 |
pabelanger | actually | 20:58 |
pabelanger | https://zuul-ci.org/docs/nodepool/configuration.html#attr-providers.[kubernetes].context | 20:58 |
pabelanger | is k8s | 20:58 |
patrick34 | yes providers.[kubernetes].context is set to default | 20:59 |
pabelanger | which doesn't reference self-provisioner | 20:59 |
patrick34 | but this thing I'm not sure it works | 20:59 |
patrick34 | - name: kubernetes-namespace type: namespace | 20:59 |
tristanC | pabelanger: self-provisioner is specific to openshift, and it's not setup by default, you have to ask an admin to set it for you | 21:00 |
patrick34 | yeah I don't use openshift | 21:00 |
patrick34 | I'm pretty much the sole admin of all this :) | 21:00 |
tristanC | patrick34: you might need to use `kubectl get --all-namespaces=true serviceaccounts` to see what nodepool creates | 21:02 |
patrick34 | oh yeah I forgot that service accounts are namespaced | 21:02 |
patrick34 | okay for each node I see a default and a zuul-worker | 21:03 |
patrick34 | '' for each namespace associated with a node | 21:03 |
patrick34 | sooo, if everything seems in order, I guess it might be the issue you posted earlier maybe | 21:05 |
patrick34 | not sure how I can see if the bug is relevant | 21:05 |
tristanC | patrick34: you can try to b64decode twice the token in $UUID/work/.kube/config | 21:06 |
patrick34 | the user token okay trying | 21:06 |
*** pcaruana has quit IRC | 21:07 | |
tristanC | that's the bug, the user token should only be encoded once | 21:08 |
pabelanger | tristanC: maybe we should add update note for nodepool release notes too, if this turns out to be the issue | 21:08 |
tristanC | pabelanger: yes | 21:09 |
patrick34 | I was able to decode it once, not twice | 21:10 |
patrick34 | I decoded once and put it back as user token, now I have different error I guess that's a good sign | 21:10 |
patrick34 | Error from server (Forbidden): pods is forbidden: User "system:serviceaccount:sqa-tests-ipmi-community-k3s-0000350177:zuul-worker" cannot list resource "pods" in API group "" at the cluster scope | 21:10 |
patrick34 | (manually doing the kube commands by hand with KUBECONFIG var | 21:11 |
patrick34 | oh wait | 21:11 |
patrick34 | no it seems to work now I can get pods and see the pod running | 21:11 |
pabelanger | \o/ | 21:11 |
patrick34 | =D | 21:12 |
patrick34 | sooooooo I need that patch :P | 21:12 |
tristanC | patrick34: oh right, the bug is that it was encoded once while it should not | 21:12 |
pabelanger | yah, so we should update our docs to include that note, as k8s driver seems to not work. | 21:13 |
pabelanger | then, patrick34 will have to schedule upgrade | 21:13 |
patrick34 | ya | 21:13 |
pabelanger | FWIW: we've upgrade each release nodepool release in ansible, and things work | 21:13 |
pabelanger | pip install, stop / start service | 21:13 |
patrick34 | I have zuul 3.11.2.dev26 will upgrading nodepool to the version you mentionned earlier cause any problem? | 21:13 |
patrick34 | or 'potential' problems :P | 21:14 |
pabelanger | I don't think so, we usually do a good job saying when both have to be in lockstep | 21:14 |
corvus | tristanC: was there an associated zuul change? or was the fix entirely in nodepool? | 21:16 |
patrick34 | I'm pretty grateful guys for your help this is amazing. I'll be updating nodepool and giving you some updates | 21:16 |
tristanC | corvus: iirc only nodepool needed a fix | 21:16 |
corvus | patrick34: thanks! let us know how it goes :) | 21:17 |
openstackgerrit | Tristan Cacqueray proposed zuul/nodepool master: kubernetes: add release note about service account token issue https://review.opendev.org/703869 | 21:21 |
*** patrick34 has quit IRC | 21:28 | |
openstackgerrit | Merged zuul/zuul master: Fix Test Case "TestScheduler.test_timer_with_jitter" https://review.opendev.org/703749 | 21:40 |
hashar | oh | 21:46 |
hashar | that timer_with_jitter has hit me several time. Nice to see it fixed ; | 21:46 |
hashar | ) | 21:46 |
*** hashar has quit IRC | 21:51 | |
*** hashar has joined #zuul | 21:52 | |
*** coldtom has quit IRC | 22:22 | |
pabelanger | tristanC: can you help me add a tooltip to web UI, that shows estimated time remaining? I've struggled for a while to get this working, but can't seem to figure it out | 22:26 |
pabelanger | time remaining for a job run | 22:26 |
tristanC | pabelanger: iirc tooltip are managed through the `title` attribute of a dom element, where do you want it to appear? | 22:29 |
tristanC | pabelanger: e.g. here: https://opendev.org/zuul/zuul/src/branch/master/web/src/containers/status/ChangePanel.jsx#L201 see the line below for how to inject code in dom | 22:30 |
pabelanger | tristanC: basically, when I hover over the blue progress bar, the old UI would show estimated time remaining | 22:30 |
tristanC | do we have estimated time per job run? | 22:31 |
clarkb | yes I think zuul still tracks that | 22:31 |
clarkb | tristanC: that is how it knows the estimate for the buildset | 22:31 |
clarkb | it takes the max of builds estimated time list iirc | 22:32 |
pabelanger | I think it is last 10 runs | 22:32 |
pabelanger | looking | 22:32 |
tristanC | clarkb: oh right | 22:33 |
pabelanger | https://opendev.org/zuul/zuul/src/branch/master/zuul/model.py#L4611 | 22:34 |
tristanC | pabelanger: then adding `<div className='progress zuul-job-result' title={ "estimated time remaining" + remainingTime }>` to https://opendev.org/zuul/zuul/src/branch/master/web/src/containers/status/ChangePanel.jsx#L201 should do the trick | 22:34 |
pabelanger | k, let me try that | 22:35 |
*** hashar has quit IRC | 22:37 | |
openstackgerrit | Paul Belanger proposed zuul/zuul master: WIP: Add estimated time remaining tooltip to UI https://review.opendev.org/703892 | 22:39 |
*** jamesmcarthur has quit IRC | 22:44 | |
pabelanger | tristanC: OMG, it works. I just need to confirm to 00:00:00 format | 22:50 |
*** armstrongs has joined #zuul | 22:54 | |
tristanC | pabelanger: hehe nice :) | 22:57 |
tristanC | pabelanger: you could moment.js like so https://opendev.org/zuul/zuul/src/branch/master/web/src/containers/build/Buildset.jsx#L64 | 22:58 |
pabelanger | kk | 23:00 |
*** armstrongs has quit IRC | 23:03 | |
openstackgerrit | Paul Belanger proposed zuul/zuul master: WIP: Add estimated time remaining tooltip to UI https://review.opendev.org/703892 | 23:07 |
*** rlandy is now known as rlandy|bbl | 23:22 | |
*** avass has quit IRC | 23:23 | |
openstackgerrit | Paul Belanger proposed zuul/zuul master: WIP: Add estimated time remaining tooltip to UI https://review.opendev.org/703892 | 23:26 |
*** mattw4 has quit IRC | 23:51 | |
openstackgerrit | Paul Belanger proposed zuul/zuul master: Add estimated time remaining tooltip to UI https://review.opendev.org/703892 | 23:55 |
openstackgerrit | Paul Belanger proposed zuul/zuul master: Add estimated time remaining tooltip to UI https://review.opendev.org/703892 | 23:56 |
pabelanger | tristanC: okay, ^ worked | 23:56 |
pabelanger | however, not sure humanize() is the right way. that will only say 2hours, and not include minutes | 23:57 |
pabelanger | or 13 minutes (without seconds) | 23:57 |
pabelanger | however, need to #dadops now | 23:57 |
pabelanger | will look more tomorrow | 23:57 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!