*** baoli has quit IRC | 00:00 | |
*** Swami has quit IRC | 00:01 | |
mnaser | jeblair do we want to recheck 508336 ? | 00:01 |
---|---|---|
*** mat128 has joined #openstack-infra | 00:01 | |
*** baoli has joined #openstack-infra | 00:01 | |
mordred | mnaser: just hit the recheck button | 00:02 |
*** lukebrowning has quit IRC | 00:02 | |
mnaser | okay so once that merges, ill restart my test case | 00:03 |
mordred | yah | 00:03 |
mordred | and then if that passes on centos nodes we'll know we're good to land the real one | 00:03 |
fungi | clarkb: i have some inline comments on 508348, one cosmetic and one about the regex | 00:04 |
mordred | fungi: oops. that first one was my bad :) | 00:04 |
clarkb | fungi: re regex its from mordred and so we don't have t oescape a $ | 00:04 |
clarkb | we could write it the other way but meh | 00:04 |
* mordred can do a follow up to fix the name | 00:05 | |
openstackgerrit | Monty Taylor proposed openstack-infra/openstack-zuul-jobs master: Fix name of zuul sudo script task https://review.openstack.org/508361 | 00:07 |
mordred | fungi: ^^ | 00:07 |
mordred | clarkb: you too | 00:07 |
fungi | okay, was mostly worried that it was matching on the variable's name rather than its value. i guess my ansible-fu is weak and i really have no idea why that regex is there | 00:07 |
fungi | oh! now i get it | 00:07 |
*** ekcs has quit IRC | 00:07 | |
clarkb | fungi: the regex is matching contents of a file. If a line has that content it replcaes it with the line we provide | 00:07 |
fungi | this is the ansible equivalent of sed s/ | 00:07 |
clarkb | yes | 00:07 |
*** lukebrowning has joined #openstack-infra | 00:08 | |
*** gildub has joined #openstack-infra | 00:08 | |
fungi | so lineinfile is for inline editing? | 00:08 |
clarkb | ya | 00:09 |
clarkb | its also super clunky and now I don't like it | 00:09 |
fungi | i can certainly see why | 00:09 |
clarkb | would rather just command: sed -e"s/foo/bar/" | 00:09 |
fungi | -i | 00:10 |
fungi | but yeah | 00:10 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Fix sql reporting start/end times https://review.openstack.org/508362 | 00:11 |
jlk | I don't like touching files like that at all | 00:11 |
jlk | but I can see the allure | 00:11 |
jeblair | clarkb, mordred: real noop fix ^ | 00:11 |
clarkb | jlk: its far easier to understand imo and more powerful | 00:12 |
clarkb | jlk: expressing that in lineinfile is much more complicated and error prone | 00:12 |
*** lukebrowning has quit IRC | 00:12 | |
openstackgerrit | Merged openstack-infra/project-config master: Update fetch-zuul-cloner in base-test https://review.openstack.org/508336 | 00:12 |
jlk | no no, I get sed vs lineinfile, I just don't like editing a file with a SCM | 00:12 |
jlk | I'd rather replace the file with a template or something like that | 00:12 |
mordred | jlk: totally agree | 00:12 |
mnaser | mordred jeblair i'll recheck in a few minutes, im at stage where tempest is running in current puppet job so i wanna see if it completes | 00:12 |
mordred | mnaser: \o/ | 00:13 |
mnaser | and if it does then woo, if it doesn't it'll save me waiting another 20-30 minutes till it fails again | 00:13 |
openstackgerrit | Merged openstack-infra/openstack-zuul-jobs master: Added legacy-puppet-openstack-integration templates https://review.openstack.org/508332 | 00:13 |
jeblair | mnaser: wfm | 00:13 |
*** lukebrowning has joined #openstack-infra | 00:14 | |
clarkb | jeblair: mnaser https://review.openstack.org/#/c/508334/ the depends on for that merged, so the error there is weird looking | 00:14 |
mnaser | clarkb i was gonna rebase and repush... even tho im not sure why | 00:15 |
mnaser | but i can leave it as is if you want to check | 00:15 |
clarkb | mnaser: I don't think you need a rebase, I think you can likely just recheck it | 00:15 |
clarkb | mnaser: gerrit at least doesn't seem to think it is a merge failure | 00:15 |
clarkb | which makes me think some interaction with depends on maybe? | 00:15 |
*** thorst has quit IRC | 00:15 | |
mordred | clarkb: gerrit shows merge conflict to me | 00:16 |
mnaser | big red "Patch in Merge Conflict" i see in the UI isn't gerrit? | 00:16 |
clarkb | mnaser: no that is from the ci results | 00:16 |
mnaser | TIL, so let me throw a recheck i guess | 00:16 |
clarkb | the box with owner it in will say merge conflict if gerrit itself says its a conflict | 00:16 |
*** lukebrowning has quit IRC | 00:18 | |
clarkb | 508302 is showing that some of the less trivial d-g jobs are passing | 00:19 |
*** lukebrowning has joined #openstack-infra | 00:20 | |
mnaser | if my legacy jobs are failing and im ignoring them and writing new jobs, how will i get the .zuul.yaml file merged then (without leaving the project with no ci for that duration?) | 00:21 |
clarkb | jlk: mordred I think its important to treat ansible as a remote execution language here rather than configuration management. You can definitely clean things up but right now especially for legacy jobs its literally there for "run these scripts" | 00:21 |
mnaser | (this could be really obvious) | 00:21 |
clarkb | mnaser: you'd update your .zuul.yaml to run the new jobs and remove the old jobs from the project-config list I think | 00:22 |
jlk | clarkb: agreed | 00:22 |
mnaser | clarkb but i cant get my .zuul.yaml merged if my legacy jobs are failing? unless there's some depends-on black magic that would go on | 00:22 |
clarkb | mnaser: you remove the legacy jobs so they won't run | 00:22 |
jlk | clarkb: I'm also kinda over touching config on systems after deploy as a whole. But that's just the container koolaid talking | 00:22 |
jeblair | clarkb: that merge failure may be the "someone pushed up a new patchset of the dependency" error | 00:23 |
clarkb | jeblair: ya except in this case the dependency merged I think | 00:23 |
mnaser | remove jobs in project-config, change that adds .zuul.yaml depends-on that one .. maybe that might workaround it? | 00:23 |
clarkb | jeblair: so no new patchsets | 00:23 |
clarkb | mnaser: ya | 00:23 |
jeblair | clarkb: oh | 00:23 |
*** lukebrowning has quit IRC | 00:24 | |
clarkb | mriedem_dinner: is nova net expected to work right now http://logs.openstack.org/02/508302/3/check/legacy-tempest-dsvm-nnet/ac16495/logs/devstacklog.txt.gz#_2017-09-29_00_22_37_388 ? | 00:25 |
*** alex_xu has joined #openstack-infra | 00:25 | |
*** alex_xu has quit IRC | 00:25 | |
*** mrunge_ has joined #openstack-infra | 00:25 | |
*** alex_xu has joined #openstack-infra | 00:25 | |
mnaser | https://review.openstack.org/#/c/508296/ | 00:26 |
mnaser | woo my first success | 00:26 |
mnaser | jeblair im rechecking to test base-test | 00:26 |
Jeffrey4l | hi, where is the /etc/nodepool/provider file? it is disappeared. | 00:26 |
openstackgerrit | Ian Wienand proposed openstack/diskimage-builder master: [DNM] gate testing https://review.openstack.org/508367 | 00:26 |
*** mrunge has quit IRC | 00:26 | |
*** lukebrowning has joined #openstack-infra | 00:26 | |
*** thorst has joined #openstack-infra | 00:26 | |
clarkb | Jeffrey4l: the information that was in those files is available in ansible but not necessarily in /etc/nodepool in all cases any longer. What are you trying to accomplish with it? | 00:27 |
Jeffrey4l | clarkb, need get the variable and set into kolla's configuration file to build image. | 00:27 |
clarkb | Jeffrey4l: right but what is it being used for? | 00:27 |
clarkb | Jeffrey4l: there may be an existing role we can add to your jobs to accomplish the same task | 00:27 |
clarkb | or otherwise advice on the best way to get that info | 00:28 |
Jeffrey4l | this should be added into one jinja2 template. not /etc/yum.repos.d/ files. | 00:28 |
Jeffrey4l | how can i see these variables? | 00:28 |
jeblair | clarkb, Jeffrey4l: if this is for a legacy job, the conversion script may have set the wrong parent for the job | 00:29 |
*** rcernin has quit IRC | 00:29 | |
Jeffrey4l | clarkb, it is used for build docker images. | 00:29 |
Jeffrey4l | jeblair, yes. all jobs in kolla is red now | 00:29 |
clarkb | Jeffrey4l: http://logs.openstack.org/02/508302/3/check/legacy-devstack-gate-tox-run-tests/8971b5f/zuul-info/inventory.yaml it is part of the node inventory now | 00:29 |
jlk | oh hey, ansible-lint change was merged upstream, and will be released tomorrow. | 00:30 |
clarkb | Jeffrey4l: are you using it to build the mirror info for builds? | 00:30 |
clarkb | because that info is more directly consumable elsewhere | 00:30 |
Jeffrey4l | use the mirror repo to build docker images. | 00:30 |
*** lukebrowning has quit IRC | 00:31 | |
Jeffrey4l | since there are in ansible variable. how can not access it in my plain bash script? | 00:31 |
Jeffrey4l | they are in * | 00:31 |
clarkb | /etc/ci/mirror_info.sh iirc | 00:31 |
clarkb | Jeffrey4l: ^ is probably the best way to consume the particular mirror info in a bash script | 00:31 |
clarkb | you source it then you get things like NODEPOOL_FEDORA_MIRROR and NODEPOOL_EPEL_MIRROR and so on | 00:32 |
Jeffrey4l | ok. let me check. thanks. | 00:32 |
*** lukebrowning has joined #openstack-infra | 00:32 | |
clarkb | 508348 is waiting on a trusty node, its child change in gate got one before it :/ | 00:33 |
*** mat128 has quit IRC | 00:34 | |
mnaser | ya i noticed that too :X | 00:34 |
*** lukebrowning has quit IRC | 00:37 | |
Jeffrey4l | btw, how the /etc/ci/mirror_info.sh file is added into the image? | 00:37 |
mnaser | Jeffrey4l there is a role that zuul runs before your job starts | 00:37 |
*** zhurong has joined #openstack-infra | 00:37 | |
clarkb | Jeffrey4l: with zuulv2 it is added by nodepool, with zuulv3 its part of base job setup | 00:37 |
Jeffrey4l | mind give me a code link? | 00:38 |
jeblair | clarkb: a different project-config change merged around the same time that the dependent change in openstack-zuul-jobs merged. it triggered merge-check events for all project-config changes, including 508334. since the ozj change had landed, it did not include it in the speculative merge anymore. however, it had not yet performed the reconfiguration needed by the ozj change, so it didn't have the new configuration cached. so the ... | 00:38 |
jeblair | ... configuration syntax check failed. it was actually the merge-check pipeline that reported that error. | 00:38 |
jeblair | i think that's two strikes against merge-check | 00:39 |
openstackgerrit | Merged openstack-infra/project-config master: Switch puppet jobs to legacy template https://review.openstack.org/508334 | 00:39 |
mnaser | jeblair it failed even with depends-on and that change merged - http://logs.openstack.org/96/508296/9/check/puppet-openstack-integration-4-scenario001-tempest-centos-7/a614a5e/job-output.txt.gz (change in question - https://review.openstack.org/#/c/508296/) | 00:39 |
clarkb | Jeffrey4l: https://git.openstack.org/cgit/openstack-infra/openstack-zuul-jobs/tree/roles/mirror-info | 00:39 |
*** thorst has quit IRC | 00:40 | |
mnaser | jeblair sorry, here's a more easier acccessible link - http://logs.openstack.org/96/508296/9/check/puppet-openstack-integration-4-scenario001-tempest-centos-7/a614a5e/job-output.txt.gz#_2017-09-29_00_37_40_082572 | 00:40 |
clarkb | jeblair: I'm thinking maybe we should rely on gerrit for those checks? | 00:40 |
Jeffrey4l | got, thanks a lot. | 00:40 |
mnaser | jeblair oh shoot, you added base-test to the legacy base which im not using here | 00:41 |
mnaser | let me manually add it to my new base job | 00:41 |
openstackgerrit | James E. Blair proposed openstack-infra/project-config master: Disable merge-check pipeline https://review.openstack.org/508371 | 00:41 |
jeblair | clarkb: ^ | 00:41 |
jeblair | that's a quick disable which we can roll forward or backwards later as needed. | 00:41 |
*** lukebrowning has joined #openstack-infra | 00:42 | |
jeblair | mnaser: ah yeah, the reproducer you linked earlier was a legacy- job right? you move fast :) | 00:42 |
mnaser | jeblair figured it was easier than spamming project-config changes here all day :D | 00:42 |
ianw | jeblair / mordred : https://git.openstack.org/cgit/openstack-infra/openstack-zuul-jobs/tree/playbooks/legacy/dib-dsvm-functests-python2-centos-7/run.yaml#n37 <- did that happen for a reason, is it safe to convert back to "|" ? | 00:43 |
jeblair | ianw: i have no idea! | 00:44 |
clarkb | jeblair: and re 508348 should I just be patient for it to find a trusty node? | 00:45 |
ianw | ok, i just remembered something flying by about | format | 00:45 |
jlk | zuul didn't like that change | 00:45 |
fungi | jeblair: wow, the error on 508371 is amazing! | 00:46 |
jlk | "Unknown configuration error" | 00:46 |
jlk | do we error if a pipeline doesn't have a trigger? | 00:46 |
clarkb | might have to delete it entirely or make the trigger unresolvable? | 00:46 |
jeblair | whew, i was worried it was going to print out the whole config or something :) | 00:46 |
*** lukebrowning has quit IRC | 00:46 | |
jeblair | clarkb: well, there are lots of templates | 00:46 |
jeblair | i probably need some [] or {} or something | 00:46 |
*** thorst has joined #openstack-infra | 00:47 | |
openstackgerrit | James E. Blair proposed openstack-infra/project-config master: Disable merge-check pipeline https://review.openstack.org/508371 | 00:47 |
jlk | trigger: {} | 00:47 |
jeblair | jlk: agreed ^ :) | 00:47 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Fix sql reporting start/end times https://review.openstack.org/508362 | 00:48 |
jlk | lolz | 00:48 |
*** lukebrowning has joined #openstack-infra | 00:48 | |
jeblair | 362 will need a scheduler restart | 00:48 |
clarkb | lets make sure 508348 doesn't get lost with ^, that fixes a whole ton of jobs | 00:49 |
jeblair | yeah, wasn't planning on doing it right now | 00:49 |
jeblair | clarkb: your trusty node is request 100-0000049673 | 00:49 |
ianw | Jeffrey4l: you might also be interested in https://git.openstack.org/cgit/openstack/diskimage-builder/tree/contrib/setup-gate-mirrors.sh and related in dib, where we use the mirrors | 00:50 |
jeblair | clarkb: | 0000044323 | rax-dfw | None | ubuntu-trusty | 0e5b9e3d-cbab-40ef-986a-9231c9ea15b2 | building | 00:00:40:54 | locked | ubuntu-trusty-rax-dfw-0000044323 | None | None | None | 22 | nl01.openstack.org-30932-PoolWorker.rax-dfw-main | 100-0000049673 | None ... | 00:50 |
jeblair | ... | None | | 00:50 |
clarkb | jeblair: how do I get to ^ command on zuulv3.o.o ? | 00:51 |
jeblair | we're going to be a little more exposed to node build times in v3; if that exceeds our tolerance, we may want to decrease our build timeout in nodepool. | 00:51 |
Jeffrey4l | yes. it is useful. thanks ianw | 00:51 |
*** thorst has quit IRC | 00:51 | |
jeblair | clarkb: http://paste.openstack.org/show/622240/ | 00:52 |
mordred | ianw: definitely not on purpose | 00:52 |
clarkb | jeblair: thanks! | 00:52 |
*** threestrands has joined #openstack-infra | 00:52 | |
*** threestrands has quit IRC | 00:52 | |
*** threestrands has joined #openstack-infra | 00:52 | |
*** lukebrowning has quit IRC | 00:52 | |
openstackgerrit | Mohammed Naser proposed openstack-infra/project-config master: Switch to legacy puppet check jobs https://review.openstack.org/508373 | 00:53 |
*** erlon has quit IRC | 00:53 | |
openstackgerrit | Mohammed Naser proposed openstack-infra/openstack-zuul-jobs master: Add legacy puppet check jobs https://review.openstack.org/508374 | 00:53 |
*** cuongnv has joined #openstack-infra | 00:53 | |
*** lukebrowning has joined #openstack-infra | 00:54 | |
openstackgerrit | Mohammed Naser proposed openstack-infra/openstack-zuul-jobs master: Remove non-legacy puppet check template https://review.openstack.org/508375 | 00:54 |
clarkb | jeblair: this is work so definitely not expected today or any time soon, but could't we have node requests pull nodes off a queue in a fifo manner. So you don't actually ssign a node until one is ready? | 00:54 |
clarkb | zuul -> request -> nodepool -> nodepool gives request queue entry -> nodepool appends build to queue -> as builds complete oldest request in queue gets that node | 00:55 |
jeblair | clarkb: yeah, we'd need another queue in nodepool (a build queue) distinct from the request queue | 00:56 |
*** LindaWang has joined #openstack-infra | 00:56 | |
mnaser | jeblair - http://zuulv3.openstack.org/static/stream.html?uuid=92d92683f8eb46b8bf4ddf1f4339aec8&logfile=console.log - this fix works | 00:56 |
mnaser | ctrl+f => Creating /etc/puppetlabs/code/modules/aodh | 00:56 |
mnaser | it grabs it from /home/zuul | 00:56 |
mnaser | and xenial behaviour still works as well | 00:57 |
mnaser | http://zuulv3.openstack.org/static/stream.html?uuid=62758ac8d4d94628a1c763ee9ddee2fb&logfile=console.log | 00:57 |
jeblair | mnaser: cool, i +3d 508337; the promote base-test to base change | 00:57 |
jeblair | mnaser: thanks! | 00:57 |
mnaser | no problem, thank you! | 00:57 |
*** thorst has joined #openstack-infra | 00:58 | |
jeblair | clarkb: in the interim, maybe we should look at our median build time for our clouds and set the timeout to some nth percentile over that | 00:58 |
clarkb | jeblair: ya | 00:59 |
*** lukebrowning has quit IRC | 00:59 | |
openstackgerrit | melanie witt proposed openstack-infra/devstack-gate master: WIP Add mysqladmin -v extended-status processlist https://review.openstack.org/507626 | 00:59 |
*** thorst has quit IRC | 01:00 | |
jeblair | clarkb: it looks like 10m would be pretty god for rax actually. maybe 15. | 01:00 |
*** thorst has joined #openstack-infra | 01:00 | |
jeblair | http://grafana.openstack.org/dashboard/db/nodepool-rackspace?from=1506042021678&to=1506646821678 | 01:00 |
*** lukebrowning has joined #openstack-infra | 01:00 | |
*** thorst_ has joined #openstack-infra | 01:01 | |
*** thorst has quit IRC | 01:04 | |
openstackgerrit | James E. Blair proposed openstack-infra/project-config master: Set rackspace launch timeout to 10m https://review.openstack.org/508378 | 01:04 |
jeblair | clarkb: ^ | 01:04 |
*** thorst_ has quit IRC | 01:05 | |
*** lukebrowning has quit IRC | 01:05 | |
jeblair | i'm wilting, i think i need to eod. anything urgent before i do? | 01:06 |
clarkb | I too am fading fast. I don't think there is anything super urgent other than continuing to go through failures and fix them | 01:06 |
*** lukebrowning has joined #openstack-infra | 01:06 | |
clarkb | grenade is still unhappy, assuming sudo fixes finally get in I'll likely focus on grenade in the morning | 01:06 |
clarkb | maybe we should send an update? | 01:07 |
jeblair | clarkb: probably a good idea | 01:08 |
*** wolverineav has quit IRC | 01:08 | |
*** wolverineav has joined #openstack-infra | 01:09 | |
fungi | we should prepare for it to be the top message in the :my job is broken... here's what i did" thread which is certain to follow | 01:10 |
clarkb | single node tempest and grenade look happy now though | 01:10 |
ianw | clarkb: your sudo fixes will fix everything that's failing 508344 right? | 01:11 |
*** lukebrowning has quit IRC | 01:11 | |
mriedem_dinner | clarkb: i don't know what that job is | 01:11 |
*** mriedem_dinner is now known as mriedem | 01:11 | |
mriedem | nova-net will only run in a cellsv1 setup | 01:12 |
clarkb | mriedem_dinner: its the tempest nova net job | 01:12 |
mriedem | there is no tempest nova-net job | 01:12 |
mriedem | there is the cells job | 01:12 |
clarkb | mriedem: its possible that it is a bug | 01:12 |
mriedem | which is cells v1 and runs nova-net | 01:12 |
clarkb | and the job should just go away | 01:12 |
mriedem | it's likely something changed with branch restrictions or something in project-config | 01:12 |
* fungi hopes mriedem_dinner brought enough for the whole class | 01:12 | |
mriedem | dinner did not go well | 01:12 |
mriedem | and thus, | 01:12 |
mriedem | my daughter will be having hers for breakfast | 01:13 |
clarkb | ianw: if you look at 508302 it dep'd on the sudo fix. It fixes a good chunk of stuff | 01:13 |
*** lukebrowning has joined #openstack-infra | 01:13 | |
mriedem | because zucchini is terrifying | 01:13 |
clarkb | ianw: but multinode things are not happy if you want to dig into that | 01:13 |
*** bobh has joined #openstack-infra | 01:13 | |
*** wolverineav has quit IRC | 01:13 | |
ianw | clarkb: ok, i'll see :) i'd like to get the dib gate unblocked, just in case we need to push something out quickish | 01:13 |
openstackgerrit | Sam Yaple proposed openstack-infra/bindep master: Remove grammar duplication https://review.openstack.org/506803 | 01:15 |
clarkb | woo sudo fix has trusty node finally | 01:16 |
*** lukebrowning has quit IRC | 01:17 | |
openstackgerrit | Merged openstack-infra/openstack-zuul-jobs master: Swap order of sudoers manipulation https://review.openstack.org/508348 | 01:18 |
openstackgerrit | Merged openstack-infra/openstack-zuul-jobs master: Fix name of zuul sudo script task https://review.openstack.org/508361 | 01:18 |
*** lukebrowning has joined #openstack-infra | 01:19 | |
clarkb | with that done, is anyone working on an update email yet? | 01:20 |
openstackgerrit | Mohammed Naser proposed openstack-infra/project-config master: Switch to legacy puppet check jobs https://review.openstack.org/508373 | 01:21 |
*** ggillies has quit IRC | 01:21 | |
fungi | i've already cracked open that bottle of saké, it's getting pretty late over here | 01:21 |
clarkb | jeblair: ^ if you've not started I'll work on something | 01:22 |
*** andreww has joined #openstack-infra | 01:23 | |
*** ggillies has joined #openstack-infra | 01:23 | |
openstackgerrit | Sam Yaple proposed openstack-infra/bindep master: Simplify grammar https://review.openstack.org/506803 | 01:24 |
clarkb | https://etherpad.openstack.org/p/cvedg2Y74g | 01:24 |
SamYaple | gotta say, don't totally hate Parsley now that ive been working with it. its kinda nice | 01:24 |
*** lukebrowning has quit IRC | 01:24 | |
*** Apoorva has quit IRC | 01:25 | |
*** lukebrowning has joined #openstack-infra | 01:25 | |
*** xarses_ has quit IRC | 01:26 | |
*** stakeda has joined #openstack-infra | 01:27 | |
clarkb | how does ^ etherpad look? | 01:28 |
portdirect | looks good | 01:28 |
portdirect | though I'm seeing some issues with zuul cloner as well i think? http://logs.openstack.org/54/457754/67/check/legacy-openstack-helm-aio-basic-ovs-radosgw/cd21d38/job-output.txt.gz#_2017-09-29_01_23_05_351583 | 01:29 |
clarkb | ya I think z-c is still having some corner case issues | 01:29 |
jeblair | clarkb: ++ | 01:29 |
clarkb | I was focused on other stuff so don't really have z-c details but if someone does feel free to add to etherpa | 01:29 |
jeblair | clarkb: etherpad ++ i mean :) | 01:30 |
*** lukebrowning has quit IRC | 01:30 | |
jeblair | i'm eoding | 01:30 |
ianw | clarkb: small grammar update by me | 01:30 |
clarkb | what about a note to avoid approving changes until check is shown to work for $project? | 01:31 |
clarkb | I see a lot of stuff going into the gate that has no hope of passing | 01:31 |
*** sbezverk has joined #openstack-infra | 01:31 | |
ianw | clarkb: you might like to say to fix things in openstack-zuul-jobs legacy to start with | 01:31 |
ianw | and and then migrate working jobs into your tree? | 01:31 |
ianw | if people are wondering what to do | 01:31 |
*** Sukhdev has quit IRC | 01:31 | |
*** lukebrowning has joined #openstack-infra | 01:32 | |
mnaser | if anyone is around that can give the +1'd checks a push to help pave the way to clean up the puppet jobs and move them out of ozj | 01:33 |
mnaser | https://review.openstack.org/#/q/owner:mnaser%2540vexxhost.com+status:open+(project:openstack-infra/project-config+OR+project:openstack-infra/openstack-zuul-jobs) | 01:33 |
clarkb | anyone have that migration docs link handy? | 01:33 |
portdirect | for things that have stalled on checks done this morning, what should we do to get them pushed through? | 01:34 |
SamYaple | ianw: ive jumped straight to migrating, but its fairly low entropy project | 01:34 |
mnaser | clarkb one thing to note, migration docs are out of date because of failing publish jobs :( | 01:34 |
ianw | SamYaple: it's a valid path forward. just thought people should know changes to the legacy jobs are accepted | 01:34 |
clarkb | mnaser: arg | 01:34 |
SamYaple | ack | 01:35 |
SpamapS | doh! | 01:35 |
clarkb | https://docs.openstack.org/infra/manual/zuulv3.html that look right? | 01:35 |
clarkb | looks right to me | 01:35 |
clarkb | portdirect: ^ shouldhopefully tell you | 01:35 |
mnaser | http://git.openstack.org/cgit/openstack-infra/infra-manual/tree/doc/source/zuulv3.rst is more up to date, tho less formatted | 01:35 |
clarkb | portdirect: but basically you'll want to identify the failure then likely work to fix it in openstack-infra/openstack-zuul-jobs/playbooks/legacy | 01:35 |
*** yamamoto has joined #openstack-infra | 01:36 | |
*** bobh has quit IRC | 01:36 | |
*** mat128 has joined #openstack-infra | 01:36 | |
portdirect | gotcha - reading up now, thanks clarkb | 01:36 |
openstackgerrit | Merged openstack-infra/project-config master: Promote base-test to base https://review.openstack.org/508337 | 01:36 |
clarkb | ok I'm gonna send that out now so I can eod too | 01:36 |
*** dprince has quit IRC | 01:37 | |
portdirect | also thx SamYaple for pointing me to what I suspect is exactly what i need to change :) | 01:37 |
SamYaple | portdirect: i will pass that thanks onto mnaser (it was the required-projects things) | 01:37 |
SpamapS | clarkb: is another option porting that to not-legacy and putting the playbook in your own project's repo? | 01:38 |
SamYaple | btw to the whole crew here, i know its still pretty bumpy, but im very happy to have zuulv3 and i like it alot from using it so far | 01:38 |
clarkb | SpamapS: yes, but I suspect for most getting legacy working is quicker and simpler | 01:38 |
* SpamapS has not kept up on the state of devstack-gate-not-legacy | 01:38 | |
*** markvoelker has joined #openstack-infra | 01:38 | |
clarkb | SpamapS: since people want to merge code and stuff | 01:38 |
* mnaser is at patchset #13 | 01:40 | |
fungi | goal with the legacy jobs was for them to be working enough for projects to get by until they have time to replace them with in-repo versions | 01:40 |
SpamapS | clarkb: yeah, just wondering about the longer term "free them from project-config" effort. | 01:40 |
mnaser | already have all integration jobs passing, working on unit/etc | 01:40 |
*** lukebrowning has quit IRC | 01:41 | |
SpamapS | fungi: I figured as much. | 01:41 |
clarkb | email sent | 01:41 |
fungi | so making them work in-place is a reasonable next step for the broken ones | 01:41 |
openstackgerrit | Ian Wienand proposed openstack-infra/openstack-zuul-jobs master: Fix dib functional tests https://review.openstack.org/508383 | 01:42 |
fungi | _however_ the self-testing nature of in-repo replacements does make this a relatively quick thing to iterate on, as mnaser is demonstrating ;) | 01:42 |
mnaser | come watch/join the fun - https://review.openstack.org/#/c/508296/ :p | 01:42 |
openstackgerrit | Ian Wienand proposed openstack/diskimage-builder master: [DNM] gate testing https://review.openstack.org/508367 | 01:42 |
clarkb | thanks everyone! /me finds a beer and dinner now | 01:42 |
*** lukebrowning has joined #openstack-infra | 01:43 | |
fungi | thanks for sending that out, clarkb! | 01:43 |
fungi | i also like your priority ordering there (beer and dinner, not t'other way 'round) | 01:43 |
* ianw is now hungry and goes to find lunch | 01:44 | |
*** yamamoto has quit IRC | 01:44 | |
ianw | i'll keep an eye on openstack-zuul-jobs for fixes | 01:44 |
openstackgerrit | Logan V proposed openstack-infra/openstack-zuul-jobs master: Add openstack-ansible required-projects parent job https://review.openstack.org/508281 | 01:45 |
SpamapS | mnaser: it's kind of magical isn't it? | 01:45 |
SpamapS | I'm doing the same with my internal zuulv3 here at GoDaddy :) | 01:45 |
logan- | do child jobs required-projects get merged with the parent jobs? or is it a straight override? | 01:45 |
SpamapS | iterating on a giant string of patches in a PR by just pushing and watching it run the new playbook. | 01:45 |
mnaser | merged afaik logan- | 01:45 |
logan- | i was hoping for that answer | 01:46 |
logan- | thanks | 01:46 |
*** stakeda has quit IRC | 01:46 | |
SpamapS | logan-: you're thinking dependencies not child/parent | 01:46 |
SpamapS | logan-: and AFAIK no, required-projects just affects what gets checked out. Each change is still its own entity, and they go in one by one whether they're in one project or another. | 01:46 |
melwitt | as part of the switchover to zuul v3, is http://status.openstack.org/zuul/ expected not to work anymore? | 01:47 |
*** lukebrowning has quit IRC | 01:47 | |
SamYaple | melwitt: its borked right now | 01:47 |
SamYaple | zuul.openstack.org | 01:47 |
fungi | melwitt: try reloading? it should be a redirect now | 01:47 |
SamYaple | or rather what it redirects to, zuulv3.openstack.org | 01:48 |
fungi | unless that hasn't merged/applied yet | 01:48 |
logan- | SpamapS: so if in https://review.openstack.org/#/c/508281/3/zuul.d/zuul-legacy-jobs.yaml, let's use the job legacy-openstack-ansible-os_keystone-ansible-uw_apache as an example | 01:48 |
melwitt | fungi: yeah not redirecting yet | 01:48 |
melwitt | SamYaple: cool, thanks | 01:48 |
logan- | the parent job has a list of required-projects, and this job has a required-project, what is cloned? both this job's required-projects AND legacy-openstack-ansible-base? | 01:48 |
fungi | melwitt: thanks for the reminder... i | 01:48 |
fungi | i'll track down what happened to the redirect patch | 01:49 |
*** camunoz has quit IRC | 01:50 | |
*** zhurong has quit IRC | 01:50 | |
melwitt | I had been using that one bc it was a lot faster than zuul.openstack.org in the past. but zuul.openstack.org seems to be working fast now | 01:50 |
SpamapS | logan-: ahhhhh I see what you're asking | 01:50 |
mnaser | anyone know how i can pull anymore info when i get a "MODULE FAILURE" :( http://logs.openstack.org/96/508296/13/check/puppet-openstack-module-build/efae1ea/job-output.txt.gz#_2017-09-29_01_48_19_841079 | 01:50 |
mnaser | https://review.openstack.org/#/c/508296/13/playbooks/prepare-node-unit.yaml | 01:51 |
mnaser | this is where it fails | 01:51 |
SpamapS | logan-: I believe the list becomes just the one (so openstack/swift) | 01:51 |
fungi | melwitt: https://review.openstack.org/507244 should take care of it... i just approved | 01:51 |
SpamapS | logan-: but I'll double check the code/api/etc. | 01:51 |
clarkb | mnaser: look at ara | 01:51 |
clarkb | mnaser: it tends to be better and ansible related fails | 01:51 |
melwitt | fungi: cool, thanks | 01:51 |
fungi | thanks for reminding me it wasn't merged! | 01:51 |
melwitt | :) | 01:52 |
melwitt | accidental reminder | 01:52 |
logan- | thanks SpamapS | 01:53 |
*** kjackal_ has joined #openstack-infra | 01:53 | |
mnaser | clarkb perfect! i should use it more often :p | 01:53 |
SpamapS | logan-: I was wrong. It does add them. | 01:54 |
SpamapS | like magic | 01:54 |
logan- | awesome | 01:54 |
logan- | thanks a lot for checking | 01:54 |
SpamapS | https://github.com/openstack-infra/zuul/blob/feature/zuulv3/zuul/model.py#L906-L909 | 01:55 |
*** lukebrowning has joined #openstack-infra | 01:55 | |
SpamapS | logan-: as zuul walks the tree it keeps loading the job from the tree and calling that method on it with newly found required-projects stanzas | 01:55 |
fungi | magic! | 01:55 |
SpamapS | so that also, I believe, will let you have a child of a parent that changes the branch of a project | 01:56 |
* fungi throws fistfuls of glitterfetti into the air | 01:56 | |
SpamapS | which is really sweet | 01:56 |
logan- | interesting | 01:56 |
logan- | yeah that is cool | 01:56 |
SpamapS | so you can have gate-mything-others-master and gate-mything-others-newfeaturebranch | 01:56 |
*** zhurong has joined #openstack-infra | 01:59 | |
*** lukebrowning has quit IRC | 02:00 | |
*** lukebrowning has joined #openstack-infra | 02:01 | |
*** ihrachys has quit IRC | 02:02 | |
*** thorst has joined #openstack-infra | 02:02 | |
*** ihrachys has joined #openstack-infra | 02:02 | |
*** liujiong has joined #openstack-infra | 02:04 | |
*** lukebrowning has quit IRC | 02:06 | |
*** ihrachys has quit IRC | 02:07 | |
*** ihrachys has joined #openstack-infra | 02:07 | |
*** lukebrowning has joined #openstack-infra | 02:08 | |
*** mat128 has quit IRC | 02:09 | |
openstackgerrit | Pete Birley proposed openstack-infra/openstack-zuul-jobs master: OpenStack-Helm: Update legacy jobs https://review.openstack.org/508387 | 02:11 |
Jeffrey4l | can i get the information about /etc/nodepool/node_private and /etc/nodepool/sub_node_private ? | 02:11 |
Jeffrey4l | how can i get* | 02:11 |
*** lukebrowning has quit IRC | 02:12 | |
openstackgerrit | Mohammed Naser proposed openstack-infra/project-config master: Remove legacy-puppet-openstack-integration jobs https://review.openstack.org/508388 | 02:12 |
*** hongbin has joined #openstack-infra | 02:13 | |
*** markvoelker has quit IRC | 02:13 | |
*** dave-mcc_ has quit IRC | 02:13 | |
*** lukebrowning has joined #openstack-infra | 02:14 | |
openstackgerrit | Pete Birley proposed openstack-infra/openstack-zuul-jobs master: OpenStack-Helm: Update legacy jobs https://review.openstack.org/508387 | 02:14 |
*** ijw has quit IRC | 02:16 | |
openstackgerrit | Mohammed Naser proposed openstack-infra/openstack-zuul-jobs master: Remove all legacy puppet openstack integration jobs https://review.openstack.org/508390 | 02:16 |
*** ijw has joined #openstack-infra | 02:16 | |
*** lukebrowning has quit IRC | 02:18 | |
*** hongbin_ has joined #openstack-infra | 02:19 | |
fungi | another taste of our own medicine... legacy puppet apply, beaker and logstash filter jobs are broken on system-config so 507244 can't merge | 02:20 |
*** lukebrowning has joined #openstack-infra | 02:20 | |
*** rcernin has joined #openstack-infra | 02:20 | |
*** hongbin has quit IRC | 02:21 | |
openstackgerrit | Merged openstack-infra/irc-meetings master: Earlier meeting time for international attendees https://review.openstack.org/508202 | 02:22 |
openstackgerrit | Pete Birley proposed openstack-infra/openstack-zuul-jobs master: OpenStack-Helm: Update legacy jobs https://review.openstack.org/508387 | 02:23 |
mnaser | if anyone's around - https://review.openstack.org/#/c/508333/1/zuul.d/zuul-legacy-project-templates.yaml - before last step which allows us to bring jobs in-repo | 02:24 |
*** lukebrowning has quit IRC | 02:25 | |
mnaser | and maybe this one too, first step to removing unit/check jobs - https://review.openstack.org/#/c/508374/ | 02:25 |
*** spotz has quit IRC | 02:29 | |
*** mudpuppy has quit IRC | 02:29 | |
*** lukebrowning has joined #openstack-infra | 02:31 | |
*** lbragstad has joined #openstack-infra | 02:31 | |
*** ramishra has joined #openstack-infra | 02:33 | |
*** lukebrowning has quit IRC | 02:35 | |
*** yamamoto has joined #openstack-infra | 02:36 | |
*** spotz has joined #openstack-infra | 02:37 | |
SpamapS | mnaser: I think they all passed out ;) | 02:37 |
mnaser | SpamapS the fun part is i can just chain things with depends-on | 02:38 |
mnaser | and get on my merry way | 02:38 |
mnaser | ok just threw up a whole chain which will add new jobs, point projects to new jobs and remove old jobs from ozj | 02:39 |
fungi | i haven't passed out _quite_ yet | 02:41 |
fungi | will take a look in a sec | 02:42 |
*** lukebrowning has joined #openstack-infra | 02:42 | |
mnaser | anyways im gonna go to bed and hopefully everything is +1'd by Zuul tomorrow | 02:45 |
* mnaser & | 02:45 | |
*** lukebrowning has quit IRC | 02:47 | |
openstackgerrit | Merged openstack-infra/openstack-zuul-jobs master: OpenStack-Helm: Update legacy jobs https://review.openstack.org/508387 | 02:48 |
*** esberglu has joined #openstack-infra | 02:49 | |
*** lukebrowning has joined #openstack-infra | 02:49 | |
openstackgerrit | Ian Wienand proposed openstack-infra/openstack-zuul-jobs master: Add diskimage-builder requirements for heat in updown jobs https://review.openstack.org/508396 | 02:50 |
SpamapS | mnaser: I can tell, zuulv3 and you are going to be BFF's | 02:50 |
*** lukebrowning has quit IRC | 02:53 | |
*** esberglu has quit IRC | 02:53 | |
*** lukebrowning has joined #openstack-infra | 02:55 | |
jianghuaw | From the page of http://zuulv3.openstack.org/; it shows "Zuul version: 2.5.3.dev1373". I'm a little confused. Which version of zuul is used for the upstream CI? | 02:58 |
clarkb | jianghuaw: we havent tagged it as version 3 yet | 03:00 |
clarkb | but its running the code that eill be tagged version 3 | 03:00 |
*** lukebrowning has quit IRC | 03:00 | |
SpamapS | jianghuaw: if you want to see the code, checkout feature/zuulv3 | 03:01 |
*** lukebrowning has joined #openstack-infra | 03:01 | |
*** armax has quit IRC | 03:04 | |
ianw | i think with 508396 the devstack gate (i.e. the devstack project) might be ok modulo multinode | 03:07 |
openstackgerrit | Ian Wienand proposed openstack/diskimage-builder master: [DNM] gate testing https://review.openstack.org/508367 | 03:09 |
*** markvoelker has joined #openstack-infra | 03:10 | |
prometheanfire | ianw: is dib ready for new changes? | 03:11 |
ianw | prometheanfire: that's a joke right? :) | 03:12 |
*** zhurong has quit IRC | 03:12 | |
prometheanfire | ianw: ya, seems to have been a rough week | 03:13 |
ianw | keep an eye on the chain of 508367 | 03:13 |
prometheanfire | I am happy that infra got me v3 as a present today of all days though | 03:13 |
*** lukebrowning has quit IRC | 03:14 | |
fungi | prometheanfire: your birthiversary? | 03:14 |
prometheanfire | fungi: something like that | 03:15 |
fungi | as my friends like to say, congrats on making it another year without dying | 03:16 |
prometheanfire | another year closer to my inevitable demise | 03:16 |
prometheanfire | :D | 03:16 |
openstackgerrit | Merged openstack-infra/openstack-zuul-jobs master: Add requirements to all pylint jobs https://review.openstack.org/508303 | 03:16 |
ianw | ^ ok, that's one less problem | 03:16 |
*** lukebrowning has joined #openstack-infra | 03:17 | |
prometheanfire | ya, that was surprising to me | 03:17 |
fungi | or one more, depending on how you classify pylint | 03:17 |
prometheanfire | lol, status.openstack.org/zuul is dead | 03:18 |
fungi | prometheanfire: we need system-config's jobs working enough to be able to merge https://review.openstack.org/507244 | 03:19 |
fungi | to solve that | 03:19 |
fungi | but i've run out of steam for looking into it | 03:20 |
*** lukebrowning has quit IRC | 03:21 | |
*** lukebrowning has joined #openstack-infra | 03:23 | |
ianw | i've got so much in flight i'm loosing track. afk for a bit to let jobs process so i can see where i'm at | 03:23 |
prometheanfire | fungi: ah, known issue, k | 03:24 |
*** baoli has quit IRC | 03:26 | |
*** lukebrowning has quit IRC | 03:28 | |
*** hongbin_ has quit IRC | 03:28 | |
jianghuaw | SpamapS, thanks for the response. Actually zuul v.3 is used although it shows 2.5.3.dev1373 in the bottom of http://zuulv3.openstack.org/? | 03:29 |
*** lukebrowning has joined #openstack-infra | 03:29 | |
clarkb | jianghuaw: yes, because we haven't tagged a zuulv3 release yet | 03:29 |
clarkb | so the version reported by git is 2.5.3.dev1373 | 03:30 |
clarkb | jianghuaw: once things settle in we'll tag aa 3.0 release and that will update | 03:30 |
jianghuaw | clarkb, got it. thanks for the clarification. | 03:30 |
jianghuaw | That's cool. I will look at zuul v3 and plan to use it for XenServer CI. | 03:31 |
SpamapS | well technically git isn't reporting a version | 03:32 |
SpamapS | pbr is making 2.5.3.dev1373 | 03:32 |
SpamapS | jianghuaw: If you need help deploying, let me know. #zuul is also a zuul-specific channel (though it is mostly dev centric) | 03:33 |
SpamapS | jianghuaw: I use this for deploying: https://github.com/BonnyCI/hoist | 03:33 |
jianghuaw | SpamapS, Thanks very much. | 03:34 |
*** lukebrowning has quit IRC | 03:34 | |
* SpamapS disappears to find some sushi | 03:34 | |
*** lukebrowning has joined #openstack-infra | 03:36 | |
*** ekcs has joined #openstack-infra | 03:36 | |
*** rlandy has quit IRC | 03:37 | |
clarkb | so many things need requirements http://logs.openstack.org/48/507148/1/gate/legacy-bifrost-integration-tinyipa-opensuse-423/789afce/job-output.txt.gz#_2017-09-29_03_32_20_670439 | 03:37 |
clarkb | might be worth a special email just for this particular fail case | 03:38 |
*** lukebrowning has quit IRC | 03:40 | |
*** markvoelker has quit IRC | 03:42 | |
*** lukebrowning has joined #openstack-infra | 03:43 | |
*** lukebrowning has quit IRC | 03:48 | |
*** udesale has joined #openstack-infra | 03:48 | |
*** lukebrowning has joined #openstack-infra | 03:49 | |
*** ykarel has joined #openstack-infra | 03:50 | |
*** lukebrowning has quit IRC | 03:54 | |
*** lukebrowning has joined #openstack-infra | 03:56 | |
*** lbragstad has quit IRC | 04:00 | |
*** lukebrowning has quit IRC | 04:00 | |
*** lukebrowning has joined #openstack-infra | 04:02 | |
*** links has joined #openstack-infra | 04:02 | |
*** kjackal_ has quit IRC | 04:03 | |
*** lukebrowning has quit IRC | 04:07 | |
*** mat128 has joined #openstack-infra | 04:07 | |
*** cuongnv has quit IRC | 04:12 | |
*** cuongnv has joined #openstack-infra | 04:12 | |
*** lukebrowning has joined #openstack-infra | 04:13 | |
*** lukebrowning has quit IRC | 04:18 | |
*** lukebrowning has joined #openstack-infra | 04:19 | |
*** zhurong has joined #openstack-infra | 04:19 | |
*** ekcs has quit IRC | 04:22 | |
*** SumitNaiksatam has joined #openstack-infra | 04:22 | |
*** lukebrowning has quit IRC | 04:24 | |
openstackgerrit | Ian Wienand proposed openstack-infra/openstack-zuul-jobs master: Add diskimage-builder/sahara requirements in updown jobs https://review.openstack.org/508396 | 04:24 |
*** lukebrowning has joined #openstack-infra | 04:26 | |
*** namnh has quit IRC | 04:27 | |
*** namnh has joined #openstack-infra | 04:28 | |
*** lukebrowning has quit IRC | 04:30 | |
*** lukebrowning has joined #openstack-infra | 04:32 | |
ramishra | hi guys, any idea why this job is failing after zuul3 migration? http://logs.openstack.org/12/508112/1/check/legacy-heat-dsvm-functional-orig-mysql-lbaasv2/0f6d861/logs/devstacklog.txt.gz#_2017-09-29_02_11_02_716 | 04:32 |
ramishra | It seems to be installing it from git though http://logs.openstack.org/12/508112/1/check/legacy-heat-dsvm-functional-orig-mysql-lbaasv2/0f6d861/logs/devstacklog.txt.gz#_2017-09-29_01_44_08_254 | 04:33 |
ramishra | ianw: Hi, any idea? ^^^ | 04:34 |
*** lukebrowning has quit IRC | 04:36 | |
*** esberglu has joined #openstack-infra | 04:37 | |
*** lukebrowning has joined #openstack-infra | 04:38 | |
*** markvoelker has joined #openstack-infra | 04:39 | |
*** mat128 has quit IRC | 04:40 | |
*** stakeda has joined #openstack-infra | 04:40 | |
*** coolsvap has joined #openstack-infra | 04:42 | |
*** esberglu has quit IRC | 04:42 | |
*** lukebrowning has quit IRC | 04:43 | |
*** Sukhdev has joined #openstack-infra | 04:43 | |
*** lukebrowning has joined #openstack-infra | 04:44 | |
*** Guest50285 has quit IRC | 04:46 | |
*** lukebrowning has quit IRC | 04:49 | |
*** lukebrowning has joined #openstack-infra | 04:50 | |
*** Hal has joined #openstack-infra | 04:51 | |
*** Hal is now known as Guest11750 | 04:51 | |
*** lukebrowning has quit IRC | 04:55 | |
*** psachin has joined #openstack-infra | 04:56 | |
*** lukebrowning has joined #openstack-infra | 04:57 | |
*** dhajare has joined #openstack-infra | 04:58 | |
*** lukebrowning has quit IRC | 05:01 | |
ianw | ramishra: looking | 05:01 |
ianw | ahh, yeah | 05:02 |
ianw | will require something like https://review.openstack.org/#/c/508344/ | 05:02 |
ramishra | ianw: Ah, thanks! | 05:03 |
ianw | the issue is there's some more devstack issues in the gate too, including multinode. honestly, your best bet might be just to wait a while at this point as we sort them out | 05:03 |
ramishra | ianw: np, we can wait:) | 05:03 |
frickler | can someone check why zuul didn't produce any gate result here? https://review.openstack.org/507798 should I do a recheck to get normal check results for comparison? | 05:06 |
*** lukebrowning has joined #openstack-infra | 05:09 | |
openstackgerrit | Andreas Jaeger proposed openstack-infra/infra-manual master: Add howto section on migrating legacy jobs to v3 https://review.openstack.org/508295 | 05:10 |
*** ianychoi has quit IRC | 05:11 | |
*** ianychoi has joined #openstack-infra | 05:12 | |
*** markvoelker has quit IRC | 05:12 | |
*** lukebrowning has quit IRC | 05:14 | |
openstackgerrit | Andreas Jaeger proposed openstack-infra/infra-manual master: Add docs about tox jobs and sibling installation https://review.openstack.org/508327 | 05:14 |
*** rcernin has quit IRC | 05:15 | |
*** lukebrowning has joined #openstack-infra | 05:16 | |
AJaeger_ | mordred, fixed some trivial problems on you change ^ | 05:17 |
ianw | all indications are that legacy-tempest-dsvm-nnet should not be running on devstack master, but it is. hmm :/ | 05:18 |
*** mrunge_ is now known as mrunge | 05:18 | |
AJaeger_ | anybody wants to +2a zuul v3 doc improvments? ^ | 05:18 |
AJaeger_ | Hi ianw ! | 05:18 |
ianw | both read fine to me, better to iterate on them | 05:19 |
AJaeger_ | yep - thanks | 05:20 |
*** lukebrowning has quit IRC | 05:20 | |
ianw | AJaeger_: you seen anything funny with branch regex matches? | 05:22 |
*** armax has joined #openstack-infra | 05:23 | |
*** gongysh has joined #openstack-infra | 05:23 | |
AJaeger_ | ianw: no time to dig into anything ;( I'm still travelling and looked for a 5 minute help ;) | 05:23 |
ianw | AJaeger_: np | 05:23 |
*** pcaruana has joined #openstack-infra | 05:24 | |
openstackgerrit | Ian Wienand proposed openstack-infra/openstack-zuul-jobs master: Wrap legacy-tempest-dsvm-nnet branch match regex https://review.openstack.org/508405 | 05:25 |
*** iyamahat has joined #openstack-infra | 05:25 | |
*** lukebrowning has joined #openstack-infra | 05:25 | |
*** pgadiya has joined #openstack-infra | 05:27 | |
*** pcaruana has quit IRC | 05:29 | |
*** iyamahat has quit IRC | 05:30 | |
*** lukebrowning has quit IRC | 05:30 | |
*** lukebrowning has joined #openstack-infra | 05:31 | |
AJaeger_ | team, should we remove the old project-config/jenkins/jobs directory and layout/zuul.yaml files - so that nobody can submit changes anymore and they get conflicts on existing changes? | 05:35 |
* AJaeger_ just added comments and -1 to some existing changes that need to be adopted for zuul v3 and noticed changes submitted this week still touching the old files | 05:36 | |
*** lukebrowning has quit IRC | 05:36 | |
*** Sukhdev has quit IRC | 05:37 | |
openstackgerrit | Merged openstack-infra/infra-manual master: Add howto section on migrating legacy jobs to v3 https://review.openstack.org/508295 | 05:38 |
openstackgerrit | Merged openstack-infra/infra-manual master: Add docs about tox jobs and sibling installation https://review.openstack.org/508327 | 05:38 |
ianw | AJaeger_: for the immediate time i've found it useful to cross-reference against them | 05:43 |
ianw | but, in a few days when this has settled down, sure | 05:43 |
*** lukebrowning has joined #openstack-infra | 05:44 | |
AJaeger_ | ianw: ok, then I'll add some more -1 when new changes come in | 05:45 |
*** lukebrowning has quit IRC | 05:48 | |
*** rcernin has joined #openstack-infra | 05:52 | |
*** yamamoto_ has joined #openstack-infra | 05:53 | |
*** lukebrowning has joined #openstack-infra | 05:55 | |
*** yamamoto has quit IRC | 05:57 | |
*** lukebrowning has quit IRC | 05:59 | |
*** lukebrowning has joined #openstack-infra | 06:01 | |
*** stakeda has quit IRC | 06:01 | |
*** lukebrowning has quit IRC | 06:06 | |
*** lukebrowning has joined #openstack-infra | 06:07 | |
*** markvoelker has joined #openstack-infra | 06:09 | |
*** lukebrowning has quit IRC | 06:12 | |
*** hashar has joined #openstack-infra | 06:12 | |
*** lukebrowning has joined #openstack-infra | 06:13 | |
*** masber has joined #openstack-infra | 06:15 | |
*** lukebrowning has quit IRC | 06:18 | |
*** lukebrowning has joined #openstack-infra | 06:20 | |
*** kiennt26 has joined #openstack-infra | 06:22 | |
yolanda | hi AJaeger_ , ianw , what's the status with infra? jobs broken? i'm seeing errors on my bifrost jobs | 06:24 |
*** lukebrowning has quit IRC | 06:24 | |
*** esberglu has joined #openstack-infra | 06:25 | |
*** lukebrowning has joined #openstack-infra | 06:26 | |
*** andreas_s has joined #openstack-infra | 06:29 | |
*** esberglu has quit IRC | 06:29 | |
*** lukebrowning has quit IRC | 06:31 | |
*** lukebrowning has joined #openstack-infra | 06:32 | |
*** lukebrowning has quit IRC | 06:37 | |
*** mat128 has joined #openstack-infra | 06:39 | |
*** markvoelker has quit IRC | 06:43 | |
*** lukebrowning has joined #openstack-infra | 06:44 | |
SamYaple | can someone clarify for me how to remove the legacy jobs? They are currently b0rked and we are just going to setup zuulv3 gates from scratch for LOCI | 06:45 |
SamYaple | for now we just want to purge the legacy jobs and noop zuulv3 | 06:45 |
*** tmorin has joined #openstack-infra | 06:47 | |
*** lukebrowning has quit IRC | 06:49 | |
*** gongysh has quit IRC | 06:49 | |
*** lukebrowning has joined #openstack-infra | 06:50 | |
*** jtomasek has joined #openstack-infra | 06:53 | |
*** wewe0901 has joined #openstack-infra | 06:54 | |
*** lukebrowning has quit IRC | 06:55 | |
*** lukebrowning has joined #openstack-infra | 06:56 | |
openstackgerrit | Tony Breeds proposed openstack-infra/openstack-zuul-jobs master: Pin legacy-requirements-python34 to a trusty node https://review.openstack.org/508421 | 06:59 |
*** shardy_afk is now known as shardy | 07:00 | |
* tonyb has no idea if ^^ is right | 07:00 | |
yamamoto_ | is RETRY_LIMIT thing is recheck'able? | 07:01 |
yamamoto_ | eg. https://review.openstack.org/#/c/507037/ | 07:01 |
*** lukebrowning has quit IRC | 07:01 | |
*** lukebrowning has joined #openstack-infra | 07:03 | |
*** gildub has quit IRC | 07:03 | |
*** pgadiya has quit IRC | 07:04 | |
*** pcaruana has joined #openstack-infra | 07:04 | |
*** gongysh has joined #openstack-infra | 07:04 | |
*** lukebrowning has quit IRC | 07:07 | |
openstackgerrit | Merged openstack-infra/openstack-zuul-jobs master: Fix dib functional tests https://review.openstack.org/508383 | 07:09 |
*** lukebrowning has joined #openstack-infra | 07:09 | |
*** mat128 has quit IRC | 07:10 | |
*** jpich has joined #openstack-infra | 07:12 | |
*** ihrachys has quit IRC | 07:12 | |
*** ihrachys has joined #openstack-infra | 07:12 | |
*** lukebrowning has quit IRC | 07:13 | |
*** gk_ has joined #openstack-infra | 07:13 | |
*** gk_ has quit IRC | 07:14 | |
*** Guest11750 has quit IRC | 07:15 | |
*** lukebrowning has joined #openstack-infra | 07:15 | |
*** gk_ has joined #openstack-infra | 07:16 | |
*** florianf has joined #openstack-infra | 07:16 | |
*** gk_ has quit IRC | 07:17 | |
AJaeger_ | SamYaple: docs just merged on what to do - see https://review.openstack.org/508327 . Unfortunately those are not published yet | 07:18 |
AJaeger_ | you can read draft which is up | 07:18 |
AJaeger_ | infra-root, infra-manual publishing did fail somehow for 508327 | 07:18 |
AJaeger_ | yolanda: I'm not up to speed, see http://lists.openstack.org/pipermail/openstack-dev/2017-September/122834.html for last email on it | 07:19 |
*** lukebrowning has quit IRC | 07:19 | |
yolanda | yep, seems we hit "Missing inclusion of the requirements repo" | 07:20 |
*** kiennt26 has quit IRC | 07:20 | |
*** lukebrowning has joined #openstack-infra | 07:21 | |
SamYaple | AJaeger_: thanks. ill check it out | 07:22 |
SamYaple | im having a bit of trouble with openstack/loci right now. it only had a noop job.... but nothing seems to be working with that repo | 07:23 |
SamYaple | i see it in zuulv3.o.o , but it never reports | 07:23 |
*** lukebrowning has quit IRC | 07:26 | |
*** lukebrowning has joined #openstack-infra | 07:28 | |
*** shardy is now known as shardy_afk | 07:30 | |
*** namnh has quit IRC | 07:31 | |
*** masber has quit IRC | 07:31 | |
*** namnh has joined #openstack-infra | 07:32 | |
*** lukebrowning has quit IRC | 07:32 | |
*** lukebrowning has joined #openstack-infra | 07:34 | |
SamYaple | https://review.openstack.org/#/c/508425/ like this, zuul doesnt seem to kick anything off. i can't figure out where to begin looking | 07:36 |
*** ccamacho has joined #openstack-infra | 07:37 | |
*** lukebrowning has quit IRC | 07:38 | |
AJaeger_ | SamYaple: sorry, currently travelling and not up to speed - best come back when the US wakes up. Or reply to the email... | 07:39 |
*** rpittau has joined #openstack-infra | 07:39 | |
matbu_ | SamYaple: yep looks like zuul is not wake up : http://status.openstack.org/zuul/ | 07:39 |
*** jpena|off is now known as jpena | 07:39 | |
SamYaple | matbu_: that page is b0rked still. go to zuul.openstack.org | 07:40 |
AJaeger_ | matbu_: that's zuul v2 - SamYaple had the right URL which is zuulv3.openstack.org | 07:40 |
*** markvoelker has joined #openstack-infra | 07:40 | |
AJaeger_ | SamYaple: zuul*v3* | 07:40 |
SamYaple | AJaeger_: it redirects | 07:40 |
AJaeger_ | SamYaple: Ah! | 07:40 |
openstackgerrit | Tony Breeds proposed openstack-infra/openstack-zuul-jobs master: Pin legacy-requirements-python34 to a trusty node https://review.openstack.org/508421 | 07:41 |
SamYaple | k well i should sleep anyway. im just going to hope its all fixed when i wake up :) | 07:42 |
openstackgerrit | Pavlo Shchelokovskyy proposed openstack-infra/project-config master: Add separate coverage job for ironic-inspector https://review.openstack.org/508129 | 07:42 |
matbu_ | ha thx better now | 07:42 |
*** egonzalez has joined #openstack-infra | 07:44 | |
*** lukebrowning has joined #openstack-infra | 07:45 | |
matbu_ | im wondering why zuul is not kicked btw here https://review.openstack.org/#/c/487496/ | 07:48 |
chandankumar | AJaeger_: regarding this review https://review.openstack.org/#/c/507038/ do i need to submit changes against something else by following zuulv3 docs? | 07:48 |
matbu_ | with the A+ | 07:48 |
*** lukebrowning has quit IRC | 07:49 | |
*** lukebrowning has joined #openstack-infra | 07:51 | |
*** rossella_s has joined #openstack-infra | 07:53 | |
*** threestrands has quit IRC | 07:53 | |
tonyb | I'm seeing a few jobs fail with something like: http://logs.openstack.org/49/508249/1/check/legacy-cross-nova-func/ad36a73/job-output.txt.gz#_2017-09-29_02_05_54_784418 any ideas? | 07:54 |
*** verdurin has quit IRC | 07:54 | |
*** lukebrowning has quit IRC | 07:55 | |
*** lukebrowning has joined #openstack-infra | 07:57 | |
*** armax has quit IRC | 07:57 | |
*** shardy_afk is now known as shardy | 07:58 | |
*** chem has joined #openstack-infra | 08:00 | |
*** chenying_ has quit IRC | 08:00 | |
*** chenying_ has joined #openstack-infra | 08:01 | |
*** lukebrowning has quit IRC | 08:02 | |
*** namnh has quit IRC | 08:03 | |
*** lukebrowning has joined #openstack-infra | 08:03 | |
*** namnh has joined #openstack-infra | 08:04 | |
chem | hi, I have a job (https://review.openstack.org/#/c/474967/) that isn't picked up by zuul and http://status.openstack.org/zuul/ is not loading. | 08:04 |
chem | is that because of the workflow -1 or am i missing something | 08:04 |
chem | ? | 08:04 |
*** rossella_s has quit IRC | 08:05 | |
chem | oki, I've checked http://zuulv3.openstack.org/ and cannot find 474967 | 08:06 |
*** tushar has joined #openstack-infra | 08:07 | |
*** lukebrowning has quit IRC | 08:08 | |
tushar | Hi All, From few hours back the third party CI stopped listening to the gerrit patches | 08:09 |
*** lukebrowning has joined #openstack-infra | 08:09 | |
tushar | I think this might because of changes related to zuul v3 | 08:10 |
*** gongysh has quit IRC | 08:10 | |
tushar | Can any body knows what changes are required in third party CI setup? | 08:11 |
*** rossella_s has joined #openstack-infra | 08:11 | |
*** markvoelker has quit IRC | 08:12 | |
*** esberglu has joined #openstack-infra | 08:13 | |
*** lukebrowning has quit IRC | 08:14 | |
*** ykarel is now known as ykarel|lunch | 08:14 | |
*** esberglu has quit IRC | 08:18 | |
*** lukebrowning has joined #openstack-infra | 08:22 | |
evrardjp | this may look like a dumb question but, when, in the retirement of a repo process, does the repo disappear from cgit? | 08:23 |
*** e0ne has joined #openstack-infra | 08:24 | |
*** ralonsoh has joined #openstack-infra | 08:25 | |
*** lukebrowning has quit IRC | 08:26 | |
*** gongysh has joined #openstack-infra | 08:26 | |
frickler | evrardjp: it doesn't, there will only be an empty repo pushed as the last commit, but the history will be kept forever (whatever that may be in term of os-infra) | 08:27 |
evrardjp | mmmm | 08:28 |
evrardjp | where is the procedure for renaming a repo then? | 08:28 |
*** gongysh has quit IRC | 08:28 | |
*** lukebrowning has joined #openstack-infra | 08:28 | |
evrardjp | I think I see the end of the tunnel | 08:28 |
frickler | evrardjp: https://docs.openstack.org/infra/manual/creators.html#project-renames | 08:29 |
evrardjp | frickler: thanks! | 08:29 |
evrardjp | frickler: ok let me explain the problem | 08:30 |
*** lukebrowning has quit IRC | 08:33 | |
*** lukebrowning has joined #openstack-infra | 08:34 | |
evrardjp | I don't see any topic project-rename , but I still don't see openstack-ansible-security in cgit | 08:35 |
evrardjp | (this is part of a bigger issue, but let's say we tackle that one) | 08:35 |
*** rossella_s has quit IRC | 08:36 | |
evrardjp | openstack-ansible-security was retired in favor of ansible-hardening, but I still need old references for old stable branches into this repo | 08:36 |
evrardjp | (the openstack-ansible-security one) | 08:36 |
*** caphrim007 has quit IRC | 08:37 | |
*** caphrim007_ has joined #openstack-infra | 08:37 | |
*** jaosorior has joined #openstack-infra | 08:37 | |
*** alexchadin has joined #openstack-infra | 08:37 | |
*** sbezverk has quit IRC | 08:38 | |
*** lukebrowning has quit IRC | 08:39 | |
openstackgerrit | Andrea Frittoli proposed openstack-infra/devstack-gate master: Throwaway patch to check subunit file processing https://review.openstack.org/508171 | 08:39 |
*** tosky has joined #openstack-infra | 08:44 | |
evrardjp | at least https://git.openstack.org/cgit/openstack/openstack-ansible-security doesn't seem to exist anymore | 08:45 |
evrardjp | that blocks me from releasing anything... | 08:47 |
frickler | evrardjp: hmm, seems cgit is a different thing indeed, need to wait for some infra-root with more knowledge, then | 08:49 |
evrardjp | thanks for the effort and for the help already! | 08:50 |
openstackgerrit | Krzysztof Klimonda proposed openstack-infra/zuul feature/zuulv3: Add zuul supplementary groups before setgid/setuid https://review.openstack.org/508444 | 08:51 |
openstackgerrit | Mehdi Abaakouk (sileht) proposed openstack-infra/openstack-zuul-jobs master: Add missing projects to telemetry jobs https://review.openstack.org/508448 | 08:56 |
*** bhavik1 has joined #openstack-infra | 08:57 | |
odyssey4me | evrardjp the openstack-ansible-security git mirror is still in github, so it's like still in git.o.o too? | 08:57 |
evrardjp | nope that's what I thought | 08:57 |
evrardjp | that's why I lost time :p | 08:57 |
evrardjp | have a look at my link | 08:58 |
evrardjp | I was always checking github insted of cgit | 08:58 |
evrardjp | instead* | 08:58 |
*** ykarel|lunch is now known as ykarel | 08:59 | |
odyssey4me | gerrit still knows about it | 09:00 |
*** pas-ha has joined #openstack-infra | 09:00 | |
odyssey4me | git remote set-url origin https://git.openstack.org/openstack/openstack-ansible-security | 09:00 |
odyssey4me | that works | 09:00 |
*** mriedem has quit IRC | 09:01 | |
evrardjp | yes, and git also | 09:01 |
odyssey4me | cgit might just be set to not show repositories which are read-only | 09:01 |
evrardjp | mmm | 09:01 |
evrardjp | so the ACL on gerrit could have an impact? | 09:01 |
odyssey4me | why is cgit important in this equation? | 09:01 |
evrardjp | it's what's used in releases | 09:01 |
odyssey4me | oh really? that's a problem | 09:01 |
*** yamamoto_ has quit IRC | 09:01 | |
*** andreas_s_ has joined #openstack-infra | 09:01 | |
evrardjp | in the validation tooling we are checking the URL on cgit | 09:01 |
evrardjp | odyssey4me: indeed :p | 09:02 |
evrardjp | all our releases are broken right now. | 09:02 |
odyssey4me | ah, we'll have to wait for an infra-root to help then | 09:02 |
evrardjp | if that's ACL I can fix that | 09:02 |
evrardjp | but then it will impact other things | 09:02 |
evrardjp | so I'd rather wait for more experience | 09:03 |
AJaeger_ | evrardjp: repo is retired - that means it's frozen for all branches. | 09:03 |
evrardjp | the alternative would be to clone in the release tooling but a comment in code made me think it was tried and not a good idea | 09:03 |
AJaeger_ | evrardjp: if you want to do release on old branches, you shouldn't have retired it... | 09:03 |
AJaeger_ | and that's why it's hidden in cgit | 09:03 |
evrardjp | first, I haven't retired it :p | 09:03 |
evrardjp | second we still want to release, but not on this one | 09:04 |
evrardjp | so our OLD deliverables for Ocata for example, still contain this retired repo | 09:04 |
openstackgerrit | yolanda.robla proposed openstack-infra/openstack-zuul-jobs master: Add requirements to bifrost jobs https://review.openstack.org/508452 | 09:05 |
AJaeger_ | evrardjp: sorry, need to go offline again and can't help further for now | 09:05 |
*** andreas_s has quit IRC | 09:05 | |
evrardjp | AJaeger_: are you suggesting I need to change the release tooling for that case? | 09:05 |
evrardjp | AJaeger_: no worries :) | 09:05 |
evrardjp | I will talk in release then | 09:06 |
tonyb | evrardjp: It's a somewhat known issue. I expect dhellmann and ttx will fix it ASAP | 09:08 |
*** mat128 has joined #openstack-infra | 09:08 | |
*** markvoelker has joined #openstack-infra | 09:09 | |
*** bhavik1 has quit IRC | 09:10 | |
evrardjp | I will fix it | 09:10 |
evrardjp | it doesn't look hard | 09:10 |
evrardjp | I will ping dhellmann and ttx | 09:10 |
evrardjp | for reviews | 09:10 |
ttx | hmmm some jobs look stuck in the queue | 09:11 |
ttx | see 504940 for example | 09:12 |
*** filler has quit IRC | 09:14 | |
openstackgerrit | Pavlo Shchelokovskyy proposed openstack-infra/openstack-zuul-jobs master: Require requirements prj for legacy-requirements https://review.openstack.org/508460 | 09:16 |
*** filler has joined #openstack-infra | 09:16 | |
*** iyamahat has joined #openstack-infra | 09:23 | |
frickler | infra-root: it seems that at least neutron gate jobs are being merged without unit tests (or not merged due to multinode failure), but that does seem a critical bug to me | 09:23 |
*** iyamahat_ has joined #openstack-infra | 09:24 | |
frickler | also swift is running openstack-tox-py27 on trusty instead of openstack-tox-py27-xenial https://review.openstack.org/474801 | 09:25 |
mikal | Is zuul broken or is it just me? status.openstack.org/zuul never finishes loading. | 09:26 |
odyssey4me | mikal it's between things | 09:27 |
odyssey4me | try http://zuulv3.openstack.org/ | 09:27 |
mikal | Oh fancy curved corners! | 09:27 |
odyssey4me | it looks like the migration to zuul v3 is still somewhat in progress | 09:27 |
dmellado | what about the status page | 09:27 |
dmellado | is it broken too? | 09:27 |
dmellado | did it get migrated to some another url? | 09:27 |
dmellado | I wanted to check the status of an ongoing patch and can't see anything :\ | 09:28 |
odyssey4me | no, that status page is likely broken because the back-end it relied on is not yet migrated | 09:28 |
*** iyamahat has quit IRC | 09:28 | |
odyssey4me | for an interim status check http://zuulv3.openstack.org/ I think - once they rename zuulv3 to zuul then the status page will get back online again | 09:29 |
*** yamamoto has joined #openstack-infra | 09:29 | |
odyssey4me | well, that's my understanding from some chat I saw yesterday | 09:29 |
*** iyamahat_ has quit IRC | 09:29 | |
*** iyamahat has joined #openstack-infra | 09:29 | |
dmellado | odyssey4me: thanks! | 09:30 |
dmellado | I'm trying to check the current status and sadly fix broken things | 09:30 |
*** owalsh has joined #openstack-infra | 09:31 | |
*** panda|off is now known as panda | 09:31 | |
*** yamamoto has quit IRC | 09:32 | |
*** lukebrowning has joined #openstack-infra | 09:36 | |
*** yamamoto has joined #openstack-infra | 09:40 | |
*** mat128 has quit IRC | 09:41 | |
*** udesale has quit IRC | 09:42 | |
*** iyamahat_ has joined #openstack-infra | 09:42 | |
*** sambetts|afk is now known as sambetts | 09:42 | |
*** iyamahat has quit IRC | 09:42 | |
*** markvoelker has quit IRC | 09:43 | |
openstackgerrit | Jens Harbott (frickler) proposed openstack-infra/openstack-zuul-jobs master: Fix grenade multinode job https://review.openstack.org/508473 | 09:46 |
frickler | infra-root: ^^ I think I've located the cause for the multinode post_failures | 09:46 |
*** alexchadin has quit IRC | 09:47 | |
*** lukebrowning has quit IRC | 09:48 | |
*** alexchadin has joined #openstack-infra | 09:48 | |
dmellado | anyone also having 'end of stream' errors? | 09:49 |
*** lukebrowning has joined #openstack-infra | 09:50 | |
*** yamamoto has quit IRC | 09:54 | |
*** iyamahat_ has quit IRC | 09:54 | |
*** lukebrowning has quit IRC | 09:55 | |
*** lukebrowning has joined #openstack-infra | 09:57 | |
*** lukebrowning has quit IRC | 10:01 | |
*** esberglu has joined #openstack-infra | 10:01 | |
frickler | so all the openstack-tox-py27 I checked ran on trusty, could this have the same root-cause as nova-net running on master? | 10:01 |
*** yamamoto has joined #openstack-infra | 10:01 | |
*** egonzalez has quit IRC | 10:05 | |
*** esberglu has quit IRC | 10:06 | |
*** adriant has quit IRC | 10:06 | |
*** stevemar has quit IRC | 10:06 | |
*** yuval has quit IRC | 10:06 | |
*** stevemar has joined #openstack-infra | 10:07 | |
*** jgriffith has quit IRC | 10:08 | |
*** lukebrowning has joined #openstack-infra | 10:08 | |
*** numans has quit IRC | 10:08 | |
*** ari[m] has quit IRC | 10:08 | |
*** ari[m] has joined #openstack-infra | 10:08 | |
*** yuval has joined #openstack-infra | 10:09 | |
*** numans has joined #openstack-infra | 10:10 | |
*** lukebrowning has quit IRC | 10:13 | |
*** dhajare has quit IRC | 10:13 | |
*** lukebrowning has joined #openstack-infra | 10:14 | |
*** jgriffith has joined #openstack-infra | 10:14 | |
*** LindaWang has quit IRC | 10:14 | |
*** tmorin has quit IRC | 10:16 | |
*** lukebrowning has quit IRC | 10:19 | |
*** egonzalez has joined #openstack-infra | 10:20 | |
*** lukebrowning has joined #openstack-infra | 10:20 | |
*** derekh has joined #openstack-infra | 10:22 | |
*** adriant has joined #openstack-infra | 10:22 | |
*** lukebrowning has quit IRC | 10:24 | |
ianw | frickler: that one's got me beat for now | 10:26 |
ianw | nova-net on master ... what are you seeing run incorrectly? | 10:26 |
*** lukebrowning has joined #openstack-infra | 10:26 | |
*** liujiong has quit IRC | 10:27 | |
openstackgerrit | Tom Barron proposed openstack-infra/project-config master: Update manila tempest job skip conditions https://review.openstack.org/508485 | 10:27 |
*** masber has joined #openstack-infra | 10:28 | |
tosky | tbarron: ^^ I suspect it's going to be rejected as it is (Zuul v3 migration means new places for job definition) | 10:30 |
frickler | ianw: openstack-tox-py27 is running on trusty nodes instead of xenial. neutron gate jobs are missing py27 jobs completely | 10:30 |
tbarron | tosky: ack, guess I need to learn the new places :) | 10:31 |
*** masber has quit IRC | 10:32 | |
*** lukebrowning has quit IRC | 10:32 | |
*** lukebrowning has joined #openstack-infra | 10:33 | |
*** shardy has quit IRC | 10:33 | |
*** LindaWang has joined #openstack-infra | 10:36 | |
*** zhurong has quit IRC | 10:36 | |
*** lukebrowning has quit IRC | 10:37 | |
*** lukebrowning has joined #openstack-infra | 10:39 | |
*** markvoelker has joined #openstack-infra | 10:40 | |
*** alexchadin has quit IRC | 10:41 | |
*** seanhandley has joined #openstack-infra | 10:42 | |
seanhandley | I'm waiting on `Needs Verified Label` for https://review.openstack.org/#/c/508445/ | 10:42 |
seanhandley | but I don't see where that approval is coming from | 10:42 |
seanhandley | Zuul seems to have finished running | 10:42 |
seanhandley | am I waiting for Jenkins to get involved ? | 10:43 |
tosky | seanhandley: no "Jenkins" anymore (it was zuul v2.x) | 10:43 |
*** lukebrowning has quit IRC | 10:43 | |
seanhandley | Right | 10:44 |
seanhandley | So Zuul will come back and +2 verify at some point | 10:44 |
seanhandley | ? | 10:44 |
*** askb has quit IRC | 10:44 | |
tosky | that's the idea, but there may be still bugs, as -infra people are fixing the last issues of the migrations | 10:45 |
*** pbourke has joined #openstack-infra | 10:45 | |
seanhandley | ok, thanks :) | 10:45 |
*** lukebrowning has joined #openstack-infra | 10:45 | |
seanhandley | I'll give it a couple of hours | 10:45 |
*** namnh has quit IRC | 10:47 | |
*** lukebrowning has quit IRC | 10:50 | |
*** lukebrowning has joined #openstack-infra | 10:52 | |
*** askb has joined #openstack-infra | 10:52 | |
*** lukebrowning has quit IRC | 10:56 | |
*** lukebrowning has joined #openstack-infra | 10:58 | |
*** jesusaur has quit IRC | 10:59 | |
*** lukebrowning has quit IRC | 11:02 | |
*** jesusaur has joined #openstack-infra | 11:03 | |
ianw | jeblair (fyi): yolanda has a stuck job on 508452 base-integration-centos-7 : i can see it was assigned node 0000054363 (198.72.124.183); host up for ~2 hours and i can also see "zuul" has never tried to log in. http://paste.openstack.org/show/622302/ | 11:04 |
*** rhallisey has joined #openstack-infra | 11:05 | |
*** lukebrowning has joined #openstack-infra | 11:09 | |
ianw | (of course, where this rates in order of current issues i don't know ;) | 11:10 |
*** alexchadin has joined #openstack-infra | 11:10 | |
*** markvoelker has quit IRC | 11:12 | |
*** andreas_s_ has quit IRC | 11:12 | |
openstackgerrit | Dirk Mueller proposed openstack-infra/project-config master: Remove legacy-requirements-python34 job https://review.openstack.org/508489 | 11:12 |
ianw | frickler: $ git grep '/ on node' | grep multinode | wc -l | 11:13 |
ianw | 160 | 11:13 |
ianw | i think they probably all want that copy | 11:13 |
*** lukebrowning has quit IRC | 11:13 | |
Shrews | yep, zuul seems wedged. i see lots of locked nodes, but it isn't doing anything with them. there also appear to be zookeeper connection issues again. we'll have to wait for jeblair, i believe | 11:14 |
ianw | yeah, i don't want to touch anything and destroy anything helpful at this point | 11:15 |
Shrews | hrm, not wedged, just... ineffective? lots of exceptions about nodes not being locked, which we saw yesterday when the requests got lost | 11:15 |
*** lukebrowning has joined #openstack-infra | 11:15 | |
Shrews | i'm beginning to suspect our load is too much for a single zookeeper node | 11:15 |
ianw | :/ | 11:16 |
Shrews | woah, 8G zuul debug log file | 11:17 |
ianw | i thought the whole idea was it didn't loose stuff | 11:17 |
ianw | yeah, there seems to be some tight looping | 11:17 |
Shrews | ianw: the requests for nodes sent through zk are ephemeral. if the zk connection goes away, so does the request | 11:18 |
Shrews | perhaps zuul isn't handling that very well? not sure | 11:18 |
Shrews | oy, must grab coffee | 11:19 |
ianw | I'm EOD here in .au ... good luck everyone! | 11:19 |
tosky | it's also EOW there, I guess :) | 11:19 |
*** lukebrowning has quit IRC | 11:20 | |
ianw | tosky: yes, and a holiday monday too! and daylight savings so i don't have to get up so early for the infra meeting, it's all good :) | 11:20 |
tosky | uh, daylight saving so early? Interesting | 11:21 |
*** edmondsw has quit IRC | 11:21 | |
*** lukebrowning has joined #openstack-infra | 11:22 | |
*** adisky has quit IRC | 11:23 | |
*** alexchadin has quit IRC | 11:24 | |
*** lukebrowning has quit IRC | 11:26 | |
*** lukebrowning has joined #openstack-infra | 11:28 | |
*** tpsilva has joined #openstack-infra | 11:28 | |
*** alexchadin has joined #openstack-infra | 11:30 | |
*** lukebrowning has quit IRC | 11:32 | |
*** lukebrowning has joined #openstack-infra | 11:34 | |
*** lukebrowning has quit IRC | 11:39 | |
*** mat128 has joined #openstack-infra | 11:40 | |
*** alexchadin has quit IRC | 11:42 | |
*** alexchadin has joined #openstack-infra | 11:43 | |
*** jpena is now known as jpena|lunch | 11:43 | |
frickler | I think we might need a status notice that jobs in the integrated-gate are bound to fail currently due to the multinode issues | 11:50 |
*** alexchadin has quit IRC | 11:52 | |
*** baoli has joined #openstack-infra | 11:53 | |
*** armax has joined #openstack-infra | 11:53 | |
*** dprince has joined #openstack-infra | 11:57 | |
*** thorst has quit IRC | 12:00 | |
*** thorst has joined #openstack-infra | 12:00 | |
openstackgerrit | Chandan Kumar proposed openstack-infra/project-config master: Add python-tempestconf project https://review.openstack.org/508502 | 12:01 |
*** kjackal_ has joined #openstack-infra | 12:04 | |
ttx | FWIW I also have a stuck job on 504940 | 12:08 |
*** esberglu has joined #openstack-infra | 12:09 | |
*** markvoelker has joined #openstack-infra | 12:09 | |
*** baoli has quit IRC | 12:10 | |
*** alexchadin has joined #openstack-infra | 12:10 | |
*** cuongnv has quit IRC | 12:10 | |
*** mat128 has quit IRC | 12:12 | |
*** edmondsw has joined #openstack-infra | 12:13 | |
*** mat128 has joined #openstack-infra | 12:18 | |
*** trown|outtypewww is now known as trown | 12:20 | |
*** LindaWang has quit IRC | 12:21 | |
*** LindaWang has joined #openstack-infra | 12:21 | |
yamamoto | should tox_install.sh style dependencies be in required-projects as well? | 12:27 |
*** rlandy has joined #openstack-infra | 12:28 | |
*** markvoelker has quit IRC | 12:29 | |
*** markvoelker has joined #openstack-infra | 12:29 | |
*** wolverineav has joined #openstack-infra | 12:35 | |
*** hemna_ has joined #openstack-infra | 12:35 | |
*** lukebrowning has joined #openstack-infra | 12:35 | |
*** shardy has joined #openstack-infra | 12:39 | |
*** kiennt26 has joined #openstack-infra | 12:43 | |
*** camunoz has joined #openstack-infra | 12:46 | |
*** jpena|lunch is now known as jpena | 12:47 | |
*** jaypipes has joined #openstack-infra | 12:48 | |
*** lukebrowning has quit IRC | 12:48 | |
*** bnemec has joined #openstack-infra | 12:50 | |
*** lukebrowning has joined #openstack-infra | 12:51 | |
*** armax has quit IRC | 12:54 | |
*** baoli has joined #openstack-infra | 12:56 | |
*** lukebrowning has quit IRC | 12:56 | |
*** mriedem has joined #openstack-infra | 12:57 | |
dhellmann | I'm looking into a failed job and having some trouble figuring out where the log files are. http://logs.openstack.org/82/508482/1/check/legacy-releases-python35/c868102/ Is there still WIP for the zuul update related to logs? | 12:57 |
*** lukebrowning has joined #openstack-infra | 12:58 | |
*** kiennt26 has quit IRC | 12:58 | |
dmsimard | infra-root: lots of different problems in the zuul queue. It might tie back all to the same issue but I've had a job queued for 15 minutes without jobs (mergers not processing ?), example: 507889 -- ttx also mentioned 504940 that is indeed stuck | 12:58 |
frickler | dhellmann: seems like it failed here: http://logs.openstack.org/82/508482/1/check/legacy-releases-python35/c868102/job-output.txt.gz#_2017-09-29_10_50_11_618056 | 12:58 |
*** kiennt26 has joined #openstack-infra | 12:59 | |
dmsimard | dhellmann: if you open up the 'ara' folder, the error should be highlighted -- look for red things | 12:59 |
dhellmann | dmsimard : I've clicked all around and not found anything that looked like job output logging | 13:00 |
dmsimard | i.e, http://logs.openstack.org/82/508482/1/check/legacy-releases-python35/c868102/ara/ has a red icon and if you expand the tasks panel you'll see the failed | 13:00 |
dhellmann | ara seems to be helpfully showing me the job definition though | 13:00 |
dmsimard | dhellmann: click on the 'failed' status | 13:00 |
dmsimard | (the permalink version is http://logs.openstack.org/82/508482/1/check/legacy-releases-python35/c868102/ara/result/b44ba413-2dce-4b7c-afec-fe02dbec8e33/ ) | 13:00 |
dhellmann | oh, wow | 13:00 |
dhellmann | so, how do I set up my jobs so I don't need so many clicks to get to the failure logs? | 13:01 |
dmsimard | the raw output is what frickler linked, it's the equivalent of the console log | 13:01 |
dhellmann | how do I find that starting from a link on a gerrit patch? | 13:01 |
*** ralonsoh_ has joined #openstack-infra | 13:02 | |
dmsimard | the link on the gerrit patch should send you straight to the log root which contains the job-output.txt.gz file | 13:02 |
dhellmann | oh, nm | 13:02 |
dhellmann | I found it | 13:02 |
dhellmann | I just scrolled past the error | 13:02 |
*** lukebrowning has quit IRC | 13:02 | |
dhellmann | how hard is it to add new log files to a job? if I wanted a script to log its output separately for example? | 13:02 |
dhellmann | the release review process relies on us reading a report that the job generates when it passes | 13:03 |
dhellmann | now it looks like that output is likely to be all mixed in with other data | 13:03 |
openstackgerrit | Chandan Kumar proposed openstack-infra/project-config master: Add python-tempestconf project https://review.openstack.org/508502 | 13:03 |
*** jaypipes is now known as leakypipes | 13:04 | |
*** lukebrowning has joined #openstack-infra | 13:04 | |
dmsimard | dhellmann: as far as I know, you have control over what logs are sent through a task that looks like this: https://review.openstack.org/#/c/508296/17/playbooks/upload-logs.yaml | 13:04 |
*** jcoufal has joined #openstack-infra | 13:04 | |
Shrews | dmsimard: yep. we are definitely having issues with zookeeper | 13:04 |
*** ralonsoh has quit IRC | 13:05 | |
dmsimard | dhellmann: the 'src' would be the location where you would put your log files in | 13:05 |
Shrews | jeblair: mordred: pabelanger: i can't even do a "nodepool list" now. getting zk connection errors. going to poke around zk logs | 13:05 |
dhellmann | dmsimard : ok, cool, so if I created a separate file in the right place it would be copied up. I just need to figure out how to create the separate file | 13:05 |
Shrews | jeblair: mordred: pabelanger: ah ha. disk full on nodepool.o.o | 13:06 |
Shrews | wheeeee | 13:06 |
dhellmann | dmsimard, frickler : thanks for your help! | 13:06 |
dmsimard | dhellmann: np, happy to help | 13:06 |
dmsimard | Shrews: oh noes | 13:06 |
dmsimard | Shrews: I wish I had at least read only access to the servers :( | 13:07 |
*** ralonsoh has joined #openstack-infra | 13:07 | |
*** kiennt26 has quit IRC | 13:08 | |
*** ralonsoh_ has quit IRC | 13:08 | |
*** kgiusti has joined #openstack-infra | 13:08 | |
*** lukebrowning has quit IRC | 13:08 | |
*** Goneri has joined #openstack-infra | 13:09 | |
mordred | Shrews: there was 4.6G in old puppet reports in /var - I cleared them out to give some more headroom | 13:12 |
Shrews | so, /var/lib/zookeeper seems to be taking quite a lot | 13:12 |
mordred | Shrews: yah | 13:13 |
dmsimard | is debug logging enabled or something ? | 13:13 |
dmsimard | the debug logging in the zuul unit tests is.... intense | 13:13 |
mordred | yah it is | 13:14 |
dmsimard | probably want to toggle that off unless it's necessary | 13:14 |
mriedem | i'm not sure where this legacy-tempest-dsvm-nnet job came from but it's 100% fail on master since you can't run nova-network outside of a cellsv1 config, and for that we have the cellsv1 job - should i do something to filter out legacy-tempest-dsvm-nnet or is someone already doing that? | 13:14 |
*** lbragstad has joined #openstack-infra | 13:15 | |
*** lukebrowning has joined #openstack-infra | 13:15 | |
mriedem | "f off, that's the lowest priority thing right now" is an acceptable answer | 13:15 |
dmsimard | mriedem: if there's no patch opened to remove it from openstack-infra/openstack-zuul-jobs then no one is already doing that | 13:15 |
mriedem | ok | 13:15 |
mriedem | will look | 13:15 |
dmsimard | mriedem: to remove it, you'll want to remove the job component and then that job from the project | 13:15 |
mriedem | https://review.openstack.org/#/c/508405/ | 13:16 |
Shrews | mordred: zuul is outputting stuff like crazy now. maybe it's unwedged? | 13:16 |
Shrews | or made worse. i dunno | 13:17 |
mriedem | dmsimard: i think we just want to restrict it to <ocata | 13:17 |
dmsimard | mriedem: you can test that patch with a depends-on, which is what ianw did here https://review.openstack.org/#/c/508409/ and it didn't work.. let me check | 13:17 |
*** jdandrea_ has joined #openstack-infra | 13:18 | |
mordred | yah - I'm confused as to why that's not restricted to newton - looking now | 13:19 |
*** lukebrowning has quit IRC | 13:19 | |
frickler | mordred: I'm guessing something is broken with branch filtering and many of the issues in http://lists.openstack.org/pipermail/openstack-dev/2017-September/122861.html are related to that | 13:20 |
mordred | frickler: thanks for that list! | 13:20 |
*** efried is now known as fried_rice | 13:21 | |
*** lukebrowning has joined #openstack-infra | 13:21 | |
dmsimard | mordred: the template is defined in openstack-zuul-jobs but is used in project-config, would that prevent speculative testing ? | 13:21 |
*** mat128 has quit IRC | 13:22 | |
mordred | nope - speculative testing of the template should work fine | 13:22 |
mordred | I'm VERY confused about why neutron doesn't have unittests in its gate jobs | 13:22 |
*** eharney has joined #openstack-infra | 13:22 | |
*** mat128 has joined #openstack-infra | 13:22 | |
mordred | so - for a more general solution based on ianw's patch here: https://review.openstack.org/#/c/508473 | 13:24 |
*** hashar is now known as hasharAway | 13:25 | |
*** lukebrowning has quit IRC | 13:25 | |
*** jamesdenton has quit IRC | 13:25 | |
Shrews | frickler: hmm, for the "openstack-tox-py27 is being run on trusty nodes instead of xenial" issue, do you have an example handy? | 13:26 |
esberglu | Is there an alternative to zuul.openstack.org now? Or is that dashboard just not up? | 13:26 |
mordred | esberglu: http://zuulv3.openstack.org/ | 13:27 |
esberglu | mordred: Ah tx | 13:27 |
*** lukebrowning has joined #openstack-infra | 13:27 | |
tushar | Hi All ... the third party CI is not listening to the gerrit patches after migration to zuul3 | 13:29 |
tushar | and result page is also not updating - http://ci-watch.tintri.com/project?project=cinder&time=7+days | 13:29 |
tushar | Any chages required in third party CI setup? | 13:30 |
*** slaweq has quit IRC | 13:30 | |
frickler | Shrews: http://logs.openstack.org/38/508438/2/check/openstack-tox-py27/3a6fdfe/ | 13:30 |
*** stephenfin is now known as finucannot | 13:30 | |
logan- | tushar: fwiw my jenkins is still running 3rd party tests as of this morning with no changes made | 13:31 |
*** lukebrowning has quit IRC | 13:31 | |
Shrews | mordred: i might need to restart the np launchers. their zk sessions seem to be permanently suspended, so nothing is happening | 13:32 |
mordred | Shrews: nod | 13:32 |
mordred | tushar: yah - nothing should have changed re: third party CI | 13:32 |
Shrews | nl02 restarted | 13:33 |
*** lukebrowning has joined #openstack-infra | 13:33 | |
frickler | tushar: well, if you are using sos-ci, you may want to change your trigger from "Jenkins +1" to "Zuul +1" or similar | 13:34 |
Shrews | nl01 restarted. i think that was the source of the wedge | 13:34 |
tushar | logan ,mordred : In /etc/zuul/layout/layout.yaml, there is one block approval | 13:34 |
tushar | approval: | 13:34 |
tushar | - verified: [1, 2] | 13:34 |
tushar | username: jenkins | 13:34 |
fungi | tushar: is it possible some third-party ci systems are configured to only run jobs after the upstream ci reports on those changes? if so, the name of the account reporting changed | 13:34 |
tushar | after commenting this its working fine | 13:34 |
fungi | tushar: yeah, that looks like what you're doing | 13:34 |
*** dansmith is now known as superdan | 13:34 | |
mordred | ah. yes. | 13:34 |
fungi | it's the "zuul" account now | 13:34 |
tushar | fungi : correct | 13:35 |
fungi | not "jenkins" any longer | 13:35 |
mordred | fungi: that's worthy of broader communication | 13:35 |
mordred | btw - I'm looking at a wider solution to the multinode logs problem based on ianw's patch | 13:35 |
*** psachin has quit IRC | 13:36 | |
tushar | fungi : replaced jenkins with zuul , still its trigger CI for any patch | 13:37 |
fungi | i'm almost to the point where i have caffeine and can start digging in. i'm caught up on scrollback but holy moley there's so much it's hard to decide what's top priority. probably the cleanup post nodepool.o.o filling up its filesystem, followed by the branch exclusion misbehavior (or in particular whatever is causing neutron not to run unit tests and swift to run theirs on trusty) | 13:37 |
tushar | also I modified verified: [-1 2], as all the patches having zuul -1 | 13:37 |
*** ijw has joined #openstack-infra | 13:37 | |
*** lukebrowning has quit IRC | 13:38 | |
tushar | fungi : not sure we need to replace "jenkins" with "zuul" or some other key word | 13:38 |
mordred | fungi: yah - the branch exclusion misbehavior is the thign that worries me the most - but I don't see anything wrong in the config | 13:39 |
*** links has quit IRC | 13:39 | |
*** lukebrowning has joined #openstack-infra | 13:40 | |
mriedem | does openstack-zuul-jobs replace project-config now? | 13:40 |
frickler | mriedem: iiuc it only replaces project-config/jenkins/jobs | 13:41 |
Shrews | heh, citycloud-kna1 has two AZs... nova and nova-local | 13:42 |
Shrews | *sigh* | 13:42 |
mordred | mriedem: it's a little more complex - there are three main locations for shared jobs | 13:42 |
mordred | mriedem, frickler: https://docs.openstack.org/infra/manual/zuulv3.html#where-jobs-are-defined-in-zuul-v3 | 13:42 |
mordred | there are currently WAY more things in openstack-zuul-jobs than there eventually will be - because at the moment there are WAY more things defined cenrally than there eventually will be | 13:43 |
mriedem | ok so eventually push project-specific jobs to the repos that run them | 13:44 |
mordred | yes | 13:44 |
mriedem | well that seems neataroo | 13:44 |
*** lukebrowning has quit IRC | 13:44 | |
*** gouthamr has joined #openstack-infra | 13:44 | |
mordred | yah - also - those jobs get tested with the patch they're proposed in (as do patches to zuul-jobs or openstack-zuul-jobs but not project-config fwiw) ... | 13:44 |
mriedem | i have been able to blissfully ignore this zuulv3 business until this week | 13:44 |
mordred | so iterating on a job is WAY easier - once we get past these initial pain moments | 13:45 |
mriedem | yeah non-self testing patches to project-config was always annoying, but workaroundable | 13:45 |
openstackgerrit | Monty Taylor proposed openstack-infra/openstack-zuul-jobs master: Create legacy nodesets and switch all legacy jobs https://review.openstack.org/508510 | 13:45 |
openstackgerrit | Monty Taylor proposed openstack-infra/openstack-zuul-jobs master: Update legacy post playbooks to pull from primary https://review.openstack.org/508511 | 13:45 |
mordred | infra-root: ^^ based on what ianw found about multinode log uploading - I believe that ^^ should fix it systemically | 13:46 |
*** lukebrowning has joined #openstack-infra | 13:46 | |
mordred | infra-root: it's a little bit more of a sledgehammer than is strictly necessary - but I can't think of any specific downside | 13:46 |
*** ykarel has quit IRC | 13:47 | |
dmsimard | FYI I'm writing a FAQ-ish email to openstack-dev to give a few pointers on how to troubleshoot the legacy and new jobs | 13:47 |
*** Dinesh_Bhor has quit IRC | 13:48 | |
dmsimard | leifmadsen: trying to find the quickstart guide, is it just melded in https://docs.openstack.org/infra/zuul/user/index.html ? | 13:48 |
chandankumar | AJaeger_: for new repo creation, we donot need to add zuul layout? | 13:48 |
frickler | mordred: woa, how did you get that into merge conflict so fast? | 13:48 |
mordred | frickler: I'm very talented | 13:48 |
seanhandley | Is one of the current Zuul issues regarding +2 Verify and Merge step? I've had a sphynx project stuck for a few hours with +2 code review, +1 workflow and a +1 verified from Zuul. | 13:49 |
AJaeger_ | chandankumar: don't know yet ;( Best ask here and sent a patch for the infra-manual, please | 13:49 |
seanhandley | I'm not sure if I've done something wrong, or I'm caught up in the wider ongoing issues | 13:49 |
mordred | seanhandley: yah - we're having some issues with the zuul scheduler and nodpeool nodes that jeblair and Shrews are investigating | 13:49 |
mordred | seanhandley: you have almost certainly not done anything wrong | 13:49 |
mordred | chandankumar: zuul/layout.yaml is no longer a thing - but we have not updated the project creator's guide yet (thanks for the reminder) | 13:50 |
seanhandley | this is the first Gerrit change I've raised on this repo you see :) | 13:50 |
mordred | seanhandley: oh no! | 13:50 |
seanhandley | I'm wondering if I messed up the ACL in project infra perhaps | 13:50 |
*** lukebrowning has quit IRC | 13:50 | |
seanhandley | Either way, it sounds like it's worth discussing more next week when the wider issues have hopefully been fixed | 13:51 |
mordred | seanhandley: what's the project? | 13:51 |
seanhandley | I'm just being impatient :D | 13:51 |
chandankumar | mordred: https://review.openstack.org/#/c/508502/ is it right? | 13:51 |
seanhandley | mordred: It's the public cloud WG's doc repo | 13:51 |
chandankumar | for new project creation | 13:51 |
seanhandley | I'm trying to draft a spec for the Passport Program | 13:51 |
openstackgerrit | Monty Taylor proposed openstack-infra/infra-manual master: Add mention of legacy nodesets to migration instructions https://review.openstack.org/508512 | 13:52 |
*** lukebrowning has joined #openstack-infra | 13:52 | |
AJaeger_ | mordred, time to merge https://review.openstack.org/508313 - that's the new sudo change - it has three +2s but wasn't approved yet | 13:52 |
Shrews | mordred: i'm seeing np handle requests now like crazy, but i don't really see any progress on zuulv3.o.o. maybe the scheduler needs to be kicked? or should we wait for jeblair? | 13:53 |
AJaeger_ | seanhandley: what's the change? Let's look at it in detail, please | 13:53 |
seanhandley | Sure AJaeger_ - thanks. https://review.openstack.org/#/c/508445/ | 13:53 |
mordred | Shrews: I think jeblair should be up soon, so let's wait for him | 13:53 |
AJaeger_ | seanhandley: currently on a bus with bad internet - will report back as soon as I can ;) | 13:54 |
seanhandley | heh | 13:54 |
seanhandley | Been there before :D | 13:54 |
seanhandley | Been SSH'd into prod boxes there before | 13:54 |
mordred | fungi, AJaeger_: I'm thinking after I take care of a few of these morning patches I might draft an email to the list letting folks know where we're at ... and we might want to set up a specific place for people to register migration issues | 13:54 |
frickler | AJaeger_: seanhandley: that patch is waiting in the gate queue, which is currently backed up 8 hours and counting | 13:55 |
AJaeger_ | mordred: good idea - setting up an etherpad or something like that | 13:55 |
mordred | yah | 13:55 |
AJaeger_ | frickler: thanks! | 13:55 |
seanhandley | frickler: Ouch! Thanks for checking | 13:55 |
mordred | or even a storyboard story that people can just add tasks to | 13:55 |
AJaeger_ | seanhandley: so, everything fine, drink a coffee, bake some cookies and ship them to frickler ;) | 13:55 |
seanhandley | Yup. I will patiently sip coffee and find other things to do while I wait ;) | 13:55 |
seanhandley | He doesn't want to eat cookies I've baked. | 13:56 |
seanhandley | Nobody ever does :D | 13:56 |
mtreinish | fungi, mordred, clarkb: if you get a sec can you take a look at: https://review.openstack.org/#/q/topic:restore-name-sanity to try and get openstack-health useable again | 13:56 |
mordred | AJaeger_, fungi: I'm considering force-merging the sudo fix, since the gate queue is backed up but it affects a large swath of things | 13:56 |
*** lukebrowning has quit IRC | 13:57 | |
AJaeger_ | mordred: don't ask me for advice, I couldn't follow this week and thus are backed up a bit as well ;) You still have my blessing ;) | 13:57 |
openstackgerrit | Monty Taylor proposed openstack-infra/openstack-zuul-jobs master: Create legacy nodesets and switch all legacy jobs https://review.openstack.org/508510 | 13:57 |
openstackgerrit | Monty Taylor proposed openstack-infra/openstack-zuul-jobs master: Update legacy post playbooks to pull from primary https://review.openstack.org/508511 | 13:57 |
*** hongbin has joined #openstack-infra | 13:57 | |
AJaeger_ | mordred: I'l fix 508512 - the infra-manual change - for you now... | 13:58 |
mordred | AJaeger_: well - we went live! :) | 13:58 |
*** gongysh has joined #openstack-infra | 13:58 | |
*** jcoufal_ has joined #openstack-infra | 13:58 | |
AJaeger_ | mordred: I notcied ;) | 13:58 |
* AJaeger_ is happy about that! | 13:58 | |
fungi | mordred: missing a legacy-ubuntu-xenial-2-node nodeset in 508510 | 13:58 |
fungi | i think | 13:58 |
*** lukebrowning has joined #openstack-infra | 13:58 | |
*** sbezverk has joined #openstack-infra | 13:59 | |
openstackgerrit | Matt Riedemann proposed openstack-infra/project-config master: Remove legacy-tempest-dsvm-nnet-ocata https://review.openstack.org/508513 | 13:59 |
mordred | fungi: oh - yes, you're right - those weren't strictly needed but made global serach and replace easier - one sec | 14:00 |
*** jcoufal has quit IRC | 14:00 | |
*** kiennt26 has joined #openstack-infra | 14:00 | |
fungi | mordred: which sudo fix are we still missing? i'll check my zuulv3 review dashboard | 14:01 |
openstackgerrit | Andreas Jaeger proposed openstack-infra/infra-manual master: Add mention of legacy nodesets to migration instructions https://review.openstack.org/508512 | 14:01 |
mordred | fungi: https://review.openstack.org/#/c/508313/ | 14:01 |
*** srobert has joined #openstack-infra | 14:02 | |
fungi | aha, the topic isn't zuulv3, that explains why i wasn't finding it | 14:02 |
openstackgerrit | Andreas Jaeger proposed openstack-infra/infra-manual master: Add mention of legacy nodesets to migration instructions https://review.openstack.org/508512 | 14:02 |
*** ralonsoh_ has joined #openstack-infra | 14:02 | |
openstackgerrit | Monty Taylor proposed openstack-infra/openstack-zuul-jobs master: Create legacy nodesets and switch all legacy jobs https://review.openstack.org/508510 | 14:03 |
openstackgerrit | Monty Taylor proposed openstack-infra/openstack-zuul-jobs master: Update legacy post playbooks to pull from primary https://review.openstack.org/508511 | 14:03 |
*** lukebrowning has quit IRC | 14:03 | |
openstackgerrit | Matt Riedemann proposed openstack-infra/openstack-zuul-jobs master: Remove legacy-tempest-dsvm-nnet-ocata https://review.openstack.org/508515 | 14:03 |
fungi | mordred: so, 508313 won't take effect until after we rebuild nodepool images anyway, right? | 14:03 |
*** dhajare has joined #openstack-infra | 14:04 | |
*** camunoz has quit IRC | 14:05 | |
*** lukebrowning has joined #openstack-infra | 14:05 | |
ttx | fungi: I have a stuck governance change at https://review.openstack.org/#/c/504940/ which prevents my sending of the weekly TC status... Should I somehow cancel and retry it ? Or just let it be ? | 14:05 |
mordred | fungi: nope - it was done that way to avoid needing to rebuild images | 14:05 |
mordred | fungi: oh - wait - bother | 14:05 |
mordred | fungi: the openstack-zuul-jobs version of tha is the one that's important | 14:05 |
mordred | ttx: we have a bunch of stuck changes right now pending some issues being investigated | 14:06 |
fungi | k | 14:06 |
ttx | ok, standing by | 14:06 |
*** ralonsoh has quit IRC | 14:06 | |
*** amoralej is now known as amoralej|off | 14:06 | |
*** amoralej|off is now known as amoralej|lunch | 14:06 | |
frickler | mordred: so with 508511 you would not collect any logs from subnodes, is that correct? should the primary collect from subnodes? see https://review.openstack.org/508473 too | 14:07 |
odyssey4me | Hi everyone - we've had two specs patches stalled in the queue for nearly 5 hours now. https://review.openstack.org/499882 & https://review.openstack.org/499886 - any thoughts on what's going on there? | 14:07 |
mordred | fungi: so - what do you think - etherpad for reporting migration issues? Or storyboard story and have people add tasks? | 14:07 |
mordred | odyssey4me: yup. we have a stall issue ongoing | 14:07 |
fungi | mordred: we could go old school and ask them to follow up to the -dev ml thread | 14:07 |
odyssey4me | ok, will hang on a check back later then - thanks | 14:08 |
fungi | i worry that with an etherpad approach we'll just end up with a mess and not enough detail | 14:08 |
openstackgerrit | Andreas Jaeger proposed openstack-infra/infra-manual master: Ectomy Jenkins from the Infra Manual narrative https://review.openstack.org/436455 | 14:08 |
openstackgerrit | Andreas Jaeger proposed openstack-infra/infra-manual master: Add warning about Zuul v2 examples https://review.openstack.org/508518 | 14:08 |
mordred | fungi: I think that might be the right choice - especially as I cannot log in to #storyboard right now | 14:08 |
leifmadsen | dmsimard: no quickstart guide yet | 14:08 |
leifmadsen | still in progress | 14:08 |
leifmadsen | looks like the openstack etherpad is down? | 14:08 |
leifmadsen | dmsimard: no quickstart guide yet | 14:08 |
leifmadsen | still in progress, but it's available here: https://etherpad.openstack.org/p/zuulv3-quickstart | 14:08 |
leifmadsen | working notes | 14:08 |
fungi | leifmadsen: etherpad seems to be running to me | 14:09 |
leifmadsen | fungi: yea sorry, it was a local issue | 14:09 |
*** lukebrowning has quit IRC | 14:09 | |
*** alexchadin has quit IRC | 14:09 | |
leifmadsen | I didn't think my msg even went through | 14:09 |
mordred | infra-root: not that it's the most important thing on our plate, but logging in to storyboard gives me: | 14:09 |
fungi | mordred: i _do_ think an etherpad for us to coordinate what we're working on fixing makes sense, but replying to the ml seems like a better way to solicit detailed feedback | 14:10 |
mordred | Error Code: | 14:10 |
mordred | invalid_grant | 14:10 |
*** coolsvap has quit IRC | 14:10 | |
mordred | Error Description: | 14:10 |
mordred | No description received from server. | 14:10 |
openstackgerrit | Matt Riedemann proposed openstack-infra/openstack-zuul-jobs master: Remove legacy-tempest-dsvm-nnet https://review.openstack.org/508519 | 14:10 |
mordred | fungi: kk. I mostly want to make sure that people can report specific issues and that we can keep track of duplication, whether they're being worked, and status | 14:10 |
mordred | fungi: this is one of those times where I think we may be served well by a more traditional formal process :) | 14:11 |
AJaeger_ | mordred, fungi, https://docs.openstack.org/infra/manual is not getting updated by the post job - change merged this morning but http://logs.openstack.org/95/95c4d1433c74ad23894f7296be51a3a23b3c6e56 is empty . That's sad since those merged changes updated content for zuul v3 ;/ | 14:11 |
mordred | mtreinish: awesome ^^ - before you do that patch you should probably remove the mention of that job from zuul.d/projects.yaml in project-config | 14:12 |
mordred | gah | 14:12 |
mordred | mriedem: ^^ | 14:12 |
openstackgerrit | Matt Riedemann proposed openstack-infra/openstack-zuul-jobs master: Remove legacy-tempest-dsvm-nnet https://review.openstack.org/508519 | 14:12 |
openstackgerrit | Matt Riedemann proposed openstack-infra/openstack-zuul-jobs master: Remove legacy-tempest-dsvm-nnet-ocata https://review.openstack.org/508520 | 14:12 |
mriedem | mordred: yup already did, just needed the depends-on | 14:12 |
mordred | mriedem: cool | 14:13 |
fungi | mordred: i just tested storyboard.o.o and i can login, fwiw, so i don't know that we've got broken things to look into there | 14:13 |
mordred | mriedem: if you feel like it, while you're at it - you could delete ALL jobs from the nova pipeline definition in project-config that are not standard central jobs and add them to .zuul.yaml in the nova repo (don't know how much job jockeying you feel like doing this morning) | 14:13 |
fungi | just as well ;) | 14:13 |
*** bobh has joined #openstack-infra | 14:13 | |
*** ramishra has quit IRC | 14:13 | |
jeblair | mriedem: or you could leave the old ugly jobs there and add nice new ones to nova | 14:14 |
mriedem | mordred: i don't have the proper jockey attire on for that | 14:14 |
mordred | yah | 14:14 |
mriedem | priority #1 for me is just getting nova unblocked atm | 14:15 |
*** hemna_ has quit IRC | 14:15 | |
*** lukebrowning has joined #openstack-infra | 14:16 | |
*** ihrachys_ has joined #openstack-infra | 14:16 | |
*** ihrachys_ has quit IRC | 14:16 | |
mordred | mriedem: totally agree. mostly mentioning it because it MIGHT be worthwhile to do a large 3-patch copy-rename-move sequence and then be able to iterate on nova issues in nova yourself - but it also might not be depending on how many you've got | 14:17 |
mriedem | i think it's just this nnet job | 14:17 |
mriedem | btw, is zuulv3 smart enough to ignore abandoned patches which are dependencies via depends-on? | 14:17 |
dmsimard | infra-root: FYI I started an etherpad for FAQs and tips on using and troubleshooting v3 https://etherpad.openstack.org/p/zuulv3-migration-faq | 14:17 |
mriedem | b/c if not, i'll have to fix a change id | 14:17 |
*** jcoufal_ has quit IRC | 14:17 | |
dmsimard | posted to openstack-dev via http://lists.openstack.org/pipermail/openstack-dev/2017-September/122880.html | 14:17 |
*** jcoufal has joined #openstack-infra | 14:18 | |
jeblair | dmsimard: please include the infra-manual zuul v3 migration document | 14:19 |
dmsimard | jeblair: that's what I was looking for, I thought that was leifmadsen's doc | 14:19 |
dmsimard | jeblair: where is it ? | 14:19 |
jeblair | dmsimard: especially since starting on line 22 you're starting to rewrite it. :) | 14:19 |
jeblair | dmsimard: https://docs.openstack.org/infra/manual/zuulv3.html | 14:19 |
AJaeger_ | jeblair: see my comment above - last publish of infra-manual was 25th September, we need it publishing again... | 14:20 |
dmsimard | jeblair: argh, I was looking in the zuul docs | 14:20 |
*** rbrndt has joined #openstack-infra | 14:20 | |
jeblair | dmsimard: the link has been included in every communication about the zuulv3 migration. it would be great to stay on-message. :) | 14:20 |
*** camunoz has joined #openstack-infra | 14:20 | |
jeblair | AJaeger_: i agree. mordred, were you looking into that yesterday? | 14:20 |
*** mriedem1 has joined #openstack-infra | 14:20 | |
dmsimard | jeblair: added, thanks | 14:21 |
*** lukebrowning has quit IRC | 14:21 | |
andreaf | dmsimard: shall we have a link to devstack and tempest roles in devstack-gate as well? | 14:21 |
mtreinish | mordred: I'm not fluent in ansible what does this mean: http://logs.openstack.org/72/508272/1/check/legacy-infra-puppet-apply-3-centos-7/d312021/job-output.txt.gz#_2017-09-29_14_06_23_469959 | 14:21 |
dmsimard | andreaf: sure | 14:21 |
jeblair | dmsimard: *please* read the infra-manual doc and help improve it before you start over again. we spent a lot of time on it. | 14:22 |
*** lukebrowning has joined #openstack-infra | 14:22 | |
dmsimard | jeblair: my intention is not to restart it, people have been asking the same questions over and over and I'm specifically targetting those questions | 14:22 |
openstackgerrit | Matt Riedemann proposed openstack-infra/project-config master: Remove legacy-tempest-dsvm-nnet-ocata https://review.openstack.org/508524 | 14:22 |
*** kfarr has joined #openstack-infra | 14:23 | |
openstackgerrit | Matt Riedemann proposed openstack-infra/openstack-zuul-jobs master: Remove legacy-tempest-dsvm-nnet-ocata https://review.openstack.org/508520 | 14:23 |
openstackgerrit | Matt Riedemann proposed openstack-infra/openstack-zuul-jobs master: Remove legacy-tempest-dsvm-nnet https://review.openstack.org/508519 | 14:23 |
jeblair | dmsimard: yep. we need actual documentation for all of those answers. the best answer to a question is a doc link. | 14:23 |
mordred | dmsimard, jeblair: maybe it's worth adding a FAQ section to the end as we get FAQs? sometimes a short bullet-point summary can be helpful, with an internal link to the longer section? | 14:23 |
jeblair | dmsimard: i think the etherpad can be a great stop-gap, especially when we get a new question. but it should be a staging area for getting info into docs. | 14:23 |
*** mriedem has quit IRC | 14:23 | |
mordred | jeblair: andyes - I was looking in to infra-manula publication issues - will pick that up in just a bit | 14:24 |
jeblair | mordred: ya | 14:24 |
frickler | mtreinish: I think http://logs.openstack.org/72/508272/1/check/legacy-infra-puppet-apply-3-centos-7/d312021/job-output.txt.gz#_2017-09-29_14_06_23_469145 is the error message for that. you might need similar additions like in https://review.openstack.org/508448 | 14:24 |
andreaf | mtreinish: when you have a failure in a role I think you'll have better luck looking at it in ARA http://logs.openstack.org/72/508272/1/check/legacy-infra-puppet-apply-3-centos-7/d312021/ara/ | 14:24 |
AJaeger_ | mordred: you'll find a few changes to review and merge for infra-manual once you want to test ;) | 14:25 |
jeblair | fungi, dmsimard: i note that the status page is a faq on dmsimard's list. that's because https://review.openstack.org/507244 hasn't merged. is someone looking into those? | 14:25 |
mtreinish | andreaf: that doesn't help me it's too much clicking I don't know where to look | 14:26 |
mordred | jeblair, dmsimard, fungi, AJaeger_: I'm also working on an email status update - which will include the suggestion from fungi earlier that we collect specific job migration issues people are having as replies to the thread | 14:26 |
andreaf | mtreinish: heh just look for the red task http://logs.openstack.org/72/508272/1/check/legacy-infra-puppet-apply-3-centos-7/d312021/ara/result/7f8034c2-94e0-4a71-ba9f-51ee2a67c4d0/ | 14:26 |
mordred | infra-root, dmsimard, AJaeger_: unless we want to suggest a different approach for collecting those | 14:26 |
mtreinish | frickler: thanks, ok so now I have to figure out where that job is defined and update it | 14:26 |
*** lukebrowning has quit IRC | 14:27 | |
mtreinish | andreaf: right which just gives me the log output | 14:27 |
openstackgerrit | Andreas Jaeger proposed openstack-infra/infra-manual master: Add warning about Zuul v2 examples https://review.openstack.org/508518 | 14:27 |
jeblair | mordred: i think that sounds fine | 14:27 |
mtreinish | andreaf: I'd rather just look at the log... | 14:27 |
odyssey4me | we'd appreciate some review for https://review.openstack.org/508281 to fix up the required repositories for our jobs if anyone has a moment | 14:28 |
*** lukebrowning has joined #openstack-infra | 14:28 | |
fungi | jeblair: other than noticing around midnight that we don't have working puppet apply and beaker jobs on system-config, i haven't looked into them yet | 14:29 |
jeblair | fungi: maybe we should force-merge that change? | 14:30 |
fungi | i'm good with that. it can't really break anything; very limited in scope. i'll do that now | 14:30 |
AJaeger_ | the status patch fails with a cp error, see http://logs.openstack.org/44/507244/1/gate/legacy-infra-puppet-apply-3-centos-7/8156163/job-output.txt.gz#_2017-09-29_01_58_19_303876 | 14:31 |
jeblair | odyssey4me, logan-: the legacy-openstack-ansible-base part of that change looks good, but i'm not sure the linters part is correct. | 14:31 |
openstackgerrit | Merged openstack-infra/system-config master: Add redirect from status.o.o/zuul to zuulv3.openstack.org https://review.openstack.org/507244 | 14:33 |
*** lukebrowning has quit IRC | 14:33 | |
*** jaosorior has quit IRC | 14:33 | |
*** wewe0901 has quit IRC | 14:34 | |
odyssey4me | jeblair is that just a name issue, or are we misunderstanding how we're supposed to use the template model? | 14:34 |
jeblair | odyssey4me: maybe both? -- what's the problem you're trying solve there? | 14:34 |
*** lukebrowning has joined #openstack-infra | 14:34 | |
openstackgerrit | Matthew Treinish proposed openstack-infra/openstack-zuul-jobs master: Add missing projects for infra puppet jobs https://review.openstack.org/508526 | 14:35 |
odyssey4me | jeblair our linters test does ansible syntax and lint checking, so it needs all the roles in place, se we need all the repositories there | 14:35 |
mtreinish | AJaeger_: ^^^ I think that will fix it | 14:35 |
jeblair | mtreinish: awesome, thanks -- i was just noticing those jobs were failing | 14:35 |
mtreinish | AJaeger_: I'm hitting the same failure on my puppet-subunit2sql patches to fix things in subunit2sql/openstack-health post migration | 14:35 |
mtreinish | jeblair: I don't know if those are the only missing repos though, that's just where things were complaining on the failures | 14:36 |
jeblair | mtreinish: you can have the change you're trying to get through Depends-On that change, and it will test it | 14:36 |
openstackgerrit | Matthew Treinish proposed openstack-infra/puppet-subunit2sql master: Ensure that build_names are unique per project https://review.openstack.org/508258 | 14:37 |
jeblair | odyssey4me: okay, i think i understand; i'll write a suggestion in review comments | 14:37 |
mtreinish | jeblair: ^^^ ok that'll test it then | 14:37 |
odyssey4me | thanks jeblair | 14:38 |
*** lukebrowning has quit IRC | 14:39 | |
*** jcoufal_ has joined #openstack-infra | 14:40 | |
jeblair | odyssey4me: done. and it was mostly a naming issue i'd say. :) | 14:40 |
jeblair | okay, i need to dig into a zuul issue; i think it's stuck. | 14:41 |
*** dizquierdo has joined #openstack-infra | 14:42 | |
fungi | i was just about to ask. so i guess approving more job configuration fixes is futile at the moment | 14:42 |
*** jcoufal has quit IRC | 14:42 | |
odyssey4me | thanks jeblair | 14:43 |
*** mriedem1 is now known as mriedem | 14:43 | |
*** apuimedo has quit IRC | 14:43 | |
mordred | infra-root, dmsimard, AJaeger_: https://etherpad.openstack.org/p/MnG27fsAhC draft email to the mailing list - I started a list of common/known job issues and what to do about them at the bottom - although I'm thinking that perhaps I should just point to the dmsimard FAQ etherpad - and we should then start cycling those etherpad FAQ entries into a FAQ section on infra-manual once we get infra-manual | 14:43 |
mordred | publication working again | 14:44 |
fungi | sounds like a fine plan | 14:45 |
beisner | hi all, how's landing bot? | 14:45 |
openstackgerrit | Jesse Pretorius (odyssey4me) proposed openstack-infra/openstack-zuul-jobs master: Add openstack-ansible required-projects parent job https://review.openstack.org/508281 | 14:45 |
beisner | seems like we have a few things lost with the socks | 14:45 |
dmsimard | mordred: we can cross link the etherpads or something, or fold them together, I don't have a strong opinion | 14:46 |
mriedem | mordred: so my openstack-zuul-jobs change keeps failing on a project-config thing even though i have a depends-on the project-config change https://review.openstack.org/#/c/508520/ - does the project-config change need to merge first? assume so | 14:46 |
*** lukebrowning has joined #openstack-infra | 14:46 | |
fungi | mriedem: yes, project-config additions aren't safe to test directly since they can be abused to expose secrets | 14:46 |
mriedem | ok | 14:47 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Handle errors returning nodesets on canceled jobs https://review.openstack.org/508532 | 14:47 |
AJaeger_ | mordred: LGTM, send it out | 14:48 |
jeblair | mordred: etherpad and plan lgtm. | 14:49 |
*** shardy is now known as shardy_mtg | 14:49 | |
*** amoralej|lunch is now known as amoralej | 14:49 | |
Shrews | mordred: ++ | 14:49 |
jeblair | fungi: once that change lands, i'm going to want to restart zuul; do you think i should try to save queues? | 14:50 |
jeblair | we've never tried that with zuulv3 | 14:50 |
*** lukebrowning has quit IRC | 14:50 | |
fungi | jeblair: we have quite a few people who have reported they're waiting on queued stuff to land, so maybe? | 14:50 |
* clarkb is catching up. | 14:50 | |
clarkb | mordred: do we know yet why trusty is used on openstack-tox jobs? or why some branch exlcusions seem to be ignored? | 14:51 |
jeblair | also, when i restart zuul, i will remove the 19G debug log :( | 14:51 |
*** e0ne has quit IRC | 14:51 | |
fungi | probably just as well | 14:51 |
jeblair | clarkb: according to the etherpad mordred just wrote, we don't know that yet | 14:51 |
fungi | clarkb: we don't know yet | 14:51 |
fungi | clarkb: i've been mulling over the configs there and haven't spotted anything obviously wrong, but more eyes may help | 14:52 |
*** icey has joined #openstack-infra | 14:52 | |
clarkb | in zuul-jobs looks like unittests <- tox <- various openstack tox jobs. Unittests doesn't have a parent specified | 14:53 |
clarkb | is perhaps some implied parenting breaking us? | 14:53 |
*** xarses has joined #openstack-infra | 14:55 | |
mordred | fungi, clarkb: no parent = parent: base ... and base should have nodeset: ubuntu-xenial ... butyah- we need to track down what's up with that | 14:56 |
*** wolverineav has quit IRC | 14:56 | |
*** lukebrowning has joined #openstack-infra | 14:57 | |
*** ykarel has joined #openstack-infra | 14:57 | |
openstackgerrit | Matthew Treinish proposed openstack-infra/openstack-zuul-jobs master: Add missing projects for infra puppet jobs https://review.openstack.org/508526 | 14:57 |
fungi | do we have a good example of a neutron job which skipped unit tests? i see 508438,2 in check right now ran (and passed) openstack-tox-py27 and openstack-tox-py35 | 14:58 |
fungi | Status: Pass 12888 Skip 1137 http://logs.openstack.org/38/508438/2/check/openstack-tox-py27/3a6fdfe/testr_results.html.gz | 14:58 |
clarkb | 498013 | 14:58 |
*** wolverineav has joined #openstack-infra | 14:58 | |
fungi | thanks | 14:58 |
clarkb | also currently in the gate | 14:58 |
clarkb | it is a change to ocata | 14:59 |
fungi | aha, so not master | 14:59 |
fungi | maybe that's the common thread | 14:59 |
*** rcernin has quit IRC | 15:01 | |
mgagne | minor issue with grafana and nodepool, some metrics aren't showing since zuulv3, anything I can do? http://grafana.openstack.org/dashboard/db/nodepool-inap | 15:01 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Always try to unlock nodes when returning https://review.openstack.org/508532 | 15:01 |
SpamapS | rsync: rename failed for "/var/lib/zuul/builds/f2853eee2bae4e19b8b2a474de36d78b/work/logs/logs/deprecations.txt.gz" (from logs/.~tmp~/deprecations.txt.gz): No such file or directory (2) | 15:01 |
*** lukebrowning has quit IRC | 15:01 | |
*** baoli has quit IRC | 15:01 | |
AJaeger_ | logs/logs? Is that the problem? | 15:02 |
SpamapS | That's from 488013's post ara | 15:02 |
*** ykarel has quit IRC | 15:02 | |
SpamapS | 498013 | 15:02 |
SpamapS | that's the reason for the POST_FAILURE | 15:02 |
jeblair | mgagne: dmsimard was looking into that | 15:02 |
*** egonzalez has quit IRC | 15:02 | |
*** baoli has joined #openstack-infra | 15:02 | |
*** baoli has quit IRC | 15:02 | |
dmsimard | yeah there's a patch, I believe it's failing the grafyaml job but haven't had the change to look yet | 15:02 |
dmsimard | https://review.openstack.org/#/c/508349/ | 15:03 |
fungi | SpamapS: was that a multinode job? | 15:03 |
*** lukebrowning has joined #openstack-infra | 15:03 | |
dmsimard | Error executing: cp -dRl /home/zuul/src/git.openstack.org/openstack-infra/grafyaml/. /home/zuul/src/git.openstack.org/openstack-infra/project-config/.tox/grafyaml/openstack-infra/grafyaml | 15:03 |
dmsimard | that doesn't look like a legit error ? | 15:03 |
SpamapS | fungi: it was | 15:03 |
jeblair | AJaeger_: logs/logs is probably okay (devstack jobs put logs inside of a directory called logs -- so the first logs us zuul machinery, it's the root of the upload. the second is devstack machinery, it shows up in the final location) | 15:03 |
jeblair | s/us/is/ | 15:04 |
SpamapS | dmsimard: rsync will exit non-0 when that happens, but yeah looks like maybe files disappeared while it was running. | 15:04 |
SpamapS | oh n/m that's your cp error | 15:04 |
jeblair | dmsimard: needs required-projects grafyaml | 15:04 |
dmsimard | jeblair: ok, I'll send a patch | 15:05 |
dmsimard | thanks. | 15:05 |
jeblair | np | 15:05 |
dmsimard | that should be in the FAQ :D | 15:05 |
dmsimard | I'll send the patch first though | 15:05 |
mordred | just added a specific mention | 15:05 |
SpamapS | fungi: is that rsync problem a known issue w/ multinode jobs? | 15:05 |
SpamapS | It came from http://logs.openstack.org/13/498013/1/gate/legacy-grenade-dsvm-neutron-multinode/f2853ee/ | 15:05 |
openstackgerrit | Matt Riedemann proposed openstack-infra/project-config master: Remove nova-net jobs that are >newton https://review.openstack.org/508524 | 15:06 |
mriedem | mtreinish: ^ since that changes tempest | 15:06 |
fungi | SpamapS: yeah, mordred has a proposed fix stack | 15:06 |
fungi | basically right now we're trying to collect logs from every node in the nodeset rather than just the primary node | 15:06 |
clarkb | I'm noticing that project-config consumes resources out of ozj like openstack-python-jobs template, but ozj is listed after project-config in the project list. Is that a problem? Comments say order matters | 15:07 |
mordred | jeblair: I know you're looking at zuul deep issue- but could you look at https://review.openstack.org/#/c/508511/ and https://review.openstack.org/#/c/508510/ real quick- just want to make sure you're not opposed to that approach | 15:07 |
mordred | clarkb: order matters for job definitions ... | 15:07 |
*** lukebrowning has quit IRC | 15:07 | |
mordred | clarkb: so, specifically, a job can't have a parent that was defined after it | 15:08 |
*** baoli has joined #openstack-infra | 15:08 | |
SpamapS | fungi: k | 15:08 |
mordred | SpamapS: https://review.openstack.org/#/c/508511/ and https://review.openstack.org/#/c/508510 | 15:08 |
*** hemna_ has joined #openstack-infra | 15:08 | |
* SpamapS looks | 15:08 | |
mordred | clarkb: but that's within a class of config - the various classes are loaded into zuul's config in an order that should make sense - so all the jobs and project-templates are loaded before project definitions are loaded | 15:09 |
openstackgerrit | David Moreau Simard proposed openstack-infra/openstack-zuul-jobs master: Add openstack-infra/grafyaml to the project-config grafyaml job https://review.openstack.org/508537 | 15:09 |
dmsimard | jeblair: ^ | 15:09 |
*** lukebrowning has joined #openstack-infra | 15:09 | |
*** ramishra has joined #openstack-infra | 15:10 | |
*** apuimedo has joined #openstack-infra | 15:10 | |
jeblair | mordred: oh! because only the primary node was in the inventory in v2.5, right? | 15:10 |
openstackgerrit | David Moreau Simard proposed openstack-infra/project-config master: Update Nodepool graphite metric names https://review.openstack.org/508349 | 15:11 |
jeblair | mordred: both +2 | 15:11 |
clarkb | it seems like explicit nodesets are working in the cases I have checked | 15:13 |
jeblair | dmsimard: +2 on 508537, comment on 508349 | 15:13 |
clarkb | its just the implicit nodeset that isn't for things like openstack-tox jobs | 15:13 |
*** lukebrowning has quit IRC | 15:14 | |
mordred | jeblair: yes | 15:14 |
mordred | clarkb: WEIRD | 15:14 |
jeblair | okay, i'm going to force-merge the zuul fix then restart zuul now. | 15:14 |
fungi | thanks jeblair! | 15:15 |
mtreinish | jeblair: so I think for the puppet-apply jobs it's going to want all the puppet repos used by system-config. Is there a way to wild card that in the job definition | 15:15 |
mordred | clarkb, jeblair: so - the only difference I can see is that our base job is using an anonymous nodeset rather than referencing the pre-defined ubuntu-xenial nodeset | 15:15 |
*** lukebrowning has joined #openstack-infra | 15:15 | |
mordred | perhaps there is a bug in the anonymous nodeset handling code? | 15:15 |
jeblair | mtreinish: no. but if you need it for more than one job, you can define a parent job which adds all the repos, then have the jobs inherit from that, so the list is only in one place. | 15:15 |
jeblair | mordred: which bug are we talking about? | 15:16 |
mtreinish | jeblair: it'll be needed by all 3 infra puppet apply jobs so sure we can do that | 15:16 |
mordred | jeblair: there is an issue where jobs are running on trusty nodes when they should be running on xenial nodes | 15:16 |
mtreinish | jeblair: but that's going to be one big list, there are a ton of openstac-infra/puppet-* repos | 15:16 |
*** ykarel has joined #openstack-infra | 15:16 | |
jeblair | mtreinish: yah, so maybe 'infra-puppet-apply-base' or something | 15:16 |
mordred | jeblair: we can't find any reason for htat - other than that all the instnaces we've seen of it are only jobs that are getting their nodeset implicity through the base job | 15:17 |
clarkb | mordred: I'm trying to confirm that the tox-* jobs run on xenial as expected | 15:17 |
clarkb | mordred: as they share the same inheritance even | 15:17 |
mordred | yah | 15:17 |
*** ramishra has quit IRC | 15:17 | |
jeblair | mordred: have any links handy? | 15:17 |
openstackgerrit | Andreas Jaeger proposed openstack-infra/project-config master: Remove zuul v2 and jjb content https://review.openstack.org/507180 | 15:17 |
mordred | jeblair: I do not - clarkb do you? | 15:17 |
clarkb | jeblair: http://logs.openstack.org/35/507235/2/check/openstack-tox-pep8/4962d18/zuul-info/ is an example | 15:17 |
jeblair | mordred: i'm going to hold the zuul restart for this in case it's a zuul bug | 15:17 |
*** yamamoto has quit IRC | 15:18 | |
*** vhosakot has joined #openstack-infra | 15:18 | |
clarkb | that is a change to cinder master and its pep8 job ran on trusty | 15:18 |
SpamapS | jeblair: do you think these nodes that aren't locked are results of more timeouts? | 15:18 |
mnencia | Hi, there is any good reason other the lack of interest that stops https://blueprints.launchpad.net/openstack-ci/+spec/jenkins-job-builder-folders to be advanced? Ho I can help it to get iincluded? | 15:19 |
jeblair | SpamapS: zookeeper filled up its filesystem; i'm assuming anything after that until we restart zuul is because of that | 15:19 |
Shrews | infra-root: fyi, nodepool.o.o disk usage steadily rising. at 90% now. we should keep an eye on it | 15:19 |
*** v1k0d3n has quit IRC | 15:19 | |
clarkb | mnencia: we haven't used launchpad blueprints in years. However, not sure if the JJB team has separately decided to use launchpad for that feature again. You'll want to talk to electrofelix and zaro and zxiiro I think | 15:20 |
SpamapS | mnencia: jjb has fallen off openstack-infra's radar. Note that the current transition going on in here is the ultimate end of jjb use in OpenStack's CI. :-P | 15:20 |
*** v1k0d3n has joined #openstack-infra | 15:20 | |
SpamapS | jeblair: OW | 15:20 |
*** lukebrowning has quit IRC | 15:20 | |
*** trown is now known as trown|brb | 15:20 | |
mordred | clarkb: I see it | 15:21 |
clarkb | jeblair: Shrews /opt has half a terabyte free on nodepool.o.o we can move the zk data root | 15:21 |
fungi | mnencia: also, the jjb team doesn't really use launchpad since some years, so i doubt anyone's tracking that blueprint there anyway | 15:21 |
clarkb | mordred: oh good, /me awaits enlightenment | 15:21 |
Shrews | clarkb: ++ | 15:21 |
zxiiro | mnencia: we discuss jjb stuff in #openstack-jjb now. Looks like the folder plugin has a patch here https://review.openstack.org/#/c/134307/ but it's failing Jenkins so I guess someone needs to at least fix the jenkins error first | 15:22 |
SpamapS | jeblair: remember when I said it's treacherous running a single node zk? One of the things that killed the ZK in the Copenhagen juju debacle of 2012 was the zk disk filling. | 15:22 |
*** lukebrowning has joined #openstack-infra | 15:22 | |
*** dave-mccowan has joined #openstack-infra | 15:22 | |
Shrews | SpamapS: :( | 15:22 |
clarkb | but if you ran 3 you'd just have 3 with full roots | 15:22 |
clarkb | its not sharding the data aiui | 15:22 |
SpamapS | Because then restarting zk required applying every single transaction from the rather large (filled the disk!) transaction log ;) | 15:22 |
SpamapS | clarkb: no it's filling because the log snapshots build up for some reason. | 15:23 |
SpamapS | Now... | 15:23 |
SpamapS | I thought that problem was fixed. | 15:23 |
openstackgerrit | Monty Taylor proposed openstack-infra/project-config master: Remove broken openstack-tox-pep8 variant https://review.openstack.org/508542 | 15:23 |
mordred | clarkb: ^^ | 15:23 |
SpamapS | I remember specifically looking at it and there was a specific feature added to help single node ZK's not do that. | 15:23 |
SpamapS | so this was likely something else. | 15:23 |
fungi | i guess the idea is that when you have 3, they won't fill their disks at exactly the same moment and you can run around cleaning up files and restarting them constantly instead to avoid the outage? ;) | 15:23 |
mordred | clarkb: the pipeline definition has a trusty variant defined but the branch exclusion it has defined is too broad | 15:23 |
clarkb | mordred: its not just pep8 fwiw | 15:23 |
mnencia | zxiiro thanks, it is one of the two poc attached to the blueprint. I'm going to ask on openstack-jjb | 15:24 |
SpamapS | I assume this filled the disk with actual znodes, not logs? | 15:24 |
clarkb | mordred: the unittest jobs are in the same boat too aiui | 15:24 |
mordred | clarkb: right- but we need to keep looking at instances and making sure that they don't have similar issues so we can determine if it's a job config issue or a zuul issue | 15:24 |
mordred | clarkb: on cinder? | 15:24 |
clarkb | mordred: http://logs.openstack.org/09/485209/7/check/openstack-tox-pep8/dfe7ff9/zuul-info/ there is nova | 15:25 |
Shrews | SpamapS: /var/lib/zookeeper/version-2/log.* and snapshot.* files | 15:25 |
SpamapS | Shrews: :( | 15:25 |
clarkb | er thats pep8 too | 15:25 |
SpamapS | that is the exact symptom I saw then. Hrm. | 15:25 |
clarkb | mordred: http://logs.openstack.org/73/502473/6/check/openstack-tox-py27/48058ce/zuul-info/ py27 cinder is trusty too | 15:26 |
clarkb | py35 is not | 15:26 |
mordred | clarkb: ok. col - thanks | 15:26 |
Shrews | SpamapS: i think the log.* files are the largest | 15:26 |
clarkb | mordred: seems to be fairly global on pep8 and py27 | 15:26 |
SpamapS | "A ZooKeeper server will not remove old snapshots and log files, this is the responsibility of the operator. Every serving environment is different and therefore the requirements of managing these files may differ from install to install (backup for example)." | 15:26 |
*** lukebrowning has quit IRC | 15:26 | |
SpamapS | https://zookeeper.apache.org/doc/r3.1.2/zookeeperAdmin.html | 15:26 |
clarkb | sounds like elasticsearch | 15:27 |
*** trown|brb is now known as trown | 15:27 | |
fungi | frickler: you were the one to first report the missing neutron unit tests in here... do you happen to know if all occurrences were for stable branch changes perhaps, with master branch changes running expected jobs instead? | 15:27 |
clarkb | are we expected to clea nthem out of the fs directly or using some command against the server? | 15:27 |
SpamapS | clarkb: reading :( | 15:27 |
SpamapS | java -cp zookeeper.jar:log4j.jar:conf org.apache.zookeeper.server.PurgeTxnLog <dataDir> <snapDir> -n <count> | 15:27 |
clarkb | there is stuff from last year in there | 15:28 |
SpamapS | the count is the number of snaps to keep | 15:28 |
*** lukebrowning has joined #openstack-infra | 15:28 | |
clarkb | we can probably go to a months owrht and be happy | 15:28 |
SpamapS | recommendation is 3 snaps | 15:28 |
jeblair | Shrews, clarkb: i'm about to shut down zuulv3; do we want to take the opportunity to move zk to /opt? or just keep it running and run the cleanup command? | 15:28 |
SpamapS | (just in case the most recent logs are corrupted) | 15:28 |
clarkb | jeblair: I think we can likely get a huge win just with the cleanup command | 15:28 |
fungi | i'm in favor of trusting teh cleanup command | 15:29 |
clarkb | jeblair: since we have almost a year of stuff in that dir | 15:29 |
fungi | at least in the near term | 15:29 |
fungi | trust but verify ;) | 15:29 |
jeblair | k let's start there then; i won't couple zuul restart to that | 15:29 |
*** jdandrea_ has quit IRC | 15:29 | |
SpamapS | if we care about being able to recover this data upon server loss, we should run the cleanup after we backup the server | 15:29 |
jeblair | anyway, i am really going to restart zuul now. :) | 15:29 |
mordred | jeblair: when you have a sec, I think you might want to look at the xenial/trusty thing- but I think it can wait til post-restart | 15:29 |
mordred | jeblair: ++ | 15:29 |
SpamapS | if we could theoretically just recover by deleting all nodes with a clean ZK, then just run the cleanup every hour. | 15:29 |
*** dhajare has quit IRC | 15:30 | |
clarkb | SpamapS: ya I don't think we care too much about data loss other than for debugging purposes | 15:30 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Always try to unlock nodes when returning https://review.openstack.org/508532 | 15:30 |
SpamapS | yeah, seeing as we only have 1... hourly cleanup | 15:30 |
*** kiennt26 has quit IRC | 15:30 | |
jeblair | mordred: do i need to do a quick change to expose the inheritance path variable? | 15:31 |
mordred | jeblair: make sure 508532 is installed before you restart :) | 15:31 |
mordred | jeblair: maybe so? | 15:31 |
*** d0ugal has joined #openstack-infra | 15:31 | |
openstackgerrit | Matthew Treinish proposed openstack-infra/openstack-zuul-jobs master: Add missing projects for infra puppet jobs https://review.openstack.org/508526 | 15:31 |
mtreinish | jeblair: ^^^ hopefully I did that correctly | 15:31 |
mordred | jeblair: is there anything that will show whether zuulhas decided to apply a variant and if so where it got the variant from? | 15:31 |
mordred | mtreinish: looks good- except for a tab in front of required-projects | 15:32 |
*** lukebrowning has quit IRC | 15:33 | |
mordred | mtreinish: you have some things with their own required-projects and some using the base job you defined - was that on purpose? | 15:33 |
electrofelix | mnencia: just stopped using blueprints in launchpad to track stuff, combined with other things being more important, drop into the #openstack-jjb channel can help there | 15:34 |
fungi | zuul does at least merge the lists together, so you can have a main set you inherit and then add others in the ancestor | 15:34 |
lbragstad | SpamapS: when you encountered the rsync issue, did you also see a checksum failure? | 15:34 |
SpamapS | mordred: I think during the reconfig debug output you get some of that. | 15:34 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Add inheritance path to zuul vars https://review.openstack.org/508543 | 15:34 |
mtreinish | mordred: yeah, the beaker jobs are also broken, but don't need all of infra's puppet to work | 15:35 |
SpamapS | lbragstad: no, but the rsync thing is a known problem that is addressed by 508511 | 15:35 |
mordred | jeblair: lgtm | 15:35 |
mtreinish | mordred: I could split it up into 2 patches I guess, but I figured just fix all the infra jobs at once | 15:35 |
SpamapS | jeblair: ooooo I like that | 15:35 |
lbragstad | SpamapS: awesome - reviewing | 15:35 |
mordred | mtreinish: nah- looks great- just making sure | 15:35 |
mordred | mtreinish: fix that tab and I think it's good | 15:35 |
mtreinish | mordred: sigh I was in paste mode and hit tab, respinning one sec | 15:35 |
lbragstad | SpamapS: i noticed the checksum thing right before the rsync issue in this specific case | 15:35 |
lbragstad | http://logs.openstack.org/57/486757/22/check/legacy-tempest-dsvm-neutron-full/cbf0f1c/job-output.txt.gz#_2017-09-28_23_09_33_946084 | 15:35 |
jeblair | SpamapS: yeah, i'll make it a nice list of dicts later. | 15:35 |
*** chlong has quit IRC | 15:35 | |
jeblair | SpamapS: right now it's a string description; so should be enough for us to have a clue what's up. | 15:36 |
openstackgerrit | Matthew Treinish proposed openstack-infra/openstack-zuul-jobs master: Add missing projects for infra puppet jobs https://review.openstack.org/508526 | 15:36 |
jeblair | (rather, it's a list of strings right now) | 15:36 |
clarkb | I suppose I should review 50511 | 15:36 |
clarkb | *508511 | 15:36 |
jeblair | SpamapS, clarkb: you want to +2 508543 and i'll force-merge? | 15:36 |
jeblair | include it in the restart | 15:37 |
mtreinish | mordred: although the beaker jobs still aren't passing, but they run at least: http://logs.openstack.org/58/508258/3/check/legacy-openstackci-beaker-ubuntu-trusty/67395a9/job-output.txt.gz#_2017-09-29_15_19_54_480638 | 15:37 |
mordred | infra-root: https://review.openstack.org/#/c/508524 for nova, companion in https://review.openstack.org/#/c/508519 | 15:37 |
jlvillal | Is there a Zuul v3 status page? For us to find out if things should be or should not be working? | 15:37 |
SpamapS | ok, time to go find breakfast and the office. AFK for a bit | 15:37 |
mordred | jlvillal: I just sent an email to the mailing list with an update and some links to some things | 15:38 |
*** shardy_mtg is now known as shardy | 15:38 | |
clarkb | jlvillal: done | 15:38 |
jlvillal | mordred, Great. Thanks. | 15:38 |
clarkb | er jeblair done, sorry jlvillal | 15:38 |
jlvillal | heh, autocomplete on last used nick | 15:38 |
*** bauzas is now known as bauwser | 15:38 | |
jlvillal | I have noticed this job failing again and again with POST_FAILURE: https://review.openstack.org/#/c/508287/ | 15:39 |
fungi | jlvillal: it should be redirecting on its own any time now as well | 15:39 |
jlvillal | fungi, The POST_FAILURE issue? | 15:40 |
fungi | jlvillal: the status page | 15:40 |
jlvillal | fungi, Ah, thanks | 15:40 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Add inheritance path to zuul vars https://review.openstack.org/508543 | 15:40 |
mordred | jlvillal: oh! that's a failure in a nice new shiny v3 job :( | 15:40 |
mordred | well - this is another instance of openstack-tox-py27 running on trusty - I wonder if that's related | 15:41 |
frickler | fungi: re neutron, yes, there wasn't any change in master since the cutover it seems, all stable/* | 15:41 |
mordred | clarkb: ^^ see http://logs.openstack.org/87/508287/1/check/openstack-tox-py27/4e00cb9/job-output.txt.gz#_2017-09-29_04_10_40_847346 | 15:41 |
clarkb | mordred: re https://review.openstack.org/#/c/508511/3 I don't think that is a noop since primary isn't a thing unless you are a multinode job | 15:41 |
fungi | hrm, the redirect patch merged over an hour ago (14:33z) and doesn't seem to have applied yet. looking into that real quick | 15:41 |
jlvillal | mordred, Thanks. I wasn't sure if that was a known issue with the POST_FAILURE | 15:42 |
fungi | frickler: thanks, that suggests there's something going on with branch exclusions in that case | 15:42 |
mordred | clarkb: see the parent patch | 15:42 |
openstackgerrit | Ben Nemec proposed openstack-infra/tripleo-ci master: Switch cistatus page to zuul v3 https://review.openstack.org/508546 | 15:43 |
clarkb | mordred: derp, clearly too early in the morning | 15:43 |
jeblair | zuul is stopped | 15:44 |
clarkb | mordred: ok +2'd both changes but didn't approve as waiting for zuul things to complete | 15:44 |
jeblair | zuul is starting | 15:44 |
mordred | clarkb: kk | 15:44 |
mordred | clarkb, jlvillal: I think I see the bug in tox log collection | 15:44 |
jlvillal | :) | 15:45 |
fungi | apparently status.o.o isn't updating because we have a system package conflict for npm/nodejs installation which is tanking the whole manifest | 15:45 |
mordred | fungi: AWESOME | 15:45 |
fungi | npm depends on newer versions of a bunch of nodejs stuff which isn't being installed (for unspecified reasons). i'll probably have to try by hand to see why | 15:46 |
fungi | E: Unable to correct problems, you have held broken packages. | 15:47 |
*** lukebrowning has joined #openstack-infra | 15:47 | |
clarkb | fungi: if npm is trying to update itself that is known to cause problems | 15:47 |
SamYaple | can someone help me with https://review.openstack.org/#/c/508425/ ? zuul doesnt seem to be triggering anything at all and it never returns or responds to the ticket | 15:47 |
fungi | clarkb: puppet is attempting to install npm, and nodejs is apparently already installed from nodesource | 15:47 |
mordred | fungi, clarkb: dealing with the javascript stack around zuul is on my todo list for once the dust settles here | 15:47 |
openstackgerrit | David Moreau Simard proposed openstack-infra/openstack-zuul-jobs master: Add missing required-projects for TripleO jobs using devstack-gate https://review.openstack.org/508548 | 15:48 |
jeblair | SamYaple: wait a few minutes then do a recheck -- zuul was stuck for a while | 15:48 |
jeblair | SamYaple: i'm restarting it now | 15:48 |
dmsimard | infra-root: ^ is the next step to fix the broken tripleo gate, it's sitting under 2 patches from mordred which also need to land | 15:48 |
SamYaple | jeblair: its been like this since yesterday and all of last night | 15:48 |
SamYaple | since zuulv3 cutover it has not responded to any patchset in openstack/loci nameset | 15:49 |
jeblair | SamYaple: it may be something else then, but since i just restarted the debug procedure will be the same :| | 15:49 |
mordred | fungi: short-term, http://paste.openstack.org/show/622318/ is for adding nodesource apt repos for node things | 15:49 |
jeblair | zuul is up now | 15:49 |
openstackgerrit | Andreas Jaeger proposed openstack-infra/openstack-zuul-jobs master: Fix project-config-grafyaml repos https://review.openstack.org/508549 | 15:49 |
clarkb | mordred: I think fungi is saying that is already in place on status.o.o | 15:50 |
clarkb | mordred: there is puppet apt resource management of it iirc | 15:50 |
clarkb | we use it in etherpad too | 15:50 |
dmsimard | clarkb: why did https://review.openstack.org/#/c/508548/ come into merge conflict just now? o_O | 15:50 |
fungi | this is what's going on for status.o.o: http://paste.openstack.org/show/622320/ | 15:50 |
dmsimard | it's a clean patch on top of the tree | 15:50 |
jeblair | i'm re-enqueing changes from before i stopped zuul | 15:50 |
clarkb | dmsimard: it depends on something that failed to merge | 15:51 |
clarkb | dmsimard: so one of mordreds patches tickled it | 15:51 |
dmsimard | clarkb: it doesn't depend on anything and the two patches below seem to be fine | 15:51 |
SamYaple | jeblair: openstack/loci was noop gate only before the cutover, is there a problem with zuulv3 and noop? | 15:51 |
openstackgerrit | Andreas Jaeger proposed openstack-infra/project-config master: Remove zuul v2 and jjb content https://review.openstack.org/507180 | 15:51 |
dmsimard | maybe something else merged in the meantime | 15:51 |
*** lukebrowning has quit IRC | 15:51 | |
jeblair | SamYaple: there was yesterday; should be fixed today | 15:51 |
clarkb | dmsimard: also possible that is fallout from the zuul restart | 15:51 |
jeblair | SamYaple: with the current restart | 15:51 |
mordred | clarkb: ok, nod | 15:51 |
fungi | status.o.o is using "deb https://deb.nodesource.com/node_0.12 trusty main" in its sources.list | 15:52 |
mordred | dmsimard: I rechecked it just now | 15:52 |
dmsimard | AJaeger_: commented https://review.openstack.org/#/c/508549/ | 15:52 |
mordred | fungi: nod | 15:52 |
SamYaple | jeblair: ack | 15:52 |
*** derekh has quit IRC | 15:52 | |
*** lukebrowning has joined #openstack-infra | 15:53 | |
mordred | fungi: oh - I wonder if trusty somehow got a backport of npm/node that's newer than what we're getting from nodesource (since 0.12 is rather old) | 15:53 |
AJaeger_ | dmsimard: great, thanks. Will +2A then ;) | 15:53 |
fungi | mordred: possible | 15:53 |
mordred | fungi: apt-cache policy says I';m wrong :) | 15:54 |
openstackgerrit | Matthew Treinish proposed openstack-infra/openstack-zuul-jobs master: Add missing projects for infra puppet jobs https://review.openstack.org/508526 | 15:54 |
openstackgerrit | Daniel Mellado proposed openstack-infra/openstack-zuul-jobs master: Fetch python-octaviaclient from pip https://review.openstack.org/508550 | 15:54 |
mordred | fungi: Installed: 0.12.14-1nodesource1~trusty1 | 15:54 |
mordred | Candidate: 0.12.18-1nodesource1~trusty1 | 15:54 |
fungi | looks like nodejs is pending upgrade (not a security patch so unattended-upgrades doesn't install it automatically) | 15:54 |
mordred | ah | 15:54 |
openstackgerrit | Andreas Jaeger proposed openstack-infra/project-config master: Remove zuul v2 and jjb content https://review.openstack.org/507180 | 15:55 |
fungi | i'm going to `sudo apt-get upgrade` on status.o.o for now and see if that unhinges this | 15:55 |
fungi | nope, doesn't help | 15:56 |
fungi | The following packages have unmet dependencies: nodejs : Conflicts: npm | 15:57 |
*** lukebrowning has quit IRC | 15:57 | |
fungi | aha! | 15:57 |
fungi | the joys of mixing third-party package repositories with upstream | 15:58 |
fungi | er with distro | 15:58 |
*** zzzeek has quit IRC | 15:58 | |
fungi | one has a nodejs package which provides npm, the other has npm broken out as a separate package | 15:58 |
clarkb | inheritance path seems to be working well | 15:58 |
clarkb | http://logs.openstack.org/09/508209/3/check/legacy-cinder-tox-functional/772c444/zuul-info/inventory.yaml | 15:58 |
*** zzzeek has joined #openstack-infra | 15:59 | |
SamYaple | jeblair: ah perfect. thank you. that fixed it | 16:00 |
*** xyang1 has joined #openstack-infra | 16:00 | |
openstackgerrit | Matthew Treinish proposed openstack-infra/openstack-zuul-jobs master: Add missing projects for infra puppet jobs https://review.openstack.org/508526 | 16:00 |
fungi | i've removed the nodesource entry from sources.list temporarily, purged the nodejs package, and am reinstalling npm and nodejs packages from ubuntu | 16:00 |
jeblair | SamYaple: yay! | 16:01 |
jeblair | the process to re-enqueue changes is ongoing; things are moving very slowly, largely since so many of the changes in flight are zuul config changes require dynamic config updates | 16:02 |
fungi | i've now put the nodesource sources.list entry back, updated and upgraded the nodejs package | 16:02 |
SamYaple | so question.. in project-config, is zuul/layout.yaml still used? or only zuul.d/* | 16:02 |
mordred | infra-root: there is a bug in fetch-tox-output where its combination of find: and synchronize: is leading it to try to fetch a thing that doesn't exist ... I'm wokring on a fix | 16:02 |
jeblair | SamYaple: only zuul.d/ | 16:02 |
*** yee379 has quit IRC | 16:03 | |
mordred | jeblair: I kinda think we should go ahead and land a patch removing layout.yaml and jenkins/jobs ... I've seen several patches to them come through and them not being there any more is a good way to stop those | 16:03 |
*** yee379 has joined #openstack-infra | 16:03 | |
clarkb | http://logs.openstack.org/01/474801/1/gate/openstack-tox-py27/a55c57a/zuul-info/inventory.yaml inheritance debug info for why trusty is used in pep8/py27 jobs | 16:03 |
*** lukebrowning has joined #openstack-infra | 16:03 | |
mordred | https://review.openstack.org/#/c/507180/ <-- AJaeger_ updated the patch to do that | 16:03 |
fungi | okay, problem recreated. so if we're using the nodesource packages, we should not attempt to install the npm package since their nodejs package provides npm on its own, so attempting to install npm directly at that point results in the observed dependency resolution errors | 16:03 |
*** sbezverk has quit IRC | 16:04 | |
openstackgerrit | Sam Yaple proposed openstack-infra/project-config master: Remove loci-jobs from project-config https://review.openstack.org/508552 | 16:04 |
SamYaple | jeblair: so ^^ is a good patch? | 16:04 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Update zuul-changes script for v3 https://review.openstack.org/508553 | 16:04 |
*** gongysh has quit IRC | 16:04 | |
openstackgerrit | Matthew Treinish proposed openstack-infra/openstack-zuul-jobs master: Add missing projects for infra puppet jobs https://review.openstack.org/508526 | 16:04 |
*** sambetts is now known as sambetts|afk | 16:05 | |
mordred | clarkb: :( -- none of those indicate that it thinks it wants to apply a variant with nodeset ubuntu-trusty | 16:05 |
openstackgerrit | Merged openstack-infra/project-config master: Remove nova-net jobs that are >newton https://review.openstack.org/508524 | 16:05 |
clarkb | mordred: would it though? I'm mostly confused why there are 6 variants at all | 16:06 |
clarkb | the job is defined in one place then used in a single template for swift from what I can tell | 16:06 |
AJaeger_ | mordred: I agree - I gave a -1 on everything proposed the last three days already... | 16:06 |
jeblair | SamYaple: left comment | 16:06 |
AJaeger_ | mordred: I'll rebase 507180 now | 16:07 |
clarkb | I wouldn't expect any variants, that job is basically defined specifically for swift/cinder/nova/neutron/etc | 16:08 |
jeblair | clarkb, mordred: 'inherit from <Job base branches: None source: openstack-infra/project-config/zuul.d/secrets.yaml@master>' *secrets.yaml* ? | 16:08 |
*** lukebrowning has quit IRC | 16:08 | |
clarkb | jeblair: oh huh /me looks | 16:08 |
clarkb | I don't see a job base in secrets.yaml | 16:09 |
jeblair | nor do i. that's funky. | 16:09 |
clarkb | or any job | 16:09 |
*** owalsh_ has joined #openstack-infra | 16:09 | |
SamYaple | jeblair: ive never used Needed-By, thats a thing? I am assuming it is the same as Depends-On, only in the opposite direction? | 16:09 |
jeblair | SamYaple: yeah. it's not recognized by tooling, it's just for humans. | 16:09 |
SamYaple | ah i see. will do. thanks | 16:10 |
*** lukebrowning has joined #openstack-infra | 16:10 | |
clarkb | jeblair: mordred I think secrets.yaml had all its content deleted and was renamed to that path at some point. Is it possible there is some funky git behavior going on around that? | 16:10 |
clarkb | jeblair: mordred maybe zuul ins't loading a clean content of that file | 16:10 |
fungi | SamYaple: basically a bookkeeping convenience to signal to reviewers that there's this other change in a different repo depending on it | 16:10 |
*** camunoz has quit IRC | 16:10 | |
jeblair | clarkb, mordred: i think we need a tool to manually run a cat job. | 16:11 |
openstackgerrit | Andreas Jaeger proposed openstack-infra/project-config master: Remove zuul v2 and jjb content https://review.openstack.org/507180 | 16:11 |
*** owalsh has quit IRC | 16:11 | |
AJaeger_ | mordred: ^ | 16:11 |
openstackgerrit | Sam Yaple proposed openstack-infra/openstack-zuul-jobs master: Remove legacy loci jobs https://review.openstack.org/508556 | 16:12 |
clarkb | AJaeger_: mordred I'd personally like to keep that aroound a little longer as cross referencing while we unbreak the transition has been useful | 16:12 |
openstackgerrit | Sam Yaple proposed openstack-infra/project-config master: Remove loci-jobs from project-config https://review.openstack.org/508552 | 16:12 |
SamYaple | ok, i think that should do it | 16:12 |
clarkb | (I can always checkout old commit though) | 16:12 |
clarkb | jeblair: that would let us retrieve what zuul is looking at for file contents right? | 16:13 |
SamYaple | so I even need merge-check in the project-config repo? or can I remove all of that and just do it from the loci repo? | 16:13 |
frickler | jeblair: there are five matches for openstack-python-jobs-trusty in p-c/zuul.d/projects.yaml , that would match the 5 variants in your log | 16:13 |
clarkb | jeblair: if so ++ I think that would be useful | 16:13 |
frickler | jeblair: and openstack-python-jobs-trusty sets node: trusty for openstack-tox-py27 unconditionally | 16:13 |
frickler | jeblair: so maybe that template isn't applied project-specific? | 16:13 |
*** owalsh_ has quit IRC | 16:14 | |
*** lukebrowning has quit IRC | 16:14 | |
jeblair | clarkb: oh, i think i see the problem; it's a bug in the multi-file parsing; all config objects get the source context of the last file parsed from a repo-branch. it should not have an adverse affect on security, but it will make debug messages and zuul config error messages look weird. | 16:14 |
AJaeger_ | clarkb: I'm fine with waiting as well - but let's discuss a date/timeframe | 16:14 |
*** jascott1 has joined #openstack-infra | 16:15 | |
clarkb | frickler: that would certainly explain it if the trusty template is somehow getting applied everywhere | 16:16 |
*** lukebrowning has joined #openstack-infra | 16:16 | |
clarkb | frickler: where we apply the -trusty template we also apply the non trusty template which seems odd to me | 16:17 |
clarkb | perhaps that is causing a collision of some sort that is getting resolved in trusty's favor? | 16:17 |
openstackgerrit | Merged openstack-infra/openstack-zuul-jobs master: Add openstack-infra/grafyaml to the project-config grafyaml job https://review.openstack.org/508537 | 16:17 |
*** yamamoto has joined #openstack-infra | 16:18 | |
jeblair | clarkb, frickler: the bug i'm seeing would only account for the filename being wrong, not the project | 16:19 |
*** owalsh has joined #openstack-infra | 16:19 | |
clarkb | looking at old zuul layout group-based-policy at least wants to run trusty on mitaka branch and xenial on not mitaka | 16:19 |
clarkb | but our config doesn't seem to appl ya branch restriction to the openstack-python-jobs-trusty variants | 16:19 |
AJaeger_ | 508396 just failed with an unrelated error - that looks strange. zuul complained about syntax error. Could an expert check this, please? | 16:19 |
clarkb | jeblair: I'm wonderinf that since the -trusty variant comes after hte non trusty and isn't restriced by someting like branch it is just overwriting the base variant for xenial? | 16:20 |
openstackgerrit | David Moreau Simard proposed openstack-infra/project-config master: Update Nodepool graphite metric names https://review.openstack.org/508349 | 16:20 |
openstackgerrit | Matt Riedemann proposed openstack-infra/openstack-zuul-jobs master: Remove legacy-tempest-dsvm-nnet >newton jobs https://review.openstack.org/508520 | 16:20 |
openstackgerrit | Matt Riedemann proposed openstack-infra/openstack-zuul-jobs master: Remove legacy-tempest-dsvm-nnet https://review.openstack.org/508519 | 16:20 |
*** LindaWang has quit IRC | 16:20 | |
clarkb | jeblair: basically what happens if you say job-foo, then later job-foo: nodeset: trusty | 16:20 |
dmsimard | jeblair: ^ with your comment addressed. It's worth exploring a broader rework of the provider-specific dashboards but I'd do that in another patch. | 16:21 |
jeblair | clarkb: trusty | 16:21 |
*** lukebrowning has quit IRC | 16:21 | |
jeblair | clarkb: last wins | 16:21 |
clarkb | jeblair: mordred frickler ok I think that may explain it then | 16:21 |
clarkb | we need to restrict the trusty set to branch ^stable/mitaka$ | 16:21 |
clarkb | or somilar | 16:21 |
jeblair | clarkb: does swift have the openstack-python-jobs-trusty template? | 16:22 |
clarkb | AJaeger_: looks like maybe extra whitespace snuck in | 16:22 |
clarkb | jeblair: no | 16:22 |
dmsimard | eh, that's weird.. 508548 has finished all it's jobs successfully but it appears it's not ending and reporting status to the review and is still on zuulv3.o.o | 16:22 |
openstackgerrit | Matt Riedemann proposed openstack-infra/openstack-zuul-jobs master: Remove legacy-tempest-dsvm-nnet https://review.openstack.org/508519 | 16:22 |
*** lukebrowning has joined #openstack-infra | 16:22 | |
jeblair | clarkb: where's the trusty nodeset coming from then? | 16:22 |
clarkb | jeblair: I think from the projects ahead in config modifying that job by including the -trusty job | 16:23 |
clarkb | jeblair: I'm making a jump that the nodeset: trusty is side effecting globally after a project loads it | 16:23 |
jeblair | clarkb: oh, i'm not there yet; that shouldn't happen | 16:23 |
AJaeger_ | clarkb: any idea where exactly? | 16:24 |
*** yamamoto has quit IRC | 16:24 | |
jeblair | clarkb: the way it should work is that each variant gets applied to a copy of the job in series. so they shouldn't affect each other in that way | 16:24 |
clarkb | AJaeger_: on the blank line between title and paragraph | 16:24 |
clarkb | AJaeger_: I think | 16:24 |
AJaeger_ | clarkb: will you fix or shall I ? | 16:25 |
AJaeger_ | another thing I don't understand http://logs.openstack.org/39/508539/3/check/legacy-tox-doc-publish-checkbuild/84ac3d4/job-output.txt.gz#_2017-09-29_16_14_48_044446 - why do I get an rsync error here? | 16:25 |
clarkb | AJaeger_: my local checkout says I'm wrong though, no white space there | 16:26 |
*** pcaruana has quit IRC | 16:26 | |
clarkb | oh wait it specifically says job base not defined | 16:26 |
*** lukebrowning has quit IRC | 16:26 | |
clarkb | which is even more confusing | 16:27 |
fungi | cmurphy: clarkb: ianw: okay, i've tracked down the status.o.o updating breakage back to https://review.openstack.org/473136 which seems to be cool for ubuntu system packages on xenial but not for where we're deploying openstack-health on status.o.o running trusty with the nodesource third-party package repository | 16:27 |
dmsimard | Do we have enough zuul mergers running ? Looks like we're lagging behind | 16:27 |
fungi | cmurphy: clarkb: ianw: https://github.com/voxpupuli/puppet-nodejs#npm_package_ensure suggests the npm_package_ensure option is not intended for use with nodesource's packages | 16:28 |
jeblair | dmsimard: no; we'll run more when we decommission zuulv2 | 16:28 |
dmsimard | jeblair: ack | 16:28 |
*** lukebrowning has joined #openstack-infra | 16:28 | |
clarkb | fungi: ah ok sounds like we can just drop that entirely and the nodejs package will give us npm | 16:30 |
fungi | yep | 16:30 |
fungi | repo_url_suffix seems to explicitly refer to nodesource sources.list addition | 16:30 |
*** trown is now known as trown|lunch | 16:30 | |
clarkb | AJaeger_: rsync: change_dir "/home/zuul//publish-docs" failed: No such file or directory (2) now to see what the directory should be | 16:30 |
fungi | in http://git.openstack.org/cgit/openstack-infra/system-config/tree/modules/openstack_project/manifests/release_slave.pp we just use repo_url_suffix without npm_package_ensure | 16:31 |
clarkb | AJaeger_: rsync -a www/static/ publish-docs/www/ that is where it copies to publish-docs, now to see what it is relative to | 16:32 |
openstackgerrit | Sam Yaple proposed openstack-infra/openstack-zuul-jobs master: Remove legacy loci jobs https://review.openstack.org/508556 | 16:32 |
jeblair | infra-root, AJaeger_: can you keep an eye out for changes to project-config/zuul/* (eg, layout.yaml). please don't approve those. they cause full zuul v3 reconfigurations due to puppet even though they don't actually change zuulv3. | 16:32 |
AJaeger_ | clarkb: relative to working dir | 16:33 |
*** lukebrowning has quit IRC | 16:33 | |
AJaeger_ | jeblair: yeah, we should not touch layout.yaml at all anymore - that change by mriedem was too eager | 16:33 |
clarkb | AJaeger_: /home/zuul/workspace looks like | 16:33 |
clarkb | AJaeger_: so we need to update that path | 16:33 |
fungi | jeblair: full ack | 16:33 |
dmsimard | mordred: hmm, unless mistaken our patch stack (starting from 508510) is not getting enqueued to gate | 16:34 |
AJaeger_ | jeblair: I'll -1 everything that touches zuul/layout or jenkins/jobs - infra-root, let's tread these as *frozen* | 16:34 |
dmsimard | mordred: I think 508510 needs a rebase ? says the parent is outdated | 16:35 |
*** lukebrowning has joined #openstack-infra | 16:35 | |
* dmsimard rebases | 16:35 | |
AJaeger_ | clarkb: I'll prepare a change... | 16:35 |
openstackgerrit | David Moreau Simard proposed openstack-infra/openstack-zuul-jobs master: Create legacy nodesets and switch all legacy jobs https://review.openstack.org/508510 | 16:35 |
openstackgerrit | David Moreau Simard proposed openstack-infra/openstack-zuul-jobs master: Update legacy post playbooks to pull from primary https://review.openstack.org/508511 | 16:35 |
clarkb | AJaeger_: ok | 16:35 |
openstackgerrit | David Moreau Simard proposed openstack-infra/openstack-zuul-jobs master: Add missing required-projects for TripleO jobs using devstack-gate https://review.openstack.org/508548 | 16:35 |
dmsimard | ^ above stack will need fresh +W's | 16:36 |
*** panda is now known as panda|bbl | 16:36 | |
clarkb | dmsimard: what precipitated the new patchsets? | 16:37 |
clarkb | was it that merge conflict? | 16:37 |
dmsimard | clarkb: they all passed check queue but were not getting enqueued to gate | 16:37 |
*** edmondsw has quit IRC | 16:37 | |
openstackgerrit | Andreas Jaeger proposed openstack-infra/openstack-zuul-jobs master: Fix some publishing jobs https://review.openstack.org/508562 | 16:37 |
dmsimard | I gave it a good amount of time to allow for merger lag to catch up and they were still not getting enqueued, I figured it was perhaps because the parent commit on 508510 was outdated | 16:38 |
AJaeger_ | clarkb: ^ | 16:38 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Make fetch-tox-output more resilient https://review.openstack.org/508563 | 16:38 |
clarkb | dmsimard: its possible that zuul was just behind too. The queue counts at the top of the status page should give you an idea if it is caught up (queue sizes of 0) | 16:38 |
openstackgerrit | Jeremy Stanley proposed openstack-infra/puppet-openstack_health master: Don't set npm_package_ensure https://review.openstack.org/508564 | 16:38 |
clarkb | dmsimard: but coul dalso be out of date parent | 16:38 |
jeblair | clarkb, dmsimard: it is more likely zuul backlog | 16:39 |
mordred | clarkb, fungi, AJaeger_, jlk: https://review.openstack.org/508563 should fix the issue with failing to fetch tox logs | 16:39 |
fungi | infra-root: infra-puppet-core: 508564 should hopefully get puppet going on status.openstack.org again | 16:39 |
mordred | jeblair, dmsimard: ^^ you too | 16:39 |
*** lukebrowning has quit IRC | 16:39 | |
*** jcoufal has joined #openstack-infra | 16:40 | |
clarkb | ya looks like the backlog is ~17 minutes at this point | 16:41 |
*** lukebrowning has joined #openstack-infra | 16:41 | |
mordred | dmsimard, clarkb, jeblair: yatin has a comment on 508510 - it seems reasonable to me, but I think maybe a followup | 16:41 |
fungi | infra-root: i'm going to hand patch 507244 (the zuul status redirect) onto status.openstack.org in the interim while we wait for 508564 to merge | 16:41 |
dmsimard | mordred: I thought I fixed that | 16:42 |
*** jcoufal_ has quit IRC | 16:42 | |
dmsimard | mordred: hm, it was in one of my previous multinode patchsets but not in the current ones (which will need more rebases T_T) | 16:43 |
fungi | actually, puppet managed to apply 507244 on its own, but never got far enough into the manifest to reload apache. doing that now | 16:43 |
jeblair | mordred: either way should work i think | 16:44 |
dmsimard | mordred: it's a legit fix but can be follow-up. Naming the node that way makes it so you can't use groups['subnodes'] for example. | 16:44 |
dmsimard | I think devstack-gate uses groups['subnodes'] actually. | 16:44 |
openstackgerrit | Monty Taylor proposed openstack-infra/openstack-zuul-jobs master: Update ubuntu-xenial-2-node to match centos-7-2-node https://review.openstack.org/508568 | 16:44 |
mordred | jeblair, dmsimard: ^^ | 16:45 |
jeblair | mordred, clarkb: i'm going to go spend some significant time on the trusty variant thing. i will be incommunicado for a bit. | 16:45 |
dmsimard | mordred: yeah subnodes usage: http://git.openstack.org/cgit/openstack-infra/devstack-gate/tree/playbooks/devstack-legacy.yaml#n15 and http://git.openstack.org/cgit/openstack-infra/openstack-zuul-jobs/tree/playbooks/legacy/pre.yaml#n15 | 16:45 |
clarkb | jeblair: gl | 16:45 |
mordred | jeblair: ok. cool - and good, because I'm stumped by it | 16:45 |
*** lukebrowning has quit IRC | 16:45 | |
mordred | dmsimard: you said we're going to need another rebase? | 16:46 |
dmsimard | mordred: I already did it | 16:46 |
*** lnxnut_ has joined #openstack-infra | 16:46 | |
dmsimard | mordred: however it's not certain if it was a rebase issue or if the zuul backlog is >10 minutes | 16:46 |
dmsimard | after 10 minutes the change was still not enqueued to gate | 16:46 |
mordred | dmsimard: nod | 16:46 |
*** electrofelix has quit IRC | 16:47 | |
*** lukebrowning has joined #openstack-infra | 16:47 | |
clarkb | mordred: comment on https://review.openstack.org/#/c/508563/1 | 16:47 |
*** rtjure has quit IRC | 16:49 | |
*** jcoufal_ has joined #openstack-infra | 16:50 | |
*** mugsie has quit IRC | 16:51 | |
*** lukebrowning has quit IRC | 16:52 | |
*** jcoufal has quit IRC | 16:53 | |
*** lukebrowning has joined #openstack-infra | 16:53 | |
*** jdandrea_ has joined #openstack-infra | 16:55 | |
*** lukebrowning has quit IRC | 16:58 | |
dmsimard | infra-root: zuul/nodepool are not dequeuing fast enough to cope with the load, we're at ~175 nodes in-use right now | 16:59 |
*** lukebrowning has joined #openstack-infra | 17:00 | |
inc0 | good morning guys, minor thing https://twitter.com/OpenStackStatus <- charts are all flat since zuulv3 | 17:00 |
dmsimard | there is a bunch of nodepool capacity we're not tapping into | 17:00 |
dmsimard | inc0: that's jd_ | 17:00 |
jeblair | dmsimard: zuul is backlogged | 17:01 |
*** ykarel has quit IRC | 17:02 | |
dmsimard | jeblair: queue length has dropped though | 17:02 |
dmsimard | jeblair: unless the backlog would be elsewhere | 17:02 |
inc0 | https://twitter.com/OpenStackStatus/status/913473927996936192 <- I like this one;) | 17:02 |
jeblair | dmsimard: the system isn't stable until it hit's zero | 17:02 |
*** Swami has joined #openstack-infra | 17:02 | |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Make fetch-tox-output more resilient https://review.openstack.org/508563 | 17:02 |
mordred | clarkb: thanks - fixed ^^ | 17:02 |
*** ralonsoh_ has quit IRC | 17:02 | |
dmsimard | jeblair: ok, anything I can do to help ? | 17:03 |
*** jascott1 has quit IRC | 17:03 | |
mordred | clarkb, fungi, dmsimard: has anybody looked in to infra-manula publishing yet? | 17:03 |
dmsimard | mordred: I haven't, what's the problem ? | 17:03 |
mordred | well - it may just be a lost-connection-to-host issue: http://logs.openstack.org/ab/a0e829e5cbd68815cf0b00687a9ac7e5228c56ab/post/publish-openstack-python-docs-infra/9ef3842/job-output.txt.gz | 17:04 |
*** lukebrowning has quit IRC | 17:04 | |
mtreinish | jeblair, mordred: any idea what's going on here: http://logs.openstack.org/58/508258/3/check/legacy-infra-puppet-apply-3/4490a3f/job-output.txt.gz#_2017-09-29_16_46_27_131506 ? | 17:05 |
openstackgerrit | Merged openstack-infra/project-config master: Zuul versions of sudo grep checks https://review.openstack.org/508313 | 17:05 |
dmsimard | mordred: yeah, looks like the host went unreachable midjob -- the SSH key removal task failed as well | 17:05 |
jeblair | dmsimard: it's not going to get better until we optimize the config loading/parsing. we had no idea what the config would look like until a few days before the cutover, so we've never seen something like this. now that we have an at-scale configuration, we can tune for it. that's going to take a few days -- after we fix all the little fires. | 17:05 |
mordred | dmsimard: oh - it also might be that the most recent infra-manual patches didn't manage to run the post job | 17:05 |
mordred | dmsimard: http://logs.openstack.org/95/95c4d1433c74ad23894f7296be51a3a23b3c6e56 doens't exist and 95c4d1433c74ad23894f7296be51a3a23b3c6e56 is the tip .. | 17:06 |
*** lukebrowning has joined #openstack-infra | 17:06 | |
* AJaeger_ goes offline again - final leg of my journey... | 17:06 | |
mordred | fungi clarkb: feel like +3ing https://review.openstack.org/#/c/436455/2 so that we can see it trigger a post job? | 17:07 |
mordred | mtreinish: looking | 17:07 |
SamYaple | what am I doing wrong with this job removal? https://review.openstack.org/#/c/508556/ | 17:07 |
dmsimard | jeblair: ok, happy to help if it's something I can lend a hand with | 17:07 |
mordred | mtreinish: I mena- it looks ike puppet-bugdaystats just isn't in required-projects (and the logging is weird/interleaved) | 17:08 |
dmsimard | SamYaple: I don't believe you can do a depends-on from a patch that is in project-config | 17:08 |
fungi | mordred: done | 17:08 |
SamYaple | dmsimard: yea i was throwing it in there to test. i think youre right | 17:08 |
mtreinish | mordred: it is though, because above it zuul-cloner pulls it: http://logs.openstack.org/58/508258/3/check/legacy-infra-puppet-apply-3/4490a3f/job-output.txt.gz#_2017-09-29_16_46_27_056891 | 17:08 |
dmsimard | SamYaple: the job content in project-config is 'trusted' which means it contains sensitive things that, if altered, could expose secrets and things like that. | 17:08 |
dmsimard | SamYaple: zuul doesn't allow to do speculative (depends-on) testing against trusted repos | 17:09 |
dmsimard | at least that's my understanding | 17:09 |
mtreinish | mordred: I've got the patch up adding it to the job definition https://review.openstack.org/508526 and that run was my patch with a depends-on on it | 17:09 |
mordred | mtreinish: nod - lemme look further then | 17:09 |
SamYaple | ok. that makes sense, but im not sure what the next step is for me and removing the legacy job | 17:10 |
mordred | mtreinish: so - if you look in http://logs.openstack.org/58/508258/3/check/legacy-infra-puppet-apply-3/4490a3f/zuul-info/inventory.yaml | 17:10 |
mordred | mtreinish: you can see the list of projects zuul thinks it should be running with | 17:10 |
*** lukebrowning has quit IRC | 17:10 | |
*** gouthamr has quit IRC | 17:10 | |
mordred | mtreinish: oh- that was a run of legacy-infra-puppet-apply-3 | 17:11 |
mtreinish | mordred: hmm, none of the things I added to required projects is there | 17:11 |
mordred | mtreinish: in https://review.openstack.org/#/c/508526/7/zuul.d/zuul-legacy-jobs.yaml legacy-infra-puppet-apply-3 does not have bugdaystats | 17:11 |
dmsimard | SamYaple: added a comment in https://review.openstack.org/#/c/508552/ | 17:11 |
mordred | mtreinish: so I think legacy-infra-puppet-apply-3 needs legacy-infra-puppet-apply-base in its base | 17:11 |
mordred | mtreinish: you'll need to move it after the legacy-infra-puppet-apply-base definition of course | 17:12 |
*** lukebrowning has joined #openstack-infra | 17:12 | |
SamYaple | ammaaazzinnng. comments from old patchsets now post | 17:12 |
dmsimard | SamYaple: you need to wait for the project-config patch to land before you can land the o-z-j repo | 17:12 |
mtreinish | mordred: oh ffs, I messed up that patch again | 17:12 |
dmsimard | SamYaple: also, you can already start adding jobs to loci, don't need to wait to remove legacy | 17:12 |
SamYaple | dmsimard: got it. i thought jeblair wassaying i had to get the o-z-j patch in first | 17:13 |
SamYaple | dmsimard: the legacy job is busted and i cant merge anything. since im rewriting i just want to remove it reather than fix it | 17:13 |
openstackgerrit | Matthew Treinish proposed openstack-infra/openstack-zuul-jobs master: Add missing projects for infra puppet jobs https://review.openstack.org/508526 | 17:13 |
dmsimard | SamYaple: the project-config job definition uses content from o-z-j | 17:13 |
dmsimard | so it needs to go in first iiuc | 17:13 |
SamYaple | makes sense | 17:13 |
mordred | SamYaple: mnaser had an idea the other day that might apply well to you here (although it'll be another couple of steps to do it) | 17:14 |
openstackgerrit | Sam Yaple proposed openstack-infra/project-config master: Remove loci-jobs from project-config https://review.openstack.org/508552 | 17:14 |
SamYaple | mordred: yea i reviewed the patches, thought just purging the job was the easier route | 17:14 |
*** SumitNaiksatam has quit IRC | 17:15 | |
jd_ | dmanchad: inc0: is there a new source for that chart? :) | 17:15 |
SamYaple | because this is low-entropy low priority project thats not stable, i can be fast and loose with gating right now :) | 17:15 |
mordred | SamYaple: which is that you could move the definition of the project-template loci-jobs to one of your loci repos - then revert https://review.openstack.org/#/c/508552 - and then what jobs are in loci-jobs is under your control - but is defined in one place | 17:15 |
openstackgerrit | Merged openstack-infra/project-config master: Set rackspace launch timeout to 10m https://review.openstack.org/508378 | 17:15 |
mordred | SamYaple: cool | 17:15 |
*** jpena is now known as jpena|off | 17:16 | |
inc0 | sooo....I'll wait till SamYaple's work merges before I do the same for Kolla;) | 17:17 |
SamYaple | inc0: you probably dont want to follow my example | 17:17 |
mordred | inc0, SamYaple: :) | 17:17 |
*** lukebrowning has quit IRC | 17:17 | |
SamYaple | im nooping my gates and then redoing all of it from the ground up in loci repo | 17:17 |
mordred | inc0: mnaser did the dance yesterday - might be a good place to cargo-cult from | 17:17 |
inc0 | nah, it's not like you're stuffing yourself with strong eadibles;) | 17:17 |
*** ykarel has joined #openstack-infra | 17:17 | |
SamYaple | yea inc0, you want what mnaser did | 17:17 |
SamYaple | inc0: are you calling me fat? | 17:18 |
inc0 | big boned | 17:18 |
dmsimard | mnaser's patch for puppet-openstack zuul v3 things is here: https://review.openstack.org/#/c/508296/ | 17:18 |
*** lukebrowning has joined #openstack-infra | 17:18 | |
inc0 | but I was referring to different kind of eadibles | 17:18 |
openstackgerrit | Sam Yaple proposed openstack-infra/openstack-zuul-jobs master: Remove legacy loci jobs https://review.openstack.org/508556 | 17:18 |
SamYaple | inc0: i know ;) | 17:18 |
mordred | inc0: also, I landed a patch for infra-manual on this: https://review.openstack.org/#/c/508295/ - infra-manual publishing is in flux atm | 17:19 |
*** mugsie has joined #openstack-infra | 17:19 | |
inc0 | cool, I'll read through that, thanks | 17:19 |
inc0 | also, did you get secrets sorted out already? | 17:19 |
mordred | inc0: oh yah- secrets totally work and we're using the heck out of them | 17:20 |
*** yamamoto has joined #openstack-infra | 17:20 | |
inc0 | so I'll need help with that and registry deployment | 17:20 |
*** ekcs has joined #openstack-infra | 17:21 | |
mordred | inc0: https://docs.openstack.org/infra/zuul/feature/zuulv3/user/config.html#secret is the section of the zuul docs about them | 17:21 |
inc0 | so....let me know mordred when I'll be able to borrow your brain for few minutes to discuss that:) | 17:21 |
mordred | inc0: and yes - as soon as the current issues have settled down my brain is yours | 17:21 |
inc0 | haha, it's like issues ever settles;) anyway, I'll keep pinging | 17:22 |
SamYaple | dont be greedy! | 17:22 |
SamYaple | cat mordred_brain > paste.openstack.org | 17:22 |
SamYaple | simple | 17:22 |
inc0 | I don't think disks on paste.o.o can handle this amount of stuff | 17:23 |
*** lukebrowning has quit IRC | 17:23 | |
SamYaple | i think is like a 200 line limit. thats plenty | 17:24 |
mordred | inc0: what - you're saying we can't write None to paste.o.o ;) | 17:24 |
SamYaple | haha same wavelength there | 17:24 |
*** lukebrowning has joined #openstack-infra | 17:24 | |
inc0 | anyway, I'll leave you guys to zuulv3 and I'll start bugging you early next week | 17:25 |
inc0 | thank you! | 17:25 |
*** yamamoto has quit IRC | 17:25 | |
SpamapS | inc0: I'm curious what you mean by registry deployment. | 17:26 |
honza | We seem to be having issues with the new legacy-tripleo-ci-* jobs. It's as if the job isn't even run (no console.html | 17:26 |
mordred | neat! I can chromecast a browser tab to my TV | 17:26 |
mordred | SpamapS: docker docker docker | 17:26 |
SpamapS | oh cool, like, as a post job to upload to docker? | 17:26 |
honza | e.g. http://logs.openstack.org/36/508536/1/check/legacy-tripleo-ci-centos-7-undercloud-oooq/dd23de8/ | 17:26 |
inc0 | SpamapS: a bit more to tha | 17:26 |
inc0 | t | 17:27 |
inc0 | semi-official registry ran in infra where our iimages will be published | 17:27 |
mordred | SpamapS: well - that too - first step is just getting a local registry run in infra that docker build jobs can push stuff to so that we're not copying tarball exports around | 17:27 |
honza | It fails to set up the workspace, I guess | 17:27 |
honza | rsync: change_dir "/home/zuul/src/*/openstack/ceilometer" failed: No such file or directory (2) | 17:27 |
honza | How can I debug this? | 17:27 |
mordred | honza: looking | 17:27 |
inc0 | so kolla deploy gates will have some place to pull stable images | 17:27 |
frickler | honza: job-output.txt.gz is the new console.html | 17:27 |
cloudnull | ^ seeing something similar http://logs.openstack.org/03/508503/2/check/legacy-openstack-ansible-openstack-ansible-aio/cc4d0d6/job-output.txt.gz#_2017-09-29_17_17_14_073816 I think | 17:27 |
honza | ah! | 17:28 |
inc0 | then, afterwards, we'll pull images and push to dockerhub on daily basis | 17:28 |
mordred | honza: 2017-09-29 15:17:28.848212 | centos-7 | cat: /etc/nodepool/primary_node_private: No such file or directory | 17:28 |
mordred | http://logs.openstack.org/36/508536/1/check/legacy-tripleo-ci-centos-7-undercloud-oooq/dd23de8/job-output.txt.gz#_2017-09-29_15_17_28_848212 | 17:28 |
*** tosky has quit IRC | 17:29 | |
mordred | we're not writing out /etc/nodepool/primary_node_private on single-node jobs | 17:29 |
*** rbrndt has quit IRC | 17:29 | |
*** lukebrowning has quit IRC | 17:29 | |
openstackgerrit | Merged openstack-infra/project-config master: Disable merge-check pipeline https://review.openstack.org/508371 | 17:29 |
mordred | \o/ that'll help | 17:29 |
mordred | honza: is the info in there something you need on single-node jobs too? | 17:30 |
honza | mordred: to be honest, i don't even know what that is for | 17:30 |
*** lukebrowning has joined #openstack-infra | 17:31 | |
mordred | honza: well on multi-node jobs /etc/nodepool/primary_node_private is how you find the address you want to use for intra-cloud traffic to talk to the 'primary' node | 17:31 |
mordred | honza: (thus why it's not generally relevant for single-node jobs) ... one sec and I'll take a peek at that job itself | 17:32 |
mordred | cloudnull: your issue is different- you are missing openstack/ansible-hardening from your required-projects list | 17:32 |
cloudnull | ah ! | 17:32 |
jeblair | clarkb, mordred: i've reproduced the trusty issue in local (enormous) test. it's definitely looking like a zuul bug. | 17:33 |
*** gouthamr has joined #openstack-infra | 17:34 | |
honza | mordred: are you sure that's the cause of the error? the run continues with a SUCCESS after that, and then it seems to actually fail on the /home/zuul/workspace/devstack-gate/functions.sh: line 180: declare: gate_hook: not found line | 17:34 |
honza | mordred: or, at least that's the error closest to the FAILURE line | 17:35 |
honza | FAILED* | 17:35 |
*** lukebrowning has quit IRC | 17:35 | |
*** florianf has quit IRC | 17:36 | |
*** lukebrowning has joined #openstack-infra | 17:37 | |
*** edmondsw has joined #openstack-infra | 17:37 | |
*** michaelxin has quit IRC | 17:39 | |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Add helpful error message about required-projects https://review.openstack.org/508576 | 17:40 |
*** esberglu has quit IRC | 17:40 | |
mordred | jeblair: excellent news! | 17:40 |
*** esberglu has joined #openstack-infra | 17:40 | |
mordred | honza: I'm not 100% certain - I shall now look at your job content | 17:40 |
*** dhellmann has quit IRC | 17:41 | |
honza | mordred: thanks! | 17:41 |
*** esberglu has quit IRC | 17:41 | |
mordred | cloudnull, clarkb, fungi: ^^ https://review.openstack.org/508576 - helpful message about required-projects | 17:41 |
*** jascott1 has joined #openstack-infra | 17:41 | |
*** esberglu has joined #openstack-infra | 17:41 | |
*** dave-mccowan has quit IRC | 17:41 | |
*** lukebrowning has quit IRC | 17:41 | |
*** esberglu has quit IRC | 17:42 | |
*** esberglu has joined #openstack-infra | 17:42 | |
*** esberglu has quit IRC | 17:42 | |
*** michaelxin has joined #openstack-infra | 17:42 | |
*** esberglu has joined #openstack-infra | 17:42 | |
openstackgerrit | Julie Pichon proposed openstack-infra/project-config master: Adjust branches for OSC jobs https://review.openstack.org/503500 | 17:43 |
*** ihrachys has quit IRC | 17:43 | |
*** SumitNaiksatam has joined #openstack-infra | 17:43 | |
*** ihrachys has joined #openstack-infra | 17:43 | |
cloudnull | thanks mordred | 17:44 |
*** thorst has quit IRC | 17:45 | |
mordred | honza: aha! SOOOO | 17:45 |
honza | :) | 17:46 |
*** dhellmann has joined #openstack-infra | 17:46 | |
*** lukebrowning has joined #openstack-infra | 17:47 | |
*** esberglu has quit IRC | 17:47 | |
mordred | honza: you are referencing a file: /opt/stack/new/tripleo-ci/toci_gate_test.sh in your gate_hook | 17:47 |
mordred | honza: that is from a repo that isn't in required-projects OR PROJECTS | 17:49 |
*** trown|lunch is now known as trown | 17:49 | |
mordred | now - that may not be the actual issue - still looking ... | 17:50 |
SamYaple | can i get an +2+W on https://review.openstack.org/#/c/508552/ when someone gets a chance to unblock my gates. please and thank you :) | 17:50 |
*** tosky has joined #openstack-infra | 17:51 | |
*** lukebrowning has quit IRC | 17:52 | |
*** ykarel has quit IRC | 17:52 | |
openstackgerrit | Alex Kavanagh proposed openstack-infra/project-config master: Change the docs job to a deploy-publish-job https://review.openstack.org/508298 | 17:53 |
clarkb | mordred: 508563 lgtm now | 17:53 |
*** lukebrowning has joined #openstack-infra | 17:53 | |
*** rtjure has joined #openstack-infra | 17:53 | |
honza | mordred: https://github.com/openstack-infra/devstack-gate/blob/master/devstack-vm-gate-wrap.sh#L93 | 17:54 |
mordred | honza: can you add Depends-On: I9cdc182ac5800e1566c04e6f21e454956d82ad33 to that patch? (there are several things in the stack ending at https://review.openstack.org/#/c/508548 that will, I think help that job - and it'll be good to see how it does once we've applied those fixes) | 17:54 |
openstackgerrit | Michael Johnson proposed openstack-infra/openstack-zuul-jobs master: Add missing horizon project for octavia-dashboard https://review.openstack.org/508579 | 17:54 |
openstackgerrit | Merged openstack-infra/openstack-zuul-jobs master: Create legacy nodesets and switch all legacy jobs https://review.openstack.org/508510 | 17:54 |
openstackgerrit | Merged openstack-infra/openstack-zuul-jobs master: Update legacy post playbooks to pull from primary https://review.openstack.org/508511 | 17:54 |
openstackgerrit | Merged openstack-infra/openstack-zuul-jobs master: Add missing required-projects for TripleO jobs using devstack-gate https://review.openstack.org/508548 | 17:54 |
*** ihrachys has quit IRC | 17:54 | |
*** lnxnut_ has left #openstack-infra | 17:55 | |
*** ihrachys has joined #openstack-infra | 17:55 | |
mordred | honza: nevermind what I said above - you can try re-checking now that that^^ has landed | 17:55 |
honza | mordred: excellent! | 17:55 |
*** david-lyle has quit IRC | 17:56 | |
*** david-lyle has joined #openstack-infra | 17:56 | |
mordred | clarkb: the job for 436455 seems hung | 17:57 |
honza | mordred: thanks for the quick help, much appreciated | 17:57 |
mordred | honza: sure thing! | 17:57 |
*** jpich has quit IRC | 17:58 | |
*** lukebrowning has quit IRC | 17:58 | |
openstackgerrit | Matt Riedemann proposed openstack-infra/openstack-zuul-jobs master: Remove legacy-tempest-dsvm-nnet >newton jobs https://review.openstack.org/508520 | 17:58 |
openstackgerrit | Matt Riedemann proposed openstack-infra/openstack-zuul-jobs master: Remove legacy-tempest-dsvm-nnet https://review.openstack.org/508519 | 17:58 |
clarkb | mordred: weird, maybe its that ssh timeout thing? | 17:58 |
clarkb | mordred: so we are waiting for ssh to fial? | 17:58 |
*** lukebrowning has joined #openstack-infra | 17:59 | |
mordred | I dunno - I actually don't see any mention on the executors that it's doing anything | 18:02 |
*** camunoz has joined #openstack-infra | 18:03 | |
clarkb | mordred: do you want to address fungi's comment at https://review.openstack.org/#/c/508576/1/roles/fetch-zuul-cloner/templates/zuul-cloner-shim.py.j2 ? | 18:03 |
mordred | clarkb: I do! | 18:03 |
*** lukebrowning has quit IRC | 18:04 | |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Add helpful error message about required-projects https://review.openstack.org/508576 | 18:04 |
fungi | it was purely cosmetic, but i expect this will be showing up in a bunch of job logs so better to be crystal clear | 18:04 |
mordred | yah | 18:05 |
fungi | though now you have whitespace characters on otherwise empty lines, it looks like | 18:05 |
*** lukebrowning has joined #openstack-infra | 18:06 | |
fungi | you shame the whitespace gods with your brazen blasphemy | 18:06 |
mordred | fungi: shall I fix the blasphemy? | 18:10 |
*** lukebrowning has quit IRC | 18:10 | |
*** robled has quit IRC | 18:10 | |
mordred | I'll need to for the pep8 gods won't I? | 18:10 |
clarkb | probably | 18:10 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Add helpful error message about required-projects https://review.openstack.org/508576 | 18:10 |
mordred | clarkb: ok - so - here's what I've got | 18:10 |
fungi | mordred: if pep7 is checking that script, then yeah | 18:11 |
fungi | er, pep8 | 18:11 |
mordred | clarkb, fungi: I did "grep 436455,2 debug.log | grep Execute" on the scheduler | 18:11 |
mordred | which gave me: | 18:11 |
mordred | 2017-09-29 17:29:54,493 INFO zuul.ExecutorClient: Execute job build-openstack-sphinx-docs (uuid: 752c4e6329c84ad98fd34837f56611d9) on nodes <NodeSet OrderedDict([('ubuntu-xenial', <Node 0000060407 ubuntu-xenial:ubuntu-xenial>)])OrderedDict()> for change <Change 0x7fc64ae694a8 436455,2> with dependent changes [] | 18:11 |
mordred | the uuid 752c4e6329c84ad98fd34837f56611d9 is important | 18:11 |
*** lukebrowning has joined #openstack-infra | 18:12 | |
mordred | I then did: ansible 'ze0*' -m shell -a 'grep 752c4e6329c84ad98fd34837f56611d9 /var/log/zuul/executor-debug.log' | 18:12 |
mordred | and found that job running on ze03 | 18:12 |
clarkb | AJaeger_: are you still around? there is a linting problem with your publish-docs fix | 18:12 |
clarkb | AJaeger_: I'll push the fix if you are already afk for the day | 18:12 |
*** robled has joined #openstack-infra | 18:12 | |
*** robled has quit IRC | 18:12 | |
*** robled has joined #openstack-infra | 18:12 | |
mordred | on ze03, it has done all the cloning | 18:12 |
mordred | and created the workdir and everything | 18:13 |
mordred | but does not seem to be running ansible | 18:13 |
mordred | in fact, ze03 is not running ANY ansible | 18:13 |
fungi | clarkb: it sounded like AJaeger_ was headed into a travel blackhole for a while | 18:14 |
fungi | mordred: ze03 is having a bad problem and will not go to space today | 18:15 |
fungi | ? | 18:15 |
clarkb | ok /me pushes fix | 18:15 |
*** boris_42_ has joined #openstack-infra | 18:15 | |
boris_42_ | Hi there | 18:16 |
openstackgerrit | Clark Boylan proposed openstack-infra/openstack-zuul-jobs master: Fix some publishing jobs https://review.openstack.org/508562 | 18:16 |
*** lukebrowning has quit IRC | 18:16 | |
clarkb | mordred: ^ that should fix an error with docs publishing that AJaeger_ ran into | 18:16 |
* clarkb stops bothering mordred so that ze03 can be debugged | 18:16 | |
boris_42_ | Is there way I can help with fixing Rally jobs that are failing after upgrading to zull v3 | 18:16 |
mnaser | if things arent on fire - https://review.openstack.org/#/c/508333/ | 18:16 |
mnaser | just need a +A to remove project templates which aren't prefixed with legacy- so we can move in-repo | 18:17 |
fungi | boris_42_: is you have a link to one of the failing jobs, we can probably tell you whether one of the changes in flight is expected to fix that or maybe help you pinpoint what needs adjusting where | 18:18 |
*** lukebrowning has joined #openstack-infra | 18:18 | |
mordred | clarkb: it's looking like the executor process is stuck in a read() call | 18:18 |
clarkb | mordred: you should be able to use lsof to figure out what the fd is | 18:19 |
openstackgerrit | Tim Burke proposed openstack-infra/project-config master: legacy-swift-dsvm-functional should be voting https://review.openstack.org/508585 | 18:21 |
clarkb | we've also got ssh agents that are 10 hours old according to ps | 18:22 |
clarkb | on ze03 | 18:22 |
*** yamamoto has joined #openstack-infra | 18:22 | |
*** lukebrowning has quit IRC | 18:22 | |
clarkb | I think we might be leaking those ssh-agents | 18:24 |
*** lukebrowning has joined #openstack-infra | 18:24 | |
*** yamamoto has quit IRC | 18:26 | |
*** lukebrowning has quit IRC | 18:29 | |
*** rossella_s has joined #openstack-infra | 18:29 | |
*** rbrndt has joined #openstack-infra | 18:30 | |
mordred | clarkb: ok. I'm coming up stumped | 18:30 |
fungi | the number of ssh-agent processes on ze06 and 07 is similarly large, the rest are around 100-ish | 18:30 |
*** lukebrowning has joined #openstack-infra | 18:30 | |
*** jdandrea_ has quit IRC | 18:31 | |
fungi | and yeah, ssh-agent processes on ze07 also date back to ~10 hours ago | 18:31 |
jeblair | mordred, clarkb: try a thread dump? | 18:31 |
fungi | no ansible processes there either | 18:31 |
fungi | same for 06 | 18:32 |
fungi | so 03, 06 and 07 all seem to be in the same boat | 18:33 |
fungi | stale ssh-agent processes as old as 10 hours, no ansible processes | 18:33 |
mordred | jeblair, clarkb: remind me how to doa thread dump? | 18:35 |
*** lukebrowning has quit IRC | 18:35 | |
jeblair | sigusr2 | 18:35 |
*** lukebrowning has joined #openstack-infra | 18:37 | |
fungi | and then it'll appear in the debug log | 18:37 |
openstackgerrit | Kendall Nelson proposed openstack-infra/storyboard master: Add Test Migration Directions https://review.openstack.org/502509 | 18:38 |
boris_42_ | @fungi so there are in gate pipeline 2 patches 507276 | 18:38 |
boris_42_ | fungi: they have already failed jobs | 18:38 |
jeblair | i'm going to do the stack dump on ze03 | 18:38 |
fungi | boris_42_: thanks, taking a peek now | 18:39 |
*** rossella_s has quit IRC | 18:39 | |
mordred | fungi: ok. I have done poorly at this on 03 and 06 - would you mind doing it for me on 07? I'm having a brain-sad at the moment and don't want to lose things on all thre nodes | 18:39 |
mordred | jeblair: do 07 | 18:39 |
boris_42_ | fungi: thank you | 18:39 |
mordred | jeblair: I borked 3 and 6 | 18:39 |
mordred | jeblair: because somehow I can't do basic unix today | 18:39 |
jeblair | okay i'll do 7 | 18:39 |
jeblair | before i do that | 18:40 |
*** nikhil has joined #openstack-infra | 18:40 | |
jeblair | i notice that both of those hosts are running | 18:40 |
jeblair | zuul 28614 0.0 0.4 192192 32908 ? S 08:37 0:00 git-remote-https origin https://git.openstack.org/openstack/glance-specs | 18:40 |
jeblair | seems plausible that's the read they're stuck on | 18:40 |
mordred | yah. I agree | 18:40 |
*** rossella_s has joined #openstack-infra | 18:40 | |
jeblair | until we figure out how to time that out, can probably just kill that | 18:40 |
mordred | jeblair: k. want me to do that then clean up after myself on 03 and 06 while you look at 7? | 18:41 |
jeblair | also, once again, there are 3 zuul-executor processes running on 07 | 18:41 |
jeblair | there should only be 2 | 18:41 |
mordred | excellent | 18:41 |
*** lukebrowning has quit IRC | 18:41 | |
jeblair | it would be nice to know what happened at 03:19 to cause that | 18:42 |
mriedem | andreaf: jeblair: https://review.openstack.org/#/c/508519/ and below are rebased and passed CI | 18:42 |
mriedem | those are blocking nova so would be sweet if we could get them in | 18:43 |
mriedem | AJaeger_: ^ | 18:43 |
mriedem | sorry andreaf | 18:43 |
fungi | we were still merging job configuration changes up to/around 03:19 | 18:43 |
mordred | mriedem: +2 from me | 18:44 |
fungi | but no restarts that i'm aware of | 18:44 |
mriedem | thanks | 18:44 |
jeblair | interesting... it's a child of the main proc | 18:44 |
jeblair | or rather, a child of the child | 18:44 |
jeblair | i wonder if we have a fork sneaking in we don't know about | 18:45 |
mordred | jeblair: you still want 03 and 06 or you want me to go ahead and clean up there? | 18:45 |
jeblair | mordred: go for it | 18:45 |
*** thorst has joined #openstack-infra | 18:45 | |
jeblair | yeah, it's the git command; the merger therad is holding the lock on the git repos while running that | 18:47 |
*** lukebrowning has joined #openstack-infra | 18:47 | |
jeblair | 07 seems unstuck now | 18:48 |
jeblair | after i killed the leaf git process | 18:48 |
mordred | 03 is restarted and running properly now | 18:48 |
openstackgerrit | Tim Burke proposed openstack-infra/project-config master: Make legacy-swift-tox-xfs-tmp-func-ec voting https://review.openstack.org/508589 | 18:49 |
Shrews | jeblair: one child should be the LogStreamer (which then forks children for finger requests) | 18:49 |
fungi | firefox on my workstation can no longer handle the zuulv3 status page. keeps complaining about jquery running too long | 18:49 |
*** thorst has quit IRC | 18:50 | |
Shrews | fungi: fine under chrome. odd | 18:50 |
jeblair | Shrews: yeah, the child i'm expecting is the logstreamer; the one i'm not expecting is one of its children. i don't know why there often seems to be exactly one of those after some random time period. | 18:50 |
jeblair | fungi: yeah, the new status page is way less efficient than the old. | 18:51 |
*** hasharAway has quit IRC | 18:51 | |
*** hashar has joined #openstack-infra | 18:52 | |
*** lukebrowning has quit IRC | 18:52 | |
mordred | jeblair: ok- 06 is restarted - but doesn't seem to be taking on jobs | 18:52 |
SamYaple | fungi: working fine for me. what version of FF? | 18:52 |
jeblair | mordred: give it a minute? zuul is busy reloading its config | 18:53 |
mordred | there it goes | 18:53 |
*** lukebrowning has joined #openstack-infra | 18:54 | |
fungi | SamYaple: 52.3.0 but i expect it's gotten crufty. i haven't cleared my preferences since years and should probably start fresh at some point | 18:54 |
*** jascott1 has quit IRC | 18:55 | |
jeblair | clarkb, mordred: regarding the trusty bug -- i can say it's related to the use of project templates, and there's no reason it should be limited to nodesets. it's looking similar to what clarkb was surmising -- when project templates are used together on a project, they seem to somehow be modified and combined so that they apply to later projects. it's complicated and i still haven't discovered the mechanism, but thought the ... | 18:55 |
jeblair | ... additional potential error symptoms may be useful. | 18:55 |
SamYaple | fungi: ah. i tested on 54 and 55 (55 is my main) and it appears to be working fine | 18:56 |
Shrews | jeblair: yuck | 18:56 |
*** dizquierdo has quit IRC | 18:57 | |
fungi | boris_42_: looks like you probably need to add openstack/dib-utils (and probably others?) to this job: http://logs.openstack.org/76/507276/5/gate/legacy-rally-dsvm-keystone-v2api-rally/044bb96/logs/devstack-gate-setup-workspace-new.txt | 18:57 |
jeblair | i need to grab some lunch; i'll pick this up again afterwords. | 18:58 |
fungi | boris_42_: in its required-projects list | 18:58 |
*** lukebrowning has quit IRC | 18:58 | |
fungi | boris_42_: and this one probably needs openstack/rally in required-projects? hard to tell since the job isn't collecting the setup-workspace log: http://logs.openstack.org/76/507276/5/gate/legacy-rally-dsvm-cli/d2adbe9/job-output.txt.gz#_2017-09-29_17_55_21_873727 | 18:59 |
kfox1111 | all the kolla-kubernetes jobs are broken at the moment. :/ | 19:00 |
kfox1111 | any idea what this means? http://logs.openstack.org/65/508565/1/check/legacy-kolla-kubernetes-deploy-centos-binary-2-external-ovs/70f5b6a/job-output.txt.gz | 19:00 |
kfox1111 | seems like it breaks before the job even starts. | 19:00 |
*** lukebrowning has joined #openstack-infra | 19:00 | |
fungi | boris_42_: and this one looks like maybe we translated some of the shell script fragment incorrectly? /bin/sh is having trouble parsing it: http://logs.openstack.org/76/507276/5/gate/legacy-rally-dsvm-verify-light-discover-resources/f455652/job-output.txt.gz#_2017-09-29_17_55_14_381078 | 19:01 |
*** lbragstad has quit IRC | 19:03 | |
fungi | kfox1111: looks like it's probably missing openstack/requirements in the required-projects list for that job: http://logs.openstack.org/65/508565/1/check/legacy-kolla-kubernetes-deploy-centos-binary-2-external-ovs/70f5b6a/job-output.txt.gz#_2017-09-29_17_18_27_585936 | 19:03 |
fungi | mordred has a patch up to make the error condition there a lot more user-friendly | 19:04 |
kfox1111 | I tweak that in project-config? | 19:04 |
Shrews | kfox1111: see http://lists.openstack.org/pipermail/openstack-dev/2017-September/122880.html and the linked etherpad. i think that covers your exact situation | 19:04 |
Shrews | instructions in that etherpad | 19:04 |
*** lukebrowning has quit IRC | 19:04 | |
kfox1111 | ok. cool. thanks. | 19:05 |
kfox1111 | oh. there is a single ps that will fix it for everyone? | 19:06 |
*** lukebrowning has joined #openstack-infra | 19:06 | |
mnaser | hey folks, is there a knowni ssue with debian-jessie images? | 19:06 |
mnaser | http://logs.openstack.org/33/508333/1/gate/base-integration-debian-jessie/6290c4b/job-output.txt.gz | 19:07 |
mnaser | "No space left on device" | 19:07 |
*** lbragstad has joined #openstack-infra | 19:07 | |
mnaser | (this ran on our cloud, could it be possible resize2fs or whatever didnt do its thing?) | 19:08 |
Shrews | kfox1111: no, no single fix | 19:08 |
kfox1111 | oh. ok. thanks. | 19:09 |
jeblair | clarkb, mordred, Shrews: oh! i found the trusty issue. the problem and fix are both as subtle as you might expect. i need to write some tests and clean some stuff up before pushing up a patch after lunch. | 19:10 |
jeblair | but we should be able to expect that to be in production within a couple of hours | 19:10 |
mtreinish | mordred, jeblair: ok, now it's a new failure mode for the puppet jobs: http://logs.openstack.org/58/508258/3/check/legacy-infra-puppet-apply-3/3b271d2/job-output.txt.gz#_2017-09-29_17_47_58_633109 | 19:11 |
*** lukebrowning has quit IRC | 19:11 | |
kfox1111 | Shrews: https://review.openstack.org/#/c/508460/1/zuul.d/zuul-legacy-jobs.yaml looks like its tweaking it for all legacy projects? | 19:12 |
*** esberglu has joined #openstack-infra | 19:12 | |
*** esberglu has quit IRC | 19:12 | |
*** lukebrowning has joined #openstack-infra | 19:12 | |
*** esberglu has joined #openstack-infra | 19:13 | |
Shrews | kfox1111: that's fixing the legacy-requirements job. the job you mentioned above (legacy-kolla-kubernetes-deploy-centos-binary-2-external-ovs) does not use that as a parent. | 19:15 |
Shrews | kfox1111: that job is defined here: https://git.openstack.org/cgit/openstack-infra/openstack-zuul-jobs/tree/zuul.d/zuul-legacy-jobs.yaml#n4081 | 19:16 |
*** dhajare has joined #openstack-infra | 19:16 | |
*** lukebrowning has quit IRC | 19:17 | |
*** esberglu has quit IRC | 19:17 | |
*** lukebrowning has joined #openstack-infra | 19:19 | |
kfox1111 | ah. ok. so I just need to add the same thing to all of our jobs. | 19:19 |
Shrews | kfox1111: so you would add the required-projects there. Or if you need it on all of the kolla jobs, do something like what https://review.openstack.org/#/c/508281 does and create a new base job for them, and add it there | 19:19 |
kfox1111 | thanks. | 19:19 |
*** jascott1 has joined #openstack-infra | 19:19 | |
fungi | mnaser: our images should be using the growroot element: http://git.openstack.org/cgit/openstack-infra/project-config/tree/nodepool/nodepool.yaml#n1051 | 19:20 |
fungi | mnaser: http://git.openstack.org/cgit/openstack/diskimage-builder/tree/diskimage_builder/elements/growroot | 19:20 |
smcginnis | fungi: Is there a restart still forthcoming? | 19:20 |
clarkb | jeblair: awesome. I'm taking advantage of lunch time to also cook dinner | 19:20 |
mnaser | fungi interesting, not sure why it did that... :x | 19:20 |
fungi | smcginnis: i expect so once jeblair has a patch for the inheritance issue | 19:21 |
fungi | smcginnis: probably in a couple hours | 19:21 |
smcginnis | fungi: ack, thanks | 19:21 |
*** ijw has quit IRC | 19:21 | |
fungi | mnaser: it's software, so... bugs? | 19:21 |
fungi | mnaser: we'd need logs from early boot (which may be in the journal if we're collecting it) | 19:22 |
*** yamamoto has joined #openstack-infra | 19:23 | |
openstackgerrit | Monty Taylor proposed openstack-infra/infra-manual master: Update project creators guide with zuul v3 information https://review.openstack.org/508596 | 19:23 |
*** lukebrowning has quit IRC | 19:23 | |
mordred | fungi, clarkb:^^ I did a followup to AJaeger_'s patch there | 19:23 |
*** lukebrowning has joined #openstack-infra | 19:25 | |
openstackgerrit | Kevin Fox proposed openstack-infra/openstack-zuul-jobs master: Fix Kolla-Kubernetes missing deps. https://review.openstack.org/508597 | 19:28 |
*** yamamoto has quit IRC | 19:28 | |
kfox1111 | Shrews: how does that look? | 19:28 |
Shrews | kfox1111: looks about right :) we'll see what zuul says | 19:29 |
mnaser | is there a big queue or an issue? (taking almost 40-50 minutes for jobs to start?) | 19:30 |
mordred | kfox1111: lgtm | 19:30 |
*** lukebrowning has quit IRC | 19:30 | |
mordred | Shrews: do you know if we're stuck in happy fun land again? | 19:30 |
*** e0ne has joined #openstack-infra | 19:30 | |
kfox1111 | ok. cool. thanks. :) | 19:31 |
Shrews | mordred: i do not know. i can check np again | 19:31 |
*** lukebrowning has joined #openstack-infra | 19:31 | |
fungi | nodepool hasn't run out of space on / yet at least | 19:32 |
Shrews | mordred: fungi: i see nodepool processing requests, but there are A LOT of requests | 19:33 |
Shrews | that list seems to be slowly declining | 19:33 |
mordred | Shrews, fungi, clarkb, jeblair: there are also a set of jobs at the top of the queues that seem to each have one job that seems somewhat stuck or lost | 19:34 |
inc0 | hey, https://review.openstack.org/#/c/508544/ <- patch from 4hrs ago and zuul still queeues it:( | 19:34 |
*** thorst has joined #openstack-infra | 19:35 | |
fungi | inc0: after the zuul restart it made it into the check pipeline a little over an hour ago | 19:35 |
fungi | inc0: but yes, we're seeing a significant delay in node assignments to jobs too | 19:35 |
*** lukebrowning has quit IRC | 19:36 | |
inc0 | no worries, just informing you:) | 19:36 |
mordred | jeblair: we have 5 executor processes running on ze08 | 19:36 |
inc0 | let me know if there is anything I can do to help | 19:36 |
*** lukebrowning has joined #openstack-infra | 19:37 | |
Shrews | mordred: if i knew how to map review # to node requests #, i'd be able to look into if they're waiting for nodepool or not. but i do not know how to do that | 19:38 |
Shrews | that might actually be a good enhancement to the zk data model | 19:38 |
*** jcoufal has joined #openstack-infra | 19:39 | |
mordred | Shrews: for now - if you grep for the change in the scheduler debug log and then for NodeRequest ... | 19:39 |
mordred | Shrews: like: "grep 508505,4 /var/log/zuul/debug.log | grep NodeRequest" | 19:39 |
mordred | Shrews: which will return things like: | 19:40 |
mordred | 2017-09-29 19:36:11,316 INFO zuul.IndependentPipelineManager: Completed node request <NodeRequest 100-0000065636 <NodeSet devstack-single-node OrderedDict([('primary', <Node 0000063610 primary:ubuntu-xenial>)])OrderedDict()>> for job legacy-tempest-dsvm-neutron-full of item <QueueItem 0x7fc5583f1240 for <Change 0x7fc71cbb4f98 508505,4> in check> with nodes <NodeSet devstack-single-node | 19:40 |
mordred | OrderedDict([('primary', <Node 0000063610 primary:ubuntu-xenial>)])OrderedDict()> | 19:40 |
*** jcoufal_ has quit IRC | 19:41 | |
Shrews | ah, handy | 19:41 |
*** jtomasek has quit IRC | 19:42 | |
mordred | Shrews, jeblair: also - we have four zuul-executor processes on ze05 | 19:42 |
*** lukebrowning has quit IRC | 19:42 | |
*** camunoz has quit IRC | 19:43 | |
*** lukebrowning has joined #openstack-infra | 19:43 | |
mordred | on both ze05 and ze08 - the extra processes are subprocesses of the one child process we expect - and in both cases they are looped reading the console-log file | 19:44 |
mordred | and the jobs for which they are stuck reading the console-log file are the jobs I'm seeing at the top of the queues that are hung | 19:44 |
*** mat128 has quit IRC | 19:46 | |
Shrews | mordred: do those console logs still exist? | 19:46 |
mordred | Shrews: yes | 19:47 |
mordred | OK ... | 19:48 |
Shrews | mordred: seems to imply zuul hasn't completed doing "something" for those jobs | 19:48 |
mordred | so there are two things running at the same time | 19:48 |
Shrews | which is the edge of my knowledge | 19:48 |
openstackgerrit | Kevin Carter (cloudnull) proposed openstack-infra/openstack-zuul-jobs master: Add openstack-ansible required-projects parent job https://review.openstack.org/508281 | 19:48 |
mordred | like - there are two identical ansible-playbook processes both running the same pre-playbook | 19:49 |
*** rossella_s has quit IRC | 19:49 | |
mordred | Shrews, jeblair, SpamapS: http://paste.openstack.org/show/622341/ | 19:50 |
*** rossella_s has joined #openstack-infra | 19:51 | |
mordred | I know that the multiple of anecdote isn't data - but I believe we've seen a few times now cases where things that seem hung wind up having an extra executor | 19:51 |
Shrews | oh, they really ARE identical | 19:52 |
mordred | yah- and this is a pattern that we've seen before but have never been able to understand | 19:52 |
Shrews | i can't even begin to speculate as to a cause for such a thing | 19:53 |
kfox1111 | thanks for the help. :) | 19:53 |
Shrews | kfox1111: no problem! sorry for the hassles | 19:53 |
clarkb | mordred: Shrews possible two nodes got the job? | 19:54 |
kfox1111 | hmmm. still similar problem: http://logs.openstack.org/43/471843/3/check/legacy-kolla-kubernetes-deploy-centos-binary-2-ceph/df4b6e5/job-output.txt.gz#_2017-09-29_18_46_52_644460 | 19:55 |
kfox1111 | Shrews: no worries. zuul3 was a huge change. thanks for working on it. :) | 19:56 |
*** vhosakot has quit IRC | 19:57 | |
Shrews | kfox1111: that change either needs to wait for your fix to merge, or else it needs to depend on your fix | 19:57 |
kfox1111 | oh. I thought it did merge.... guess I didn't verify it actually made it through though... | 19:58 |
kfox1111 | sorry. probably jumped the gun. | 19:58 |
Shrews | node request list has dropped from ~1300 to ~900 | 20:01 |
kfox1111 | ah. yup. its still in the queue. | 20:01 |
fungi | Shrews: any guess at the spike? still just fallout from the reenqueue after zuul was restarted or something else you think? | 20:02 |
Shrews | fungi: no. there are just too many variables right now | 20:02 |
fungi | i figured | 20:02 |
openstackgerrit | Andreas Jaeger proposed openstack-infra/openstack-zuul-jobs master: Remove legacy-requirements-python34 https://review.openstack.org/508598 | 20:02 |
openstackgerrit | Andreas Jaeger proposed openstack-infra/project-config master: Remove legacy-requirements-python34 job https://review.openstack.org/508489 | 20:03 |
jeblair | Shrews, fungi: i just recalled that node request priority is still a TODO item in zuul | 20:03 |
jeblair | so gate is going to be starved by check | 20:03 |
mordred | jeblair: welcome back! | 20:03 |
mordred | jeblair: dunno if you saw in scrollback - but we've got multi-child-processes again on executors | 20:04 |
jeblair | mordred: thx | 20:04 |
jeblair | mordred: i was just looking at that, but i've never seen that before | 20:04 |
jeblair | mordred: i was only previously concerned with multiple zuul-executor processes | 20:04 |
jeblair | mordred: multiple brap processes for the same playbook is new behavior to me | 20:04 |
mordred | jeblair: yah. oh - well - I mean, we have multiple zuul-executor processes | 20:04 |
mordred | jeblair: and the bwrap processes each have a different parent which is one of the z-e processes | 20:05 |
mordred | jeblair: so - god only knows | 20:05 |
mordred | it subverts my understanding of, well, unix | 20:05 |
jeblair | mordred: http://paste.openstack.org/show/622342/ | 20:06 |
mordred | oh - I was reading the number wrong | 20:06 |
jeblair | ya, that looks pretty normal, right? | 20:06 |
*** esberglu has joined #openstack-infra | 20:07 | |
Shrews | what generated that output? | 20:07 |
jeblair | mordred, Shrews: now with full lines: http://paste.openstack.org/show/622343/ | 20:07 |
jeblair | Shrews: pstree -p -l | 20:07 |
mordred | jeblair: oh. I maybe wasn't looking at the right thing first | 20:08 |
fungi | you can get a sort of similar rendering with `ps afuxww` | 20:08 |
fungi | does a parentage tree | 20:08 |
fungi | the f option does i mean | 20:08 |
jeblair | fungi: ya, though that is hard to read with bubblewrap cmdlines which tend to be several kB | 20:09 |
fungi | oh, heh indeed line wrapping makes that painful | 20:09 |
jeblair | i exaggerate. only like 2kB. | 20:09 |
mordred | jeblair, fungi: load avg is quite high - and we're swapping- at least on 08 and 01 | 20:09 |
jeblair | mordred: yeah, i think we need more executors. to be fair, we needed more for zuulv2 as well. | 20:10 |
*** xyang1 has quit IRC | 20:11 | |
jeblair | we also need the executors to have an internal load average limit | 20:11 |
mordred | yah | 20:11 |
mordred | http://paste.openstack.org/show/622346/ | 20:11 |
jeblair | so they stop accepting new jobs after a certain load average | 20:11 |
mordred | that's where we're at right now | 20:11 |
*** esberglu has quit IRC | 20:11 | |
jeblair | nicely distributed! | 20:12 |
mordred | IKR? | 20:12 |
openstackgerrit | Andreas Jaeger proposed openstack-infra/project-config master: Remove some pypy jobs that don't work https://review.openstack.org/504748 | 20:12 |
mordred | jeblair: oh - are we still split with mergers? | 20:12 |
jeblair | mordred: yes, we only have 4 | 20:12 |
jeblair | others are still zuulv2, in case of rollback | 20:12 |
mordred | yah | 20:12 |
fungi | so all the executors are running around 125% of ram... i guess we want a minimum of 4 more? | 20:13 |
openstackgerrit | Ihar Hrachyshka proposed openstack-infra/openstack-zuul-jobs master: Removed confusing comments https://review.openstack.org/508601 | 20:14 |
jeblair | i just emitted suggestions for zuul patches folks can write in #zuul | 20:14 |
jeblair | i need to go work on the project template fix now | 20:15 |
*** e0ne has quit IRC | 20:15 | |
mordred | jeblair: kk | 20:15 |
*** e0ne has joined #openstack-infra | 20:15 | |
fungi | thanks! | 20:15 |
*** e0ne has quit IRC | 20:16 | |
*** e0ne has joined #openstack-infra | 20:16 | |
openstackgerrit | Dirk Mueller proposed openstack-infra/openstack-zuul-jobs master: Drop requirements-python34 job https://review.openstack.org/508602 | 20:16 |
*** e0ne has quit IRC | 20:16 | |
*** Goneri has quit IRC | 20:17 | |
*** e0ne has joined #openstack-infra | 20:17 | |
*** e0ne has quit IRC | 20:17 | |
*** jcoufal has quit IRC | 20:17 | |
*** e0ne has joined #openstack-infra | 20:17 | |
*** kjackal_ has quit IRC | 20:17 | |
*** e0ne has quit IRC | 20:18 | |
*** e0ne has joined #openstack-infra | 20:18 | |
openstackgerrit | Andreas Jaeger proposed openstack-infra/project-config master: Remove some pypy jobs that don't work https://review.openstack.org/504748 | 20:18 |
*** e0ne has quit IRC | 20:19 | |
mordred | clarkb, fungi: ifyou have a sec, https://review.openstack.org/#/c/508568/1 | 20:22 |
mordred | oh - piddle - there's a spurious change in that ... | 20:22 |
openstackgerrit | Merged openstack-infra/infra-manual master: Ectomy Jenkins from the Infra Manual narrative https://review.openstack.org/436455 | 20:22 |
clarkb | I'm just about done with lunch and dinner back to reviewing shortly | 20:23 |
mordred | fungi, clarkb: let me fix that in a followup - that change is needed to unbreak some people's jobs and with queue length I'd hate to send it through another check cycle - whereas the tox.ini change won't really break anything | 20:24 |
*** yamamoto has joined #openstack-infra | 20:24 | |
openstackgerrit | Monty Taylor proposed openstack-infra/openstack-zuul-jobs master: Remove spurious change to tox.ini https://review.openstack.org/508607 | 20:25 |
openstackgerrit | Dirk Mueller proposed openstack-infra/openstack-zuul-jobs master: Remove rpm-packaging-tox-lint legacy job https://review.openstack.org/508609 | 20:28 |
*** yamamoto has quit IRC | 20:29 | |
openstackgerrit | Dirk Mueller proposed openstack-infra/project-config master: Remove legacy-rpm-packaging-tox-lint https://review.openstack.org/508610 | 20:30 |
*** jtomasek has joined #openstack-infra | 20:31 | |
openstackgerrit | Trevor McCasland proposed openstack-infra/subunit2sql master: Add subunit2sql CLI option to use non_subunit_name https://review.openstack.org/507192 | 20:32 |
openstackgerrit | Trevor McCasland proposed openstack-infra/subunit2sql master: Add subunit2sql CLI option to use non_subunit_name https://review.openstack.org/507192 | 20:36 |
openstackgerrit | Trevor McCasland proposed openstack-infra/subunit2sql master: Add subunit2sql CLI option to use non_subunit_name https://review.openstack.org/507192 | 20:41 |
*** jtomasek has quit IRC | 20:42 | |
*** rhallisey has quit IRC | 20:42 | |
*** rlandy has quit IRC | 20:44 | |
openstackgerrit | Clark Boylan proposed openstack-infra/zuul-jobs master: Make fetch-tox-output more resilient https://review.openstack.org/508563 | 20:45 |
clarkb | mordred: ^ fixed an issue there (that testing caught \o/) | 20:45 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Fix bug with multiple project-templates https://review.openstack.org/508612 | 20:46 |
jeblair | clarkb, mordred: ^ lists are mutable | 20:46 |
jeblair | clarkb, mordred: i did a bunch of nice inheritance path enhancements as part of tracking that down; i'm cleaning that up now as a follow-up change | 20:47 |
clarkb | jeblair: shiny | 20:47 |
*** slaweq has joined #openstack-infra | 20:47 | |
*** trown is now known as trown|outtypewww | 20:47 | |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Map pipeline precedence to nodepool node priority https://review.openstack.org/508613 | 20:48 |
*** jdandrea_ has joined #openstack-infra | 20:48 | |
*** jdandrea_ has quit IRC | 20:48 | |
*** nikhil has quit IRC | 20:49 | |
*** Sukhdev has joined #openstack-infra | 20:49 | |
openstackgerrit | Trevor McCasland proposed openstack-infra/subunit2sql master: Add subunit2sql CLI option to use non_subunit_name https://review.openstack.org/507192 | 20:50 |
*** bnemec has quit IRC | 20:51 | |
*** jtomasek has joined #openstack-infra | 20:51 | |
*** jdandrea_ has joined #openstack-infra | 20:51 | |
clarkb | jeblair: mordred each of thoes zuul fixes lgtm | 20:52 |
jeblair | clarkb: cool, i'm going to -1 mordreds since he's working on tests, but i agree it looks good | 20:52 |
clarkb | jeblair: can has quick review on https://review.openstack.org/#/c/508563/ ? | 20:53 |
mordred | clarkb, jeblair: should we start working on spinning up additional executors? and/or should we start converting v2 mergers to v3 mergers? (are we ready for that or do we wanna keep them still) | 20:54 |
*** jdandrea_ has quit IRC | 20:55 | |
clarkb | at this point do we see ourselves rolling back? its still doable since everything is in a separate config fs tree (though in some cases shared repo) | 20:55 |
clarkb | I feel like we've been pretty committed torolling forward | 20:55 |
clarkb | so converting old mergers is probably fine | 20:55 |
jeblair | i've been so heads down i don't feel like i have a great handle on how big the fire is, so will rely on others to evaluate general likelihood of rollback. i will say that with the project-templates fix, i don't know of any zuul bugs i would consider blockers. the closest is the high amount of pain that dynamic configs will cause us for the next several days at least until we can optimize that. | 20:56 |
*** kgiusti has left #openstack-infra | 20:57 | |
clarkb | jeblair: right now I think the biggest fires are mostly around job config bugs. Like the missing workspace in publish-docs path AJaeger_ fixed and adding requirements to reuired repos. Basically things that we'd have a hard time fixing if we rolled back | 20:57 |
mordred | clarkb, jeblair: I think it seems like, while we have fires, they're all mostly roll-forward-and-fix-the-job types of fires, and mostly things (otherthan config optimizatoin) that are fairly easy to fix once the problem is spotted | 20:58 |
clarkb | I need to catch up on the situation with multinode jobs too (but again I think itsmostly little corner case edges we are fidning that automated conversion alone isn't likely to catch) | 20:58 |
mordred | clarkb: yah - I think we've got most of the systemic multinode issues and have the pile of edge cases left | 20:58 |
clarkb | if we want to maybe run v2 and v3 concurrently then I'd entertain rollback otherwise I think we press forward and fix bugs | 20:58 |
clarkb | basically the only way we find and fix these is by stubbing our toes on them | 20:59 |
*** armax has joined #openstack-infra | 21:00 | |
mordred | and at this point, folks have already gone through an amount of disruption - rolling back and roling forward again later seems like it's likely to pile on with more disruption than fixing bugs as we find them will | 21:01 |
SamYaple | please dont introduce v2 again | 21:01 |
AJaeger_ | Let's move forward... | 21:02 |
SamYaple | +9001 | 21:02 |
clarkb | jeblair: one additional possible zuul bug. legacy-tempest-dsvm-nnet has branches set to stable/newton and yet runs against d-g master changes | 21:02 |
clarkb | SamYaple: AJaeger_ yup I think that is consensus just wanted to make sure we weighed the options properly. | 21:02 |
jeblair | clarkb: is it used in project-templates at all? | 21:02 |
clarkb | jeblair: yes | 21:02 |
clarkb | integrated-gate-nova-net is the project-template | 21:02 |
clarkb | jeblair: is that the same bug as the one you fixed? | 21:03 |
jeblair | clarkb: let's assume it's fixed by my change until we see otherwise | 21:03 |
clarkb | kk | 21:03 |
AJaeger_ | regarding multi-node, dmsimard has quite a few open changes that need review love: https://review.openstack.org/#/q/status:open++topic:zuulv3-multinode | 21:03 |
jeblair | clarkb: (let's check though :) | 21:03 |
clarkb | AJaeger_: thanks, will review | 21:03 |
clarkb | AJaeger_: looks like these are for doing native zuul multinode (should review but likely less urgent for now until we get happy with v3 as is) | 21:05 |
AJaeger_ | clarkb: I see... | 21:06 |
* AJaeger_ waves good night | 21:06 | |
clarkb | I've just approved https://review.openstack.org/#/c/508460/1 | 21:07 |
boris_42_ | fungi: so can i help somehow? | 21:08 |
fungi | boris_42_: did you see any of the analysis i posted earlier of the several different kinds of job failures in your jobs? | 21:08 |
clarkb | ^.*requirements-py[2,3].txt$ the comma in the regex there doens't do what I think it thinks it does :) | 21:09 |
SamYaple | question.. is it generally better to have long running jobs, or lots of short jobs? For LOCI I will be building openstack projects and was considering a job per project with each distro instead of a job per distro building all projects | 21:09 |
boris_42_ | fungi: yep I saw, but not sure that completely understand part about requriments | 21:09 |
mordred | clarkb: it really doesn't. :) | 21:09 |
boris_42_ | fungi: why do we need to have rally in requirments? | 21:09 |
SamYaple | s/each distro/all distros/ | 21:10 |
mordred | SamYaple: the answer is a *VERY* firm 'it depends' :) | 21:10 |
*** thorst has quit IRC | 21:10 | |
SamYaple | haha. well more context, each *project* takes about 3 minutes to build. and this way it would allow me to exclude projects should certain files not get changed (something we may or may not do) | 21:10 |
mordred | SamYaple: lots of shorter jobs means that resources can be recycled and used by other things on a more granular level - but there is still a recycle cost ... also, at the moment, we have single-sized flavors - so you'll get an 8G node whether you need it or not | 21:11 |
SamYaple | right, so im trying to figure out if that "recycle cost" is high. | 21:12 |
SamYaple | if i end up building all openstack projects that means a patchset could have 40 or so 3-5 minute gates | 21:12 |
fungi | boris_42_: one of them needed to add openstack/dib-utils to the required-projects list for that job, it looked like. the other two seemed to be the same sort of shell parsing error, looking back at them now | 21:13 |
SamYaple | rather than 3-4 50m gates | 21:13 |
fungi | boris_42_: http://logs.openstack.org/76/507276/5/gate/legacy-rally-dsvm-verify-light-discover-resources/f455652/job-output.txt.gz#_2017-09-29_17_55_14_381078 and http://logs.openstack.org/76/507276/5/gate/legacy-rally-dsvm-cli/d2adbe9/job-output.txt.gz#_2017-09-29_17_55_21_873727 | 21:13 |
mordred | SamYaple: yah - but, if you put in exclusions, you could wind up with only 3-4 3-5 minute gates much of the time yeah? | 21:13 |
SamYaple | thats the desire, yes | 21:13 |
SamYaple | though certain patches will still trigger them all | 21:13 |
mordred | yah- but that's life- you'll need to use that amount of resources on such a patch no matter how you split them | 21:14 |
SamYaple | fair enough | 21:14 |
clarkb | re recycle cost its the boot and delete of openstack VMs for the most part | 21:14 |
mordred | SamYaple: I think right now it's probably not a TON of diference because of quota - however, as we roll out ability to use less resources for a given job, the split jobs may allow you to take advantage of that more | 21:14 |
clarkb | which depending on cloud, region, and load that varies in cost significantly even over the course of a day | 21:14 |
mordred | it does | 21:15 |
mordred | it's also worth noting that on some of our clouds the upper bound on quota is number of available ips | 21:15 |
SamYaple | is this quota per project or for all of infra? | 21:15 |
SamYaple | oh good point about the ips | 21:15 |
mordred | SamYaple: all of infra- we currently have one total quota - but we currently only calculate it in terms of number of servers | 21:15 |
SamYaple | hmmm. well i think ill start with split projects and only setup 6-7 core projects until we can get the exclusions working the way we want | 21:16 |
*** vhosakot has joined #openstack-infra | 21:16 | |
mordred | SamYaple: tobiash has written code to look at nova flavors and quota and whatnot and actually allow calculating what the actual quota and actual usage are - which then allows for a 1 G node to count less towards quota than an 8 G node | 21:16 |
SamYaple | we can always roll back to a fat gate job | 21:16 |
mordred | SamYaple: yah - re-organizing it should be easy to do as you poke at it | 21:16 |
SamYaple | much easier now too :) | 21:17 |
*** jamesdenton has joined #openstack-infra | 21:17 | |
SamYaple | i think per project gate would be the easy call if i could get a 1GB instance | 21:18 |
boris_42_ | fungi: so where code of jobs is now located ? | 21:19 |
boris_42_ | fungi: does it make to move these jobs to our project as we need to fix them in any case? | 21:20 |
boris_42_ | make sense* | 21:20 |
fungi | mordred: looking at one of the shell parsing errors for the rally job failures, is it possible $ZUUL_PROJECT is no longer being set? http://git.openstack.org/cgit/openstack-infra/openstack-zuul-jobs/tree/playbooks/legacy/rally-dsvm-verify-light-discover-resources/run.yaml#n31 | 21:20 |
fungi | http://logs.openstack.org/76/507276/5/gate/legacy-rally-dsvm-verify-light-discover-resources/f455652/job-output.txt.gz#_2017-09-29_17_55_14_381078 | 21:20 |
*** edmondsw has quit IRC | 21:20 | |
*** jpena|off has quit IRC | 21:21 | |
mordred | fungi: it should be set by environment: '{{ zuul | zuul_legacy_vars }}' | 21:21 |
*** dprince has quit IRC | 21:21 | |
mordred | fungi: although we somehow missed-parsed the shebang line on that one | 21:21 |
fungi | oh! | 21:22 |
fungi | good eye | 21:22 |
fungi | that's a broken shebang | 21:22 |
*** amoralej has quit IRC | 21:22 | |
fungi | and i bet in the past we just ignored it | 21:22 |
mordred | fungi: so we should turn line 29: #/bin/bash -xe - into set -x and set -e - and then add exectutable: /bin/bash | 21:22 |
mordred | fungi: I believe you're likely right | 21:22 |
mordred | fungi: so - executable: /bin/bash as a sibling to the chdir: at the bottom of the task - and then expanding it into set commands instead of a shebang should fixthat one right up | 21:23 |
fungi | mordred: boris_42_: looks like that same broken shebang appears in half a dozen different rally job definitions according to git grep | 21:24 |
boris_42_ | fungi: ya most of Rally jobs are broken | 21:24 |
fungi | boris_42_: http://git.openstack.org/cgit/openstack-infra/openstack-zuul-jobs/tree/playbooks/legacy/rally-dsvm-verify-light-discover-resources/run.yaml#n29 | 21:24 |
fungi | boris_42_: the #/bin/bash should be #!/bin/bash | 21:24 |
*** ltomasbo has quit IRC | 21:24 | |
boris_42_ | fungi: and what about -xe ? | 21:25 |
fungi | so the error was copied over from the old jobs, but zuul v2's executor wasn't as picky about that typo and just ignored the line and fed it directly to the shell parser | 21:25 |
fungi | boris_42_: keep the -xe, i just mean the typo in the original jobs (which we copied over into the new job definitions verbatim) was missing a ! after the # | 21:25 |
*** yamamoto has joined #openstack-infra | 21:25 | |
*** slaweq has quit IRC | 21:26 | |
boris_42_ | fungi: okay let me propose patch that fixes the things | 21:26 |
boris_42_ | in all places | 21:26 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: SourceContext improvements https://review.openstack.org/508620 | 21:26 |
jeblair | clarkb: ^ new shiny | 21:26 |
*** slaweq has joined #openstack-infra | 21:26 | |
fungi | boris_42_: `git grep '#/bin/'` in openstack-infra/openstack-zuul-jobs should show you the ones that are missing ! | 21:26 |
boris_42_ | fungi: thanks for help! | 21:27 |
clarkb | FYI I'm digging into why multinode legacy jobs alls seem to think they don't have a second node | 21:27 |
clarkb | or attempting to at least | 21:27 |
fungi | boris_42_: i think you must have made the mistake in one and then copied it to the others, because only rally jobs seem to have that mistake | 21:27 |
boris_42_ | fungi: I believe so =( | 21:27 |
clarkb | http://logs.openstack.org/51/500351/18/check/legacy-grenade-dsvm-neutron-multinode-live-migration/365ed5b/ example job (note the inventory looks correct so guessing its something runtime with d-g) | 21:27 |
*** ltomasbo has joined #openstack-infra | 21:28 | |
fungi | boris_42_: but separately, i expect legacy-rally-dsvm-keystone-v2api-rally will still need (at least) openstack/dib-utils added to its required-projects list based on the error i saw in that one | 21:28 |
*** jpena|off has joined #openstack-infra | 21:29 | |
*** amoralej has joined #openstack-infra | 21:29 | |
fungi | clarkb: changes to how we store multinode metadata in /etc/nodepool maybe? as in which files we make present in the new primary-less multinode world? | 21:29 |
clarkb | fungi: ya or not setting the vars that trigger multinode code paths, gonna finish reviewing jeblair's change then hope to dig in properly | 21:30 |
*** thorst has joined #openstack-infra | 21:30 | |
*** slaweq has quit IRC | 21:31 | |
*** yamamoto has quit IRC | 21:31 | |
fungi | yep, i actually popped in on a break to review the inheritance fix. still mired in food-related stuff for a while yet | 21:31 |
openstackgerrit | Boris Pavlovic proposed openstack-infra/openstack-zuul-jobs master: Fix typo in rally jobs #/bin/bash -> #!/bin/bash https://review.openstack.org/508622 | 21:31 |
cloudnull | looking at the zuul status page, after a job completes we're seeing something like this "http://zuulv3.openstack.org/legacy-openstack-ansible-linters" - which has posted "node_failure" | 21:32 |
cloudnull | the post is testing pr https://review.openstack.org/#/c/508281/ | 21:32 |
clarkb | boris_42_: you'll actually probably want to fix it in the way mordred described | 21:32 |
clarkb | boris_42_: 21:23:21 mordred | fungi: so we should turn line 29: #/bin/bash -xe - into set -x and set -e - and then add exectutable: /bin/bash | 21:32 |
cloudnull | against our main repo, any advice? | 21:32 |
boris_42_ | @clarkb sure | 21:32 |
*** slaweq has joined #openstack-infra | 21:34 | |
clarkb | cloudnull: I think that means your new base job is leaving out the bits that configure things to use our log server | 21:35 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Add helpful error message about required-projects https://review.openstack.org/508576 | 21:35 |
clarkb | cloudnull: I would look at the other base jobs to see what they include related to log server data | 21:35 |
*** srobert has quit IRC | 21:35 | |
*** baoli has quit IRC | 21:35 | |
*** thorst has quit IRC | 21:35 | |
clarkb | cloudnull: though you use the legacy-base as your parent so now I'm just confused | 21:36 |
boris_42_ | clarkb: just one short qustion about set -x and set -e should I put these commands before shebang or inside it ? | 21:37 |
jeblair | clarkb, cloudnull: node_failure means unable to get a node from nodepool; bad things happened earlier today; is it very recent? (since last zuul restart) | 21:37 |
*** hemna_ has quit IRC | 21:38 | |
cloudnull | i ran the job like an hour ago | 21:38 |
cloudnull | maybe it was pre restart ? | 21:39 |
jeblair | cloudnull: i think the restart was a few hours ago now; it would probably be good to investigate the failure then. maybe an infra-root can look at it? | 21:40 |
clarkb | fungi: http://logs.openstack.org/02/508302/3/check/legacy-tempest-dsvm-neutron-multinode-full/b3ea860/logs/etc/nodepool/sub_nodes_private.txt.gz the empty sub nodes private file is why multinode isn't working | 21:40 |
clarkb | looking to sort that out now | 21:40 |
mordred | clarkb: the nodeset change should fix that | 21:42 |
mordred | clarkb: https://review.openstack.org/#/c/508568/ | 21:43 |
clarkb | mordred: ya just read the when on that and it relies on the group | 21:43 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Map pipeline precedence to nodepool node priority https://review.openstack.org/508613 | 21:43 |
mordred | clarkb, jeblair: now with a test! | 21:43 |
mordred | Shrews: your path to add change id into the node request would make reading the asserts in that test nicer | 21:45 |
jeblair | mordred: generally lgtm, question inline | 21:46 |
boris_42_ | mordred: hi there, can you elaborate about mordred | fungi: so we should turn line 29: #/bin/bash -xe - into set -x and set -e - and then add exectutable: /bin/bash | 21:46 |
boris_42_ | mordred: not sure that I understand why #!/bin/bash -xe won't work .. | 21:47 |
clarkb | boris_42_: because it is being executed by ansible as a shell script now. And by default it will use sh | 21:47 |
mordred | boris_42_: sure! it's because putting a shebang line into an ansible shell: block doesn't work | 21:47 |
clarkb | boris_42_: basically the shebang is no longer interpreted | 21:47 |
boris_42_ | ah ok | 21:47 |
jeblair | mordred, clarkb: i'm chasing down a branch matcher problem i observed when working on the project-template thing | 21:48 |
mordred | boris_42_: so if we want it to run under bash, we need to set that in the executable: parameter, and if we want -x or -e we need to use set -e and/or set -x ... we did this on the conversion for most of the jobs already ... | 21:48 |
mordred | boris_42_: but because the shebang line in that script happened to be misformed, our parser didn't catch it (oops) | 21:48 |
mordred | jeblair: kk | 21:48 |
boris_42_ | @mordred gottcha | 21:48 |
boris_42_ | going to refactor that piece | 21:49 |
openstackgerrit | Merged openstack-infra/project-config master: Adjust branches for OSC jobs https://review.openstack.org/503500 | 21:49 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Map pipeline precedence to nodepool node priority https://review.openstack.org/508613 | 21:49 |
mordred | jeblair: ^^ fixed | 21:49 |
clarkb | bah new patchset before i could post my comments | 21:50 |
mordred | clarkb: sorry | 21:50 |
clarkb | in any case not worth a -1 but had a couple things | 21:50 |
*** esberglu has joined #openstack-infra | 21:50 | |
*** mat128 has joined #openstack-infra | 21:52 | |
*** thorst has joined #openstack-infra | 21:52 | |
*** hashar has quit IRC | 21:54 | |
openstackgerrit | Boris Pavlovic proposed openstack-infra/openstack-zuul-jobs master: Do not use shebang in rally legacy jobs https://review.openstack.org/508622 | 21:55 |
boris_42_ | okay fixed it ^ | 21:55 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Map pipeline precedence to nodepool node priority https://review.openstack.org/508613 | 21:56 |
*** esberglu has quit IRC | 21:56 | |
mordred | clarkb: I went ahead and fixed your request for comments- and reordered the changes so that sequence and priority were in different order to make it clearer | 21:56 |
*** thorst has quit IRC | 21:56 | |
mordred | boris_42_: looks great! | 21:57 |
openstackgerrit | Dean Troyer proposed openstack-infra/project-config master: Remove python-aodhclient as g-r.txt has aodhclient https://review.openstack.org/508626 | 21:57 |
openstackgerrit | Boris Pavlovic proposed openstack-infra/openstack-zuul-jobs master: Remove unused shebang from legacy jobs https://review.openstack.org/508627 | 21:59 |
clarkb | mordred: approved, thanks | 21:59 |
* clarkb reviews boris_42_'s change | 21:59 | |
openstackgerrit | Monty Taylor proposed openstack-infra/infra-manual master: Update project creators guide with zuul v3 information https://review.openstack.org/508596 | 22:00 |
clarkb | mordred: ^ reminds me, is docs publishing working yet? | 22:00 |
*** ijw has joined #openstack-infra | 22:01 | |
*** Swami has quit IRC | 22:01 | |
*** ijw has quit IRC | 22:02 | |
*** bobh has quit IRC | 22:05 | |
SamYaple | is zuul down right now? | 22:05 |
SamYaple | zuulv3.openstack.org/status.json just hangs | 22:05 |
*** tpsilva has quit IRC | 22:05 | |
mnaser | SamYaple i can confirm that behaviour on my side as well | 22:07 |
*** slaweq has quit IRC | 22:08 | |
clarkb | mordred: comments on 508596 | 22:09 |
* clarkb goes to look at zuulv3.o.o | 22:09 | |
clarkb | I confirm only zuul-web is running | 22:09 |
clarkb | jeblair: ^ you aren't in the process of restarting zuul are you? | 22:09 |
jlvillal | On the POST_FAILUREs I see on: https://review.openstack.org/508287 Should I wait until next week for a fix? | 22:10 |
clarkb | journalctl and the zuul logs don't seem to know why zuul-scheduler isn't running | 22:10 |
SamYaple | now its 503'ing | 22:11 |
SamYaple | so progress! | 22:11 |
clarkb | SamYaple: I think that means apache has noticed | 22:11 |
clarkb | jlvillal: yes there is a fix for that specific issue | 22:11 |
clarkb | jlvillal: just a matter of getting it merged | 22:11 |
mnaser | clarkb systemctl status might say reason for service exit? | 22:11 |
mordred | clarkb, jlvillal: http://paste.openstack.org/show/622352/ | 22:12 |
mordred | gha. jeblair ^^ ... don't konw if that's fatal or not- but it's the most recent error in the log | 22:12 |
mordred | clarkb, jeblair: nope - thathappens from time to time | 22:14 |
clarkb | mnaser: no luck fro mthat, has the web server and nothing about scheduler | 22:14 |
jlvillal | clarkb, mordred Thanks | 22:14 |
clarkb | mordred: jeblair thinking we might consider merging those zuul patches, then get scheduler running again? | 22:14 |
mnaser | clarkb | 22:14 |
clarkb | but ya I'm not finding anything in logs about why it stopped running oh /me looks at syslog maybe it was oom | 22:14 |
mnaser | oops, sorry, early enter, not sure then :( | 22:14 |
clarkb | yup OOM | 22:15 |
*** slaweq has joined #openstack-infra | 22:15 | |
mordred | awesome | 22:15 |
boris_42_ | clarkb: mordred is this just race http://logs.openstack.org/76/507276/5/check/legacy-rally-dsvm-manila-multibackend/3de3226/logs/devstack-early.txt.gz#_2017-09-29_19_05_07_119 install_from_lib seems doesn't work .. | 22:15 |
clarkb | boris_42_: ianw was working to fix that | 22:16 |
clarkb | boris_42_: I don't understand all the details but I think there is a mmailing list thread | 22:16 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Protect against builds dict changing while we iterate https://review.openstack.org/508629 | 22:16 |
clarkb | we don't have swap on zuulv3.o.o | 22:17 |
mordred | clarkb: yes - that is correct - I believe we spun that server up before fixing launch node to do the fix-swap dance | 22:18 |
fungi | clarkb: yeah, i mentioned that earlier in the week but at the time it wasn't using more than 50% of its available ram | 22:18 |
clarkb | maybe we should fix that then possibly merge some zuul fixes then start it again? | 22:18 |
openstackgerrit | Boris Pavlovic proposed openstack-infra/openstack-zuul-jobs master: Export missing projects in Rally legacy jobs https://review.openstack.org/508630 | 22:18 |
*** lukebrowning has quit IRC | 22:18 | |
SamYaple | can I statically define the path for the playbook with zuulv3? | 22:18 |
fungi | looks like the zuul-scheduler process was up over 18gib when the oom-killer decided to take action | 22:18 |
clarkb | SamYaple: yes I think there should be examples of that | 22:18 |
boris_42_ | @clarkb okay thanks, it's good that someone is working on it | 22:19 |
mordred | we actually don't have /dev/xvde1 mounted anywhere | 22:19 |
SamYaple | ok ill search for it | 22:19 |
boris_42_ | @clarkb i fixed a bit more of jobs | 22:19 |
clarkb | SamYaple: ya look in openstack-zuul-jobs and grep for playbook | 22:19 |
fungi | mordred: yes, i believe the server was built back when swap setup wasn't working with our launch script | 22:19 |
mordred | fungi, clarkb: we don't really need much disk on that box - we could just turn all of /dev/xvde1 into swap | 22:19 |
SamYaple | clarkb: found it in the docs https://docs.openstack.org/infra/manual/zuulv3.html#ansible-playbooks | 22:20 |
mordred | (as an easy way to deal with that, since we're down anyway) | 22:20 |
clarkb | arg /me has to deal with kids for a bit. But ya I think we should work on rough plan above (not sure how safe those changes are did any pass unittests before it crashed?) | 22:20 |
* fungi has no idea what the modern limits of a swap partition size are, nor what negative repercussions there might be if you gave the kernel that much | 22:20 | |
clarkb | if we want to do a bunch probably 2x memory is plenty | 22:20 |
mordred | clarkb, fungi, jeblair: also - I think https://review.openstack.org/508629 may help in some cases since we're seeing that exception from time to time in the logs | 22:20 |
mordred | fungi: should we just run our normal swap script? | 22:21 |
*** mat128 has quit IRC | 22:21 | |
fungi | i believe so, yes | 22:21 |
mordred | clarkb: also - I agree re: landing the zuul patches - especially the one from jeblair | 22:21 |
mordred | ok. I'm going to do the swap bit right now | 22:21 |
fungi | though may want to move /opt contents out of the way before running that | 22:21 |
fungi | and then put them back after | 22:21 |
fungi | mordred: ^ | 22:22 |
*** jtomasek has quit IRC | 22:22 | |
mordred | fungi: ++ | 22:22 |
fungi | fewer surprises | 22:22 |
mordred | oh - the script does that for us | 22:22 |
fungi | ahh, nice | 22:22 |
*** jtomasek has joined #openstack-infra | 22:23 | |
*** d0ugal has quit IRC | 22:23 | |
fungi | for some reason i thought it ended up blowing away the contents of /opt on one of the mergers? executors? recently when we were fixing those up | 22:23 |
mordred | fungi: that was a different dance - on the executors we want the volume mounted on /var/lib/zuul instead of in /opt/ | 22:23 |
fungi | maybe that behavior got fixed | 22:23 |
fungi | ahh | 22:23 |
fungi | nm | 22:23 |
mordred | ok. swap enabled - volume mounted on /opt and fstab updated | 22:24 |
fungi | so we have 1x ram as swap, which is fine since each is around 16gb | 22:24 |
*** lukebrowning has joined #openstack-infra | 22:24 | |
SamYaple | from .zuul.yaml is there a way for me to pass variables in some form to the jobs? | 22:25 |
mordred | SamYaple: yup. it's "vars" on the job ... | 22:25 |
jeblair | clarkb, mordred: that doesn't look like an especially sustainable line on the zuul memory graph | 22:26 |
mordred | SamYaple: https://docs.openstack.org/infra/zuul/feature/zuulv3/user/config.html#attr-job.vars | 22:26 |
SamYaple | beautiful. exactly what i was hoping for | 22:26 |
*** lukebrowning has quit IRC | 22:26 | |
*** armax has quit IRC | 22:26 | |
mordred | jeblair: agree | 22:27 |
fungi | memory growth looks like it's been pretty steady today, yeah http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=63979&rra_id=all | 22:27 |
*** lukebrowning has joined #openstack-infra | 22:27 | |
*** yamamoto has joined #openstack-infra | 22:27 | |
*** jtomasek has quit IRC | 22:27 | |
fungi | hard to tell from the curve there whether it would have topped out around 16gb were there room to page some stuff out | 22:28 |
jeblair | the bug that i'm working on is another serious flaw related to project-templates; it will certainly produce erroneous configuration | 22:28 |
jeblair | it's not going to be fixed by adding a colon | 22:28 |
superdan | I think we still need this https://review.openstack.org/#/c/508519 to unblock nova, right? looks like it has had two +2s today, just not at the same time.. :) | 22:28 |
*** dhajare has quit IRC | 22:28 | |
*** gouthamr has quit IRC | 22:29 | |
openstackgerrit | Monty Taylor proposed openstack-infra/project-config master: Remove broken openstack-tox-pep8 variant https://review.openstack.org/508542 | 22:29 |
superdan | I hear +2s are happiest in pairs | 22:30 |
jeblair | mordred, fungi, clarkb: in the column of "reasons to roll back" i think we have: 1) project-template bug #2; 2) dynamic reconfiguration is very slow; 3) unsustainable memory use (leak?) | 22:30 |
jeblair | both #2 and #3 are things that would benefit from a period of zuulv3 running check jobs only | 22:30 |
fungi | if we roll back, what is the interim plan? would we unfreeze project-config? | 22:31 |
*** ijw has joined #openstack-infra | 22:31 | |
jeblair | #1 can probably be fixed in a day or so | 22:32 |
fungi | i worry that there are at least some fixes which have been made directly to migrated jobs now, which rerunning the migration script will lose | 22:32 |
*** baoli has joined #openstack-infra | 22:32 | |
jeblair | fungi: notably required-projects | 22:32 |
fungi | yup, and things like rally's typo'd shebangs | 22:32 |
*** yamamoto has quit IRC | 22:32 | |
mordred | fungi, jeblair: yah- I'm pretty sure re-running the migration script would cause way more problems than it would solve | 22:33 |
jeblair | it would cause a lot of problems | 22:33 |
SamYaple | i would take hourly restarts of zuulv3 over rolling back. people are just coming around to how to do things again | 22:33 |
boris_42_ | fungi: actually we can work on moving Rally jobs to Rally repo | 22:33 |
fungi | so we should likely leave zuul v2 configuration/jobs frozen if we roll back? | 22:33 |
boris_42_ | fungi: and fixing them there | 22:33 |
jeblair | SamYaple: hourly restarts aren't long enough to merge a single change | 22:33 |
SamYaple | jeblair: for you maybe! my jobs take 5 minutes to run! (but point taken) | 22:34 |
*** baoli_ has joined #openstack-infra | 22:34 | |
jeblair | SamYaple: for *us* | 22:34 |
mnaser | fwiw i have an empty weekend and i'd be more than happy to pick up any work to help clear this out and avoid a zuulv2 rollback :> | 22:34 |
SamYaple | :) | 22:34 |
mordred | jeblair: it was actually steady memory wise, it seems, until the most recent restart - so we may have a fairly small amount of things to examine to find the memory leak | 22:35 |
clarkb | ya I think if we rolled back we'd run v2 and v3 side by side | 22:35 |
jeblair | mordred: i believe the steady state was due to zookeeper being broken and nothing happening | 22:35 |
clarkb | v3 doing check only possibly | 22:35 |
jeblair | mordred: i think as long as zuulv3 is active it's leaking | 22:35 |
clarkb | and not do a migration script again just roll both sides forward | 22:36 |
fungi | i'm not super keen on rollback for fear of lost traction, but also don't want people losing their weekends to this | 22:36 |
jeblair | i will not be able to work on this over the weekend | 22:36 |
*** baoli has quit IRC | 22:37 | |
fungi | right. we've been hitting this pretty hard all week, so letting v2 take back gating and do check side by side with v3 into early next week sounds compelling | 22:37 |
jeblair | it looks like we got 6 hours of use out of v3 before it ran out of ram | 22:38 |
superdan | 5.5 hour cron restart and call it good! | 22:38 |
jeblair | swap will reduce the likelihood of oom killer, but it still make tank performance enough for us to need to restart | 22:38 |
jeblair | it's also worth noting that it took a couple of hours (because of the reconfig slowness) to re-enqueue changes, so expect a backlog of a couple hours after every restart. | 22:40 |
fungi | i'm leaning toward rollback at this point so we can focus on fixing what we know is broken, at least | 22:42 |
mnaser | wouldn't there be a large # of resource starvation as both jobs attempt to run | 22:42 |
fungi | but i still think v2 configuration needs to remain basically frozen | 22:42 |
mnaser | splitting resources across both nodepools | 22:42 |
fungi | mnaser: yeah, we'd probably have to give ~2/3 of the quota to v2 | 22:43 |
mordred | fungi: I think both configs will largely need to remain frozen - at least in terms of what's in what pipeline | 22:43 |
jeblair | mnaser: v3 would use less since we'd be doing check only | 22:43 |
jeblair | and i'd expect us to allow it to get fairly backlogged | 22:43 |
mordred | fungi: well - I take that back - iterating forward on v3 jobs sohuld be fine | 22:44 |
fungi | if anything, a bit of backlog there probably helps expose issues | 22:44 |
SamYaple | wont a rollback right now break projects that converted there stuff to v3? | 22:44 |
jeblair | (actually running jobs is probably not strictly necessary to expose these issues) | 22:44 |
clarkb | SamYaple: no all your old jobs are still there | 22:44 |
fungi | unless we approved changes to delete the v2 jobs, which i doubt | 22:44 |
*** rbrndt has quit IRC | 22:44 | |
*** esberglu has joined #openstack-infra | 22:45 | |
SamYaple | ah i suppose that makes sense | 22:45 |
clarkb | how difficult would it be to run v3 as check only? | 22:45 |
mordred | SamYaple: not realy - the v2 version fo the jobs ... yah - that clarkb said - and you should still be able to iterate on your local versions of the jobs - it just won't gate on them | 22:45 |
mnaser | only thing to keep in mind is to make sure zuul doesnt vote even if it runs on check | 22:45 |
jeblair | mnaser: it's okay for it to vote, it's a different user than v3 | 22:45 |
mordred | mnaser: zuul voting in check is fine - yah, that | 22:45 |
mnaser | so jenkins can leave +1 and zuul -1 and it'll allow it to merge? dont know acls much but hey if it'll be okay you know better :) | 22:46 |
mordred | clarkb: so - that's the part I'm a smidge concerned about ... bcause we can write a script to remove pipeline config for things that aren't check ... | 22:46 |
*** xarses has quit IRC | 22:46 | |
fungi | though this is going to trip up those third-party-ci users who had configured their systems to only run jobs after zuul voted, since we're toggling the account name back again | 22:46 |
mordred | clarkb: but if peope make changes to check jobs, re-applying the removed non-check entries will be very hard | 22:46 |
fungi | mnaser: correct, zuul v2 ("jenkins" account) is only configured to care about its own verify +1 | 22:47 |
fungi | so it'll ignore the zuul v3 ("zuul" account) votes | 22:47 |
mordred | (I mean, not impossible, ut the re-applied gate entries will not reflect any changes made to check entries) | 22:47 |
jeblair | if folks want to keep limping along on v3, that's fine. i just wanted to be up front that if we were considering flipping the switch right now, i would not do it. and i'd say we probably need 2 weeks to deal with the performance/memory issues. they will not be easy to find or fix. | 22:48 |
SamYaple | as long as i can hammer out my new zuulv3 jobs this weekend, i suppose im ok with a solution that is easiest | 22:48 |
*** slaweq has quit IRC | 22:48 | |
fungi | SamYaple: with v3 voting in the check pipeline alongside v2, you can iterate on most jobs (aside from release, post, periodic) pretty easily | 22:49 |
clarkb | mordred: you mean in layout type stuff as we'd have to reconcile th delta? | 22:49 |
SamYaple | yea thats what it sounds like. and for this particular repo its noop on v2, so meh. i guys i dont have much of a stake in this race | 22:49 |
mordred | jeblair: my biggest concern with rolling back is the logistical flux that I think will be hard for people to reason about. but I also agree that finding and fixing the memory and performance issues is not likely to be straightforward or easy | 22:49 |
fungi | SamYaple: assuming v3 is running well enough to execute jobs and report back at any given point in time that is | 22:49 |
SamYaple | fungi: fair point | 22:50 |
*** esberglu has quit IRC | 22:50 | |
fungi | leaving v2 job configuration frozen for a couple weeks will be a tough sell, but i expect the community will understand the reasons | 22:51 |
cloudnull | I'm at a bit of a loss regarding what I'd need to change here: https://review.openstack.org/#/c/508281 test commits are still resulting in "node_failure", i have no idea why. | 22:51 |
jeblair | there is a possibility that it's not so much of a memory leak as just using an inordinate amount of memory. maybe it will level off at 20G. | 22:51 |
mordred | which is to say - I think I lean *slightly* toward limping forward, but not so much that i'd try to persuade anyone to change their mind if they leaned the other direction | 22:51 |
mordred | jeblair: this is an excellent point | 22:52 |
mnaser | will it actually really slow down if you it swaps out a lot? it could just be idle memory that is largely unused :X | 22:52 |
fungi | i'm more concerned about the correctness of job selection, with whatever new bug it is jeblair has spotted which will take significant engineering to fix | 22:52 |
jeblair | fungi: expect a fix for that within a day; i wouldn't let it drive a rollback decision. | 22:53 |
*** thorst has joined #openstack-infra | 22:53 | |
clarkb | I am booked with family stuff this weekend and am traveling to a conference late next week | 22:53 |
clarkb | but otherwise happy to keep rolling forward | 22:53 |
mordred | we could leave it in place over the weekend and see how memory grows or doesn't with the swap in place, and stage a rollback if needed early monday (since we'll need to figure out cleanly splitting out the projects.yaml file anyway) | 22:53 |
clarkb | my biggest concern is making sure the rest of openstack is able to get their work done too | 22:54 |
mordred | yah | 22:54 |
mordred | I agree with that | 22:54 |
clarkb | mordred: thats a good point since we are quiet over weekend for most part | 22:54 |
fungi | i'm willing to check in periodically over the weekend and restart/reenqueue zuul if memory pressure reaches the danger zone, but can't commit to much more... and also wonder what the load is going to look like on it again when monday rolls around | 22:54 |
clarkb | its nit huge loss to delay any rollback based on more data gathering | 22:54 |
SamYaple | can we add swap and see if memory levels out over the weekend? | 22:54 |
mordred | yah - I don't intend to WORK on things over the weekend, but I can check in periodically for a restart if needed if it'll help us gather data | 22:54 |
mordred | SamYaple: we have added the swap - so that's in place | 22:55 |
fungi | SamYaple: mordred added swap moments ago | 22:55 |
jeblair | mordred: how much? | 22:55 |
SamYaple | got it. so that will be good data | 22:55 |
jeblair | 16 i see it now | 22:55 |
mordred | jeblair: 16G | 22:55 |
fungi | 16gb but we can up it | 22:55 |
mordred | jeblair: we have plenty of disk on the volume we could reallocate if we wanted | 22:55 |
SamYaple | any reason to not throw the entirety of /etc/xvde1 at swap? | 22:55 |
fungi | though if it burns through most of that then i don't think more swap is the answer anyway | 22:55 |
jeblair | yeah, let's actually do that. i normally would avoid this, but, if it turns out we're leaking idle memory and it doesn't impact performance, we'll get more data and save a debugging cycle | 22:56 |
*** yee379 has quit IRC | 22:56 | |
mordred | fungi: well - it might be if we learn that zuul levels off at 45G and we just need to boot a REALLY big server | 22:56 |
*** yee379 has joined #openstack-infra | 22:56 | |
SamYaple | right tahts my point | 22:56 |
jeblair | obviously not a long term answer, but may help answer questions faster and possibly keep limping longer | 22:56 |
fungi | i can buy the moar datas argument | 22:56 |
mordred | jeblair: by "that" do you mean "just add the whole volume as swap" ? | 22:56 |
fungi | yeah | 22:56 |
jeblair | mordred: well, more, maybe not whole | 22:56 |
*** wolverineav has quit IRC | 22:57 | |
fungi | it's, what, an 80gb device? | 22:57 |
*** mat128 has joined #openstack-infra | 22:57 | |
jeblair | mordred: we're not using it for anything else? | 22:57 |
SamYaple | ive had 1TB swap before... more is fine surely | 22:57 |
*** dizquierdo has joined #openstack-infra | 22:57 | |
mnaser | uh if this server is at rax i'd suggest grabbing swap from the local drive because that wont go over the network | 22:57 |
mordred | 133G | 22:57 |
mnaser | (maybe make a swap file?) | 22:57 |
mordred | jeblair: we are not | 22:57 |
mordred | jeblair: it was completely unmounted before we ran the swap script | 22:57 |
jeblair | mordred: okay sure, whole thing i guess :) | 22:57 |
mordred | ok. let me do that real quick | 22:58 |
*** mat128 has quit IRC | 22:58 | |
jeblair | mnaser: i think this is a local volume | 22:58 |
jeblair | mnaser: i think this is the "ephemeral volume" you get with rax servers | 22:58 |
mnaser | okay cool, that's ideal | 22:58 |
fungi | we'll want to relocate the content for /opt back off it before we destroy that volume though | 22:58 |
fungi | (obviously) | 22:59 |
mnaser | im sure this is really obvious but i guess this memory leak started since the latest restart of zuul? | 22:59 |
SamYaple | mnaser: memory leak OR lots of memory usage. | 22:59 |
mnaser | http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=63979&rra_id=all | 22:59 |
mnaser | well it seemed to be pretty stable | 22:59 |
SamYaple | one of those is easier to solve :) | 22:59 |
mnaser | until the latest restart which changed the pattern of memory usage significantly | 23:00 |
fungi | mnaser: one interpretation is that it started once we got it under significant load by fixing the zk timeout issues | 23:00 |
clarkb | to make sure its clear, increase swap on zuulv3.o.o, use weekend to collect data with minimal impact, decide if rollback is necessary monday ish is the rough plan? | 23:00 |
fungi | clarkb: that sounds right to me | 23:00 |
SamYaple | +1 | 23:00 |
mordred | ok. we have 150G of swap now | 23:02 |
fungi | and did we want to go ahead and merge any of the pending zuul patches before restarting? all of them? specific ones? | 23:02 |
*** dizquierdo has quit IRC | 23:02 | |
mordred | fungi: I would like to merge jeblair's patch at least | 23:03 |
clarkb | ++ | 23:03 |
fungi | the [:] patch? i agree that one's pretty critical | 23:03 |
jeblair | heh, it will use (a little) more memory :) | 23:03 |
mordred | https://review.openstack.org/#/c/508629 and https://review.openstack.org/#/c/508613 are worth looks too | 23:03 |
mordred | as is https://review.openstack.org/#/c/508620/ | 23:04 |
*** slaweq has joined #openstack-infra | 23:04 | |
mordred | but I defer to jeblair on those - I've got the elevated bit - want me to merge the jeblair change? | 23:05 |
fungi | yeah | 23:07 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Fix bug with multiple project-templates https://review.openstack.org/508612 | 23:07 |
jeblair | 629 is a good fix but not having it won't cause any problems; let's leave it for normal process | 23:08 |
jeblair | other two are good candidates for force-merge before restarting | 23:08 |
fungi | 508553 would also be nice for any restarts, but can always use a local fork install in a venv with that | 23:08 |
fungi | er, i guess it doesn't even need installing. standalone utility script | 23:09 |
*** hongbin has quit IRC | 23:09 | |
fungi | so can just let it merge normally | 23:09 |
superdan | https://review.openstack.org/#/c/508519/6 | 23:10 |
superdan | oops | 23:10 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Map pipeline precedence to nodepool node priority https://review.openstack.org/508613 | 23:10 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Protect against builds dict changing while we iterate https://review.openstack.org/508629 | 23:10 |
mordred | had to rebaes the pipeline precedence one - we both added tests | 23:10 |
jeblair | mordred: rebase looks good | 23:11 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Map pipeline precedence to nodepool node priority https://review.openstack.org/508613 | 23:11 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: SourceContext improvements https://review.openstack.org/508620 | 23:12 |
fungi | do we need to restart executors for any of those too, or just the scheduler? | 23:12 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Update zuul-changes script for v3 https://review.openstack.org/508553 | 23:12 |
jeblair | just sched | 23:12 |
mordred | k. I thnk that's everything - shall I do a kick from puppetmaster? | 23:13 |
jeblair | mordred: sounds good | 23:13 |
fungi | that does seem like the next step | 23:13 |
mordred | running puppet | 23:13 |
jeblair | i have rm'd the pidfile and run 'service zuul-scheduler stop' so that systemd will not be confused | 23:14 |
jeblair | mordred: you should be able to just start it normally when ready | 23:14 |
fungi | good call | 23:14 |
mordred | ok. it's done | 23:15 |
fungi | sad that systemd gets confused by services dying | 23:15 |
mordred | starting ... | 23:15 |
clarkb | fungi: its because zuul manages its own pid file iirc | 23:15 |
fungi | clarkb: yeah, and backgrounds itself | 23:15 |
clarkb | a sigkill handler likely would be good | 23:15 |
clarkb | but later :) | 23:16 |
mordred | zuul scheduler started | 23:16 |
fungi | that was ~70 minutes from the oom event at 22:07 | 23:19 |
jeblair | i think we've exceeded cacti's swap limit; i will update it | 23:20 |
fungi | in case anyone's looking back at the timeline later | 23:20 |
SamYaple | fungi: i run systemctl restart for that reason. i think that was a design decision | 23:20 |
*** bobh has joined #openstack-infra | 23:21 | |
openstackgerrit | Andrea Frittoli proposed openstack-infra/devstack-gate master: Basic processing of test results https://review.openstack.org/507980 | 23:27 |
openstackgerrit | Andrea Frittoli proposed openstack-infra/devstack-gate master: Throwaway patch to check subunit file processing https://review.openstack.org/508171 | 23:27 |
*** rossella_s has quit IRC | 23:28 | |
*** yamamoto has joined #openstack-infra | 23:29 | |
clarkb | we should recheck various fix changes | 23:30 |
clarkb | mordred groups one and the more reliable tox logs one | 23:30 |
*** bobh has quit IRC | 23:31 | |
*** rossella_s has joined #openstack-infra | 23:32 | |
*** yamamoto has quit IRC | 23:35 | |
*** lbragstad has quit IRC | 23:35 | |
*** slaweq has quit IRC | 23:36 | |
*** bobh has joined #openstack-infra | 23:37 | |
mnaser | hate to be the bearer of bad news but i think no new jobs are getting queue'd? | 23:37 |
superdan | mnaser: they are for me | 23:38 |
mnaser | superdan i swear as you said that my jobs appeared | 23:38 |
mnaser | so uh, thanks! | 23:38 |
superdan | mnaser: you. are. welcome. | 23:38 |
fungi | it's his superpower | 23:38 |
*** esberglu has joined #openstack-infra | 23:39 | |
*** fried_rice is now known as efried_thbagh | 23:40 | |
*** esberglu has quit IRC | 23:44 | |
*** bobh has quit IRC | 23:44 | |
clarkb | http://logs.openstack.org/18/505418/13/check/openstack-tox-py27/0b6f094/zuul-info/ says xenial \o/ | 23:45 |
mordred | clarkb: WOOT | 23:48 |
clarkb | though nnet job still running against master apparently | 23:48 |
clarkb | mordred: before I edit https://review.openstack.org/#/c/508519/6/zuul.d/zuul-legacy-project-templates.yaml can you check my comment there that it is sane? | 23:49 |
*** slaweq has joined #openstack-infra | 23:50 | |
clarkb | mordred: looking at http://logs.openstack.org/12/506312/7/check/legacy-tempest-dsvm-nnet/54fc679/zuul-info/inventory.yaml it didn't apply the branch restriction | 23:51 |
clarkb | it says branches: {MatchAny:{BranchMatcher:master}} so guessing that is another configuration bug? | 23:51 |
*** zhurong has joined #openstack-infra | 23:52 | |
SamYaple | it cannot find my custom playbook. https://review.openstack.org/#/c/508625/ , im at a loss | 23:53 |
mordred | clarkb: yes - re https://review.openstack.org/#/c/508519/6/zuul.d/zuul-legacy-project-templates.yaml | 23:53 |
*** caphrim007_ has quit IRC | 23:54 | |
*** caphrim007 has joined #openstack-infra | 23:54 | |
mordred | clarkb: yes! this is, in fact, part of the problem with the thing jeblair is now working to fix | 23:55 |
jeblair | yeah, i've tracked it down to project-templates getting branch matchers when they shouldn't | 23:55 |
clarkb | ok, I'll work on getting mriedems workaround in place then | 23:56 |
jeblair | this is probably also the cause of the earlier neutron not running stuff on stable... | 23:56 |
jeblair | what workaround? | 23:56 |
openstackgerrit | Merged openstack-infra/openstack-zuul-jobs master: Fix Kolla-Kubernetes missing deps. https://review.openstack.org/508597 | 23:56 |
openstackgerrit | Merged openstack-infra/openstack-zuul-jobs master: Drop non-legacy Puppet project templates https://review.openstack.org/508333 | 23:56 |
openstackgerrit | Merged openstack-infra/openstack-zuul-jobs master: Update ubuntu-xenial-2-node to match centos-7-2-node https://review.openstack.org/508568 | 23:56 |
openstackgerrit | Merged openstack-infra/openstack-zuul-jobs master: Remove spurious change to tox.ini https://review.openstack.org/508607 | 23:56 |
clarkb | well it may not workaround it now that I think about it | 23:56 |
clarkb | jeblair: but 508519 | 23:56 |
clarkb | jeblair: it cleans things up around that job | 23:56 |
jeblair | clarkb: hrm, does the nnew-newton job have a branch matcher? | 23:57 |
jeblair | nnet-newton | 23:58 |
clarkb | jeblair: on the job itself I think | 23:58 |
clarkb | rather than on a template | 23:58 |
jeblair | clarkb: hrm, i don't see one | 23:58 |
*** caphrim007 has quit IRC | 23:59 | |
*** Sukhdev has quit IRC | 23:59 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!