clarkb | does anyone understand why atomic is building skopeo/ostree as part of the image pull too? | 00:01 |
---|---|---|
mordred | no. that doesn't make any sense to me | 00:01 |
clarkb | that also seems like a less than desireable feature (I mean its called "atomic" beacuse it uses atomic images but then we go compile things just for fun?) | 00:01 |
mordred | I would have thought one of the benefits of this as an exploded packaging format is pre-built binaries | 00:01 |
mnaser | i think it might be a layered thing | 00:01 |
mnaser | where it commits a layer of bins on top of the existing one | 00:02 |
mnaser | i guess it's time for me to write a magnum patch | 00:02 |
mnaser | boo | 00:02 |
clarkb | mnaser: ah so the error there looks like a compile fail but really its a write the objects to disk error? I could see that | 00:02 |
* mordred nominates mnaser for core in all openstack projects | 00:02 | |
clarkb | mordred: corvus fwiw I'm not in a spot to redeploy the cluster today. I think we need to wait for mnaser to have new images regardless | 00:03 |
clarkb | mnaser: ^ starting with that might be the easiest path forward then those who can build new cluster can redeploy | 00:03 |
mordred | clarkb: kk | 00:03 |
mnaser | yeah i think the fail is that it tries to commit the new image on top of everything | 00:03 |
mnaser | clarkb: you can technically redeploy without us making changes yet :) | 00:03 |
clarkb | mnaser: by building my own images? | 00:03 |
mnaser | nope! | 00:03 |
mnaser | one second | 00:03 |
mnaser | so when you deploy magnum clusters | 00:04 |
mnaser | theres something called "labels" which are pretty much parameters | 00:04 |
mnaser | one of them is the image tag used to deploy cluster | 00:04 |
mnaser | in this case it's called kube_Tag | 00:05 |
mnaser | https://github.com/openstack/magnum/blob/3a50a242d34f9ee02bc782629e54710d437b3d23/magnum/drivers/common/templates/kubernetes/fragments/configure-kubernetes-minion.sh#L28-L29 | 00:05 |
mnaser | which gets injected later into heat | 00:05 |
clarkb | --labels kube_Tag=1.11.5-1 ? | 00:05 |
mnaser | https://github.com/openstack/magnum/blob/c8019ea77f33609452dd1a973e0f421b118c2079/doc/source/user/index.rst#kube-tag | 00:05 |
clarkb | cool I think that will work then. Good to know | 00:06 |
mnaser | kube_tag -- all lower case, sorry about that uppercase typo | 00:06 |
clarkb | unless it boots the existing atomic image, then fails to pull in the new ones :P | 00:06 |
clarkb | which is what I was trying to do :P | 00:06 |
mnaser | clarkb: it won't, it will actually pull down the new ones from the get go | 00:06 |
mnaser | and i guess we'll have.. enough space in the new cluster | 00:07 |
mnaser | im guessing logs and what not probably managed to fill up the cluster | 00:07 |
clarkb | ya if the old images fit then the new ones likely will in that case | 00:07 |
clarkb | mnaser: the bulk of it is atomic itself. 4.4GB/5GB | 00:07 |
mnaser | https://hub.docker.com/r/openstackmagnum/kubernetes-apiserver/tags/ | 00:07 |
mnaser | they dont see to be much bigger | 00:07 |
clarkb | the journal is 500MB | 00:07 |
clarkb | which is the other space in that I think | 00:07 |
clarkb | I can probably nuke some of the journal to pull in these new images too | 00:07 |
clarkb | mnaser: so ya probably what will happen is new images iwll get pulled in straight away before journal fills disk. Then journal will fill disk more quickly as slightly less disk is available for it | 00:08 |
mnaser | yeah, still should be fixed tho. | 00:09 |
clarkb | ++ | 00:09 |
mnaser | but just an easier workaround to help you get going | 00:09 |
clarkb | now I want to see about pruning the journal and rerunning the pull just to test all these assumptions | 00:09 |
*** munimeha1 has quit IRC | 00:09 | |
fungi | infra-root: https://review.openstack.org/621258 is now un-wip'ed and passing jobs. please review so i can remove lists.o.o from the emergency disable list | 00:11 |
clarkb | fungi: done | 00:13 |
clarkb | fwiw making progress now that journal is pruned | 00:13 |
mordred | fungi: +2 - feel free to +A whenever it's the right time | 00:14 |
mordred | or I can if it's fine to roll live right now | 00:15 |
fungi | it's fine to go live now | 00:15 |
fungi | i've already hand-patched it in | 00:15 |
fungi | because i wanted to get notifications about it sent out before utc midnight | 00:15 |
clarkb | and ya confirmed on the minion we have a bunch of thnigsrunning under docker on the docker volume | 00:17 |
clarkb | I wonder why the master isn't running any of those toe spread out the load | 00:18 |
clarkb | #status log clarkb upgraded the Nodepool magnum k8s cluster by pulling images and rebasing/restarting services for k8s on the master and minion nodes. Magnum doesn't support these upgrades via the API yet. Note that due to disk space issues the master node had its journal cleaned up in order to pull the new images down | 00:19 |
openstackstatus | clarkb: finished logging | 00:19 |
clarkb | we are now patched against teh cve | 00:20 |
fungi | i've added a nonmember discard filter of ^[0-9]+@qq\.com$ to the openstack-discuss ml config, after seeing that one or more of the aliased old list addresses is receiving spam from random addresses matching that pattern on the order of one every few seconds | 00:21 |
clarkb | mnaser: last question of the day. Do you know if the master node runs any pod workload? | 00:22 |
mnaser | clarkb: nope | 00:22 |
clarkb | mnaser: seems like maybe it doesn't in which case the docker volume isn't useful there | 00:22 |
fungi | i'm not thrilled with that discard filter, but i don't see how the moderation queue will be even remotely manageable otherwise | 00:22 |
mnaser | clarkb: yeah indeed, i would have to check with other magnum devs | 00:22 |
* mnaser is not thrilled at how the boot from volume stuff is not super clean | 00:23 | |
mnaser | :< | 00:23 |
clarkb | fungi: as long as that doesn't prevent people with qq.com addrs from registering that is probably a reasonable compromise | 00:23 |
clarkb | mnaser: that was an exciting end of the day for me :) I learned a bunch about magnum and atomic and a little about k8s too :P | 00:23 |
mnaser | and signed me up for work! :P | 00:23 |
fungi | clarkb: they can subscribe just fine and nonmembers using non-numeric @qq.com addresses can also send and land in moderation | 00:24 |
clarkb | ah its the numeric addrs got it, (though those seem common on that platform?) | 00:24 |
fungi | it's just the all-numeric @qq.com addresses which need to be subscribed to send | 00:24 |
fungi | yeah, i think they at one point matched phone numbers for converged text messaging | 00:25 |
clarkb | ah | 00:25 |
clarkb | alright I've got to step out now. I've not really gotten away from my desk at all today | 00:25 |
clarkb | my brain is sufficiently fried. But k8s is patched so yay | 00:25 |
fungi | but i'll admit i'm not super familiar with qq.com's services | 00:25 |
*** jamesmcarthur has joined #openstack-infra | 00:26 | |
*** tosky has quit IRC | 00:27 | |
fungi | looks like it's also getting some spam from empty addresses like: "sales" <> | 00:28 |
fungi | i wonder how best to filter those | 00:28 |
mordred | clarkb: I believe this now makes you the resident infra k8s expert. congrats | 00:30 |
clarkb | uh oh | 00:31 |
*** jamesmcarthur has quit IRC | 00:31 | |
fungi | good jeorb | 00:33 |
*** bdodd_ has joined #openstack-infra | 00:37 | |
*** bdodd has quit IRC | 00:37 | |
*** yamamoto has quit IRC | 00:38 | |
*** bdodd has joined #openstack-infra | 00:44 | |
*** gfidente|afk has quit IRC | 00:45 | |
pabelanger | strongbad! | 00:45 |
mnaser | clarkb: mordred https://review.openstack.org/#/c/621734/ if you wanna follow progres but threw that up to see how it breaks and ill test locally soon | 00:46 |
*** bdodd_ has quit IRC | 00:48 | |
*** bdodd__ has joined #openstack-infra | 00:48 | |
*** bdodd has quit IRC | 00:49 | |
fungi | pabelanger: well, that was actually coach z, but yes a homestar runner reference indeed | 00:49 |
pabelanger | fungi: ah, right. Had to google again. | 00:50 |
openstackgerrit | Merged openstack-infra/system-config master: Shut down openstack general, dev, ops and sigs mls https://review.openstack.org/621258 | 00:51 |
*** sthussey has quit IRC | 01:05 | |
*** bhavikdbavishi has joined #openstack-infra | 01:06 | |
*** bdodd has joined #openstack-infra | 01:06 | |
*** bdodd__ has quit IRC | 01:07 | |
*** lbragstad has quit IRC | 01:07 | |
*** yamamoto has joined #openstack-infra | 01:10 | |
clarkb | mwhahaha: http://logs.openstack.org/41/621341/1/gate/tripleo-ci-centos-7-standalone/0157d94/logs/undercloud/home/zuul/standalone_deploy.log.txt.gz#_2018-12-03_23_21_18 possible 7.6 fallout? its weird because it says the skew and correction are 0, but it still tries 20 times then fails | 01:14 |
mwhahaha | That's chrony not able to sync | 01:14 |
mwhahaha | We syncwait 20 | 01:14 |
mwhahaha | Ntp can be touchy | 01:15 |
clarkb | mwhahaha: ya but why did it fail if the correction and skew are 0 | 01:15 |
*** wolverineav has quit IRC | 01:15 | |
clarkb | that should mean it is in sync | 01:15 |
mwhahaha | So if it happens more than a few times it might be 7.6 fallout | 01:15 |
mwhahaha | No it might not have been synced | 01:15 |
clarkb | then the skew number is buggy? | 01:15 |
mwhahaha | It's unlikely to be 0 | 01:15 |
clarkb | mwhahaha: ya but the stdout says 0 is my point | 01:15 |
*** wolverineav has joined #openstack-infra | 01:16 | |
mwhahaha | Yeah that's the output when chronyc can't sync | 01:16 |
clarkb | by can't sync you mean it couldn't talk to an ntp server so the output is "null" | 01:17 |
clarkb | ? | 01:17 |
mwhahaha | Yea | 01:18 |
mwhahaha | It's the same thing ntp does | 01:18 |
clarkb | do you know what sources it is looking for? | 01:19 |
mwhahaha | We just switched to chrony recently but ntp initial sync can be flakey | 01:19 |
mwhahaha | Ntp.pool.org | 01:19 |
mwhahaha | We likely need to expand the source list | 01:19 |
mwhahaha | I'll check it out later | 01:19 |
mwhahaha | I think I know the cause | 01:20 |
clarkb | chronyc makestep && chronyc waitsync 20 may be worthwhile too just in case the skew is large | 01:21 |
*** yamamoto has quit IRC | 01:21 | |
clarkb | (we've seen that unfortunately on some clouds they can be hours off) | 01:21 |
*** jamesmcarthur has joined #openstack-infra | 01:24 | |
*** bhavikdbavishi has quit IRC | 01:24 | |
*** wolverineav has quit IRC | 01:26 | |
*** wolverineav has joined #openstack-infra | 01:26 | |
mwhahaha | That's what we are doing. Anyway it's cause we recnely dropped the pool config and aren't supplying multiple servers any more. | 01:26 |
mwhahaha | I'll propose a patch and ping juan | 01:27 |
*** yamamoto has joined #openstack-infra | 01:28 | |
*** jamesmcarthur has quit IRC | 01:34 | |
*** jamesmcarthur has joined #openstack-infra | 01:35 | |
*** takamatsu has quit IRC | 01:37 | |
*** jamesmcarthur has quit IRC | 01:38 | |
*** hongbin has joined #openstack-infra | 01:43 | |
*** jamesmcarthur has joined #openstack-infra | 01:49 | |
*** jamesmcarthur has quit IRC | 01:51 | |
fungi | #status log used rmlist to delete the openstack, openstack-dev, openstack-operators and openstack-sigs mailing lists on lists.o.o while leaving their archives in place | 01:52 |
openstackstatus | fungi: finished logging | 01:52 |
fungi | in case they're needed, i saved the final states of the mailman configs and subscriber lists for all of those in my homedir on the server | 01:53 |
fungi | so that we won't need to extract them from backups | 01:53 |
*** jamesmcarthur has joined #openstack-infra | 01:53 | |
*** jamesmcarthur has quit IRC | 01:57 | |
*** jamesmcarthur has joined #openstack-infra | 02:04 | |
*** agopi has quit IRC | 02:04 | |
*** jamesmcarthur has quit IRC | 02:07 | |
*** jamesmcarthur has joined #openstack-infra | 02:07 | |
*** agopi has joined #openstack-infra | 02:13 | |
*** gyee has quit IRC | 02:14 | |
*** mrsoul has quit IRC | 02:18 | |
openstackgerrit | Ian Wienand proposed openstack-infra/zuul-jobs master: mirror-workspace-git-repos: Explicitly show HEAD of checked out branches https://review.openstack.org/621840 | 02:21 |
openstackgerrit | Ian Wienand proposed openstack-infra/system-config master: Add support for enabling the ARA callback plugin in install-ansible https://review.openstack.org/611228 | 02:29 |
openstackgerrit | Ian Wienand proposed openstack-infra/system-config master: Prefix install_openstacksdk variable https://review.openstack.org/621462 | 02:29 |
openstackgerrit | Ian Wienand proposed openstack-infra/system-config master: [to squash] Modifications to ARA installation https://review.openstack.org/621463 | 02:29 |
openstackgerrit | Ian Wienand proposed openstack-infra/system-config master: functional-tests: collect and publish inner ARA results https://review.openstack.org/617216 | 02:29 |
openstackgerrit | Ian Wienand proposed openstack-infra/system-config master: Revert "Make system-config-run-base-ansible-devel non-voting" https://review.openstack.org/621847 | 02:29 |
*** betherly has joined #openstack-infra | 02:38 | |
*** bhavikdbavishi has joined #openstack-infra | 02:39 | |
*** rlandy has quit IRC | 02:40 | |
*** wolverineav has quit IRC | 02:41 | |
*** jamesmcarthur has quit IRC | 02:42 | |
*** wolverineav has joined #openstack-infra | 02:42 | |
*** betherly has quit IRC | 02:43 | |
*** jamesmcarthur has joined #openstack-infra | 02:45 | |
*** wolverineav has quit IRC | 02:47 | |
*** psachin has joined #openstack-infra | 02:48 | |
*** wolverineav has joined #openstack-infra | 02:55 | |
*** wolverineav has quit IRC | 03:01 | |
*** eernst has joined #openstack-infra | 03:11 | |
*** hongbin_ has joined #openstack-infra | 03:11 | |
*** eernst has quit IRC | 03:12 | |
*** hongbin has quit IRC | 03:13 | |
*** hongbin has joined #openstack-infra | 03:22 | |
*** dklyle has joined #openstack-infra | 03:23 | |
*** onovy has quit IRC | 03:23 | |
*** hongbin_ has quit IRC | 03:23 | |
*** onovy has joined #openstack-infra | 03:25 | |
*** david-lyle has quit IRC | 03:25 | |
openstackgerrit | kangyufei proposed openstack/boartty master: Change openstack-dev to openstack-discuss https://review.openstack.org/621938 | 03:27 |
*** armax has quit IRC | 03:28 | |
*** wolverineav has joined #openstack-infra | 03:32 | |
*** betherly has joined #openstack-infra | 03:33 | |
*** betherly has quit IRC | 03:38 | |
openstackgerrit | Merged openstack-infra/nodepool master: Set pool for error'ed instances https://review.openstack.org/621681 | 03:42 |
*** ramishra has joined #openstack-infra | 03:46 | |
*** dave-mccowan has quit IRC | 03:53 | |
*** ykarel|away has joined #openstack-infra | 03:58 | |
*** mriedem_away has quit IRC | 03:58 | |
*** sshnaidm is now known as sshnaidm|afk | 04:04 | |
*** auristor has quit IRC | 04:05 | |
*** auristor has joined #openstack-infra | 04:05 | |
*** jamesmcarthur has quit IRC | 04:11 | |
tonyb | tobias-urdin: Sorry I let that slip off my radar. I'll queue it up for tomorrow | 04:19 |
*** ykarel|away has quit IRC | 04:24 | |
*** ykarel|away has joined #openstack-infra | 04:25 | |
*** ykarel|away is now known as ykarel | 04:27 | |
*** jamesmcarthur has joined #openstack-infra | 04:39 | |
*** agopi_ has joined #openstack-infra | 05:00 | |
*** agopi has quit IRC | 05:00 | |
*** diablo_rojo has quit IRC | 05:05 | |
*** agopi__ has joined #openstack-infra | 05:09 | |
*** agopi__ is now known as agopi | 05:10 | |
*** agopi_ has quit IRC | 05:12 | |
*** janki has joined #openstack-infra | 05:22 | |
*** hongbin has quit IRC | 05:39 | |
openstackgerrit | Manik Bindlish proposed openstack/os-testr master: Change openstack-dev to openstack-discuss https://review.openstack.org/622002 | 05:49 |
*** gcb_ has joined #openstack-infra | 06:06 | |
*** gcb_ has quit IRC | 06:11 | |
*** betherly has joined #openstack-infra | 06:13 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: web: update status page layout to 4 columns https://review.openstack.org/622010 | 06:15 |
*** betherly has quit IRC | 06:17 | |
*** ahosam has joined #openstack-infra | 06:22 | |
ianw | hrm, another weird ansible error on the devel branch job -> fatal: [trusty]: FAILED! => {"msg": "the connection plugin 'ssh' was not found"} | 06:32 |
*** apetrich has quit IRC | 06:37 | |
*** yboaron_ has joined #openstack-infra | 06:46 | |
*** jamesmcarthur has quit IRC | 06:47 | |
ianw | #status log fixed emergency file to re-enable bridge.o.o puppet runs (which stopped in http://grafana.openstack.org/d/qzQ_v2oiz/bridge-runtime?orgId=1&from=1543888040274&to=1543889448699) | 06:52 |
openstackstatus | ianw: finished logging | 06:52 |
ianw | it doesn't seem the switch to static inventory has happened yet, probably because of the broken runs. i guess it will take two more runs to fully deploy | 06:53 |
ianw | infra-root: ^ worth keeping an eye on | 06:53 |
*** takamatsu has joined #openstack-infra | 06:53 | |
*** wolverineav has quit IRC | 06:55 | |
*** wolverineav has joined #openstack-infra | 06:56 | |
*** quiquell|off is now known as quiquell | 06:59 | |
*** pcaruana has joined #openstack-infra | 07:10 | |
*** prometheanfire has quit IRC | 07:11 | |
*** AJaeger has quit IRC | 07:11 | |
*** prometheanfire has joined #openstack-infra | 07:13 | |
*** AJaeger has joined #openstack-infra | 07:15 | |
*** ahosam has quit IRC | 07:21 | |
*** apetrich has joined #openstack-infra | 07:24 | |
tobias-urdin | tonyb: thank you, got lost on me as well :) | 07:28 |
*** dpawlik has joined #openstack-infra | 07:29 | |
*** ykarel_ has joined #openstack-infra | 07:29 | |
*** ykarel has quit IRC | 07:31 | |
*** rcernin has quit IRC | 07:38 | |
*** jtomasek has joined #openstack-infra | 07:41 | |
*** wolverineav has quit IRC | 07:45 | |
*** wolverineav has joined #openstack-infra | 07:47 | |
*** quiquell is now known as quiquell|brb | 07:50 | |
*** wolverineav has quit IRC | 07:52 | |
*** adriancz has joined #openstack-infra | 07:57 | |
*** longkb has joined #openstack-infra | 08:00 | |
*** chkumar|off is now known as chandan_kumar | 08:05 | |
*** quiquell|brb is now known as quiquell | 08:07 | |
*** takamatsu has quit IRC | 08:15 | |
*** ralonsoh has joined #openstack-infra | 08:20 | |
*** gema has joined #openstack-infra | 08:28 | |
*** ykarel_ is now known as ykarel | 08:29 | |
*** ianychoi has quit IRC | 08:30 | |
*** ianychoi has joined #openstack-infra | 08:30 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/nodepool master: Set type for error'ed instances https://review.openstack.org/622101 | 08:44 |
*** jamesmcarthur has joined #openstack-infra | 08:44 | |
*** takamatsu has joined #openstack-infra | 08:46 | |
*** cgoncalves has joined #openstack-infra | 08:48 | |
*** jamesmcarthur has quit IRC | 08:50 | |
*** tosky has joined #openstack-infra | 08:51 | |
*** ykarel is now known as ykarel|lunch | 08:51 | |
*** jpena|off is now known as jpena | 08:55 | |
*** agopi has quit IRC | 08:56 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/nodepool master: Set id for error'ed instances https://review.openstack.org/622108 | 08:56 |
*** agopi has joined #openstack-infra | 08:56 | |
*** ahosam has joined #openstack-infra | 08:57 | |
*** jamesmcarthur has joined #openstack-infra | 08:58 | |
*** wolverineav has joined #openstack-infra | 08:59 | |
*** jamesmcarthur has quit IRC | 09:02 | |
*** jpich has joined #openstack-infra | 09:02 | |
*** agopi_ has joined #openstack-infra | 09:06 | |
*** agopi has quit IRC | 09:09 | |
*** rascasoft has quit IRC | 09:09 | |
*** rascasoft has joined #openstack-infra | 09:10 | |
*** takamatsu has quit IRC | 09:14 | |
*** shardy has joined #openstack-infra | 09:15 | |
*** shardy has quit IRC | 09:16 | |
*** takamatsu has joined #openstack-infra | 09:16 | |
*** shardy has joined #openstack-infra | 09:16 | |
*** agopi_ is now known as agopi | 09:22 | |
*** ahosam has quit IRC | 09:22 | |
*** lbragstad has joined #openstack-infra | 09:24 | |
*** ykarel|lunch is now known as ykarel | 09:26 | |
*** priteau has joined #openstack-infra | 09:30 | |
*** d0ugal has quit IRC | 09:31 | |
*** aojea has joined #openstack-infra | 09:32 | |
*** jamesmcarthur has joined #openstack-infra | 09:33 | |
*** jamesmcarthur has quit IRC | 09:38 | |
*** e0ne has joined #openstack-infra | 09:44 | |
*** sshnaidm|afk is now known as sshnaidm | 09:46 | |
*** takamatsu has quit IRC | 09:49 | |
*** takamatsu has joined #openstack-infra | 09:50 | |
*** derekh has joined #openstack-infra | 09:51 | |
*** yamamoto has quit IRC | 09:59 | |
openstackgerrit | Jonathan Rosser proposed openstack-infra/project-config master: Add centos/suse to OSA grafana dashboard https://review.openstack.org/622169 | 10:00 |
*** gfidente has joined #openstack-infra | 10:02 | |
*** dtantsur|afk is now known as dtantsur | 10:07 | |
*** bhavikdbavishi1 has joined #openstack-infra | 10:08 | |
openstackgerrit | Manik Bindlish proposed openstack/os-performance-tools master: Change openstack-dev to openstack-discuss https://review.openstack.org/622173 | 10:08 |
*** yamamoto has joined #openstack-infra | 10:08 | |
*** yamamoto has quit IRC | 10:08 | |
openstackgerrit | Quique Llorente proposed openstack-infra/zuul master: Add default value for relative_priority https://review.openstack.org/622175 | 10:09 |
*** bhavikdbavishi has quit IRC | 10:09 | |
*** bhavikdbavishi1 is now known as bhavikdbavishi | 10:09 | |
*** takamatsu has quit IRC | 10:13 | |
*** ralonsoh has quit IRC | 10:23 | |
*** ralonsoh has joined #openstack-infra | 10:23 | |
*** ahosam has joined #openstack-infra | 10:29 | |
*** electrofelix has joined #openstack-infra | 10:34 | |
*** jamesmcarthur has joined #openstack-infra | 10:34 | |
*** bhavikdbavishi1 has joined #openstack-infra | 10:44 | |
*** bhavikdbavishi has quit IRC | 10:48 | |
*** bhavikdbavishi1 is now known as bhavikdbavishi | 10:48 | |
*** takamatsu has joined #openstack-infra | 10:50 | |
ssbarnea|rover | I observed some spam which looks as being related to zuul logger with "waiting for logger", like http://logs.openstack.org/99/618899/3/gate/tripleo-ci-centos-7-standalone/d0e6fe3/job-output.txt.gz | 10:52 |
ssbarnea|rover | is this one-off or a more serious problem? | 10:52 |
*** bhavikdbavishi has quit IRC | 10:52 | |
*** jpena is now known as jpena|brb | 10:54 | |
ssbarnea|rover | created as https://review.openstack.org/#/c/622187/ | 10:57 |
openstackgerrit | Sorin Sbarnea proposed openstack-infra/elastic-recheck master: Query: [primary] Waiting for logger https://review.openstack.org/622210 | 11:01 |
*** takamatsu has quit IRC | 11:04 | |
*** takamatsu has joined #openstack-infra | 11:05 | |
*** ahosam has quit IRC | 11:05 | |
*** ahosam has joined #openstack-infra | 11:07 | |
*** yboaron_ has quit IRC | 11:15 | |
*** yboaron_ has joined #openstack-infra | 11:15 | |
*** yamamoto has joined #openstack-infra | 11:22 | |
*** longkb has quit IRC | 11:22 | |
*** yamamoto has quit IRC | 11:35 | |
*** e0ne has quit IRC | 11:35 | |
*** e0ne has joined #openstack-infra | 11:38 | |
*** bhavikdbavishi has joined #openstack-infra | 11:54 | |
*** vabada has joined #openstack-infra | 11:59 | |
*** vabada has quit IRC | 11:59 | |
*** vabada has joined #openstack-infra | 12:00 | |
*** gfidente has quit IRC | 12:06 | |
*** yboaron_ has quit IRC | 12:07 | |
*** yboaron_ has joined #openstack-infra | 12:12 | |
*** dave-mccowan has joined #openstack-infra | 12:12 | |
*** d0ugal has joined #openstack-infra | 12:16 | |
*** ahosam has quit IRC | 12:19 | |
*** lbragstad has quit IRC | 12:21 | |
*** lbragstad has joined #openstack-infra | 12:22 | |
*** lbragstad has quit IRC | 12:23 | |
*** lbragstad has joined #openstack-infra | 12:24 | |
*** lbragsta_ has joined #openstack-infra | 12:26 | |
*** ykarel is now known as ykarel|afk | 12:29 | |
*** lbragsta_ has quit IRC | 12:31 | |
*** lbragstad has quit IRC | 12:31 | |
*** quiquell is now known as quiquell|lunch | 12:33 | |
pabelanger | ssbarnea|rover: that usually happens when zuul_conole port on remote node is closed, and zuul is not able to stream logs | 12:35 |
*** lbragstad has joined #openstack-infra | 12:38 | |
*** jamesmcarthur has quit IRC | 12:39 | |
*** Douhet has quit IRC | 12:41 | |
*** jpena|brb is now known as jpena|lunch | 12:42 | |
*** Douhet has joined #openstack-infra | 12:42 | |
*** ykarel|afk is now known as ykarel | 12:43 | |
*** tpsilva has joined #openstack-infra | 12:44 | |
*** ahosam has joined #openstack-infra | 12:47 | |
*** bhavikdbavishi has quit IRC | 12:50 | |
*** ykarel is now known as ykarel|afk | 12:51 | |
*** bhavikdbavishi has joined #openstack-infra | 12:56 | |
*** psachin has quit IRC | 12:59 | |
*** udesale has joined #openstack-infra | 13:00 | |
*** ahosam has quit IRC | 13:01 | |
*** rh-jelabarre has joined #openstack-infra | 13:01 | |
*** boden has joined #openstack-infra | 13:05 | |
*** jcoufal has joined #openstack-infra | 13:08 | |
openstackgerrit | Filippo Inzaghi proposed openstack-dev/hacking master: Change openstack-dev to openstack-discuss https://review.openstack.org/622317 | 13:08 |
*** sshnaidm is now known as sshnaidm|afk | 13:14 | |
*** jtomasek_ has joined #openstack-infra | 13:15 | |
openstackgerrit | Filippo Inzaghi proposed openstack-dev/pbr master: Change openstack-dev to openstack-discuss https://review.openstack.org/622321 | 13:18 |
*** jtomasek has quit IRC | 13:18 | |
*** rlandy has joined #openstack-infra | 13:27 | |
*** bhavikdbavishi has quit IRC | 13:29 | |
*** yboaron_ has quit IRC | 13:29 | |
*** psachin has joined #openstack-infra | 13:38 | |
*** kgiusti has joined #openstack-infra | 13:40 | |
*** priteau has quit IRC | 13:40 | |
*** jpena|lunch is now known as jpena | 13:40 | |
*** priteau has joined #openstack-infra | 13:42 | |
*** quiquell|lunch is now known as quiquell | 13:44 | |
*** jaosorior has joined #openstack-infra | 13:45 | |
*** jamesmcarthur has joined #openstack-infra | 13:46 | |
openstackgerrit | Filippo Inzaghi proposed openstack-infra/bindep master: hange openstack-dev to openstack-discuss https://review.openstack.org/622325 | 13:46 |
*** eharney has joined #openstack-infra | 13:47 | |
openstackgerrit | Filippo Inzaghi proposed openstack-infra/elastic-recheck master: Change openstack-dev to openstack-discuss https://review.openstack.org/622326 | 13:49 |
*** janki has quit IRC | 13:50 | |
openstackgerrit | Filippo Inzaghi proposed openstack-infra/gear master: Change openstack-dev to openstack-discuss https://review.openstack.org/622327 | 13:51 |
*** jamesmcarthur has quit IRC | 13:51 | |
*** psachin has quit IRC | 13:51 | |
*** gfidente has joined #openstack-infra | 13:55 | |
openstackgerrit | Filippo Inzaghi proposed openstack-infra/git-review master: Change openstack-infra to openstack-discuss https://review.openstack.org/622328 | 13:57 |
e0ne | hi. could anybody please help me why CI didn't start for https://review.openstack.org/#/c/580469/? | 13:58 |
e0ne | I've got the same issue with the related patch | 13:58 |
e0ne | https://review.openstack.org/#/c/601687/3 | 13:58 |
*** sshnaidm|afk is now known as sshnaidm | 13:59 | |
frickler | e0ne: you have a dependency cycle between these two | 14:00 |
openstackgerrit | Filippo Inzaghi proposed openstack-infra/glean master: Change openstack-dev to openstack-discuss https://review.openstack.org/622329 | 14:00 |
frickler | e0ne: one depends on the other, which is rebased onto the first | 14:00 |
e0ne | frickler: thanks! I missed this somehow :( | 14:01 |
*** sthussey has joined #openstack-infra | 14:02 | |
*** udesale has quit IRC | 14:08 | |
openstackgerrit | Merged openstack-infra/gear master: Change openstack-dev to openstack-discuss https://review.openstack.org/622327 | 14:14 |
*** irclogbot_1 has quit IRC | 14:15 | |
dmsimard | ianw: meant to ping you last night for the ara things but got sidetracked, ping me when you're around ? | 14:16 |
*** larainema has joined #openstack-infra | 14:25 | |
*** janki has joined #openstack-infra | 14:29 | |
*** mriedem has joined #openstack-infra | 14:33 | |
dmsimard | ianw: most notably, your review makes me wonder at what point should we perhaps consider a standard "install-ansible" zuul-jobs role | 14:33 |
*** ginopc has quit IRC | 14:33 | |
*** zul has quit IRC | 14:37 | |
openstackgerrit | Filippo Inzaghi proposed openstack-infra/grafyaml master: Change openstack-infra to openstack-discuss https://review.openstack.org/622338 | 14:38 |
fungi | skimming the zuul status page, looks like things are going really well | 14:40 |
*** ginopc has joined #openstack-infra | 14:41 | |
fungi | tripleo and nova have burned a lot of their tokens and have changes waiting for hours to get node assignments, but other lower-activity projects are getting nodes straight away with no problem | 14:41 |
*** zul has joined #openstack-infra | 14:41 | |
*** bhavikdbavishi has joined #openstack-infra | 14:43 | |
cmurphy | is the asterisk server okay? it's been a while since I've used it so it might just be me but calling into it with a sip client or telephone gives me a busy signal | 14:44 |
*** janki has quit IRC | 14:44 | |
fungi | server's reachable over ssh | 14:44 |
fungi | asterisk process is running since september 27 | 14:45 |
*** xek has quit IRC | 14:45 | |
fungi | i don't have a sip client handy but can confirm the dial-in number is returning busy | 14:46 |
fungi | i wonder if the account for the dial-in trunk has run out of prepaid minutes again | 14:46 |
fungi | i know clarkb was talking to the osf finance person at the summit about getting a different account set up which bills the foundation directly | 14:47 |
fungi | but i don't recall where that got to | 14:47 |
openstackgerrit | Filippo Inzaghi proposed openstack-infra/infra-manual master: Change openstack-infra to openstack-discuss https://review.openstack.org/622341 | 14:48 |
fungi | nothing jumping out at me in the asterisk service logs | 14:48 |
fungi | interesting that sip connectivity would also be broken as i didn't think that relied on the dial-in account | 14:49 |
openstackgerrit | Jens Harbott (frickler) proposed openstack-infra/project-config master: Add placement into integrated queue https://review.openstack.org/622342 | 14:50 |
cmurphy | it's possible my client is broken, it turns out opensuse doesn't have an official jitsi package and the rpm i downloaded from jitsi might be meant for redhat | 14:50 |
cmurphy | but that's why i was hoping dialing in would work | 14:50 |
fungi | i could probably install jitsi on my workstation but don't have a speaker/microphone to test it with | 14:51 |
cmurphy | for me the connection timed out without reaching the server so you'd still be able to see if it gets past that | 14:51 |
fungi | interesting. that could be a routing or packet filtering problem too | 14:52 |
amorin | hey all | 14:52 |
fungi | hey amorin, did you manage to confirm the disk performance strangeness we were seeing in bhs1? | 14:53 |
*** udesale has joined #openstack-infra | 14:53 | |
*** irclogbot_1 has joined #openstack-infra | 14:53 | |
amorin | I am starting working on it | 14:54 |
amorin | so I'd like to be sure | 14:54 |
amorin | is the flavor ssd-osFoundation-3 ? | 14:54 |
fungi | checking | 14:54 |
amorin | how did you test the IO ? | 14:54 |
cmurphy | the message in jitsi when i use the address sip:conference@pbx.openstack.org is "The remote party has not replied!The call will be disconnected" | 14:54 |
*** eharney has quit IRC | 14:56 | |
fungi | amorin: https://git.openstack.org/cgit/openstack-infra/project-config/tree/nodepool/nl04.openstack.org.yaml says we are booting all our images in bhs1 with flavor ssd-osFoundation-3 | 14:56 |
amorin | ok | 14:56 |
amorin | I am booting right now an image with debian 9 | 14:56 |
amorin | how are you measuring IO? | 14:56 |
amorin | using fio? | 14:56 |
fungi | amorin: we were just trying a basic dd from /dev/zero to a file on the rootfs with a blocksize around 1mb and file size around 4gb to see the write speed. factor of 20x performance difference between equivalent instances in gra1 and bhs1 | 14:57 |
amorin | ok | 14:57 |
fungi | amorin: it first came to our attention that jobs which preallocate a non-sparse swapfile were taking an additional 10+ minutes in bhs1 waiting on that step to complete, which is what got us digging deeper on possible disk performance differences there | 14:58 |
*** zul has quit IRC | 14:58 | |
fungi | but more generally it seems to also be impacting the speed at which any i/o-bound operations complete (package installation, log flushing, et cetera) | 14:59 |
*** florianf is now known as florianf|biab | 14:59 | |
amorin | make sense if disk is slowing down the whole instance | 14:59 |
fungi | so a far higher proportion of jobs running in bhs1 were basically hitting their configured timeouts before they could complete | 15:00 |
fungi | i did some rough analysis of job timeouts based on our logstash data, and jobs running in bhs1 were 20x more likely to timeout than jobs running in gra1 | 15:01 |
fungi | interesting that the proportion there was roughly the same as the difference in disk write speed... i wouldn't have expected quite so close a correlation | 15:01 |
fungi | and that was after we halved our max-servers there to account for any possible cpu contention due to the 2:1 oversubscription ratio | 15:02 |
*** dpawlik has quit IRC | 15:04 | |
fungi | cmurphy: ahh, okay then that error very well may indicate some trouble with the service itself. i'll see if i can repeat your findings from here | 15:04 |
cmurphy | thanks fungi | 15:04 |
fungi | cmurphy: on a related note, i didn't see puppet log making any local changes on lists.katacontainers.io | 15:05 |
fungi | so that does indeed seem to have been a no-op | 15:06 |
fungi | as hoped | 15:06 |
cmurphy | fungi: that's good | 15:06 |
cmurphy | fungi: there's another patch in the same state that needs a recheck https://review.openstack.org/615656 | 15:07 |
fungi | cmurphy: wow, so it looks like there is no jitsi package in debian these days. even ubuntu seems to have dropped it from universe after 14.04lts (likely because they were importing it from debian?) | 15:08 |
cmurphy | fungi: oh dear :( | 15:08 |
fungi | researching a bit now to see if i can tell why | 15:08 |
cmurphy | their website does seem to dissuade people from using it anymore https://jitsi.org/ | 15:09 |
cmurphy | it's "legacy" | 15:09 |
openstackgerrit | Merged openstack-infra/project-config master: Set placement's gate queue to integrated https://review.openstack.org/621267 | 15:09 |
openstackgerrit | Sorin Sbarnea proposed openstack-infra/elastic-recheck master: Query: HTTPConnectionPool(host='tempest-sendmail.tripleo.org', https://review.openstack.org/622352 | 15:10 |
fungi | cmurphy: yeah, looks like it was dropped from debian/unstable over a year ago after it failed to meet qa requirements for either of the previous two stable debian releases | 15:11 |
pabelanger | cmurphy: I can try to debug pbx.o.o if you'd like | 15:11 |
pabelanger | if you give me 1 min, you can try your call again | 15:11 |
fungi | pabelanger: that would be lovely if you have a moment! | 15:11 |
cmurphy | thanks pabelanger | 15:11 |
pabelanger | cmurphy: okay, try your call again | 15:12 |
fungi | pabelanger: what did you find? looks like you restarted the service | 15:13 |
cmurphy | pabelanger: "Initiating call" | 15:13 |
pabelanger | fungi: I think there was a deadlock in asterisk, when I tried to enable sip debugs, the CLI froze | 15:14 |
*** jamesmcarthur has joined #openstack-infra | 15:14 | |
cmurphy | "The remote party has not replied!The call will be disconnected" | 15:14 |
fungi | i'm able to dial into the bridge | 15:14 |
pabelanger | cmurphy: oh, I think you are using ipv6 | 15:14 |
pabelanger | let me see if we have pbx.o.o setup for that | 15:14 |
pabelanger | cmurphy: if you can confirm on your client side | 15:15 |
fungi | looks like the server itself is reachable on its v6 address | 15:15 |
pabelanger | Yah, we have it enabled. I am unsure if it was ever tested / worked | 15:15 |
openstackgerrit | Sorin Sbarnea proposed openstack-infra/elastic-recheck master: Query: [primary] Waiting for logger https://review.openstack.org/622210 | 15:15 |
cmurphy | pabelanger: when i ping pbx.openstack.org it uses the ipv4 address | 15:15 |
pabelanger | Call-ID: 5678768139207f7e86d484327d1687b6@0:0:0:0:0:0:0:0 | 15:16 |
*** mriedem is now known as mriedem_afk | 15:16 | |
pabelanger | I suspect that is the issue, let me see if I can fix | 15:16 |
fungi | cmurphy: your ping utility may default to v4 and need a -6 option or to be invoked as ping6 instead | 15:16 |
cmurphy | i think I changed one of my networks configs to favor ipv4 always because my office was having issues | 15:16 |
cmurphy | not sure how to tell what jitsi is using, it doesn't log the address | 15:17 |
openstackgerrit | Sorin Sbarnea proposed openstack-infra/elastic-recheck master: Query: HTTPConnectionPool(host='tempest-sendmail.tripleo.org', https://review.openstack.org/622352 | 15:17 |
fungi | cmurphy: you could try connecting by ip address instead of hostname, probably? | 15:17 |
cmurphy | will try that | 15:17 |
pabelanger | cmurphy: yah, I can see your client is not replying to our 200 OK | 15:17 |
fungi | sip:6000@23.253.226.32 | 15:17 |
openstackgerrit | Filippo Inzaghi proposed openstack-infra/infra-specs master: Change openstack-infra to openstack-discuss https://review.openstack.org/622355 | 15:17 |
fungi | pabelanger: any idea what a better alternative to jitsi is we could recommend? looks like it's basically abandoned upstream and been dropped from most modern distros | 15:18 |
pabelanger | fungi: I'd have to look, I'm a few years out of date on SIP clients sadly. | 15:19 |
cmurphy | same result with the ipv4 address | 15:19 |
fungi | pabelanger: no worries, thought you might know off the top of your head | 15:19 |
pabelanger | cmurphy: okay if I post your IP address here? If not, I can PM | 15:19 |
corvus | fungi: should i pay the phone bill again? | 15:19 |
cmurphy | pabelanger: yeah i think it's fine | 15:19 |
openstackgerrit | Merged openstack-infra/infra-manual master: Change openstack-infra to openstack-discuss https://review.openstack.org/622341 | 15:19 |
pabelanger | fungi: At one point, I wanted to build a very simple webrtc app that we hosted on pbx.o.o (or some other server) and people just used their browsers | 15:19 |
pabelanger | cmurphy: From: "cmurphy" <sip:cmurphy@195.135.221.2:60998;transport=udp>;tag=3e41c1c2 | 15:19 |
pabelanger | cmurphy: is that your current IP? | 15:20 |
fungi | corvus: seems not to have been the problem this time | 15:20 |
cmurphy | pabelanger: yep looks like | 15:20 |
corvus | fungi: well, the account balance is low; if it's not the problem now, it will be shortly | 15:20 |
cmurphy | maybe a firewall issue in my office? | 15:20 |
corvus | looks like we have $7.26 | 15:21 |
ssbarnea|rover | can someone help me with few elastic-recheck reviews? https://review.openstack.org/#/q/project:openstack-infra/elastic-recheck+status:open+is:mergeable | 15:21 |
fungi | cmurphy: sip can be sensitive to some sorts of address translation and state tracking | 15:21 |
pabelanger | cmurphy: for some reason, you are not getting our replies to your invite. You could try using TCP for your client, and see if your firewall is better with those packets | 15:21 |
pabelanger | usually in this case, I would say look to your firewall and see why the packets are not getting to your client | 15:22 |
corvus | fwiw i can sip connect | 15:22 |
fungi | reading up at https://jitsi.org/ it seems they've basically switched focus to webrtc-based video conferencing tools and ceased development on sip | 15:22 |
pabelanger | corvus: Yup, see your attempt | 15:22 |
openstackgerrit | Filippo Inzaghi proposed openstack-infra/jeepyb master: Change openstack-infra to openstack-discuss https://review.openstack.org/622357 | 15:22 |
cmurphy | pabelanger: corvus okay then it is definitely me, i can try to figure out how to switch to tcp and/or try again from home | 15:22 |
corvus | and i can dial in on pstn | 15:23 |
openstackgerrit | David Moreau Simard proposed openstack-infra/puppet-openstackci master: Add AFS mirror support https://review.openstack.org/529376 | 15:23 |
fungi | yeah, i was able to dial in with my phone after pabelanger restarted askerisk | 15:23 |
pabelanger | fungi: cmurphy: I still use ekiga as a client, maybe see if you have a package for that | 15:23 |
cmurphy | pabelanger: i do! excellent | 15:23 |
pabelanger | great | 15:24 |
fungi | debian seems to have dropped ekiga from unstable as well, but more recently than jitsi. checking why | 15:24 |
cmurphy | and dialing in now works | 15:24 |
pabelanger | Yah, wouldn't surprise me that everybody is switching to webrtc | 15:25 |
openstackgerrit | David Moreau Simard proposed openstack-infra/puppet-openstackci master: Add AFS mirror support for RHEL/CentOS https://review.openstack.org/528739 | 15:25 |
fungi | ekiga was removed from debian/unstable two weeks ago at the maintainer's request, noting it's been abandoned upstream since 2013 and hadn't been updated with support for newer ptlib, so blocking removal of openssl 1.0 | 15:26 |
pabelanger | Yah, so I think there is a deadlock in chan_sip for some reason, not worth debugging at this point. We are running an old version of asterisk, we also need to upgrade to bionic also. | 15:26 |
fungi | https://bugs.debian.org/911593 | 15:26 |
openstack | Debian bug 911593 in ftp.debian.org "RM: ekiga -- ROM; RoQA; unmaintained, depends on ptlib which depends on openssl1.0" [Normal,Open] | 15:26 |
openstackgerrit | David Moreau Simard proposed openstack-infra/puppet-openstackci master: Add AFS mirror support for RHEL/CentOS https://review.openstack.org/528739 | 15:28 |
openstackgerrit | Filippo Inzaghi proposed openstack-infra/openstack-zuul-jobs master: Change openstack-infra to openstack-discuss https://review.openstack.org/622360 | 15:30 |
*** xek has joined #openstack-infra | 15:30 | |
amorin | fungi: can you give me the dd command you used ? | 15:34 |
amorin | I cant reproduce | 15:34 |
amorin | I have even better perf on BHS1 than on GRA1 | 15:34 |
openstackgerrit | Filippo Inzaghi proposed openstack-infra/openstackid master: Change openstack-infra to openstack-discuss https://review.openstack.org/622362 | 15:35 |
fungi | dd if=/dev/zero of=foo bs=1M count=4096 | 15:35 |
fungi | amorin: it's possible we were seeing i/o competition with other instances actively running jobs. we were testing while the region was still in use so now that we've turned it off completely that could explain why you don't see it | 15:36 |
amorin | yes maybe | 15:37 |
fungi | we could turn our max-servers for bhs1 back up to what we were running before and see if you can reproduce then, i suppose | 15:38 |
openstackgerrit | Filippo Inzaghi proposed openstack-infra/os-loganalyze master: Change openstack-dev to openstack-discuss https://review.openstack.org/622363 | 15:38 |
openstackgerrit | Filippo Inzaghi proposed openstack-infra/project-config master: Change openstack-infra to openstack-discuss https://review.openstack.org/622365 | 15:40 |
*** dpawlik has joined #openstack-infra | 15:40 | |
dmsimard | fungi: iteratively perhaps ? like not necessary 150 out of the gate | 15:40 |
dmsimard | necessarily* | 15:40 |
*** zul has joined #openstack-infra | 15:41 | |
fungi | well, if we *do* want to crank it back up to 159 i can simply un-wip https://review.openstack.org/621251 | 15:41 |
amorin | wait a little | 15:42 |
fungi | but sure, we can also try going back to 79 instead if that would help | 15:42 |
fungi | just let us know how we can assist | 15:42 |
*** florianf|biab is now known as florianf | 15:43 | |
*** graphene has joined #openstack-infra | 15:43 | |
*** dpawlik has quit IRC | 15:45 | |
openstackgerrit | Filippo Inzaghi proposed openstack-infra/python-storyboardclient master: Change openstack-dev to openstack-discuss https://review.openstack.org/622368 | 15:45 |
clarkb | fungi: cmurphy: I did bring it up with scott and he had said he would look at the options. But I don't know if he has done that and if he has if there were conclusions | 15:46 |
fungi | clarkb: do you still need me to chair the infra/opendev meeting today, or do you expect to be around for that? | 15:48 |
clarkb | fungi: I will be around | 15:49 |
clarkb | this cold won't go away so I'm hiding at home | 15:49 |
fungi | sorry to hear that! | 15:49 |
openstackgerrit | Filippo Inzaghi proposed openstack-infra/statusbot master: Change openstack-infra to openstack-discuss https://review.openstack.org/622375 | 15:51 |
openstackgerrit | Filippo Inzaghi proposed openstack-infra/storyboard master: Change openstack-dev to openstack-discuss https://review.openstack.org/622377 | 15:52 |
*** ykarel|afk is now known as ykarel | 15:53 | |
amorin | we are doing iotune on the flavor | 15:53 |
amorin | the thing is that we are setting the same value for both bhs1 and gra1 | 15:53 |
amorin | but I think this is not needed at all on your flavors | 15:54 |
amorin | you are alone on the hosts so | 15:54 |
amorin | I will try to disable this | 15:54 |
*** kjackal has quit IRC | 15:55 | |
openstackgerrit | Filippo Inzaghi proposed openstack-infra/system-config master: Change openstack-infra to openstack-discuss https://review.openstack.org/622380 | 15:55 |
*** kjackal has joined #openstack-infra | 15:55 | |
clarkb | fungi: I see you are already -2ing some of those ^ | 15:56 |
openstackgerrit | Filippo Inzaghi proposed openstack-infra/yaml2ical master: Change openstack-infra to openstack-discuss https://review.openstack.org/622382 | 15:56 |
fungi | yep | 15:57 |
clarkb | I'm looking at the zuul status page and trying to decipher if the new behavior is doing what we expect. One thing I notice is that nova has a large backlog in check (not completely unexpected) but the tripleo projects don't | 15:57 |
clarkb | I'm guessing that is because tripleo while using many resources is spread out among project repos? | 15:58 |
clarkb | its possible this may need further tuning | 15:58 |
fungi | tripleo-heat-templates had a large backlog earlier | 15:58 |
fungi | and tripleo-ci a modest backlog as well | 15:58 |
clarkb | nova has ~31 changes in check | 15:59 |
clarkb | ~20 for tripleo-heat-templates | 15:59 |
clarkb | (I don't know that the behavior is wrong, mostly just trying to make sense of my observations at this point) | 16:00 |
openstackgerrit | Filippo Inzaghi proposed openstack-infra/zuul-base-jobs master: Change openstack-infra to openstack-discuss https://review.openstack.org/622386 | 16:01 |
openstackgerrit | Filippo Inzaghi proposed openstack-infra/zuul-jobs master: Change openstack-infra to openstack-discuss https://review.openstack.org/622387 | 16:02 |
openstackgerrit | Filippo Inzaghi proposed openstack-infra/zuul-sphinx master: Change openstack-infra to openstack-discuss https://review.openstack.org/622388 | 16:03 |
corvus | fungi: https://review.openstack.org/622341 | 16:04 |
corvus | AJaeger: ^ that change may not be trivial | 16:04 |
openstackgerrit | Jeremy Stanley proposed openstack-infra/infra-manual master: Revert "Change openstack-infra to openstack-discuss" https://review.openstack.org/622391 | 16:05 |
clarkb | heh apparently people were twittering about 15h queue delays | 16:06 |
clarkb | if only people would use the mailing list :/ | 16:06 |
fungi | corvus: thanks for spotting | 16:06 |
clarkb | ah its jd's thing https://twitter.com/openstackstatus | 16:07 |
clarkb | ? | 16:07 |
amorin | fungi: I tuned the flavor to improve iotune, this is far better now | 16:08 |
clarkb | jd_: fwiw we changed how job assignments are prioritized to give greater weight to projects that use less resources | 16:08 |
amorin | can you turn back on your booting on this region? | 16:08 |
clarkb | jd_: so while the ~15h or whatever delay is accurate for some projects it doesn't paint the whole picture anymore | 16:08 |
amorin | root@amorin-bhs1:~# dd if=/dev/zero of=foo bs=1M count=4096 | 16:08 |
amorin | 4096+0 records in | 16:08 |
amorin | 4096+0 records out | 16:08 |
amorin | 4294967296 bytes (4.3 GB, 4.0 GiB) copied, 8.85547 s, 485 MB/s | 16:08 |
clarkb | amorin: that does look much better, thanks! | 16:09 |
fungi | jd_: specifically, we now dynamically round-robin job resources between gate queues so it's no longer a global fifo for the whole pipeline | 16:09 |
clarkb | fungi: do you already have a prepushed revert up for amorin that I can review? | 16:10 |
*** slittle1 has joined #openstack-infra | 16:10 | |
fungi | clarkb: https://review.openstack.org/621251 | 16:11 |
clarkb | amorin: ^ has been approved we should see that soon ish | 16:11 |
*** xek has quit IRC | 16:12 | |
*** pcaruana has quit IRC | 16:12 | |
amorin | looks good | 16:13 |
jd_ | fungi: clarkb: ok; fwiw I don't really maintain this anymore so if anyone wants to take it over I'm all for it | 16:17 |
*** armax has joined #openstack-infra | 16:22 | |
AJaeger | corvus: agreed, missed that it was infra for the infra-manual ;( fungi, thanks for reverting. | 16:22 |
openstackgerrit | Quique Llorente proposed openstack-infra/zuul master: Add default value for relative_priority https://review.openstack.org/622175 | 16:23 |
*** eharney has joined #openstack-infra | 16:24 | |
fungi | the author at least seems to have gotten the message after a dozen or so -2 votes on other changes like that and abandoned them | 16:25 |
*** hwoarang has quit IRC | 16:27 | |
*** electrofelix has quit IRC | 16:27 | |
openstackgerrit | James E. Blair proposed openstack-infra/nodepool master: Fix race when deleting Node znodes https://review.openstack.org/622403 | 16:28 |
AJaeger | fungi: wow, he's learning... | 16:29 |
openstackgerrit | Merged openstack-infra/infra-manual master: Revert "Change openstack-infra to openstack-discuss" https://review.openstack.org/622391 | 16:32 |
openstackgerrit | Merged openstack-infra/project-config master: Revert "Temporarily disable ovh-bhs1 in nodepool" https://review.openstack.org/621251 | 16:32 |
*** xek has joined #openstack-infra | 16:32 | |
*** larainema has quit IRC | 16:34 | |
openstackgerrit | Nate Johnston proposed openstack-infra/project-config master: Neutron grafana update for co-gating section https://review.openstack.org/622418 | 16:40 |
clarkb | fungi: I think what the current status is trying to tell me is that with current resources (previous to bhs1 going back in) we can service the first two changes for a project in check | 16:41 |
clarkb | right now the priority for third change is high enough that it isn't getting serviced. Will be interesting to see if that changes with bhs1 capacity back in place | 16:41 |
fungi | clarkb: probably won't be a significant bump as it's only restoring an additional ~18% | 16:46 |
*** gyee has joined #openstack-infra | 16:47 | |
*** quiquell is now known as quiquell|off | 16:50 | |
*** mriedem_afk is now known as mriedem | 16:50 | |
openstackgerrit | Nate Johnston proposed openstack-infra/project-config master: Neutron grafana update for co-gating section https://review.openstack.org/622418 | 16:50 |
openstackgerrit | melissaml proposed openstack/os-testr master: Change openstack-dev to openstack-discuss https://review.openstack.org/622425 | 16:56 |
openstackgerrit | melissaml proposed openstack/os-testr master: Update the home-page URL https://review.openstack.org/622427 | 16:59 |
*** shardy has quit IRC | 17:00 | |
openstackgerrit | Doug Hellmann proposed openstack-infra/openstack-zuul-jobs master: stop publishing release notes using python 2 https://review.openstack.org/622430 | 17:02 |
*** ykarel is now known as ykarel|away | 17:03 | |
*** ramishra has quit IRC | 17:06 | |
*** takamatsu has quit IRC | 17:06 | |
*** graphene has quit IRC | 17:10 | |
*** wolverineav has quit IRC | 17:10 | |
clarkb | fungi: nova has 3 changes processing in check now | 17:10 |
clarkb | and bhs1 doesn't seem to be in use yet | 17:10 |
fungi | cool! | 17:11 |
clarkb | nl04 should be updating shortly based on syslog timestampos | 17:11 |
*** graphene has joined #openstack-infra | 17:11 | |
*** pcaruana has joined #openstack-infra | 17:12 | |
*** e0ne has quit IRC | 17:13 | |
*** ginopc has quit IRC | 17:14 | |
openstackgerrit | James E. Blair proposed openstack-infra/zuul master: Add governance document https://review.openstack.org/622439 | 17:14 |
*** ccamacho has quit IRC | 17:15 | |
*** dpawlik has joined #openstack-infra | 17:15 | |
*** xek has quit IRC | 17:16 | |
clarkb | bhs1 is enabled again | 17:17 |
*** xek has joined #openstack-infra | 17:17 | |
*** ykarel|away has quit IRC | 17:22 | |
dmsimard | yay, thanks amorin | 17:26 |
*** wolverineav has joined #openstack-infra | 17:29 | |
*** wolverineav has quit IRC | 17:30 | |
*** wolverineav has joined #openstack-infra | 17:30 | |
*** jcoufal has quit IRC | 17:32 | |
corvus | looks like we're still ramping up to full utilization; the executors are close to maxed out so they're taking on new jobs slowly | 17:34 |
openstackgerrit | Merged openstack-infra/nodepool master: Make launcher debug slightly less chatty https://review.openstack.org/621675 | 17:34 |
openstackgerrit | Merged openstack-infra/openstack-zuul-jobs master: stop publishing release notes using python 2 https://review.openstack.org/622430 | 17:34 |
clarkb | corvus: we just added 159 nodes to our capacity too (restoring bhs1) | 17:34 |
corvus | yeah, sorry that's what i was referring to | 17:34 |
fungi | i assumed that's what he meant by ramping up | 17:34 |
clarkb | ah | 17:35 |
*** jpich has quit IRC | 17:35 | |
clarkb | tripleo has just merged 12 changes together in the gate (as long as there are no last second merge conflicts) | 17:36 |
clarkb | so that is exciting | 17:36 |
*** bobh has joined #openstack-infra | 17:36 | |
fungi | one just hit a post_failure | 17:36 |
fungi | and is now showing a merge conflict | 17:37 |
openstackgerrit | Jonathan Rosser proposed openstack-infra/project-config master: Add centos/suse to OSA grafana dashboard https://review.openstack.org/622169 | 17:37 |
amorin | clarkb: dmsimard fungi what is the status of BHS1 nodes ? | 17:37 |
*** bobh_ has joined #openstack-infra | 17:37 | |
amorin | is IO better? | 17:37 |
*** jcoufal has joined #openstack-infra | 17:37 | |
clarkb | amorin: I haven't checked a specific instance yet but will do that now | 17:37 |
fungi | er, i guess the merge conflict is on a different change | 17:38 |
amorin | ok | 17:38 |
*** bobh_ has quit IRC | 17:38 | |
*** bobh_ has joined #openstack-infra | 17:39 | |
fungi | clarkb: dmsimard: looks like the ara change for logs.o.o may have significantly improved cpu utilization and also somewhat improved memory usage there, based on cacti graphs | 17:40 |
dmsimard | amorin: searching "node_provider:ovh-gra1 or node_provider:ovh-bhs1" on logstash.openstack.org seems to yield successful jobs as far as I can tell | 17:40 |
*** bobh has quit IRC | 17:41 | |
*** bhavikdbavishi has quit IRC | 17:41 | |
amorin | dmsimard glad to hear that | 17:41 |
clarkb | amorin: 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 4.89867 s, 219 MB/s | 17:41 |
amorin | sounds good | 17:41 |
clarkb | that looks much much better, thank you | 17:41 |
amorin | I applied the same on GRA1, so even GRA1 should be better | 17:41 |
fungi | clarkb: dmsimard: oh, or at least the ara change resulted in no negative impact on the server... the earlier cpu and memory increases seem to be part of a periodic pattern if you look at the monthly graphs | 17:41 |
amorin | I'll check if I can increase the IO a little bit more in the future | 17:41 |
clarkb | amorin: good to know re GRA1 | 17:42 |
dmsimard | fungi: the monthly graphs are indeed showing an odd pattern http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=138&rra_id=all | 17:42 |
fungi | dmsimard: i have a feeling that's the weekly log pruning cron | 17:43 |
dmsimard | that would make sense if it lasts ~3 days | 17:43 |
fungi | it takes a few days to run its course | 17:43 |
fungi | so, yeah | 17:43 |
*** bobh_ has quit IRC | 17:43 | |
amorin | also, we have in our commit pipe the fact to reenable nested virtualisation, I hope I will be able to push that to GRA1 and BHS1 tomorrow | 17:44 |
amorin | I know that this is affecting the OSF somehow | 17:44 |
*** bobh has joined #openstack-infra | 17:45 | |
clarkb | amorin: its mostly that we have some jobs that test nested workloads (octavia tests nested load balancers for example) and qemu isn't really fast enough to do that reliably | 17:45 |
dmsimard | I remember an obscure bug we hit on OVH with... disk injection ? Not sure that ever got fixed | 17:46 |
*** wolverineav has quit IRC | 17:46 | |
*** derekh has quit IRC | 17:46 | |
*** wolverineav has joined #openstack-infra | 17:47 | |
dmsimard | Something along the lines of libguestfs using kvm instead of qemu | 17:47 |
*** wolverineav has quit IRC | 17:47 | |
*** kjackal has quit IRC | 17:47 | |
*** wolverineav has joined #openstack-infra | 17:47 | |
amorin | ah, I am not aware of this bug | 17:48 |
openstackgerrit | Nate Johnston proposed openstack-infra/openstack-zuul-jobs master: Replace neutron-grenade job with grenade-py3 https://review.openstack.org/622480 | 17:48 |
openstackgerrit | James E. Blair proposed openstack-infra/nodepool master: Fix race when deleting Node znodes https://review.openstack.org/622403 | 17:49 |
dmsimard | amorin: it was https://bugs.launchpad.net/nova/+bug/1735823 -- it's probably not specific to OVH but that's where we noticed it | 17:50 |
openstack | Launchpad bug 1735823 in OpenStack Compute (nova) "Nova can hang when creating a VM with disk injection" [Medium,In progress] - Assigned to Matt Riedemann (mriedem) | 17:50 |
amorin | ok, I'll check that | 17:52 |
mriedem | interesting http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22Failed%20to%20force%20guestfs%20TCG%20mode%5C%22%20AND%20tags%3A%5C%22screen-n-cpu.txt%5C%22&from=7d | 17:53 |
mriedem | [None req-9d2a4bac-a4b3-4be2-a896-88c3cb7497cf tempest-ServerActionsTestJSON-983743848 tempest-ServerActionsTestJSON-983743848] Failed to force guestfs TCG mode. guestfs_set_backend_settings returned: None: NotImplementedError | 17:53 |
*** pcaruana has quit IRC | 17:56 | |
*** gfidente is now known as gfidente|afk | 18:04 | |
*** xek has quit IRC | 18:06 | |
*** xek has joined #openstack-infra | 18:06 | |
clarkb | dhellmann: the community goals email reminds me that I think you had said you had a plan to make openstack client startup quicker? or was that dtroyer? | 18:11 |
clarkb | I'd be curious to hear what that plan is if I am remembering correctly. And will try to help as much as possible to make those changes | 18:11 |
*** e0ne has joined #openstack-infra | 18:13 | |
TheJulia | clarkb: I think it was dtroyer. I replied to that thread specifically because I have questions regarding what we would consider scoping. If we do the full scope and the largest lift, then it should naturally fire up much faster if I understand the root problem as to why it fires up so slowly. | 18:19 |
clarkb | TheJulia: the issue with startup time is the use of entrypoints as implmented by pkg_resources | 18:20 |
clarkb | TheJulia: it scans your entire python path for all python package installations, inspects them for metadata then sorts them all. This scales with the number of packages and the speed of your filesystem | 18:20 |
TheJulia | yup | 18:20 |
TheJulia | That was my understanding as well | 18:20 |
*** jpena is now known as jpena|off | 18:20 | |
clarkb | with long lived processes this isn't an issue bceause you start it once and the cost is paid and you are done. But with osc as used by eg devstack we pay that cost every time we run osc | 18:20 |
clarkb | and it adds significant time to the devstack runs I think | 18:21 |
clarkb | (its been a while since I looked at numbers) | 18:21 |
TheJulia | the last numbers I seem to remember were kind of mind boggling | 18:21 |
fungi | wasn't there a plan to have a persistent osc socket in devstack to reduce that impact? | 18:22 |
TheJulia | i wonder if it could make sense to have some sort of cache to prevent the scan | 18:22 |
TheJulia | or something like what fungi just mentioned | 18:22 |
clarkb | fungi: that was one idea thrown out at the time | 18:22 |
fungi | TheJulia: yeah, i vaguely recall dhellmann talking about stevedore caching entrypoints as one possibility | 18:22 |
clarkb | another idea was to stop using entrypoints for the "base" or "common" commands | 18:23 |
fungi | an interesting effect of the dynamic priority in zuul is that you can see which repos are the most active in check based on what's floated to the top due to getting delayed | 18:24 |
fungi | pretty much all tripleo, nova and a smidge of neutron at the moment. everyone else is getting nodes pretty quickly | 18:25 |
fungi | a networking-odl change just entered the check pipeline 6 minutes ago and already has about half its node requests fulfilled | 18:26 |
tosky | do you plan to publish some graphs? :) | 18:27 |
fungi | tosky: hard to know what to graph there | 18:27 |
tosky | and/or statistics | 18:27 |
*** auristor has quit IRC | 18:28 | |
fungi | i mean, we have general graphs like the ones at http://grafana.openstack.org/d/T6vSHcSik/zuul-status | 18:30 |
*** e0ne has quit IRC | 18:30 | |
fungi | and extremely detailed stats tracked at http://graphite.openstack.org/ | 18:30 |
fungi | but not sure what statistical insights can be gained from the recent change in prioritization | 18:31 |
clarkb | half an hour to the infra meeting. I failed at sending email yesterday. /me edits reminders on calendar for that | 18:31 |
clarkb | ok my phone will remind me next week | 18:32 |
dmsimard | fungi, clarkb: merging https://review.openstack.org/#/c/616306/ might help in giving us insight as to how the resources are distributed with the new algorithm -- would have actually been nice to have this *before* the algorithm change to compare the differences | 18:32 |
*** jrist has quit IRC | 18:32 | |
*** graphene has quit IRC | 18:34 | |
fungi | dmsimard: would it? the algorithm change wouldn't affect long-term utilization per project, only their relative throughput | 18:34 |
fungi | it wasn't about reducing anyone's utilization, more like letting someone with a couple of items in their hand jump in front of your massive cartload at the grocery checkout | 18:35 |
fungi | you're all still going through eventually | 18:36 |
fungi | jamesmcarthur: i'm guessing you want me to just discard that message of yours which just landed in the -discuss moderation queue since you seem to have resent it from a different address anyway? | 18:39 |
jamesmcarthur | ha fungi: I was scrambling to see if I had a login to the ml | 18:39 |
jamesmcarthur | yes please :) delete | 18:39 |
fungi | done | 18:40 |
fungi | i haven't added any other moderators yet because with the transition i'm still relying on a fair amount of local filtering and spam identification to make sure nothing slips through the cracks | 18:40 |
clarkb | dmsimard: fungi yup over the long term we use something like 30% of our available resources because weekends and holidays and hours where fewer people are working are still a thing for us | 18:41 |
clarkb | dmsimard: but also we can glean that info from the logs already | 18:41 |
clarkb | tobiash's change is a definite improvement over that but we do have some of the info | 18:41 |
dmsimard | fungi, clarkb: makes sense | 18:43 |
clarkb | dmsimard: the last two days its tripleo 40.9%, nova 9.9%, neutron 8.7% then puppet, helm , and osa make the range from 5-3% ish | 18:43 |
fungi | nice to see tripleo continuing to drop in that ranking | 18:44 |
fungi | er, well, not in ranking but at least in proportion | 18:44 |
clarkb | fungi: dmsimard the week prior was tripleo 36%, neutron 13%, nova 8.9% | 18:46 |
clarkb | I think that mostly confirms we aren't changing usage, just when you get to use it | 18:47 |
pabelanger | the RETRY_LIMIT of tripleo jobs yesterday will also eat up a lot of nodes too, since we try 3x | 18:50 |
clarkb | pabelanger: ya but it was failing pretty fast and I promoted the change as soon as it was approved | 18:51 |
clarkb | (I think we handled that reasonably well, the only followup on that is maybe adding an ansible lint rule to become when installing packages) | 18:51 |
*** betherly has joined #openstack-infra | 18:54 | |
*** ahosam has joined #openstack-infra | 18:56 | |
*** betherly has quit IRC | 18:58 | |
*** jcoufal has quit IRC | 19:01 | |
*** diablo_rojo has joined #openstack-infra | 19:01 | |
*** diablo_rojo has quit IRC | 19:01 | |
*** diablo_rojo has joined #openstack-infra | 19:01 | |
*** wolverineav has quit IRC | 19:02 | |
ianw | dmsimard: hey, i added the ara install stuff to the meeting topics. i dunno about a generic role in zuul-jobs, i'm not opposed if someone wants to write it for something, but i'm not sure i'd take it on just for the sake of it | 19:02 |
*** e0ne has joined #openstack-infra | 19:03 | |
dmsimard | ianw: ok, I'll be there for meeting | 19:03 |
*** wolverineav has joined #openstack-infra | 19:06 | |
*** imacdonn has joined #openstack-infra | 19:13 | |
*** ahosam has quit IRC | 19:19 | |
openstackgerrit | Merged openstack-infra/nodepool master: Fix race when deleting Node znodes https://review.openstack.org/622403 | 19:20 |
imacdonn | hi infra ... just reporting another case of something that looks like tests failing due to IP address conflict (ssh auth/e failures and timeouts): http://logs.openstack.org/48/622348/1/check/legacy-tempest-dsvm-full-lio-src-os-brick/d81b9e5/ | 19:21 |
*** prometheanfire has quit IRC | 19:21 | |
fungi | imacdonn: interesting, that's in obh-bhs1 which we only turned back on to start using in the past couple hours | 19:22 |
*** graphene has joined #openstack-infra | 19:22 | |
*** prometheanfire has joined #openstack-infra | 19:22 | |
fungi | imacdonn: what about that looks like an ip address conflict? looks like tempest tests failing to delete volumes | 19:24 |
fungi | "Invalid volume: Volume status must be available or error or error_restoring or error_extending or error_managing and must not be migrating, attached, belong to a group, have snapshots or be disassociated from snapshots after volume transfer." | 19:24 |
imacdonn | fungi: maybe I pasted the wrong link .. seeing things like: | 19:24 |
dmsimard | ssbarnea|rover: (now offtopic) openstack tries really hard to not use closed source software, "free software needs free tools", four opens, etc. | 19:24 |
imacdonn | 2018-12-04 18:42:52.326095 | primary | 2018-12-04 18:39:26,546 32281 INFO [paramiko.transport] Authentication (publickey) failed. | 19:24 |
imacdonn | 2018-12-04 18:42:52.326217 | primary | 2018-12-04 18:39:26,688 32281 WARNING [tempest.lib.common.ssh] Failed to establish authenticated ssh connection to cirros@172.24.5.2 (Authentication failed.). Number attempts: 4. Retry after 5 seconds. | 19:24 |
imacdonn | and: 2018-12-04 18:51:27.023717 | primary | tempest.lib.exceptions.SSHTimeout: Connection to the 172.24.5.2 via SSH timed out. | 19:25 |
fungi | imacdonn: 172.24.5.2 would be an address of a nested vm created by devstack (that's an rfc-1918 address), not a job node interface's address | 19:25 |
ianw | #status log moved bridge.o.o /etc/ansible/hosts/openstack.yaml to a .old file for clarity, as it is not (and perhaps was never) used | 19:26 |
openstackstatus | ianw: finished logging | 19:26 |
*** graphene has quit IRC | 19:27 | |
imacdonn | fungi: OK ... but seems that not being able to ssh to it is still bad ? | 19:27 |
*** jrist has joined #openstack-infra | 19:29 | |
fungi | imacdonn: sure, but you're reporting an issue entirely encapsulated within devstack, nothing to do with our ci system's connectivity to the job node | 19:29 |
*** graphene has joined #openstack-infra | 19:29 | |
*** florianf has quit IRC | 19:30 | |
imacdonn | fungi: I didn't think I made any assertion about where the problem lies :) | 19:30 |
*** dpawlik has quit IRC | 19:30 | |
fungi | imacdonn: you mentioned thinking it was due to an ip address conflict | 19:31 |
*** dpawlik has joined #openstack-infra | 19:31 | |
fungi | for which the only ip address conflicts within the infra team's domain of control are conflicts over the addresses by which the ci system reaches its job nodes | 19:31 |
imacdonn | fungi: OK, I guess my report was not useful, so please disregard. | 19:32 |
fungi | imacdonn: sorry, i was merely confused why you were mentioning what looked like a devstack problem | 19:33 |
imacdonn | fungi: I thought it looked similar to issues we saw some weeks ago, where it appeared that the test was trying to ssh to a VM that it had created, but was failing to login, which seemed like it may be actually connecting to some other host using that IP address | 19:34 |
*** dtantsur is now known as dtantsur|afk | 19:35 | |
imacdonn | fungi: I don't have any visibility into the infrastructure that these tests run on, so it's difficult to diagnose beyond that | 19:35 |
fungi | imacdonn: yeah, and i don't have enough insight into devstack/tempest to be able to guess what's gone wrong there. you may find a better audience for devstack and tempest failures in #openstack-qa as those aren't infra projects | 19:40 |
*** wolverineav has quit IRC | 19:40 | |
fungi | looks like our connectivity to the node where that ran was fine | 19:40 |
*** wolverineav has joined #openstack-infra | 19:41 | |
*** wolverineav has quit IRC | 19:46 | |
*** jamesmcarthur has quit IRC | 19:58 | |
*** e0ne has quit IRC | 19:58 | |
*** aojea has quit IRC | 19:59 | |
corvus | mordred: oh, were you suggesting that if we're going to use a ssg for opendev, we should go ahead and use gatsby? that makes sense. | 19:59 |
openstackgerrit | François Magimel proposed openstack-infra/infra-manual master: Fix some URL redirections and broken links https://review.openstack.org/622581 | 20:01 |
clarkb | mordred: corvus: it seems that gatsby isn't really any different than any other static site generator? You write markdown or rst, compile it to static site with html/js/css and then serve that built content. | 20:04 |
clarkb | from that I have two questions 1) why gatsby instead of the million other static site generators, 2) where does netlify-cms fit into that? | 20:04 |
clarkb | infra-root I am going to reboot the opendev ns server that needs a reboot now | 20:05 |
clarkb | since nothing is really using opendev yet I think its safe to just do the reboot right? | 20:06 |
clarkb | In the future we might update NS records? | 20:06 |
fungi | rebooting an authoritative nameserver when there's another one responding should be fine | 20:06 |
*** agopi has quit IRC | 20:06 | |
fungi | even if the domains it serves are in use | 20:07 |
clarkb | do resolvers check both in that case? | 20:07 |
clarkb | priority probably plays into it too? | 20:07 |
fungi | yes, if a resolver can't reach one of the listed nameservers for a domain it will try another | 20:07 |
clarkb | ok motd no longer says reboot required | 20:07 |
fungi | if it needs to try too many then the client's request may time out before the recursive resolver provides it an answer | 20:08 |
clarkb | nsd failed to start so I'm debugging that now | 20:09 |
clarkb | hrm journalctl says it returned 1 too many times in quick succession so systemd stopped trying to start it | 20:11 |
clarkb | I don't see any logs indicating what the actual failur was though | 20:11 |
clarkb | I think networking wasn't up yet | 20:12 |
*** wolverineav has joined #openstack-infra | 20:12 | |
clarkb | I stop,started it by hand and it is running now | 20:12 |
clarkb | a race in the init system I guess | 20:13 |
*** manjeets_ has joined #openstack-infra | 20:13 | |
clarkb | maybe we need to add an After=networking.target or whatever the appropriate incantation is? | 20:13 |
*** manjeets has quit IRC | 20:14 | |
fungi | i take it the nsd package isn't providing its own systemd unit file? or is this merely a bug in the one it shipped? | 20:18 |
fungi | infra-root: if i can get some quick eyes on https://review.openstack.org/619056 that's one more mailing list which can be retired | 20:22 |
clarkb | fungi: my guess is its a bug in the one that is shipped. But we can add a .after file iirc to chagne the order without changing the main unit | 20:24 |
clarkb | I'm going to grab lunch then I'll dig back into my todo list and check if nsd starting before network has an easy fix | 20:28 |
clarkb | before I do that. brainstorming around new zuul behavior | 20:29 |
clarkb | corvus: ^ I wonder if switching from change count based numbers to node use might give better approximation for what we are trying to do there? | 20:29 |
clarkb | I don't think that will make nova or tripleo move through any more quickly | 20:30 |
clarkb | but if we did that then we might be able to account for gate and check? | 20:30 |
*** graphene has quit IRC | 20:30 | |
*** xek has quit IRC | 20:31 | |
*** graphene has joined #openstack-infra | 20:32 | |
fungi | i agree that node count for the requested jobs is a more accurate representation of the cost than just treating every change equally, but i wonder how you use that to prioritize since you ideally satisfy all node requests for a change to keep things from sitting around half-tested | 20:36 |
*** ahosam has joined #openstack-infra | 20:36 | |
openstackgerrit | Merged openstack-infra/system-config master: bridge.o.o : install ansible 2.7.3 https://review.openstack.org/617218 | 20:37 |
*** jrist has quit IRC | 20:39 | |
corvus | fungi: i think we should continue to set the priority to the same value for every job in a change, but if we based the priority on nodes requested by changes ahead in the queue, then projects with large node counts will be "demerited" faster. so a second system-config change which ran all of our jobs might be waiting behind the third zuul change which only ran a few jobs. | 20:44 |
*** takamatsu has joined #openstack-infra | 20:44 | |
fungi | i guess that could work | 20:44 |
corvus | (i mean, we could also have different priority values for jobs within a change; that would be a more complex and different behavior) | 20:46 |
*** ralonsoh has quit IRC | 20:47 | |
pabelanger | interesting idea using node requests for priority also | 20:48 |
fungi | corvus: oh, and you were asking about openstack-discuss metrics. looking back at my notes i was performing cross-analysis of addresses seen posting to the old lists which are subscribed to the new. for example, there were 245 distinct addresses seen posting 10 or more messages to the old lists so far in 2018 and 148 of those addresses (60%) have subscribed to the new list | 20:50 |
fungi | 70% of the addresses which posted 20 messages or more this year have subscribed to the new list | 20:52 |
fungi | 86% of those which posted 50 messages or more | 20:52 |
fungi | clearly a long tail there, a lot of which i expect is due to address changes | 20:53 |
*** eharney has quit IRC | 20:53 | |
*** auristor has joined #openstack-infra | 20:53 | |
fungi | 100 messages only ratchets it up another percent | 20:54 |
*** gfidente|afk is now known as gfidente | 20:59 | |
*** dpawlik has quit IRC | 21:01 | |
*** wolverineav has quit IRC | 21:01 | |
*** wolverineav has joined #openstack-infra | 21:04 | |
*** eharney has joined #openstack-infra | 21:04 | |
*** wolverineav has quit IRC | 21:04 | |
*** wolverineav has joined #openstack-infra | 21:04 | |
*** wolverineav has quit IRC | 21:15 | |
*** kjackal has joined #openstack-infra | 21:18 | |
openstackgerrit | Kendall Nelson proposed openstack-infra/infra-specs master: StoryBoard Story Attachments https://review.openstack.org/607377 | 21:19 |
*** wolverineav has joined #openstack-infra | 21:20 | |
*** wolverineav has quit IRC | 21:20 | |
*** wolverineav has joined #openstack-infra | 21:20 | |
clarkb | ianw: fungi: https://review.openstack.org/#/c/621847/1 confuses me slightly. Wasn't there an error after we got the fix in that precipitated the nonvoting change? | 21:23 |
clarkb | it certainly appears to work on that change though | 21:23 |
clarkb | the issues was ssh is not a valid connection plugin iirc | 21:24 |
clarkb | do we know what caused that? maybe ansible fixed that in a race with our jobs running? | 21:24 |
fungi | clarkb: yes, without knowing what triggered the subsequent ssh errors, i can only guess we raced with a change landing upstream in ansible | 21:24 |
ianw | clarkb: there was https://review.openstack.org/#/c/621633/ proposed, which used a block: in the handler which doesn't work. after investigating that, it's a known "problem" (the github issues seems unclear if it's a bug or feature) | 21:25 |
ianw | corvus: ^ could probably abandon that for clarity? | 21:26 |
clarkb | ianw: ya then we merged 621634 which is the fix for that via the listeners addition and the next runs after that had errors saying ssh was an unknown connection plugin | 21:26 |
clarkb | ianw: it was that second error that caused us to switch to non voting | 21:26 |
ianw | oh right, yeah i saw that once | 21:26 |
corvus | abandoned | 21:26 |
corvus | i'm on the fence -- i could go either way, but i think maybe keeping it non-voting and just trying to keep an eye on it might be the way to go. | 21:27 |
clarkb | the recent commit log for ansible doesn't show a likely fix. I guess I should git log -p | 21:27 |
ianw | i kind of like that it has proved it has actually found things that saved us from having production issues | 21:28 |
*** armax has quit IRC | 21:28 | |
ianw | although, i guess it caused production issues as it had to be fixed before other things could move in | 21:28 |
ianw | dmsimard: so following on from meeting conversation, i'd propose we squash https://review.openstack.org/621463 into your original ara installation review | 21:30 |
corvus | yeah. the job is great. | 21:30 |
ianw | and then create a new review that enables it for production bridge.o.o | 21:30 |
*** jrist has joined #openstack-infra | 21:31 | |
*** kgiusti has left #openstack-infra | 21:31 | |
*** tpsilva has quit IRC | 21:32 | |
clarkb | that error comes out of lib/ansible/executor/task_executor.py attempting to load the connection plugins based on the host config | 21:33 |
clarkb | I dont' see any code touching that file in the last day | 21:33 |
ianw | corvus: one thing that would have made it a bit quicker to bisect is something like -> https://review.openstack.org/621840 . i think there's many ways to do something similar (stamp the heads in logs) so interested in your thoughts | 21:34 |
ianw | unfortunately ansible --version doesn't show the git head unless it's installed with "pip -e" | 21:34 |
ianw | (which does work, https://review.openstack.org/621471, but i thought a common zuul solution is probably nicer) | 21:35 |
clarkb | https://github.com/ansible/ansible/pull/49249 is suspicious given it deals with handlers though | 21:36 |
clarkb | I do wonder though if ansible even intends for it to be consumed this way | 21:36 |
clarkb | its good data for us, but I'm also not sure how actionable it is (we were 1 for 2 yesterday) | 21:37 |
corvus | clarkb: no i don't think so, which is why i think non-voting may be the best approach | 21:38 |
corvus | the branch is called 'devel' :) | 21:38 |
*** wolverineav has quit IRC | 21:38 | |
*** eharney has quit IRC | 21:40 | |
*** wolverineav has joined #openstack-infra | 21:40 | |
fungi | perhaps a more effective job to peek at if it's failing around time for ansible release candidates? | 21:42 |
clarkb | dmsimard: for https://review.openstack.org/#/c/611228/10/playbooks/host_vars/bridge.openstack.org.yaml we would start writing a sqlite database on production bridge.o.o too right? what happens when we mix the ansible cron, adhoc ansible, and sqlite there? Will that cause ansible to fail or just sqlite to record junk data? | 21:43 |
clarkb | I think I'm ok with it as long as its not going to cause the cron to fail if someone runs an ad hoc command or vice cersa | 21:44 |
pabelanger | ianw: clarkb: running ansible from devel is a little hard, I'd maybe say doing so around RC time, per fungi or maybe just run it from a stable branch | 21:44 |
clarkb | pabelanger: in this case we are interested in testing how future ansible will break us so stable branches probably aren't what we want. But ya maybe we only look at it when it is RC time | 21:45 |
pabelanger | ianw: clarkb: often in ansible-network, we've had devel break us for some random reason, so we do non-voting also but really considering dropping it shortly | 21:45 |
dmsimard | clarkb: the vars for using mysql are set up in the hostvars on bridge.o.o | 21:45 |
clarkb | dmsimard: not on that change, it is using sqlite aiui | 21:45 |
clarkb | dmsimard: ya the default is sqlite | 21:45 |
pabelanger | clarkb: yah, I think RC time is when things really stablize on devel, IIRC. Until then, it is pretty wildwest | 21:45 |
dmsimard | clarkb: if you use the install-ansible role without anything, yes -- but in the hostvars for bridge.o.o we set up mysql and the authentication | 21:46 |
clarkb | dmsimard: we don't I linked to the host vars for bridge.openstack.org and that is commented out | 21:46 |
dmsimard | in the private vars? equivalent to hiera | 21:46 |
* clarkb looks | 21:47 | |
dmsimard | clarkb: /etc/ansible/hosts/host_vars/bridge.openstack.org.yaml | 21:47 |
dmsimard | It may not reflect the latest patchsets, need to do a full review | 21:47 |
clarkb | ya ok so its set in private data I now know what my comements on that change are :) | 21:48 |
dmsimard | ianw: yup let me review | 21:49 |
clarkb | ok left notes on https://review.openstack.org/#/c/611228/10 and its parent (which makes the ansible devel job voting again) | 21:50 |
openstackgerrit | James E. Blair proposed openstack-infra/infra-specs master: Move project-hosting spec to completed https://review.openstack.org/622609 | 21:51 |
dmsimard | clarkb: can you expand on "Can we instead enable this just for testing to start?" ? | 21:52 |
dmsimard | What is "this" ? | 21:52 |
dmsimard | mysql implementation ? | 21:52 |
dmsimard | the in-job nested report ? | 21:52 |
clarkb | dmsimard: enable ara on "bridge.openstack.org" only in the test jobs to start | 21:53 |
dmsimard | ok | 21:54 |
fungi | okay, gonna go find an early dinner, but should be back soon | 21:55 |
*** bobh has quit IRC | 21:57 | |
*** jaosorior has quit IRC | 22:00 | |
*** boden has quit IRC | 22:07 | |
clarkb | diablo_rojo: fungi comments on https://review.openstack.org/#/c/607377/6 curious to know what you think | 22:18 |
clarkb | that is the bulk of reviews done. Time to debug nsd ansible | 22:18 |
*** ahosam has quit IRC | 22:19 | |
*** slaweq has quit IRC | 22:20 | |
openstackgerrit | Nate Johnston proposed openstack-infra/openstack-zuul-jobs master: Add nodeset ubuntu-bionic-2-node for bionic multinode testing https://review.openstack.org/622613 | 22:21 |
clarkb | https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/ yes freedesktop.org how did you know :) | 22:22 |
clarkb | this nsd unit problem has its own faq entry | 22:22 |
openstackgerrit | Ian Wienand proposed openstack-infra/system-config master: Add support for enabling the ARA callback plugin in install-ansible https://review.openstack.org/611228 | 22:23 |
openstackgerrit | Ian Wienand proposed openstack-infra/system-config master: Prefix install_openstacksdk variable https://review.openstack.org/621462 | 22:23 |
openstackgerrit | Ian Wienand proposed openstack-infra/system-config master: [to squash] Modifications to ARA installation https://review.openstack.org/621463 | 22:23 |
openstackgerrit | Ian Wienand proposed openstack-infra/system-config master: functional-tests: collect and publish inner ARA results https://review.openstack.org/617216 | 22:23 |
ianw | dmsimard: ^ responded on that variable | 22:23 |
diablo_rojo | clarkb, looking now | 22:24 |
dmsimard | ianw: oi, I was working on the squash :p | 22:25 |
dmsimard | ianw: damn it, I think gerrit hijacked my ctrl+f and I didn't see it | 22:26 |
diablo_rojo | clarkb, seems like a good route to me, I think we might want to be able to set when the link expires based on if its private or not? But other than that, I think being more careful is better? | 22:27 |
ianw | dmsimard: it's all similar for ansible & opensatcksdk ... it's a lot of variable swizzling but it does make it a bit clearer when the role is called i think | 22:27 |
clarkb | diablo_rojo: ya we should be able to make things more or less accessible depending on that | 22:27 |
diablo_rojo | clarkb, then I think it sounds good. | 22:28 |
dmsimard | ianw: so here's what I'd do: squash 621463 into 611228 but move the .zuul.yaml change to 617216 | 22:28 |
dmsimard | how does that sound ? | 22:29 |
*** takamatsu has quit IRC | 22:29 | |
ianw | dmsimard: sure, want me to do that, or do you have it in progress? | 22:30 |
dmsimard | have it almost done | 22:30 |
*** bobh has joined #openstack-infra | 22:30 | |
dmsimard | ianw: good idea to use ara master with devel :) | 22:32 |
openstackgerrit | David Moreau Simard proposed openstack-infra/system-config master: Add support for enabling the ARA callback plugin in install-ansible https://review.openstack.org/611228 | 22:32 |
openstackgerrit | David Moreau Simard proposed openstack-infra/system-config master: functional-tests: collect and publish inner ARA results https://review.openstack.org/617216 | 22:32 |
ianw | that -devel job will be testing ansible devel, openstacksdk master and ara master. it's a pretty good canary | 22:33 |
ianw | yay, cloud-launcher is running again on bridge. looks like 20 minutes is about the baseline | 22:33 |
openstackgerrit | David Shrewsbury proposed openstack-infra/nodepool master: Add cleanup routine to delete empty nodes https://review.openstack.org/622616 | 22:33 |
*** dave-mccowan has quit IRC | 22:33 | |
dmsimard | ianw: I guess we should move https://review.openstack.org/#/c/611228/12/playbooks/bridge.yaml to 617216 too | 22:33 |
*** rcernin has joined #openstack-infra | 22:34 | |
dmsimard | as well as the testinfra test | 22:35 |
dmsimard | it's a self nit, mostly wanted to keep the scope of the patches separate :p | 22:35 |
ianw | dmsimard: yeah, can we also base it on https://review.openstack.org/#/c/621462/ (renames of the other variables) so it's all consistent | 22:36 |
clarkb | ok I've confirmed that the nsd unit from ubuntu is buggy | 22:36 |
dmsimard | ianw: will do | 22:37 |
*** slaweq has joined #openstack-infra | 22:37 | |
clarkb | aha, it would work if we listned on :: 0.0.0.0 or 127.0.0.1 or ::1 | 22:39 |
clarkb | interesting | 22:39 |
*** slaweq has quit IRC | 22:42 | |
*** mriedem has quit IRC | 22:44 | |
*** rcernin_ has joined #openstack-infra | 22:45 | |
*** rcernin has quit IRC | 22:45 | |
openstackgerrit | David Moreau Simard proposed openstack-infra/system-config master: Prefix install_openstacksdk variable https://review.openstack.org/621462 | 22:46 |
openstackgerrit | David Moreau Simard proposed openstack-infra/system-config master: Add support for enabling the ARA callback plugin in install-ansible https://review.openstack.org/611228 | 22:46 |
openstackgerrit | David Moreau Simard proposed openstack-infra/system-config master: Enable ARA reports for system-config bridge CI jobs https://review.openstack.org/617216 | 22:46 |
dmsimard | ianw: ^ hopefully squashed and rebased everything in the right order | 22:48 |
dmsimard | https://review.openstack.org/#/c/621847/ and https://review.openstack.org/#/c/621463/ are not included in the tree | 22:49 |
ianw | yep they can be abanonded | 22:51 |
*** slaweq has joined #openstack-infra | 22:53 | |
*** wolverineav has quit IRC | 22:53 | |
openstackgerrit | Clark Boylan proposed openstack-infra/system-config master: Update nsd systemd unit deps https://review.openstack.org/622620 | 22:53 |
clarkb | corvus: fungi ^ fyi I think that should fix it though I've not tested it yet. Our integration testing should at least test it isn't more broken than today | 22:54 |
*** wolverineav has joined #openstack-infra | 22:56 | |
*** wolverineav has quit IRC | 23:01 | |
clarkb | ianw: dmsimard stack lgtm | 23:01 |
*** priteau has quit IRC | 23:03 | |
*** wolverineav has joined #openstack-infra | 23:05 | |
*** bobh has quit IRC | 23:07 | |
*** slaweq has quit IRC | 23:10 | |
*** jamesdenton has quit IRC | 23:10 | |
*** rcernin_ has quit IRC | 23:12 | |
*** rcernin has joined #openstack-infra | 23:13 | |
*** witek has quit IRC | 23:14 | |
*** witek has joined #openstack-infra | 23:16 | |
corvus | clarkb: slick | 23:16 |
fungi | okay, fed and catching up | 23:18 |
*** gema has quit IRC | 23:18 | |
ianw | clarkb: want to just look over https://review.openstack.org/#/c/621231 (add stein ubuntu-cloud repo) but i think it's fine | 23:21 |
mnaser | hm | 23:26 |
mnaser | openstack/nova seems to have a 13h queue | 23:26 |
mnaser | is that based on the new changes? | 23:26 |
fungi | mnaser: yes, nova changes will wait for other projects if there are more outstanding nova changes than outstanding changes for those other projects | 23:28 |
fungi | also mikal seems to have pushed a ~20-change series for nova in check earlier today which contributed to slowing testing for them | 23:28 |
mnaser | might be running in late but wouldn't this behavior maybe encourage one big patch rather than a bunch of small ones | 23:29 |
mnaser | and yeah, it seems like one bad stack can just halt a whole project :\ | 23:29 |
fungi | or might encourage projects to split into multiple repos instead of relying on a single repo too | 23:29 |
fungi | mnaser: the alternative was one stack of nova changes halting testing for everyone else while we tested them | 23:30 |
mnaser | im trying to think how to ideally resolve this :\ | 23:31 |
mnaser | while i see how we've solved things, i feel like it also caused other bottlenecks in other ways | 23:31 |
fungi | basically with the dynamic priorities algorithm, more active projects now wait longer for some of their changes so that less active projects get more prompt results | 23:31 |
mnaser | fungi: for example, should we maybe run openstack-tox-lower-constraints only when lower-constraints.txt changes? | 23:33 |
mnaser | just maybe trying to identify ways where we can reduce our usage and have things run faster | 23:33 |
mnaser | or maybe openstack-tox-docs only when docs change | 23:33 |
fungi | mnaser: that supposes that changes people propose to the code don't introduce an expectation on newer versions of their dependencies | 23:34 |
fungi | also, whittling away at jobs like unit tests and docs builds aren | 23:35 |
fungi | 't where you're going to see gains | 23:35 |
mnaser | i hate to say it but *maybe* we should fast fail if one job fails rather than running the rest | 23:35 |
fungi | find ways to run fewer multi-node several-hour integration tests | 23:35 |
mnaser | sure it might result in more iterations | 23:35 |
fungi | are you seeing lots of changes running where the quick-to-complete jobs failed? | 23:36 |
fungi | pep8 didn't pass but devstack+tempest worked and ran for two hours? | 23:36 |
mnaser | or at smaller scale, if unit tests failed, but it still ran the whole devstack+tempest anyways (and might have probably failed) | 23:37 |
clarkb | keep in mind that still the biggest hog is the tripleo gate resets (one just happened recently) | 23:38 |
fungi | so, like, getting devstack+tempest to abort on the first test failure rather than running the rest of the battery? | 23:38 |
clarkb | I think the most significant change would be investing in reliable testing | 23:38 |
clarkb | (which is why I keep pushing that angle) | 23:38 |
mnaser | fungi: kinda, or making some sort of dependency of linters => unit => func => integration | 23:38 |
mnaser | clarkb: i agree, but even if everything is reliable, a 20 stack patch will slow down an entire project a whole day | 23:39 |
fungi | yes, spending lots of effort finding ways to run 5% fewer jobs is basically admitting defeat on fixing the bugs which cause gate resets and eat far more resources than that | 23:39 |
mnaser | well, no one is solving those bugs unfortunately, so its kinda like | 23:39 |
mnaser | the better outcome of the two bad outcomes | 23:39 |
mnaser | :( | 23:39 |
clarkb | mnaser: that simplification is too simple :) the only reason that check is slow like that is gate is monopolizing the resources | 23:40 |
fungi | mnaser: well, the check pipeline is only getting a small slice of the resource allocation because the gate pipeline still has absolute priority over it | 23:40 |
clarkb | gate is monopolizing the resources because queues keep restarting due to bugs | 23:40 |
mnaser | man at this point even i'll help out track these down :\ | 23:40 |
*** udesale has quit IRC | 23:40 | |
fungi | mnaser: we could "optimize" by making check higher priority and letting approved changes take however long they take to eventually merge | 23:40 |
mnaser | for example 619701,5 in gate reset over `openstack-tox-lower-constraints` | 23:41 |
mnaser | that cant be that hard to find.. | 23:41 |
fungi | amounts to basically test things in check during the week when developers are awake and active, and then merging approved changes on the weekends | 23:41 |
clarkb | mnaser: ya I try to pick a few off the pile every week (yesterday was the centos 7.6 related failures) | 23:42 |
clarkb | today I've got a much longer todo list so havne't been digging into individual failures | 23:42 |
mnaser | i try to look after openstack-ansible as much as possible | 23:43 |
mnaser | i dunno how bad we are :) | 23:43 |
clarkb | well OSA is ~4.5% of the total resource usage | 23:44 |
fungi | good news: openstack-ansible results should be coming faster than tripleo, nova and neutron ;) | 23:44 |
clarkb | really we are going to see the most impact improving the tests of the projects consuming the most resources | 23:44 |
fungi | but yeah, fixing bugs which only impact osad can at best chip away at <5% of the resource utilization | 23:44 |
fungi | fixing bugs which impact tripleo on the other hand get to eat away at their 40% | 23:45 |
fungi | (or dropping more of their jobs, or removing additional nodes from their multi-node jobs) | 23:45 |
clarkb | top of integrated gate is grenade and tempest failing on a glance change | 23:47 |
clarkb | in aggregate integrated gate is also a big consumer | 23:47 |
fungi | it includes 2 of the top 3 repos, for sure | 23:47 |
fungi | i'm guessing tripleo-heat-templates is the highest consumer repo anyway | 23:48 |
*** gfidente has quit IRC | 23:48 | |
clarkb | fungi: yes at 15 ish percent | 23:49 |
clarkb | mnaser: that lower constraints failure appears to be a valid nova unittest failuer | 23:51 |
clarkb | in test_add_exmods | 23:51 |
clarkb | *test_add_extra_exmods | 23:51 |
mnaser | spoke too soon | 23:53 |
mnaser | all our centos jobs are broken because of https://github.com/CentOS/sig-cloud-instance-images/issues/133 | 23:53 |
mnaser | oh well | 23:53 |
mnaser | interesting feature idea that just came to me | 23:54 |
mnaser | 'pause' jobs for a project .. but not set them to non voting | 23:54 |
mnaser | i don't want to go non voting on centos-7 and start landing stuff that break it, but i dont want to waste upstream CI resources for constant failures anyways | 23:54 |
*** tosky has quit IRC | 23:55 | |
mnaser | maybe a job that hard-fails right away (which runs on zuul-executor only, avoiding allocating a node?) | 23:55 |
*** pbourke has quit IRC | 23:55 | |
clarkb | mnaser: you're think just short circuit and return failure until some reset is made so that nodes aren't consumed for that? | 23:55 |
fungi | tripleo seems to want something similar, as they have a lot of external dependencies which can break them and then tend to just abandon every approved change to clear out their gate queue if they know they'll all fail anyway | 23:56 |
mnaser | clarkb: yeah | 23:56 |
clarkb | that is theoretically possible, I don't think zuul would do it for us today | 23:56 |
mnaser | but we can probably do it using that similar job that had rtd warning fail | 23:56 |
clarkb | out of curiousity why does the centos version matter if you are using a container. I thought the whole point was isolating those concerns :P | 23:56 |
*** pbourke has joined #openstack-infra | 23:56 | |
*** rascasoft has quit IRC | 23:57 | |
*** pabelanger has quit IRC | 23:57 | |
*** haleyb has quit IRC | 23:57 | |
mnaser | clarkb: we build venvs and tag them per OS release, the container running the repo is 7.5 so it builds things against 7.5.. when the host (7.6) tries to download something, it gets a 404 because it cant find the package | 23:57 |
mnaser | s/tries to download something/tries to download a built venv/ | 23:57 |
clarkb | mnaser: its probably safer to rely on wheels rather than whole venvs? | 23:58 |
mnaser | this is all fixed thanks to the work of odyssey4me so this will slowly disappear out of OSA (this cycle hopefully) | 23:58 |
clarkb | they should install quite quickly | 23:58 |
mnaser | exactly. enter python_venv_builder role we're transitioning to :) | 23:58 |
*** rascasoft has joined #openstack-infra | 23:58 | |
mnaser | instead we build the venv inside the container by installing the whells | 23:58 |
mnaser | https://github.com/openstack/ansible-role-python_venv_build | 23:59 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!