Wednesday, 2019-04-17

openstackgerritMerged openstack/tripleo-validations master: Add rabbitmq-limits role
openstackgerritMerged openstack/paunch stable/queens: docker/compose: quote health-cmd
openstackgerritMerged openstack/paunch stable/rocky: docker/compose: quote health-cmd
openstackgerritMerged openstack/tripleo-heat-templates stable/queens: Fixed wrong cinder store user name
openstackgerritMerged openstack/tripleo-validations master: Add check-latest-minor-version role
*** dsneddon has joined #tripleo02:49
openstackgerrityatin proposed openstack-infra/tripleo-ci master: [DNM] Test integrity of container-build job
*** udesale has joined #tripleo04:37
openstackgerritChandan Kumar (raukadah) proposed openstack-infra/tripleo-ci master: Enable ansible-role-collect-logs role in TOCI
*** ykarel|away has joined #tripleo04:43
*** ramishra has joined #tripleo04:50
*** ratailor has joined #tripleo04:51
*** ratailor has joined #tripleo04:53
*** absubram has joined #tripleo05:01
openstackgerritMerged openstack/tripleo-common master: Inject validation roles path in ansible config
*** absubram has joined #tripleo05:10
*** ykarel|away is now known as ykarel05:10
Tenguweshay: why so?05:33
Tengudue to the ovn-northd thingy? if so, it would be better to merge the correction -.-05:33
Tenguuhu, another service is down :D. the validation IS working.05:35
*** quiquell|off is now known as quiquell|rover05:47
Tenguyeah, that one will help getting info about the failed container :D05:49
quiquell|roverTengu I am going to test with this05:49
*** janki has joined #tripleo05:49
Tenguquiquell|rover: lemme know how it goes.05:49
quiquell|roverAnd we will see stuff going on there05:49
Tenguwould love to see it merged as well.05:50
Tenguso many +2, and no +w, it's sad.05:50
quiquell|roverI am not allow to +w them05:51
Tengusad change is sad.05:52
Tenguoh. OH! we can now run the validations provided by tripleo-validations within tripleo-heat-templates \o/05:56
Tengumy patch adding the related paths to ansible config has merged \o/\o/05:56
Tenguin-flight validations are WORKING05:56
* Tengu does a little victory dance05:56
marioso/ Tengu so with that we can re-enable the check on fs 56?06:02
mariosTengu: yea looks like it :D fails on the docker-py thing (
Tengumarios: yeah, with the new ovn-northd health check we should be able to re-enable on fs056 :)06:08
Tenguthanks for the vote!06:08
Tengumarios: care to +w that one as well?
Tenguthat would be really cool to get some more logs from the containers, especially the ones launched by neutron.06:09
mariosTengu: will check in a minute (only has 4 +2 though is it enough?)06:09
TenguI think it's enough, especially if beagles voted for it06:10
Tengubrb, breakfast.06:10
mariosTengu: inline question was looking for the logs06:22
mariosTengu: cant find the keepalived one but dnsmasq one empty? is it ok ?
Tengumarios: well, that's what beagles said in his own comment. might need to tweak the commands directly06:23
Tengumaybe (probably?) some don't output to the stdout.06:23
mariosTengu: ah missed the comment06:23
mariosTengu: and i only found that dnsmasq one not th eothers but i guess depends on the job06:24
marios here i mean06:24
Tengudepends on the job, yeah06:24
mariosTengu: k do we want to tweak it now or you happy to merge as is?06:25
*** dsneddon has quit IRC06:25
Tengumarios: we can merge it, and see how we can get some more output as a follow-up, per service06:25
Tenguat least there's the basis for something good with that patch. tweaking the commands is another task, hence another change06:26
mariosTengu: ack06:27
Tenguthank you :)06:29
mariosTengu: np just clicked some buttons06:30
Tenguwell, there were important buttons ;)06:30
Tenguhm. *they, even.06:33
Tengupfff. morning. brain's still booting up.06:33
* marios hugs coffee06:33
Tengualready had one. need to pump it to the brain, taking time.06:34
Tengulike an old diesel.06:34
*** udesale has joined #tripleo06:56
*** hberaud|gone is now known as hberaud07:39
*** ykarel is now known as ykarel|lunch07:43
gchamoulTengu: woot!07:52
Tengugchamoul: for the in-flight? btw, I'll need to push some doc about them - any idea where?07:54
gchamoulTengu: to me, the in-flight should be somewhere in tripleo-docs07:55
Tengusure, and probably not that far from the validation framework or related.07:55
Tengusince it can run those very same validations from within the run :)07:55
gchamoulTengu: indeed07:55
Tengugchamoul: new file in the install/validations then?07:57
gchamoulTengu: are you talking about a tripleo-docs place?07:58
Tengugchamoul: yeah07:58
Tenguthere's a doc/sources/install/validations directory. probably can add something in there.07:58
gchamoulTengu: maybe I need to check though07:58
openstackgerritGael Chamoulaud proposed openstack/tripleo-validations master: [WIP] Update tripleo-validations documentation
gchamouljust a rebase ^^08:02
*** quique|rover|brb is now known as quiquell|rover08:05
*** dsneddon has joined #tripleo08:07
openstackgerritChandan Kumar (raukadah) proposed openstack/tripleo-quickstart master: Run novajoin tempest tests in all TLS job using os_tempest
openstackgerritPiotr Kopec proposed openstack/tripleo-heat-templates master: Allow NovaNfs parameters to be role specific
shyambI see a bug opened against this issue08:22
shyamb bug 1629331 in openstack-tripleo "docker container ironic_inspector_dnsmasq continuously restarts" [High,New] - Assigned to jslagle08:22
openstackgerritCédric Jeanneret proposed openstack/tripleo-docs master: New documentation for in-flight validations
shyambOne workaround is given on the bug08:22
shyambI am not sure how much correct it is08:23
shyambTengu: Any thoughts here?08:23
openstackgerritCédric Jeanneret proposed openstack/tripleo-docs master: New documentation for in-flight validations
*** suuuper has joined #tripleo08:28
*** shyamb has quit IRC08:30
*** ykarel|lunch is now known as ykarel08:31
*** shyamb has joined #tripleo08:31
openstackgerritMerged openstack/tripleo-common master: Correct ovn-dbs health check
quiquell|roverbeagles: ping08:34
openstackgerritRajesh Tailor proposed openstack/tripleo-heat-templates master: Disable libvirtd service on controller node
openstackgerritMerged openstack/puppet-tripleo master: Enable file logging for podman neutron sidecars
openstackgerritGael Chamoulaud proposed openstack/tripleo-common master: Adding support for the new validation framework
openstackgerritGael Chamoulaud proposed openstack/tripleo-common master: Use 'DEFAULT_VALIDATIONS_BASEDIR' variable from
*** udesale has joined #tripleo09:20
openstackgerritGael Chamoulaud proposed openstack/tripleo-validations master: Adds roles support in the script generating the validations doc.
openstackgerritGael Chamoulaud proposed openstack/tripleo-validations master: Deletes validations directory
Tengunumans: heya! do you know anything about the neutron-haproxy-ovn-metadata container?09:33
numansTengu, I know a bit. But dalvarez knows it well.09:34
Tenguquiquell|rover: -^^09:34
numansdalvarez, can you please answer Tengu here.09:34
Tengunumans: thanks :)09:34
dalvarezsure, what's the question Tengu ?09:34
dalvarezi can try to answer, not sure if i'll do well :p09:35
Tengudalvarez: so quiquell|rover and I are trying to understand why the container fails running on fedora, and is exited with a 137 exit code.09:35
openstackgerritMartin Schuppert proposed openstack/tripleo-heat-templates master: Use oslo_messaging_rpc_port for nova rpc healthchecks
quiquell|roverdalvarez: LP
openstackLaunchpad bug 1824977 in tripleo "fedora-28 standalone failing at neutron-haproxy-ovnmeta service" [Critical,In progress]09:35
quiquell|roverdalvarez, Tengu: I have still the node at my tenant09:35
quiquell|roverif you want to poke there09:36
Tenguquiquell|rover: perfect :) I let you check with dalvarez then09:36
Tenguand if we could get some logs from haproxy, that would be really, really helpful.09:36
quiquell|roverTengu: do we need something else to have those logs ?09:36
Tenguquiquell|rover: quick question: what's the SELinux state on the vm?09:37
Tenguquiquell|rover: since haproxy doesn't write logs, we have to use the /dev/log socket. not sure about the haproxy.cfg though09:37
dalvarezquiquell|rover: Tengu that's like a very limited amount of info :P i checked the job log... what's in  /var/log/extras/failed_containers.log ?09:37
quiquell|roverTengu: Permissive09:37
Tenguquiquell|rover: :/ so not related to selinux.09:38
dalvarezquiquell|rover: Tengu not sure if it's specific to haproxy-ovnmeta though, but anyways can you check the ovn metadata agent logs?  /var/log/containers/neutron...09:38
Tengudalvarez: if it was easy... ;)09:38
quiquell|roverdalvarez: we don't have anything else the neutron-haproxy-ovnmeta is not dumping anything09:38
quiquell|roverdalvarez: Tengu has try to run it but nothing appears there09:38
quiquell|roverdalvarez: it's only f28 and tempest is passing though09:39
dalvarezTengu: i shaked my magic wand but nothing popped up :P09:39
dalvarezquiquell|rover: you mean that even with that error tempest is passing?09:39
dalvarezthat's odd, shouldnt be it as instances won't fetch metadata without ovn metadata agent running09:39
Tengudalvarez: I tried to shake my sleeve, nothing either :(09:40
quiquell|roverdalvarez: all green here
quiquell|roverdalvarez: I can give you access to the node09:40
dalvarezquiquell|rover: then there's something else :) or something's already running and binding on the haproxy port09:40
dalvarezquiquell|rover: k i have no much time now but i can try for 10 mins09:41
dalvarezquiquell|rover: let's go to that node and do some tmux09:41
dalvarezi'll bring some coffee there09:41
quiquell|roverdalvarez: ack thanks! just let know when you have some time09:42
dalvarezquiquell|rover: go go let's go now :)09:42
quiquell|roverdalvarez: tmate or pub key ?09:43
dalvarezquiquell|rover: tmate works for me :)09:45
*** holser_ is now known as holser|lunch09:49
openstackgerritArx Cruz proposed openstack/tripleo-quickstart-extras master: Removing test_update_port_with_two_security_groups_and_extra_attributes
*** florianf has quit IRC10:00
*** dsneddon has quit IRC10:02
quiquell|roverTengu: do you know where are the undercloud creds at standalone ?10:03
quiquell|roverTengu: I mean at zuul home dir10:03
quiquell|roverA ok is at .config10:03
dalvarezquiquell|rover: Tengu numans mistery solved :)
dalvarezquiquell|rover: Tengu numans when haproxy sidecar container is no longer needed, the ovn metadata agent kills it, hence you see that 137. It's all good10:04
dalvarezand expected10:04
openstackgerritRabi Mishra proposed openstack/puppet-tripleo master: Set octavia provider_drivers config option correctly
quiquell|roverconfirmed and reproduced10:08
*** dsneddon has joined #tripleo10:08
quiquell|roverTengu: do you know why validate-services is done before tempest ?10:09
*** avivgt has quit IRC10:13
*** dsneddon has quit IRC10:14
*** janki has quit IRC10:16
*** dsneddon has joined #tripleo10:27
openstackgerritSagi Shnaidman proposed openstack/tripleo-quickstart-extras master: Add explicit DNS forwarders to TLS job
*** sanjayu_ has quit IRC10:31
sshnaidmquiquell|rover, Tengu ^^10:31
openstackgerritGabriele Cerami proposed openstack-infra/tripleo-ci master: [WIP] build containers: make tripleo-repos inclusion optional
quiquell|roversshnaidm: +2 let's wait for fs039 result10:33
jaosoriorsshnaidm: Added myself to the review; I'll wait for CI10:33
*** dsneddon has quit IRC10:37
openstackgerritQuique Llorente proposed openstack/tripleo-quickstart-extras master: Run validate-services before tempest
quiquell|roverTengu: ^ this make any sense ?10:39
quiquell|roverdalvarez: ^10:39
quiquell|roverLet's add a depends on with revert on deactivation10:39
Tengudalvarez: oh wow... would be maybe better to actually "CLI stop container"?10:41
Tenguquiquell|rover: no real reason - I pushed that check there, that's all10:42
quiquell|roverTengu: let's just validate after the deploy10:42
Tenguif it's better suited elsewhere, move it10:42
Tenguah, nah10:42
TenguI did that in order to have a "lifetime" span10:42
quiquell|roverwe can filter those containers and all10:42
Tenguwe can indeed filter out the exit code 13710:42
quiquell|rovernot sure10:42
quiquell|roverwe can end with false positives there10:42
Tenguwe usually get something else when it crashes.10:42
quiquell|roverIt make sense to check after tempest ?10:43
Tenguwell, I wanted to ensure we have the state after some time10:43
Tengui.e. some time to let things crash if they have to10:43
quiquell|roverI have stop the disabling of validate-services on f2810:43
quiquell|roverfrom merging10:43
Tenguif you run the validation right after the deploy, it MIGHT happen things are all good, and a moment later, after some use, things crash.10:43
quiquell|rovery prefer to do validate-services before tempest than not doing at all10:44
quiquell|roverI see10:44
Tenguquiquell|rover: or we can put a filter for "known" container10:44
Tengulike that ovn-metadata thingy10:44
quiquell|roverbut we will have to maintain and all10:44
*** boazel_ has joined #tripleo10:44
Tenguright. hm.10:44
quiquell|roverok I am going to unblock the disabling of f2810:44
quiquell|roverand we work this out10:44
Tenguthe right thing would be to actually stop the container instead of killing its running process10:45
dalvarezTengu: quiquell|rover neutron code is not aware of containers so it kills the process it's monitoring10:45
dalvarezwhen it's no longer needed10:45
Tengudalvarez: hmm. the script that actually start the container has the needed info.10:45
Tengudalvarez: maybe time to make some changes in order to make neutron self-aware?10:46
dalvarezTengu: we deploy a wrapper to replace /bin/haproxy with a script that calls podman/docker...10:46
quiquell|roverTengu: btw validate-services is deactivated at centos-7-standalone10:46
dalvarezTengu: i dont think so :) why? anyways it can be proposed and discussed in neutron10:46
Tengudalvarez: that script generates the container name. so we might push that name somehere, and let neutron kill the container instead of the service?10:46
dalvarezTengu: that's very tripleo specific as we're using podman there but neutron just monitors the processes10:47
dalvarezthis is also true not only for haproxy but for keepalived, dnsmasq, etc.10:47
*** boazel has quit IRC10:47
Tengudalvarez: yep, there are "some" wrappers for that indeed.10:48
Tengustill... that would prevent dangling containers.10:48
Tengutherefore limit disk usage10:48
Tenguand make things cleaner10:48
Tengudalvarez: using "kill" is fine in a non-container world. maybe a thing like "if <file> exists, execute that file, else kill"10:49
Tengudalvarez: that would let the "creation wrapper" generate a file with the right content, and neutron will just call that file.10:49
Tengudalvarez: consider that as a "nice possibility to put hooks in there"10:50
dalvarezTengu: yeah i'll raise it in our meetings but not sure it'll be accepted :) ie. to make openstack components container-aware10:51
dalvarezTengu: this is the reason why the sidecar containers were created10:51
dalvarezanyways let's explore it10:51
Tengudalvarez: maybe not "container-aware", but more "hooks-aware"10:52
dalvarezright, the hook side can be explored, that sounds reasonable10:52
Tenguthat's wider than just container if it can execute a script "if script is present".10:52
Tenguthanks :)10:52
quiquell|roverdalvarez: we have a different one related to neutron
openstackLaunchpad bug 1824315 in tripleo "periodic fedora28 standalone job failing at test_volume_boot_pattern" [Critical,In progress] - Assigned to Quique Llorente (quiquell)10:52
quiquell|roverdalvarez: maybe you can help there too10:52
Tengudalvarez: do you want me to take part in the meeting?10:52
dalvarezTengu: or you can start a thread then we can xplore with the hook idea10:53
Tengudalvarez: on what ML?10:54
dalvarezTengu: openstack-dev ?10:54
Tengu*openstack-discuss then10:54
dalvarezyeah sry10:54
dalvarezi keep forgetting the rename10:54
Tengunp :). lemme start a thread in there.10:54
dalvarezTengu: i believe it may be relevant for other components as well10:54
dalvarezcool :)10:54
Tenguguess I have to add [neutron][tripleo] in the topic right?10:54
quiquell|roverdalvarez: forget about last LP was meaning nova10:55
dalvarezTengu: +1 ... although i think it'd be nice to call other components attention10:55
dalvarezquiquell|rover: yeah i was reading, confused :)10:55
quiquell|roverdalvarez: is for nova bit brain fart10:55
dalvarezTengu: not sure if there's such wrappers/sidecars for nova, cinder.. etc10:55
dalvarezsry folks need to afk10:55
Tengudalvarez: not sure there are for other than neutron.10:55
dalvarezack then10:56
Tengulet's start with that one, and we'll see how it goes.10:56
Tenguthanks for your feedback and time!10:56
openstackgerritMerged openstack-infra/tripleo-ci master: Disable validations on f28 standalone
openstackgerritQuique Llorente proposed openstack-infra/tripleo-ci master: Activate validate-services for standalone
quiquell|roverTengu, dalvarez: ^ activating it with depends on running before tempest, let's see how this pan out11:00
openstackgerritMarios Andreou proposed openstack/tripleo-common master: docker-rm: check if rpm dependency is actually installed
mariosTengu: EmilienM updated ^11:00
*** gkadam is now known as gkadam-afk11:06
Tengulemme finish my mail for that new neutron feature: hooks :)11:06
mariosTengu: ack (also updating i think there is nit doing some testing now)11:09
mariosTengu: no actually is ok i think11:10
Tengumarios: lemme check that.11:11
mariosTengu: thanks11:11
Tenguhmm, that might work. Do you have tested that somewhere?11:11
mariosTengu: just locally i played wit it just now seems like the conditions work. otherwise lets see what it does with that standalone-upgrade job :D11:13
Tengumarios: even in "--check" with ansible-playbook?11:13
mariosTengu: didn't try that i was just running those tasks in a local play11:13
Tengumarios: would be good to ensure it's working in --check as well. apparently there are some downstream job playing that, and it might fail.11:14
mariosTengu: ack.. so -check gives warning about using command and error on the py2_docker_installed.rc == 0): 'dict object' has no attribute 'rc11:15
Tengumarios: you can add a condition before the py2_docker_installed.rc:  py2_docker_installed is changed11:16
mariosTengu: but i don't get it. what does it mean about no attribute rc11:16
Tengumarios: you can debug: var=py2_docker_installed11:16
mariosTengu: hm but i added changed_when: false on that11:16
chandankumarmandre: hey11:16
mariosTengu: because of a linting issue i hit in another task earlier11:16
Tenguyou'll see, in that case you won't get the "rc" in the dict, since it's "skipped"11:16
chandankumarmandre: please have a look at this patch little improvement in tempest container11:17
Tengumarios: or "rc in py2_docker_installed" then11:17
*** cylopez has joined #tripleo11:17
mariosTengu: k will try that thanks11:17
Tengumarios: and you'll need that for the other resource using that py2_docker_installed.rc11:17
mariosTengu: yeah was just thinking that11:17
Tengumarios: another way is to disable the calls in check mode, using "check_mode: no"11:18
mariosTengu: i mean its gonna be a bit verbose...11:18
Tenguthat's as you want.11:18
Tenguboth work well.11:18
*** panda is now known as panda|lunch11:18
mariosTengu: added note for now
mariosTengu: to  be honest I've never tried that with dry run before merging any other ansible tasks and wasn't aware somethign else is relying on that11:21
mariosTengu: i mean, if this breaks it, then probably lots of other things will too cos i dont think anyone checks that !11:21
mariosTengu: i can add the check_mode: no there if you think its necessary (-1 it :D) otherwise i wont11:22
*** janki has quit IRC11:22
*** holser|lunch is now known as holser_11:26
*** rh-jelabarre has joined #tripleo11:26
Tengumarios: hm, can't say for sure if we will hit an issue or not. maybe, maybe not. I'm even unable to tell how ppl are running that "check" for the generated playbooks.11:26
Tengumarios: you can see the issue here: bug 1697507 in openstack-tripleo-heat-templates "Forward logging to haproxy.log file task failing in dry-run mode" [High,Modified] - Assigned to cjeanner11:27
mariosTengu: do you recall what/where is checking htat. and do you mean somehwere we actually run with dry-run before executing or something?11:27
mariosTengu: tx looking11:27
Tengumarios: downstream at least. and it's not the "--dry-run" apparently.11:27
Tenguthere are some jenkins things running the --check, with plain "ansible-playbook <playbook> --check <other options>"11:28
Tengucan't say more :(.11:28
mariosTengu: ack thanks for pointer11:28
mariosTengu: i thought it was something worse like we are running it in the tripleo code or seomthing11:28
Tengumarios: well, it's actually some promotion blocker downstream11:28
openstackgerrityatin proposed openstack-infra/tripleo-ci master: Allow to override kolla-build rpm config
ykarelpanda|lunch, ^^11:29
mariosTengu: the error he has there is the same one i saw .. dict has no attribute rc11:29
mariosTengu: but i don't get why that is a promotion blocker?11:29
Tengumarios: he's a QE working on rhosp-15 integration, and apparently there are QE tests with that kind of run in order to ensure things are working as expected.11:30
Tengucan't say more. you'd rather have a quick chat with Victor if you want details11:30
mariosTengu: ack ok thanks11:30
EmilienMI'll look the patch shortly11:31
mariosEmilienM: ack hoping the standalone-upgrade will be green there now lets see11:31
Tengumarios: added link to the BZ as an answer, for reference.11:31
mariosTengu: thank you11:32
*** dsneddon has joined #tripleo11:32
openstackgerritFrancesco Pantano proposed openstack/tripleo-quickstart-extras master: [DNM] Added quickstart-centos-ceph-nautilus repo
openstackgerritSagi Shnaidman proposed openstack/tripleo-quickstart-extras master: DNM: test build image in SA
openstackgerritFrancesco Pantano proposed openstack/tripleo-quickstart-extras master: Added quickstart-centos-ceph-nautilus repo
*** dsneddon has quit IRC11:38
*** ratailor has quit IRC11:48
*** shyamb has joined #tripleo11:48
*** gkadam-afk is now known as gkadam11:51
Tengubeagles: heya! can you ping me once you're online? It's about :). Thanks!11:51
*** cylopez has quit IRC11:52
beaglesTengu: sure - I'm semi-nomadic while son is at appointment atm so I might get interrupted11:52
beaglesTengu: neutron-keepalived-state-change only runs as a daemon11:53
Tengubeagles: but without detaching to the background right?11:53
beaglesTengu: but that only means that it cannot run in its own container - so it needs to be exec'ed into something else is all11:53
beaglesTengu: yeah it detaches pretty sure11:53
beaglesTengu: I was just bringing it up to explain why were exec'ing it in the first place is all :)11:54
Tengubeagles: ok. so if it detaches, we don't really need the "--detach" option for podman. hopefully.11:54
beaglesTengu: instead of podman run11:54
beaglesTengu: yeah fingers crossed11:54
Tengunow I get it. so once I correct the CMD for podman, we'll be clean.11:54
Tenguand we'll be able to actually test that somewhere.11:54
beaglesI think so - nice catch11:54
Tenguwell, thank you for the review and for catching my own mistake ;)(11:55
openstackgerritCédric Jeanneret proposed openstack/puppet-tripleo master: Correct how podman exec is called for the neutron-keepalived-state-change
Tengulet's go then.11:56
beaglescool thanks for the catch and the fix!11:56
Tenguthere are some differences between podman exec and docker exeec.11:56
Tenguhopefully those won't make the world burn.11:56
Tengubeagles: btw, I would love some discussions about "how to get some logs from the sidecars" launched by Neutron. At least the haproxy one doesn't output anything - I didn't check the others.... I guess you wouldn't be against some "-d" flag wherever it's possible right? since it's going to the container stdout...11:58
Tenguas soon as we get things to the stdout, that will be written in a dedicated file11:58
beaglesTengu: dnsmasq doesn't either afaict - logging to stdout might work best for debugging - and something is better than the nothing there is now11:59
*** thrash|g0ne is now known as thrash11:59
Tengubeagles: I'll try to get some "-d"12:00
beaglesTengu: ack12:00
Tengujust some mtg in the way, I'll work on that after it.12:01
openstackgerritSorin Sbarnea proposed openstack/tripleo-quickstart master: Test script using pytest/molecule
*** rlandy has joined #tripleo12:20
*** boazel_ has quit IRC12:29
*** shyamb has quit IRC12:37
bogdandoTengu there is debug parameter may be a proper switching trigger for advanced stdout logs12:40
*** artom has quit IRC12:42
ccamachomwhahaha o/12:42
ccamacho:) quick question12:43
ccamachowe started to hit the issue downstream when testing 13->1412:43
openstackgerritChandan Kumar (raukadah) proposed openstack/tripleo-quickstart master: Run novajoin tempest tests in all TLS job using os_tempest
ccamachodo we have BZs for tracking the fix?!12:43
ccamacholbezdick ^12:44
Tengubogdando: for haproxy, yeah. but not for dnsmasq. that one is a monster =.=12:46
bogdandoTengu: yeah, just saying let's control that behavior to not make this excessive logging as a default behavior :)12:47
*** mcornea has joined #tripleo12:47
Tengubogdando: hmm yeah. have to check how to get that into puppet-tripleo, since wrappers are generated there.12:48
TenguTHAT said... we might configure haproxy to actually push its thing to /dev/stdout12:49
bogdandoTengu: some examples
Tenguso that we will get its standard logs without debug.12:49
Tengubogdando: yeah, well, we can forget about dnsmasq being polite anyway :/12:50
bogdandoTengu: yeah, especially given that removing podman run -a flag in units would render all syslog logging efforts for podman futile12:50
bogdandowould it btw?..12:50
* bogdando thinks of disastrous UX regressions...12:50
Tengubogdando: hmmm well, stdout (aka old journald) is already written on the disk12:51
Tenguso it's not THAT bad12:51
Tengubut indeed, journalctl -u tripleo_foo won't output much12:51
bogdandoTengu: yes, I was just suddenly stunned by that questions, would /dev/log we pass for all containers take any effect when executed via podman run w/o -a12:51
Tengubogdando: no reasons it wouldn't.12:52
bogdandowell. hope so12:52
Tenguthe -a attaches the container stdout to the systemd process12:52
Tenguwhile /dev/log is a standard socket12:52
bogdandoit also takes care of journald integration12:52
Tenguno real reasons the -a will change that.12:52
Tenguwell, yes, but not with /dev/log12:53
Tengusame goal, different ways.12:53
*** pcaruana has joined #tripleo12:53
bogdandoTengu: ack12:54
bogdandothen it's fine12:54
Tenguthe -a is compensated by the --log-driver and stuff12:55
*** dsneddon has quit IRC12:57
*** shyamb has joined #tripleo12:58
*** amoralej is now known as amoralej|lunch13:01
*** Goneri has joined #tripleo13:05
*** dtantsur|brb is now known as dtantsur13:07
openstackgerritMartin Schuppert proposed openstack/tripleo-heat-templates master: Avoid concurrent nova cell_v2 discovery instances
mwhahahaccamacho: you are using the wrong containers13:19
mwhahahaccamacho: see bug 1700096 in openstack-tripleo-common "Mistral version mismatch between system libs and tripleo container" [High,Closed: notabug] - Assigned to apetrich13:20
*** mjturek has joined #tripleo13:22
*** shardy has joined #tripleo13:23
*** dsneddon has joined #tripleo13:26
*** Vorrtex has joined #tripleo13:27
openstackgerritBogdan Dobrelya proposed openstack/tripleo-docs master: Clarify DCN requirements for provider networks
ccamachothanks mwhahaha13:34
*** ykarel is now known as ykarel|afk13:36
*** ykarel|afk has quit IRC13:41
*** vinaykns has joined #tripleo13:45
*** quiquell|rover has quit IRC14:00
*** quiquell has joined #tripleo14:00
*** quiquell is now known as quiquell|off14:00
*** amoralej|lunch is now known as amoralej14:06
openstackgerritGael Chamoulaud proposed openstack/tripleo-common master: Use 'DEFAULT_VALIDATIONS_BASEDIR' variable from
*** boazel_ has joined #tripleo14:10
sshnaidmrlandy|ruck, can we merge it?  039 job fails for different reason now, dns resolving is ok14:11
sshnaidmalso it passed in previous run :(14:12
*** aakarsh has quit IRC14:12
rlandy|rucksshnaidm; lol - was just goimg to say still failing14:12
*** boazel has quit IRC14:12
rlandy|rucksshnaidm: k - let's merge this as it gets the job further at least14:14
*** itlinux has joined #tripleo14:19
*** dsneddon has joined #tripleo14:22
*** ykarel|afk has joined #tripleo14:24
openstackgerritMartin Schuppert proposed openstack/tripleo-heat-templates master: Avoid concurrent nova cell_v2 discovery instances
*** mjturek has quit IRC14:25
*** ykarel|afk is now known as ykarel14:25
*** dsneddon has quit IRC14:28
openstackgerritMerged openstack/tripleo-heat-templates master: Add OS::TripleO::NovaAZConfig
*** dsneddon has joined #tripleo14:30
openstackgerritMerged openstack/tripleo-ha-utils master: Switch to {{ undercloud_user }} for rsync
*** dsneddon has quit IRC14:36
*** jchhatbar has quit IRC14:53
*** jchhatbar has joined #tripleo14:54
*** weshay is now known as weshay|rover14:57
TenguI just remember why we don't see the haproxy logs from the sidecars :D14:59
Tenguwe do mount the host /dev/log inside the container - but the syslog configuration is set to ignore them.15:00
TenguGuess we can do something nice and clean, like updating the local0 to something else in order to allow to get a dedicated file in the right location.15:00
Tenguthat would already help a lot.15:00
mwhahahapffft logs who needs them15:01
Tengumore over..... maybe some tweaks can be done in the haproxy config.15:01
Tengumwhahaha: I know right ;). logs are like metrics: useless15:01
Tenguthat's why we have to get them XD15:01
mwhahahaif you can't see the error, is it really and error?15:02
Tenguthat's why I'm not happy when CI folks are removing my service validation thing (*stares at weshay|rover*)15:02
Tenguostrich policy is bad ;)15:03
*** mcornea has joined #tripleo15:03
Tengulet's get something nice and neat.15:03
*** dsneddon has joined #tripleo15:07
*** dsneddon has quit IRC15:14
*** absubram has joined #tripleo15:17
Tengubeagles, dalvarez : quick question: what about allowing the neutron "haproxy" metadata agent get a new param, "log_facilty", instead of the hard-coded "local0" we have? That would allow a nice separation at syslog level, meaning "dedicated haproxy file" for that metadata agent.15:24
Tenguwe can set that parameter default value to "local0" in order to avoid breaking things, but...15:25
*** dsneddon has joined #tripleo15:25
Tenguin tripleo, it would be neat to get a different value in order to differenciate that haproxy instance from the standard frontend.15:25
*** dsneddon has quit IRC15:32
*** jchhatbar has quit IRC15:33
*** jchhatbar has joined #tripleo15:33
*** sanjayu_ has quit IRC15:45
*** boazel has joined #tripleo15:53
*** iurygregory has quit IRC15:53
*** boazel_ has quit IRC15:56
*** bogdando has quit IRC16:01
*** ykarel is now known as ykarel|away16:05
openstackgerritRabi Mishra proposed openstack/puppet-tripleo master: Set octavia provider_drivers config option correctly
*** shyamb has quit IRC16:06
*** baha has quit IRC16:06
openstackgerritMarios Andreou proposed openstack-infra/tripleo-ci master: commenting for maintainability
*** suuuper has quit IRC16:09
beaglesTengu: those config files are generated by neutron, correct? We'd have to either edit the cfg in place or submit a patch to neutron.16:11
beaglesTengu: I don't disagree though16:11
beaglesTengu: one thing we need think about is the lifecyles of networks/routers etc16:12
beaglesTengu: and when are the logs going to get cleaned up16:12
Tengubeagles: yeah, generated within neutron, already checked where/how.16:17
*** marios has quit IRC16:18
Tengubeagles: regarding the logs: logrotate can take care of them eventually. Or we can tweak some tmpwatch call just for those logs.16:18
beaglesTengu: ack16:18
openstackgerritMerged openstack/tripleo-heat-templates master: Add Etcd to DistributedCompute roles
Tengubeagles: I can try to provide a WIP patch tomorrow, but I'll need some help I think :).16:22
Tenguanyway. signing-off for real now :). See you tomorrow!16:23
*** dsneddon has joined #tripleo16:32
*** florianf has quit IRC16:35
beaglesTengu: cheers16:37
*** khyr0n has quit IRC16:47
*** hberaud is now known as hberaud|gone17:02
*** avivgt has quit IRC17:03
*** kopecmartin is now known as kopecmartin|off17:05
*** dsneddon has joined #tripleo17:08
openstackgerritAlex Schultz proposed openstack/tripleo-common master: Remove ProcessTemplatesAction as base class
*** mcornea has quit IRC17:25
openstackgerritPiotr Kopec proposed openstack/tripleo-heat-templates stable/rocky: Allow NovaRbdPoolName to be role specific
*** aakarsh has quit IRC17:27
*** krypto has joined #tripleo17:28
*** aakarsh has joined #tripleo17:33
kryptohi all  i am trying to add  a new node to the stack,but its failing because of an existing node.I dont see any commands being executed on that node there is nothing on "/var/log/os-apply-config.log" ,while on other nodes i can see some changes in log files17:34
*** dsneddon has joined #tripleo17:34
kryptois there an agent in tripleo that runs on all nodes17:34
kryptolast message during stack update after that it fails after timeout17:35
krypto2019-04-17 17:14:48Z [ControllerIpListMap]: UPDATE_COMPLETE  state changed17:35
krypto2019-04-17 17:14:48Z [AllNodesValidationConfig]: UPDATE_COMPLETE  state changed17:35
*** boazel_ has joined #tripleo17:35
kryptoafter posting here i noticed that os-collect-config is not running on old node17:37
*** mcornea has joined #tripleo17:38
*** boazel has quit IRC17:38
*** arxcruz is now known as arxcruz|off|2317:53
openstackgerritPiotr Kopec proposed openstack/tripleo-heat-templates stable/queens: Allow NovaRbdPoolName to be role specific
mwhahahakrypto: what version18:02
mwhahahaand what task times out18:02
*** amoralej is now known as amoralej|off18:03
*** Goneri has quit IRC18:06
kryptomwhahaha Newton ...this is a lab environment simulating production18:07
mwhahahawhat does os-collect-config show in the journal18:08
kryptothis is the last message18:08
krypto2019-04-17 17:14:48Z [AllNodesValidationConfig]: UPDATE_COMPLETE  state changed18:08
krypto Stack overcloud UPDATE_FAILED18:08
kryptoHeat Stack update failed.18:08
kryptousr/bin/python /usr/bin/os-collect-config18:09
kryptoSource [request] Unavailable.18:09
kryptoNo local metadata found (['/var/lib/os-collect-config/local-data'])18:09
kryptoNo auth_url configured.18:09
mwhahahayea that message is always there18:09
kryptowhen i run it manually it exits with above messages18:09
mwhahahai think18:09
mwhahahayou might check the heat api logs18:09
mwhahahamake sure it's not erroring18:09
kryptoon other nodes its running fine with this message"/var/lib/os-collect-config/local-data not found. Skipping18:10
kryptoNo local metadata found (['/var/lib/os-collect-config/local-data'])"18:10
kryptoatleast the service is running fine18:10
*** dmacpher_ has joined #tripleo18:11
*** psachin has quit IRC18:13
*** dmacpher has quit IRC18:13
*** rfolco has quit IRC18:20
*** Goneri has joined #tripleo18:21
*** rfolco has joined #tripleo18:21
openstackgerritNagasai Vinaykumar Kapalavai proposed openstack/puppet-tripleo master: Configuration changes to support Qdr-mesh topology.
*** ykarel|away has quit IRC19:02
*** holser_ has joined #tripleo19:12
*** boazel has joined #tripleo19:18
*** raildo has joined #tripleo19:20
*** boazel_ has quit IRC19:21
*** jcoufal has joined #tripleo19:21
*** artom has quit IRC19:22
*** khyr0n has quit IRC19:26
kryptomwhahaha i dont see anything strange in heat api logs ,when i run python /usr/bin/os-collect-config it exits and doesnt stay up19:34
*** Goneri has joined #tripleo19:34
kryptoSource [request] Unavailable.19:35
mwhahahayea it runs periodically19:36
kryptothis file is also empty /etc/os-collect-config.conf19:39
kryptoi guess it happened because during stack update nova service on director was not responding  and metadata was not reachable19:40
*** rlandy|ruck is now known as rlandy|afk19:46
kryptois there a way to regenerate /etc/os-net-config/config.json and /etc/os-collect-config.conf19:49
*** dciabrin_ has joined #tripleo19:54
*** dciabrin has quit IRC19:55
*** raildo has quit IRC20:02
openstackgerritRafael Folco proposed openstack-infra/tripleo-ci master: DNM: Test push containers for buildah jobs
*** jcoufal has quit IRC20:22
openstackgerritRafael Folco proposed openstack-infra/tripleo-ci master: DNM: Test push containers for buildah jobs
*** jcoufal has joined #tripleo20:30
*** snecklifter has joined #tripleo20:30
snecklifterlaunchpad broken for anyone else or just me?20:30
*** jcoufal has quit IRC20:39
dsneddonkrypto, I'm not aware of any way to generate /etc/os-net-config/config.json and /etc/os-collect-config.conf without doing a stack update.20:55
dsneddonkrypto, Do you see anything on the logs on the host about os-net-config? I wonder if the /etc/os-net-config/config.json was empty because of a failure. I would grep /var/log/messages for "os-net-config" or "network_config" and see if you have logs about whether the networking is at fault.20:57
kryptodsneddon heat stack-update was ignoring this node and there was no new logs in /var/log/os-apply-config.log probably because os-collect was not running.Finally i copied  /etc/os-net-config/config.json from another node (changed IP) and copied /etc/os-collect-config.conf from another node (changed metdata values) and its now service is running20:58
kryptodsneddon you are right after reboot this node had only pxe nic ... in what case would os-collect remove interface configuration20:59
*** boazel_ has joined #tripleo21:01
ade_leeEmilienM, mwhahaha - hey guys, any tips on debugging an failing overcloud deploy where I'm getting a ValueError  Value must be valid JSON:21:01
*** snecklifter has quit IRC21:01
ade_leehappening at the beginning of the deloyment after the pre-deploy validations21:01
ade_leeunfortunately, it doesn't tell me what json is broken ..21:02
mwhahahawould need the whole output21:02
mwhahahathat's a generic error21:02
ade_leeother than: ValueError: Value must be valid JSON: Extra data: line 1 column 6 - line 1 column 14 (char 5 - 13)21:02
mwhahahaneed the surrounding lines, that doesn't mean anything other than some json is bad21:03
ade_leemwhahaha, can give you a tmate if you can jump on and see ..21:03
*** boazel has quit IRC21:04
*** rh-jelabarre has quit IRC21:09
*** rh-jelabarre has joined #tripleo21:09
*** slaweq has quit IRC21:24
*** mcornea has quit IRC21:35
dsneddonkrypto, If os-net-config fails, it will fall back to a configuration where every NIC gets set to DHCP. And since only the PXE NIC probably has a DHCP server on the network, you only see that NIC active.21:50
kryptoThanks dsneddon that was the issue with that node21:51
dsneddonkrypto, Which version are you deploying? You probably want to have "" calling os-net-config, not os-apply-config. If you see "os-apply-config" in your NIC config templates, I can guide you to make those changes.21:51
kryptoafter i started os-net-collect it applied all the configuration and node is functional now ,however deployment failed21:54
dsneddonkrypto, OK, I think you are doing the right thing for Newton, we still relied on os-collect-config to run the network configuration.21:57
kryptoi am still learning triple-O ,the guy who deployed this lab left our company...21:58
kryptocan you help me understand what this error means openstack stack failures list overcloud --long21:59
krypto  resource_type: OS::Heat::StructuredDeployment21:59
krypto  physical_resource_id: ff40d821-5439-43eb-a2e5-93d74db2d82521:59
krypto  status: CREATE_FAILED21:59
krypto  status_reason: |21:59
krypto    CREATE aborted21:59
krypto  deploy_stdout: |21:59
krypto  deploy_stderr: |21:59
mwhahahasounds like validation issues21:59
mwhahahai think that's where it does connectivity checks22:01
kryptois there a log i can look to catch these errors
mwhahahathe heat engine log might have something22:04
mwhahahanot completely sure22:04
kryptoi havent enabled debug in heat possibly because of that ...i couldnt find anything in heat-engine.log22:06
mwhahahain newton you could query heat for teh actual errors usually22:06
*** rh-jelabarre has quit IRC22:08
*** boazel_ has quit IRC22:11
kryptoThanks i wil check it22:21
openstackgerritMerged openstack/tripleo-common master: Uncap jsonschema
openstackgerritMerged openstack/tripleo-heat-templates master: Fix CI ipv6 NIC config default route
openstackgerritEmilien Macchi proposed openstack/tripleo-heat-templates master: Scale-down tasks for RHSM
*** aakarsh has quit IRC22:49
gouthamrx-posting from #rdo:23:00
gouthamrhi folks, i'm looking for a stable/rocky backport to kolla to be available in a container image23:01
gouthamrbackport in question ( merged 3 days ago23:01
gouthamri looked here: and if i understand this correctly, it will take longer for getting newer container images promoted?23:01
*** rlandy|afk is now known as rlandy|ruck23:03
gouthamrunsure of the cadence here, any pointers appreciated :)23:05
*** raildo has quit IRC23:07
EmilienMweshay|rover: FYI
EmilienMweshay|rover: we're going to branch, probably tomorrow23:07
weshay|roverEmilienM ah.. thanks23:12
weshay|roverrfolco rlandy|ruck HEADS UP23:12
rlandy|ruckk - thanks for the advanced warning23:12
weshay|roverit's like the first time :P23:13
weshay|roveronly took EmilienM 10 yrs23:13
* weshay|rover runs23:13
EmilienM18? wow I got an upgrade23:13
* mwhahaha rm -rfs all thet hings23:14
EmilienMYES to that23:15
mwhahahaEmilienM: btw placement is a Train thing23:16
mwhahahanova is planning on supporting it for stein still i think23:16
mwhahahaif i recall the ML posts23:16
EmilienMmwhahaha: right but they plan to remove it after PTG I've heard23:16
EmilienMwhich will break our master jobs in promotion23:16
EmilienMas the placement integration isn't finished at all23:17
EmilienMstill under review and failing heavily23:17
EmilienMso yeah kind of worried23:17
mwhahahahorible breakages in m123:17
EmilienManyway I'm out23:17
mwhahahacan't be worse than the other twelve times they've done it23:17
* mwhahaha runs away23:17
EmilienMlol that was a valid troll23:18
weshay|roverwe don't need  no stinking jobs in nova23:20
weshay|roverEmilienM is this where we allow packages for f28 builds?
weshay|roverrlandy|ruck ^23:37
weshay|roverrlandy|ruck I see setuptools is there for python2/323:37
weshay|roverrlandy|ruck ya.. so if the rpm is available23:39
weshay|roverperhaps it's not installed and we have to update a spec23:39
* weshay|rover looks at the bug23:39
weshay|roverer.. rlandy|ruck
rlandy|ruckweshay|rover; sec - fixing ci.centso failures- it was my fault :(23:42
weshay|roverIT"S ALL YOUR FAULT23:42
*** khyr0n has joined #tripleo23:43
rlandy|ruckweshay|rover: ^^ that should fix one set of problems23:47
*** absubram has quit IRC23:50
weshay|roverrlandy|ruck afaict.. /me adding notes to the bug23:50
weshay|roverI don't see why we would get that import error yet23:50
weshay|roverrlandy|ruck so... this is a good one for the repro23:51
weshay|roverI'll fire one up23:51
rlandy|ruckwe are only seeing that in promotions23:52
weshay|roverrlandy|ruck ya.. that is even weirder23:52
weshay|roverbecause it's all tripleo code23:52
rlandy|ruckalso it's fedora23:52
rlandy|ruckwe don't run that as commonly23:52
*** boazel_ has joined #tripleo23:52
weshay|roverit's running in check23:53
weshay|roverkicked a repro23:54
rlandy|ruckweshay|rover: thanks - can you vote on so we can hopefully clear that error23:55
*** boazel has quit IRC23:55
*** khyr0n has quit IRC23:56
*** khyr0n has joined #tripleo23:59

