Friday, 2020-06-26

openstackgerritMerged openstack/octavia stable/train: Update the lb_id on an amp earlier if we know it  https://review.opendev.org/70661500:21
*** yamamoto has joined #openstack-lbaas00:36
*** yamamoto has quit IRC00:41
*** jamesden_ has joined #openstack-lbaas00:43
*** jamesdenton has quit IRC00:43
*** armax has quit IRC00:50
*** armax has joined #openstack-lbaas00:58
*** yamamoto has joined #openstack-lbaas01:12
*** yamamoto has quit IRC01:38
*** vishalmanchanda has joined #openstack-lbaas02:29
*** yamamoto has joined #openstack-lbaas02:42
*** yamamoto has quit IRC02:43
*** yamamoto has joined #openstack-lbaas02:43
*** hongbin has joined #openstack-lbaas02:59
*** psachin has joined #openstack-lbaas03:29
*** hongbin has quit IRC03:33
*** armax has quit IRC04:05
*** spatel has joined #openstack-lbaas04:17
*** spatel has quit IRC04:21
*** aannuusshhkkaa has quit IRC05:37
*** maciejjozefczyk has joined #openstack-lbaas05:50
*** maciejjozefczyk has quit IRC05:51
openstackgerritCarlos Goncalves proposed openstack/octavia master: DNM: ARM64 support  https://review.opendev.org/73809606:22
*** rpittau|afk is now known as rpittau06:29
*** riuzen has joined #openstack-lbaas06:59
*** riuzen has quit IRC07:01
*** maciejjozefczyk has joined #openstack-lbaas07:25
*** stingrayza has joined #openstack-lbaas07:26
*** stingray- has joined #openstack-lbaas07:28
*** also_stingrayza has quit IRC07:29
*** stingrayza has quit IRC07:31
openstackgerritAnn Taraday proposed openstack/octavia master: Preupgrade check for amphorav2 provider  https://review.opendev.org/73555607:41
*** ataraday_ has joined #openstack-lbaas07:54
openstackgerritCarlos Goncalves proposed openstack/octavia master: DNM: ARM64 support  https://review.opendev.org/73809608:11
*** stingray- is now known as stingrayza08:36
openstackgerritCarlos Goncalves proposed openstack/octavia-tempest-plugin master: DNM: CentOS 8 controller and amphora job  https://review.opendev.org/69845008:50
*** born2bake has joined #openstack-lbaas09:01
openstackgerritCarlos Goncalves proposed openstack/octavia-tempest-plugin master: DNM: CentOS 8 controller and amphora job  https://review.opendev.org/69845009:05
*** ramishra has quit IRC09:09
*** luksky has joined #openstack-lbaas09:18
openstackgerritCarlos Goncalves proposed openstack/octavia master: Introduce an image driver interface  https://review.opendev.org/73801709:23
openstackgerritCarlos Goncalves proposed openstack/octavia master: Add amphora image tag capability to Octavia flavors  https://review.opendev.org/73752809:23
openstackgerritCarlos Goncalves proposed openstack/octavia master: DNM: ARM64 support  https://review.opendev.org/73809609:24
openstackgerritCarlos Goncalves proposed openstack/octavia master: DNM: ARM64 support  https://review.opendev.org/73809609:49
*** ramishra has joined #openstack-lbaas09:52
*** rpittau is now known as rpittau|bbl10:04
openstackgerritMerged openstack/octavia stable/train: Workaround peer name starting with hyphen  https://review.opendev.org/73243010:15
cgoncalvescores, please review these two backports so that we can cut a train dot release: https://review.opendev.org/#/c/738023/ and https://review.opendev.org/#/c/737155/10:19
openstackgerritCarlos Goncalves proposed openstack/octavia master: DNM: ARM64 support  https://review.opendev.org/73809611:22
*** gcheresh has joined #openstack-lbaas11:34
*** gcheresh has quit IRC11:44
*** rpittau|bbl is now known as rpittau12:12
openstackgerritCarlos Goncalves proposed openstack/octavia master: DNM: ARM64 support  https://review.opendev.org/73809612:33
*** gcheresh has joined #openstack-lbaas12:57
*** jamesden_ is now known as jamesdenton12:59
*** stingrayza has quit IRC13:00
*** TrevorV has joined #openstack-lbaas13:25
ataraday_johnsom, Hi! I released that failover refactor broke some of amphorav2 functionality, like update_vrrp_conf() method changed and v2 tasks we not updated. If you have WIP patch for v2 refactor - maybe you can upload it and we can work on it together to speed up things.13:28
openstackgerritCarlos Goncalves proposed openstack/octavia master: DNM: ARM64 support  https://review.opendev.org/73809613:46
*** yamamoto has quit IRC13:54
*** gthiemon1e is now known as gthiemonge13:55
*** yamamoto has joined #openstack-lbaas13:56
*** stingrayza has joined #openstack-lbaas13:57
*** gcheresh has quit IRC14:19
*** armax has joined #openstack-lbaas14:24
*** ataraday_ has quit IRC14:51
openstackgerritCarlos Goncalves proposed openstack/octavia master: DNM: ARM64 support  https://review.opendev.org/73809615:26
openstackgerritCarlos Goncalves proposed openstack/octavia master: DNM: ARM64 support  https://review.opendev.org/73809615:30
*** gcheresh has joined #openstack-lbaas15:31
*** yamamoto has quit IRC15:44
*** rpittau is now known as rpittau|afk15:59
*** vishalmanchanda has quit IRC16:15
*** psachin has quit IRC16:18
*** yamamoto has joined #openstack-lbaas16:20
*** yamamoto has quit IRC16:27
openstackgerritCarlos Goncalves proposed openstack/octavia master: DNM: ARM64 support  https://review.opendev.org/73809616:37
xgermancgoncalves: you are welcome16:45
cgoncalvesxgerman, lol. you've just +W'd them. danke dir!16:46
johnsomxgerman While you are on a roll: https://review.opendev.org/#/c/738070/16:47
johnsomThat actually broke our OSA friends16:48
johnsomThanks!16:49
xgermansure thing!16:49
johnsomcgoncalves Thanks for getting stable/train moving again. We should try to get that endpoint_type patch in the stable/train release as the bad patch was backported there.16:50
johnsomWow, there is an octavia patch in experimental that has run 28 hr 21 min16:51
johnsomIt's the v2 patch16:53
*** rouk has joined #openstack-lbaas17:04
roukquestion, i have a user who "accidentally" deleted their certs out of barbican before setting the new ones on the listener, and setting the new one on the listener errors due to the old ones, whats the best workaround?17:06
johnsomThat was fixed a while ago. What version of Octavia are you running?17:07
roukuhhh, sec17:07
johnsomhttps://review.opendev.org/#/c/691987/17:08
rouk5.0.1, latest train.17:11
johnsomThat issue was a confluence of open bugs coming together. There is an open bug in barbican to allow us to "reserve" pkcs12 bundles like we did for the raw containers format. So when we switched to pkcs12 bundles all of a sudden people could delete in use barbican content. Which exposed that our API thought they could not do that. So we had to go fix all of that in the API.17:11
roukshould i be building from stable/train?17:11
johnsomOk, one second let me check the status of that patch in train. We are planning a train stable branch release very soon.17:11
roukyeah, if stable/train has it, or i can safely cherrypick it, ill get that built and deployed to fix this for us.17:13
johnsomYeah, stable/train has these patches. I'm just not sure if it has been "released" yet. Github is not being nice to me today.17:15
johnsomThere were a series of patches related to that issue. It might be hard to cherry pick.17:15
roukwell if its in stable/train im fine building that17:16
roukwe are just using release packages currently for octavia17:16
johnsomAs a short term workaround, you could update the barbican href in the database to point to the new href.17:16
roukwill the listener delete?17:16
rouki could dump the listener and tell the user to remake it, at worst.17:16
johnsomNo17:16
cgoncalvesfix not released yet in train17:17
johnsomOk. Yeah, we are cutting a new train release fairly soon. I would expect one early next week at the latest.17:17
roukah okay, ill get stable/train built and deployed then, had to do that for a lot of projects in stein too... as missing patches in release versions plagues every openstack project basically.17:19
johnsomYeah, we hit gate breakage from the python3 switch, so we couldn't do stable releases for a while.17:19
roukdoes the patch have any messaging that tells the user that the cert is gone? i have it in my logs, but the user just gets a bricked LB with everything in error.17:21
openstackgerritCarlos Goncalves proposed openstack/octavia-tempest-plugin master: Define and use octavia nodesets  https://review.opendev.org/73824617:22
johnsomWell, the patches just make actions work when the content is missing.17:23
johnsomIf they try to use a missing pkcs12 bundle, the api and client both tell the user it is invalid17:24
*** gcheresh has quit IRC17:24
roukalright, not really an issue, just curious is all17:24
*** maciejjozefczyk has quit IRC17:28
johnsomrouk I just checked on delete, it appears the delete fix was in 5.0.1, so they should be able to delete the listener and recreate.17:29
johnsomJust update isn't in 5.0.117:29
roukdelete doesnt work, stuck complaining17:39
roukno error returned to client, same octavia-listener-delete/update flow that is stuck complaining about certs17:40
johnsomYeah, I know you don't have update, but I would have expected the delete to work with 5.0.117:40
roukyeah no dice, same revert based on the cert retrieval error17:41
openstackgerritMerged openstack/octavia stable/train: Fix netcat option in udp_check.sh for CentOS/RHEL  https://review.opendev.org/73802318:35
johnsomUgh, one of the stable/train patches landed on a rax host 600 seconds a test19:11
openstackgerritMerged openstack/octavia stable/train: Fix batch member create for v1 amphora driver  https://review.opendev.org/73715519:23
openstackgerritMerged openstack/octavia master: Fix neutron subnet lookup ignoring endpoint_type  https://review.opendev.org/73807019:23
johnsomWell, at least it merged19:24
openstackgerritMichael Johnson proposed openstack/octavia stable/ussuri: Fix neutron subnet lookup ignoring endpoint_type  https://review.opendev.org/73826419:24
openstackgerritMichael Johnson proposed openstack/octavia stable/train: Fix neutron subnet lookup ignoring endpoint_type  https://review.opendev.org/73826519:24
roukthis would be in the haproxy logs on the amphora, right? had every amphora in dev go error on cert rotation after 18 days on train octavia.amphorae.drivers.haproxy.exceptions.InternalServerError: Internal Server Error19:31
johnsomNo, the reason there should be in the main controller logs. It will also be in the amphora logs, likely syslog or amphora-agent19:32
roukno reason i can find in controller logs19:33
johnsomIt should be in the line or two above the "internal server error line"19:33
roukhttp://paste.openstack.org/show/PRXAusVIUR5QTPo3igK9/19:34
johnsomYeah, it's still above those lines.19:36
roukmm, figured it would be in the stack, looked higher, its just a generic message so my search didnt see it: Amphora agent returned unexpected result code 500 with response {'error': 'write() argument must be str, not bytes', 'http_code': 500}19:36
rouksmells like python219:37
roukthe cert being used is being represented as b'XXX' in the logs.19:37
johnsomAh, yeah, that was fixed too. It was a python issue19:37
roukthis might be 5.0.019:38
roukchecking19:38
roukoh, too late, already updated the containers, dont have records in this dev spot.19:38
johnsomIt was this bug: https://review.opendev.org/#/c/719922/19:39
roukit was kolla as of march 25th ish19:39
openstackgerritCarlos Goncalves proposed openstack/octavia master: DNM: ARM64 support  https://review.opendev.org/73809619:43
roukjohnsom: yeah checked and 5.0.1 doesnt have that fix19:49
roukkinda a timebomb :/19:49
roukglad im pushing stable/train today.19:49
*** TrevorV has quit IRC19:50
*** yamamoto has joined #openstack-lbaas20:25
*** yamamoto has quit IRC20:30
*** aannuusshhkkaa has joined #openstack-lbaas20:31
*** shtepanie has joined #openstack-lbaas20:41
shtepaniehi! does anyone have any advice or tips on how to debug a test case from Zuul that timed out? https://zuul.opendev.org/t/openstack/build/73e396cdd62f40cca2914f7e0be683a020:46
johnsomI can take a look20:46
johnsomshtepanie So that job is "non-voting" which means we know it is sometimes experiencing problems not related to the patch.20:47
johnsomThat said, I will still take a quick look20:47
shtepanieah i see, thanks!20:48
johnsomSo this one, show_loadbalancer provisioning_status failed to update to ACTIVE within the required time 900. Current status of show_loadbalancer: PENDING_CREATE20:49
johnsomFor whatever reason nova was unable to boot a VM inside the 900 second timeout.20:49
shtepaniewhere did you look to find that error?20:50
johnsomOr it never was reachable on the network.20:50
johnsomhttps://zuul.opendev.org/t/openstack/build/73e396cdd62f40cca2914f7e0be683a0/log/job-output.txt#6061220:50
johnsomThat is the direct link. I usually start with the "job-output.txt" log file. Then will dig into the other logs are I see where the error might be20:51
aannuusshhkkaaSo how do we go about debugging this?20:51
openstackgerritCarlos Goncalves proposed openstack/octavia master: DNM: ARM64 support  https://review.opendev.org/73809620:52
johnsomWell, I would say that you don't need to debug it as it is unrelated to your patch. But if you are interested in trying to figure it out, we would drill down into the deeper level logs to see if there are hints.20:53
johnsomMy next stop is usually the screen-o-cw.txt log, which is an Octavia worker log file: https://zuul.opendev.org/t/openstack/build/73e396cdd62f40cca2914f7e0be683a0/log/controller/logs/screen-o-cw.txt#176020:54
johnsomI can confirm here that the only issue/error is that the controller can't reach the VM being booted.20:54
johnsomAgain at this point it is either a nova or neutron failure.20:54
johnsomAs you can see in the logs, we try for a long time....20:55
shtepanieah ok, thanks for clarifying!20:56
aannuusshhkkaaohh how can you be sure that it wasn't a failure on octavia side?20:56
johnsomNext I would typically look in the libvirt logs to see if the hypervisor kernel crashed. Unfortunately this multi-node job doesn't collect those logs.20:57
johnsomI highly suspect that was the issue here.20:57
aannuusshhkkaaohh.. so how do we do away with the time out error?20:58
aannuusshhkkaawithout knowing what caused it20:58
johnsomThere is a bit of art to this. I look for any log messages that are of "ERROR", there is a filter at the top. I also then look at what is in the logs. In this case I see it's just repeatedly attempting to connect to the amphora. If that part of the code was broken, the other scenario tests would have all failed as well.20:59
johnsomWell, since it's non-voting, it won't block your patch or stop people from reviewing it. If Zuul gives it a +1 you are good to go.21:00
johnsomIf you want to see if you can clear it, and you know it's an un-related issue to the patch, you can post a comment of "recheck" which will tell zuul to re-run the tests. However, we use those sparingly as they use resources to run. Also in this case, it's a non-voting job, so no really need/reason to recheck it.21:01
shtepaniewill recheck run all of the tests?21:02
johnsomYou just got the unlucky roll of the dice on that one. It landed on a test host with a broken hypervisor21:03
aannuusshhkkaaaah okay so we leave it as it is.. how can you be sure that it will not cause any issues in production level code?21:03
johnsomYes, it reruns all of the tests21:03
johnsomThe tests that don't say "non-voting" are the tests that ensure that.21:03
johnsomBasically I can tell that the failure was not in our code by looking at our logs. To isolate it more I would dig through the nova and neutron logs. But those services logs are not always so helpful. If we collected the libvirt logs in that job, it would probably be very clear with a kernel trace in the log.21:05
johnsomThis is one of the challenges of relying on other projects. Sometimes they are broken and impact us.21:06
aannuusshhkkaawhy dont we collect libvirt logs?21:06
aannuusshhkkaaalso, at this stage, do we inform the nova or neutron team about potential bugs?21:06
johnsomThat I don't know. The parent job created by the openstack-qa team doesn't collect those logs in the multi-node jobs. I think it really should.21:06
aannuusshhkkaaokay cool21:07
johnsomIf we can isolate it yes, we would open a bug for them. In the case of the qemu/kvm crash, I opened a kernel bug for it: https://bugzilla.kernel.org/show_bug.cgi?id=19252121:08
openstackbugzilla.kernel.org bug 192521 in kvm "KVM: entry failed, hardware error 0x0" [High,New] - Assigned to virtualization_kvm21:08
*** armax has quit IRC21:10
aannuusshhkkaagotcha! thanks for the clarification.. so now you will be able to review and merge our branch to master right?21:10
johnsomYes, the test passed so that is good sign it is ready for reviews.21:10
aannuusshhkkaaalrighty!21:11
johnsomI don't think I will have time to review that today, but early next week. Maybe others will get to it sooner.21:12
aannuusshhkkaaThat's okay.. just making sure there are no blockers from our end..21:13
johnsomNope, looks good21:13
shtepaniethanks!!21:13
aannuusshhkkaaThank you, johnsom! :)21:14
johnsomNo problem21:14
*** armax has joined #openstack-lbaas21:26
openstackgerritMerged openstack/octavia master: Fix error on devstack cleanup  https://review.opendev.org/73551021:38
*** servagem has quit IRC22:01
*** luksky has quit IRC22:22
roukaw did vip acls not make it into train?22:58
rouknevermind, just didnt show up in the patch notes for the release, but its in the api and yaml notes23:02
johnsomACLs are listed in the features of 5.0.0, just spelled out as "access control list"23:15
roukah so it is, gets lost in the list, i am blind as a bat apparently.23:15
johnsomHa, blinded by the shiny list of new features!23:16
roukevery release octavia gets much nicer, ill say that much23:16
johnsomWe try....23:16
roukwhile every release i evaluate magnum and it falls over.23:16
roukeven though its my next biggest feature request after LBs were23:17
*** born2bake has quit IRC23:45

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!