Monday, 2020-04-13

*** yamamoto has quit IRC00:34
*** rcernin has joined #openstack-lbaas01:59
*** yamamoto has joined #openstack-lbaas02:16
*** ramishra has joined #openstack-lbaas03:08
*** ramishra has quit IRC03:08
*** ramishra has joined #openstack-lbaas03:08
*** psachin has joined #openstack-lbaas03:27
*** psachin has quit IRC04:13
*** yamamoto has quit IRC05:20
*** tkajinam has quit IRC05:45
*** tkajinam has joined #openstack-lbaas05:46
*** vishalmanchanda has joined #openstack-lbaas06:00
*** gcheresh has joined #openstack-lbaas07:25
*** ataraday_ has joined #openstack-lbaas07:25
*** dayou has quit IRC08:30
*** dayou has joined #openstack-lbaas08:32
*** zasherif has joined #openstack-lbaas08:33
*** dayou has quit IRC08:38
*** dayou has joined #openstack-lbaas08:39
*** maciejjozefczyk has joined #openstack-lbaas08:41
*** dayou has quit IRC08:48
*** dayou has joined #openstack-lbaas08:50
*** dayou has quit IRC08:50
rm_workcgoncalves: :)08:54
openstackgerritAdam Harwell proposed openstack/octavia-tempest-plugin master: Remove old api_v1_enabled cruft from job defs  https://review.opendev.org/71387908:56
*** gcheresh has quit IRC08:57
rm_worksorrison: you think you might have some time to finish up https://review.opendev.org/#/c/695349/ ? make it actually use a different network such that it proves all the networking code works?09:00
rm_workcgoncalves: hoping to get in https://review.opendev.org/#/c/589180/ today09:01
*** JayLiu has quit IRC09:24
*** zasherif has quit IRC09:26
openstackgerritAdam Harwell proposed openstack/octavia master: WIP: Failover stop threshold  https://review.opendev.org/65681109:53
*** born2bake has joined #openstack-lbaas10:01
rm_workFYI I'm picking up work on https://review.opendev.org/#/c/656811/ again10:14
rm_workI think I should be able to get something up and working soon with that base I had started on earlier10:14
*** tkajinam has quit IRC10:31
rm_workinteresting, just noticed we have amps going to ERROR on cert rotation from housekeeping... it's getting a 500 response from the amp agent on the rotate attempt10:54
rm_workall i've got from the amp-agent log is this:11:01
rm_work[05/Apr/2020:23:25:20 +0000] "PUT /1.0/certificate HTTP/1.1" 500 209 "-" "Octavia HaProxy Rest Client/0.5 (https://wiki.openstack.org/wiki/Octavia)"11:01
*** gcheresh has joined #openstack-lbaas11:02
openstackgerritMerged openstack/octavia-tempest-plugin master: Add devstack plugin support  https://review.opendev.org/70845111:04
rm_worksorry, new one: [10/Apr/2020:23:25:24 +0000] "PUT /1.0/certificate HTTP/1.1" 500 209 "-" "Octavia HaProxy Rest Client/0.5 (https://wiki.openstack.org/wiki/Octavia)"11:06
rm_workfile appears changed at the right time:11:06
rm_work-rw-rw----. 1 root root 0 Apr 10 23:25 /etc/octavia/certs/server.pem11:06
rm_workbut it's empty11:06
rm_workso, cert maybe didn't get sent over right? O_o11:07
rm_workbroke during stream read?11:07
rm_workunclear11:07
rm_workbut this is going to happen to all our amps and break them, I think11:08
rm_workI might have to turn off housekeeping until i figure this out <_<11:08
*** ccamposr has joined #openstack-lbaas11:15
openstackgerritAdam Harwell proposed openstack/octavia master: Use routed network filter if it exists  https://review.opendev.org/70615311:23
*** sapd1 has joined #openstack-lbaas11:34
*** sapd1 has quit IRC11:36
rm_workjohnsom: commented on failover rewrite, one issue with a missing requires, i think. also, would be nice to just temporarily shove back in the logic that deletes spares for this cycle -- shouldn't be TOO hard? realized that'd be a kind of annoying functionality gap to have between cycles11:40
*** sapd1 has joined #openstack-lbaas11:41
*** sapd1 has quit IRC11:41
*** sapd1 has joined #openstack-lbaas11:57
*** sapd1 has quit IRC12:04
*** tkajinam has joined #openstack-lbaas12:05
rm_workWOAH12:17
rm_workdid SQLite fix transactions?! O_O12:17
rm_workoctavia.tests.functional.db.test_repositories.AllRepositoriesTest.test_sqlite_transactions_broken12:17
rm_workstarted failing12:18
rm_workalong with another test that uses rollbacks12:18
rm_workguessing in SQLAlchemy 1.3.1612:23
rm_workhttps://docs.sqlalchemy.org/en/13/changelog/changelog_13.html#change-4786673f9a875e26cf92b01ec92c497512:23
rm_workmaybe?12:23
rm_worktesting12:24
rm_workhmm no12:26
rm_worki feel like that REALLY should be it though12:36
rm_worknot sure why i can't replicate locally12:37
*** gcheresh has quit IRC13:21
*** sapd1 has joined #openstack-lbaas13:27
*** ataraday_ has quit IRC13:28
*** rcernin has quit IRC13:41
*** gcheresh has joined #openstack-lbaas14:26
*** maciejjozefczyk has quit IRC14:35
*** gcheresh has quit IRC14:35
*** servagem has joined #openstack-lbaas14:36
johnsomMorning folks. rm_work are you working on the housekeeping cert issue?14:55
rm_workhaven't gotten a chance yet14:56
rm_workworking this sqlite issue14:56
rm_workit's blocking gates14:56
rm_workand my internal build also14:56
openstackgerritMerged openstack/octavia master: Imported Translations from Zanata  https://review.opendev.org/71915815:22
rm_workjohnsom: i wonder if it's a py3 thing?16:00
rm_workjohnsom: i guess it's prolly difficult for you to look into that... but if you could fix your failover per my comments that'd be sweet :)16:00
johnsomThe foreign key thing?16:00
rm_workyes16:00
rm_workerr no16:00
rm_workthat's what I am working on right now16:01
johnsomSeems unlikely to be py3 related16:01
rm_worki meant the amp cert refresh issue16:01
johnsomOh16:01
rm_workbut since you prolly can't easily replicate16:01
johnsomMaybe. or single-proc related16:01
rm_workcould be16:01
*** gthiemonge has quit IRC16:01
rm_workbut i doubt it16:01
rm_workthe path is super simple for this cert thing16:01
johnsomWell, we can drop the interval to short windows for debug.16:01
rm_workpretty much the only thing i can see that'd kill it would be flask not passing the stream through right, or else something with opening the file16:02
johnsomBut, yeah, as soon as I'm done digging out of e-mail my plan was to wrap up failover today16:02
rm_workkk16:02
rm_worki commented16:02
johnsomYeah, I still have some good comments from Ann to address as well.16:02
*** gthiemonge has joined #openstack-lbaas16:03
rm_workbut yeah until i fix this gate blocker, we're stuck16:14
*** armax has joined #openstack-lbaas16:28
rm_workjohnsom: ummm wat: https://github.com/openstack/octavia/commit/19d80f11a43d95d93e774b21b1f070f1fdd3f0d916:38
rm_worki'm super confused16:38
rm_workcfg.CONF.register_cli_opts(healthmanager_opts, group='health_manager')16:38
rm_workwhy was that ever a cli opt?16:38
rm_workit's making my unit test runs fail16:39
rm_workthat should be a normal register_opts() shouldn't it?16:39
johnsomYeah, probably16:39
rm_workT_T16:39
rm_workweird16:39
rm_workwill need to revert that specific bit prolly in my other gate fix16:40
johnsomThat wasn't the issue for that patch however, it was needed for the core cli  opts16:40
rm_workyeah we prolly should have commented that the HM opts weren't actually cli lol16:40
johnsomThey have been that way a long time16:41
rm_workyes but i assume it's a bug16:41
rm_workbad copy/paste or something16:41
rm_workthere's nothing CLI-ey about HMs16:41
rm_workunless I'm misunderstanding what CLI means in this case16:41
johnsomYeah, we don't have any options there that really need to be set via cli16:42
johnsomIt was done in 2015 lol16:43
rm_workso yeah ok16:43
rm_workwill put that back16:43
rm_workand un-cli it16:43
*** maciejjozefczyk has joined #openstack-lbaas16:56
rm_workhmm i am not having great luck replicating this sqlite issue17:33
rm_worki installed the newest version and rebuilt my python against it17:33
johnsomI repros for me17:33
rm_workand i can get it to replicate... sometimes17:33
rm_workbut not reliably17:33
rm_workand it only does it if i run the entire test suite T_T17:33
johnsomYeah, I have only done one functional run, so ...17:33
rm_workthough the FK issue does seem more reliable, but i don't know what it's about yet, i had assumed it was related17:34
*** gcheresh has joined #openstack-lbaas17:52
*** gthiemonge has quit IRC17:56
johnsomYeah, something is up, unit test output: Failed to fetch load_balancer 0f5dd3ab-f6cd-4904-8260-f3458b9cf3ea from DB. Retrying for up to 60 seconds.17:56
johnsomFailed to fetch load_balancer 0f5dd3ab-f6cd-4904-8260-f3458b9cf3ea from DB. Retrying for up to 60 seconds.17:56
johnsomFailed to fetch load_balancer 0f5dd3ab-f6cd-4904-8260-f3458b9cf3ea from DB. Retrying for up to 60 seconds.17:56
*** gthiemonge has joined #openstack-lbaas17:57
*** ccamposr__ has joined #openstack-lbaas17:58
rm_workerr which issue is that18:00
johnsomJust popped up on a single unit test run18:00
johnsomoctavia.tests.unit.controller.worker.v1.test_controller_worker.TestControllerWorker.test_create_load_balancer_single18:00
rm_workhmm18:00
*** ccamposr has quit IRC18:00
johnsomI think this new sqlalchemy is broken for sqlite18:01
rm_worki think it's actually a new sqlite version?18:21
rm_work... unclear18:21
johnsomMy money is on sqlalchemy18:21
rm_workwell, also, by "broken" you mean "fixed"18:21
rm_workor something?18:21
rm_work<_<18:21
johnsomYeah, it passes functional with 1.3.1518:24
johnsom1.3.16 bombs18:24
rm_workhmm18:25
rm_worki was poking zzzeek about it earlier18:25
rm_workhe said the change in 1.3.16 shouldn't affect this18:25
rm_workunless we're doing wonky stuff18:25
johnsomThere are two suspect changes, one with an order by removed, the other is the autocommit with sqlite18:25
rm_work[07:46:03] zzzeek:it's important however if you are doing any tinkering with that connection in your application's setup18:26
rm_work[07:46:37] zzzeek:as far as the sqlalchemy change it just added some more options to the set_isolation_level() method and if you aren't giving sqlalchemy any execution_options(isolation_level) settings, that change would not impact you18:26
rm_workthough i don't think we do that18:26
rm_workbut yes the timing is so close...18:26
rm_workit's POSSIBLE the ubuntu image updated to include a newer sqlite18:27
rm_workbut ... yeah much more likely it seems that sqlalchemy release 3 days ago caused this18:27
johnsomI just went into the tox venv and installed the other version of sqlalchemy, problem went away18:27
johnsomWe do use autocommit and we use non-autocommit transactions18:28
rm_workk18:28
johnsomNeither are "wonky"18:28
rm_workwell, what do you think is the best move here18:28
rm_workwe can pin it and deal with this after U18:28
rm_workor ... actuallt we can't  can we18:29
rm_workwe don't control our own upper-constraints18:29
rm_workhmm18:29
johnsomYeah, I think we can pin it, just not sure if that is the right answer or not18:29
rm_workwell, we are running out of days to merge stuff18:30
rm_worki wanted to merge the UDP thing today18:30
rm_workbut that may be a pipe dream18:30
*** gthiemonge has quit IRC18:31
*** gthiemonge has joined #openstack-lbaas18:31
johnsomSo the test in question does:18:39
johnsomStart a non-autocommit transaction18:39
johnsominside that, a subtransaction to create the LB, then another subtransaction to create the pool. The later can no longer (1.3.16) see the results of the first subtransaction, the LB create18:40
*** ccamposr__ has quit IRC18:41
rm_workso that test was actually bad?18:43
rm_workbecause that seems like18:43
johnsomNo, that is the code, not the test really18:43
rm_workan issue for a WORKING transaction model?18:43
rm_workermm18:43
rm_workhh18:43
rm_work*hmm18:43
johnsomThe LB create one, when the subtransaction closes it should now be visible in the main transaction, but it's not now18:44
rm_workmay need to poke at zzzeek18:45
johnsomYeah, in that patch I don't know why they are setting the isolation level to "" nor what that means to the stack below sqlalchemy18:48
*** larsks has joined #openstack-lbaas18:53
*** larsks has left #openstack-lbaas18:53
rm_worki asked him to join here18:55
rm_workwe'll see if he does :D18:55
rm_workmeanwhile i need to go sleep18:55
rm_workfirst thing i'll be looking at is the cert-refresh housekeeping issue18:58
johnsomok18:58
rm_workbecause i have had to shut down the service in our clouds18:58
rm_workit was just going through and breaking all the amps i just fixed18:59
rm_workand i because of the way it broke them, i need your failover patch to actually restore them18:59
rm_worklol18:59
rm_workwhich means i need to be able to actually build a package, which requires tests passing T_T19:00
rm_workfff19:00
*** sapd1 has quit IRC19:05
*** ccamposr has joined #openstack-lbaas19:08
*** maciejjozefczyk has quit IRC19:13
*** vishalmanchanda has quit IRC19:39
*** irclogbot_3 has quit IRC19:51
*** irclogbot_1 has joined #openstack-lbaas19:52
*** gcheresh has quit IRC20:13
*** maciejjozefczyk has joined #openstack-lbaas20:34
*** servagem has quit IRC20:43
*** maciejjozefczyk has quit IRC20:56
*** gthiemonge has quit IRC21:01
*** gthiemonge has joined #openstack-lbaas21:01
*** maciejjozefczyk has joined #openstack-lbaas21:26

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!