Sam-I-Amrussellb: moo?00:49
openstackgerritMerged openstack/networking-ovn: Vagrant: Add live migration support
*** palexster has joined #openstack-neutron-ovn09:21
*** roeyc has joined #openstack-neutron-ovn11:45
*** azbiswas has joined #openstack-neutron-ovn12:00
*** azbiswas has quit IRC12:05
*** azbiswas has joined #openstack-neutron-ovn14:02
*** azbiswas has quit IRC14:07
Sam-I-Amrussellb: moo15:52
russellbgrabbing coffee, then i have a meeting ..15:53
Sam-I-Amrussellb: mmmm coffee15:53
Sam-I-Amrussellb: ovs creates br-int automatically, no?15:54
Sam-I-Amtrying to determine the source of weirdness15:54
*** azbiswas_ has joined #openstack-neutron-ovn16:50
Sam-I-Amrussellb: moo17:11
Sam-I-Amrussellb: ovn-controller relies on at least external-ids:ovn-remote ... but the other external-ids like ovn-bridge=br-int and setting provider nets can wait until after ovn-controller starts?17:13
*** azbiswas has joined #openstack-neutron-ovn17:18
*** salv-orlando has joined #openstack-neutron-ovn17:20
Sam-I-Amlooks like we do that stuff before starting ovn-controller17:20
russellbSam-I-Am: yes, bridge mappings can wait.  ovn-bridge should be set ahead of time if you want it to be something other than br-int17:22
Sam-I-Amok. i'm trying to map what we do in devstack to ansible.17:22
*** salv-orl_ has quit IRC17:22
Sam-I-Ambecause we were missing a couple of those things17:22
russellbi hope you guys can share the ansible goodness eventually17:23
Sam-I-Amyeah, its pretty much consuming all my time17:24
Sam-I-Amthe latest problem was moving from using devstack (and manually starting stuff) to debian packages... which were somewhat broken.17:25
russellbimportant stuff ...17:25
russellb(not sarcasm)17:25
Sam-I-Amyeah, i know17:26
Sam-I-Ami think we're carrying a local patch for the deb stuff, which i dont like17:26
Sam-I-Ami think those init scripts were written for plain old ovs, so bringing ovn into the picture made things fuzzy17:27
openstackgerritRachappa B Goni proposed openstack/networking-ovn: Fix avoids one more consumer creation for PLUGIN topic with insufficient endpoints when REPORTS topic is not present or missing.
Sam-I-Amrussellb: probably going to miss the meeting today17:43
russellbyour presence will be missed :)17:44
Sam-I-Ami dont have too much to add at this point17:44
russellbyou mean as far as a status update?17:47
*** salv-orl_ has joined #openstack-neutron-ovn17:49
*** salv-orlando has quit IRC17:52
russellbOVN meeting in 9 minutes or so in #openvswitch18:06
*** azbiswas has joined #openstack-neutron-ovn18:50
azbiswaszhouhan: when you get a chance, would love your comments before I provide another patch based on latest comments19:03
zhouhanazbiswas: sure, I will review asap19:04
azbiswasit may be time to split up the acl code into a different file, let me know your thoughts in the review.19:05
azbiswaszhouhan: thanks19:05
russellbregXboi: reviewed the split db patch, had some minor feedback20:15
russellbthanks for pushing it through20:15
regXboirussellb: ack - I'll look when the email lands20:15
regXboirussellb: I'm looking at the persistance of flows for incremental processing patch and I'm looking at needing to add not one but TWO additional hashmaps to be able to catch all the corner cases :(20:16
*** azbiswas_ has joined #openstack-neutron-ovn20:18
*** azbiswas has quit IRC20:18
regXboione of the tests runs the dbs split, the other doesn't20:29
russellbheh ok20:29
regXboithe rest of the comments make sense on first blush, but I'll look in more detail when I have some cycles20:29
russellb2016-03-03 21:14:56.148 | {2} setUpClass (tempest.scenario.test_network_basic_ops.TestNetworkBasicOps) ... SKIPPED: security-group extension not enabled.21:20
russellbjust noticed that in a tempest log ...21:20
russellbseveral other security group tests are running, so i'm not sure what that's about21:21
russellb... it's coming from tempest.conf ...21:27
Sam-I-Amrussellb: wut21:35
russellbskipping a lot of tests it seems21:36
* russellb tenses up a bit waiting for the test report on that21:42
azbiswasrussellb: A concurrency question - stress reliever.21:45
russellbi'm not sure that's a stress reliever :)21:45
azbiswasclass DelLogicalPortCommand(BaseCommand):21:45
russellbbut sure!21:45
azbiswas        lswitch.verify('ports')21:45
azbiswas        ports.remove(lport)21:45
azbiswasI have a log from our scale test showing exception in the port.remove(lport), value lport not found21:46
azbiswasAm trying to come up with a scenario how that is possible21:46
azbiswasassuming multiple neutron servers21:46
russellbeven with multiple servers, this data shouldn't be changing out from under this code21:47
azbiswasthe verify should have caused a retry21:47
russellbi don't think so21:47
russellbverify would cause a retry if the error came from the server21:47
russellbsounds like we're getting a local python error21:47
russellbodd thing is we thought the port should exist based on an earlier check: 147             lport = idlutils.row_by_value(self.api.idl, 'Logical_Port',21:48
russellb148                                           'name', self.lport)21:48
azbiswasit is a local python error21:48
russellbso we found the port int he Logical_Port table, but not int he ports list of the logical switch21:49
russellbnot sure i can explain how21:49
russellbsimple explanation would be if we're trying to delete it from the wrong switch21:49
russellbis this environment using ports on a provider network by chance?21:50
russellband was it recently upgraded?21:50
azbiswasthe logs are from 18th Feb, so it has the DB corruption fix21:50
russellbok, nevermind21:50
azbiswasand yes there are ports on provider network21:51
russellbi was thinking about what would happen if you upgraded a running system to this commit:
russellbthat would break21:51
azbiswasRight - we are not doing that21:51
azbiswasWe are starting a run with that fix today - clean setup21:52
russellbi'm stumped by that error21:53
russellbdefinitely not obvious how that could happen21:53
russellbunless we just passed in the wrong lswitch21:53
russellborrrrrrrr if we tried to delete it twice in the same transaction?21:53
azbiswasI've never seen that yet (wrong lswitch that is)21:54
* russellb guessing21:54
azbiswasAnother unlikely OR - AddLogicalPortCommand never added lport to lswitch list21:55
russellbare you seeing retries get handled correctly?21:56
azbiswasthere are many transaction timeouts in the logs21:56
russellbi haven't looked through to see what happens in the case of a retry (or timeout)21:56
azbiswasyes retries are handled correctly21:56
azbiswastimeouts - not sure21:57
azbiswascan a timeout result in half of a transaction commit?21:58
russellbgood question :)21:58
russellbi don't know21:58
russellbsounds worth looking into to make sure we handle it gracefully ...21:58
russellbseems unlikely21:58
russellbsince that's kind of the point of a transaction21:58
russellbsounds like our plugin is really slow, though21:59
azbiswasit's slow in create_port when the number of ports in a network/security group start getting big.22:00
russellbbut you're fixing that!  ;-)22:00
azbiswasthe current patch doesn't target that explicitly, that create_port O(N^2) loop still exists - I will look into it though22:01
russellbO(N^2) because of remote_group ?22:02
russellblet's just not support remote_group!22:02
azbiswasmy guess is yes22:02
russellbor not use it in scale tests to make the results look better :-p22:03
azbiswasright, but we are using provider networks for now, so all those 800+ VMs end up in the same security group.22:03
azbiswasnot remote per say22:04
russellbright, but for every port you create, it's recreating the ACLs for all 800 ports22:04
russellbbecause of that rule22:04
azbiswasI'll try and optimize that as much as I can. But we may have to play some games within the scale test itself and keep N down.22:06
russellbor consider changing the default security group22:06
russellband file a bug to improve remote_group_id support at scale22:06
*** yarkot_ has joined #openstack-neutron-ovn22:07
russellbat scale == for a security group with many ports22:07
*** rtheis has quit IRC22:07
azbiswasright, shard that group to keep N limited within - will that work?22:08
*** jckasper has joined #openstack-neutron-ovn22:08
azbiswasi.e. multiple copies of the same security group22:08
russellbwould result in different behavior though22:09
russellbi've got some ideas ... i'll think about it some more22:10
russellbi need to run for now though22:10
azbiswasI did have a different question on a different topic - will catch you later on that22:10
russellbi'm not in a huge rush, just headed home for the day22:10
azbiswasit's quick - the ovsdb client to ovsdb server iteraction - the probe timeout22:12
azbiswasi saw the patch proposed for ovn-controller to ovn sb db22:12
azbiswaswe may need something similar for ovn plugin to ovn nd22:13
azbiswasthe initial pull of data (> 8 MB), leaves the probe hanging behind and not processed immediately22:13
russellbit may be doable already22:13
russellbgrep for "probe_interval" in the python ovs lib22:13
russellbit's in there, just not sure about the interface for setting it22:14
azbiswaswe've set the server side to 0 for now22:14
russellbthere's a set_probe_interval()22:14
azbiswasthat's the client side22:14
azbiswasand that's set to 022:14
azbiswaswe need to set it on server side22:14
russellbyou're supposed to be able to do that with ovsdb-server config22:15
russellbbut i haven't looked at how22:15
azbiswasok let me look22:15
azbiswasthat's easy then22:15
azbiswasI'm done with my questions22:15
russellbhappy to chat :)22:15
russellbthanks again for all your help22:15
* russellb out22:16
*** yarkot_ has quit IRC22:16
*** regXboi has quit IRC22:19
*** brad_behle has quit IRC22:59
