18:19:22 <SumitNaiksatam> #startmeeting networking_policy 18:19:23 <openstack> Meeting started Thu Jan 4 18:19:22 2018 UTC and is due to finish in 60 minutes. The chair is SumitNaiksatam. Information about MeetBot at http://wiki.debian.org/MeetBot. 18:19:24 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 18:19:27 <openstack> The meeting name has been set to 'networking_policy' 18:19:27 <tbachman> SumitNaiksatam: not an issue in your part of the world ;) 18:19:38 <SumitNaiksatam> tbachman: :-) 18:19:51 <SumitNaiksatam> (but we dont have solar panels either) 18:20:13 <SumitNaiksatam> i still pretty much have the one point agenda item - Pike Sync 18:20:21 <SumitNaiksatam> thanks for merging the patches from the other repos 18:20:39 <SumitNaiksatam> the main server patch - #link https://review.openstack.org/518183 18:21:19 <SumitNaiksatam> oh wow, i just noticed the lastest set of comments 18:21:40 <rkukura> last ones from me, I think ;) 18:22:23 <SumitNaiksatam> yeah, last couple of iterations are your comments 18:22:34 <SumitNaiksatam> anyway, i havent had a chance to look at these latest comments 18:22:37 <SumitNaiksatam> so i will do those offline 18:23:01 <rkukura> main thing is a few remaining places where ensure_tenant calls were moved inside transactions 18:23:15 <SumitNaiksatam> “ensure_tenant() calls in the apic_aim L3_router driver and in the GBP plugin" 18:23:22 <rkukura> right 18:23:24 <SumitNaiksatam> i think i already moved all the GBP plugin calls 18:24:02 <rkukura> I also noticed a couple places where ensure_tenant() calls are completely missing, but that should probably be addressed via a followon patch and backported 18:24:21 <SumitNaiksatam> I can put a revisit wherever that is the case 18:24:25 <rkukura> ok 18:24:55 <rkukura> I was also hoping we’d be able to use reader transactions in some places, but I think the way status gets written to the DB may be preventing that 18:25:37 <tbachman> annakk: hi! 18:25:41 <rkukura> hi annakk! 18:25:42 <annakk> hi, sorry I'm late 18:25:49 <SumitNaiksatam> rkukura: yes, we have to use writer there (that discussion happened in some of the earlier patchsets, it was raised by annakk) 18:26:00 <SumitNaiksatam> annakk: hi, no worries, we are just getting “warmed” up 18:26:17 * tbachman hands another hot beverage to rkukura 18:26:22 <SumitNaiksatam> tbachman: lol 18:26:41 <rkukura> in general, would a writer transaction nested inside a reader transaction promote the reader transaction to a writer? 18:27:03 <tbachman> can you have a transaction inside a transaction? 18:27:12 <tbachman> (can you *start* a new transaction inside a transaction) 18:27:26 <SumitNaiksatam> rkukura: so i had wrapped most of the ml2plus methods in the transactions earlier, with the assumption that the post_commit for the plugin/driver combination is essentially a no-op 18:27:26 <rkukura> if so, I think it would be best if the get methods generally used readers, and we nested a writer if we need to update the status 18:27:36 <SumitNaiksatam> tbachman: that depends 18:27:56 <SumitNaiksatam> if you nest a writer inside a reader i think it leads to a new transaction 18:28:00 * tbachman needs to read up on mysql 18:28:08 <tbachman> sqlalchemy, that is 18:28:27 <SumitNaiksatam> tbachman: this is not sqlalchemy specific since we are dealing with the oslo_db abstraction 18:28:29 <rkukura> SumitNaiksatam: it is still possible to use regular ML2 MDs, like ovs or linuxbridge, with ml2plus 18:28:35 <SumitNaiksatam> rkukura: one sec 18:29:08 <SumitNaiksatam> rkukura: so after removing those (per your review comment yesterday), it led to this error: 18:29:14 <SumitNaiksatam> #link http://logs.openstack.org/83/518183/44/check/legacy-group-based-policy-dsvm-aim/0f7a787/logs/screen-q-svc.txt.gz?level=ERROR 18:29:36 <SumitNaiksatam> i had seen this before as well (and had done the wrapping in response) 18:30:04 <SumitNaiksatam> i was able to get around this but adding a writer transaction block at another place inside the aim_mapping PD 18:30:29 <SumitNaiksatam> i will add a note in the code where i added that to explain 18:31:25 <SumitNaiksatam> i think the security_group DB module/handling is doing something, which causes the security_group_rule to be created in a different transaction 18:31:39 <SumitNaiksatam> however putting the transaction block in the PD fixes that 18:31:40 <rkukura> OK, if we need to use writers to avoid this, fine, but we still should make sure the ensure_tenant() calls are before the transaction starts 18:32:29 <SumitNaiksatam> so, as it stands ensure_tenants() are all outside teh transaction, and i have in fact removed the wrapping transaction from all the ml2plus methods 18:32:45 <SumitNaiksatam> i will move the ensure_tenant out of the l3_plugin transaction 18:32:57 <rkukura> they are inside in the apic_aim l3_router driver and in the GBP plugin - see my comments in gerrit 18:33:40 <rkukura> https://review.openstack.org/#/c/518183/45/gbpservice/neutron/services/grouppolicy/plugin.py@467 18:33:42 <SumitNaiksatam> “i will move the ensure_tenant out of the l3_plugin transaction” 18:34:30 <SumitNaiksatam> the other comment was here: #link https://review.openstack.org/#/c/518183/43/gbpservice/neutron/plugins/ml2plus/drivers/apic_aim/mechanism_driver.py 18:34:49 <SumitNaiksatam> i dont recollect if i had to change this before or after the eager loading 18:34:52 <SumitNaiksatam> change 18:35:00 <SumitNaiksatam> but at one point this was definitely not working 18:35:11 <SumitNaiksatam> i will try to revert and see what happens 18:35:29 <rkukura> OK, and if it doesn’t work, please add a REVISIT 18:36:05 <SumitNaiksatam> okay 18:36:08 <rkukura> thanks 18:36:10 <SumitNaiksatam> was planning to do that 18:36:19 <SumitNaiksatam> this comment - #link https://review.openstack.org/#/c/518183/43/gbpservice/neutron/plugins/ml2plus/patch_neutron.py 18:36:26 <SumitNaiksatam> i have changed the wording 18:36:45 <SumitNaiksatam> but the reason i had put ml2plus is because the patching happens only when ml2plus is ussed 18:36:48 <SumitNaiksatam> *used 18:37:01 <rkukura> ok 18:37:40 <SumitNaiksatam> also there are at least a couple of places in ml2plus where we send out notifications from within a transaction 18:37:42 <annakk> I probably didn't look well enough, but I didn't see any keystone stuff in _ensure_tenant - just comments that its for UTs only.. 18:37:44 <rkukura> There are of course cases where neutron methods are called within GBP transactions, so we still need that monkey patch, right? 18:37:56 <rkukura> ? 18:38:27 <SumitNaiksatam> rkukura: the patching is not needed for resource_mapping 18:39:08 <SumitNaiksatam> but more importantly, per earlier comment, its needed just for ml2plus too since we have some notifications going from within the transactions which needed to be queued 18:39:35 <rkukura> ensure_tenant() calls the project_name_cache.ensure_project(), which calls keystone if the project is not already in the cache 18:40:10 <SumitNaiksatam> annakk: i missed that part too 18:40:57 <SumitNaiksatam> so that covers patchset 43 18:41:05 <SumitNaiksatam> on patchset 45, rkukura’s latest comments 18:41:51 * tbachman hands SumitNaiksatam an adult beverage, pats him on the back 18:42:10 <SumitNaiksatam> tbachman: lol 18:42:20 <rkukura> thanks tbachman ;) 18:42:25 <SumitNaiksatam> was just quickly going through the comments 18:42:29 <tbachman> I can’t remember what my personal best is for patch sets 18:42:31 <tbachman> it’s pretty high 18:43:07 <SumitNaiksatam> i think we covered the ensure_tenant (yes, i will move the GBP plugin as well) and the writer versus reader for the get_resources 18:43:17 <rkukura> ok, great 18:43:20 <SumitNaiksatam> “Should the session.begin() calls in these _ml2_md_extend_<resource>_dict()" 18:43:26 <SumitNaiksatam> annakk: had raised this one earlier too 18:43:48 <SumitNaiksatam> her question was whether these can be eliminated 18:44:16 <SumitNaiksatam> and the issue at that time was that the DB object used in there would be wierdly if the subntransaction was not created 18:44:45 <rkukura> I have no idea if they are really needed 18:44:49 <SumitNaiksatam> however that was before i made the earlier mentioned change to wrap the create_security_group (inside the aim_mapping PD) in a transaction 18:45:21 <SumitNaiksatam> so i will revisit if we can completely eliminate the creation of the (sub)transaction here 18:45:36 <rkukura> Ideally, the extend functions would always be called inside the main API transaction, but that isn’t always the case 18:45:51 <SumitNaiksatam> rkukura: yeah, thats the problem 18:46:26 <SumitNaiksatam> the thing is that these changes have to be tried independently 18:46:39 <SumitNaiksatam> otherwise you dont know what is breaking what 18:46:59 * tbachman nods 18:46:59 <SumitNaiksatam> so its a new incremental patchset for each of these changes 18:47:22 <rkukura> If it were me, I’d probably convert the begins to readers for now, and then see if they can be eliminated in a separate patch 18:47:22 <SumitNaiksatam> and the gate craps out almost 2 in 3 times on the HOST_IP error 18:47:31 <tbachman> :( 18:47:44 <rkukura> aren’t you able to run the UTs locally? 18:47:45 <SumitNaiksatam> i tried earlier to fix it, and thought i had, but its come back again 18:48:03 <SumitNaiksatam> rkukura: this does not manifest in the UTs 18:48:08 <rkukura> oh 18:48:30 <SumitNaiksatam> in fact i cannot even locally reproduce some of the errors even if i run devstack 18:48:49 <SumitNaiksatam> i think its a timing issue, since neutron is doing some expunging of DB objects 18:49:12 <tbachman> number of threads? 18:49:22 <SumitNaiksatam> synth_db_objs] 18:49:22 <SumitNaiksatam> Jan 03 23:45:08.268557 ubuntu-xenial-infracloud-chocolate-0001674228 neutron-server[14315]: ERROR gbpservice.neutron.services.grouppolicy.policy_driver_manager File "/opt/stack/new/neutron/neutron/objects/base.py", line 405, in _load_object 18:49:24 <SumitNaiksatam> Jan 03 23:45:08.268733 ubuntu-xenial-infracloud-chocolate-0001674228 neutron-server[14315]: ERROR gbpservice.neutron.services.grouppolicy.policy_driver_manager context.session.expunge(obj.db_obj) 18:49:25 <SumitNaiksatam> Jan 03 23:45:08.268896 ubuntu-xenial-infracloud-chocolate-0001674228 neutron-server[14315]: ERROR gbpservice.neutron.services.grouppolicy.policy_driver_manager File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 1586, in expunge 18:49:26 <SumitNaiksatam> Jan 03 23:45:08.269072 ubuntu-xenial-infracloud-chocolate-0001674228 neutron-server[14315]: ERROR gbpservice.neutron.services.grouppolicy.policy_driver_manager state_str(state)) 18:49:27 <SumitNaiksatam> Jan 03 23:45:08.269255 ubuntu-xenial-infracloud-chocolate-0001674228 neutron-server[14315]: ERROR gbpservice.neutron.services.grouppolicy.policy_driver_manager InvalidRequestError: Instance <SecurityGroupRule at 0x7f0674653590> is not present in this Session 18:49:30 <SumitNaiksatam> tbachman: possibly 18:50:25 <SumitNaiksatam> anyway, so on patchset 45, i think most of the other comments are questions or comments which i can answer inline 18:50:35 <rkukura> agreed 18:50:36 <SumitNaiksatam> rkukura: will reach out if something needs more discussion 18:50:41 <rkukura> sure 18:50:55 <tbachman> SumitNaiksatam: thanks for tackling this beast! 18:50:56 <SumitNaiksatam> hopefully track to knock this off immedicately after this meeting 18:51:06 <SumitNaiksatam> tbachman: no worries 18:51:20 <SumitNaiksatam> i already created the pike branches for the other repos 18:51:27 <SumitNaiksatam> since the patches were merged 18:51:27 <rkukura> nice 18:51:40 <rkukura> I’ll be ready to re-review 18:52:04 <SumitNaiksatam> rkukura: thanks, i guess not the best way to keep your spirits up! :-) 18:52:25 <rkukura> no problem 18:52:52 <SumitNaiksatam> annakk: were you able to try this with your NSX devstack? 18:53:00 <rkukura> I admit I sometimes post my review after a few comments, without looking at everything, but I got throught everything this time 18:53:05 <annakk> not yet, planning to 18:53:18 <SumitNaiksatam> annakk: okay 18:53:41 <tbachman> timecheck 18:54:00 <SumitNaiksatam> rkukura: np, i try to post as i go along too, since it gives the author more immediate feedback 18:54:08 <SumitNaiksatam> tbachman: good point 18:54:16 <SumitNaiksatam> apologies again for starting late today 18:54:22 <tbachman> SumitNaiksatam: no worries! 18:54:33 <SumitNaiksatam> tbachman: so the weather okay in your neck of the woods? 18:54:41 <tbachman> cold, but not that much snow 18:54:48 <tbachman> east of us got a lot 18:54:52 <tbachman> way east 18:54:57 <tbachman> (near the ocean) 18:55:38 <SumitNaiksatam> tbachman: storm/hurricane/bomb-cyclone? 18:55:43 <tbachman> heh 18:55:46 <tbachman> nothing bad 18:55:51 <tbachman> It’s all relative 18:56:00 <tbachman> just cold ;) 18:56:13 <rkukura> here, we are getting plenty of snow (about a foot so far today) but the wind isn’t too bad 18:56:25 <SumitNaiksatam> rkukura: oh good, reassuring 18:56:30 <SumitNaiksatam> annakk: did you have anything else to bring up today? 18:56:41 <annakk> no :) 18:57:46 <SumitNaiksatam> alrighty then, thanks for making it (especially rkukura and tbachman) :-) 18:57:50 <tbachman> SumitNaiksatam: thanks for running the meeeting! 18:57:52 <SumitNaiksatam> stay warm and stay safe 18:57:56 <tbachman> SumitNaiksatam: ack ;) 18:57:56 <rkukura> thanks SumitNaiksatam! 18:57:58 <SumitNaiksatam> bye! 18:58:00 <rkukura> bye 18:58:02 <SumitNaiksatam> #endmeeting