00:01:32 <thinrichs> #startmeeting CongressTeamMeeting 00:01:32 <openstack> Meeting started Thu Jun 23 00:01:32 2016 UTC and is due to finish in 60 minutes. The chair is thinrichs. Information about MeetBot at http://wiki.debian.org/MeetBot. 00:01:33 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 00:01:35 <openstack> The meeting name has been set to 'congressteammeeting' 00:01:48 <ekcs> hi 00:01:58 <aimeeu> Hi again 00:02:08 <ramineni_> hi 00:02:09 <thinrichs> ekcs, aimeeu: hi! 00:02:12 <thinrichs> ramineni_: hi 00:02:18 <thinrichs> masahito is I think out of town still 00:02:46 <thinrichs> Agenda for the week… 00:02:48 <thinrichs> 1. Gate 00:02:58 <thinrichs> 2. Low-hanging bugs for newcomers 00:03:15 <thinrichs> 3. Status updates 00:03:18 <thinrichs> Anything else? 00:04:27 <thinrichs> #topic Gate 00:04:37 <thinrichs> ramineni_: how is the gate looking? 00:05:22 <ramineni_> thinrichs: right now, it looks green, series of patches merged in tempest , causing our gate to fail 00:05:55 <ramineni_> thinrichs: fixed the same in both master and stable/mitaka 00:06:32 <thinrichs> ramineni_: Nice! 00:06:51 <thinrichs> That's an easy agenda item them 00:07:30 <thinrichs> Moving on, unless there's anything to discuss about the HA tests, which are the ones that have been broken of late. 00:08:43 <thinrichs> #topic Low-hanging bugs 00:08:51 <ramineni_> thinrichs: no, HA tests failing because we are using endpoints_client and service client which got changed in tempets 00:09:07 <ramineni_> thinrichs: otherwise they are fine 00:09:54 <thinrichs> ramineni_: makes sense 00:10:26 <thinrichs> aimeeu is a new contributor (if you remember from last week) and is looking for a couple of bugs to start with 00:10:56 <aimeeu> I did come across another one that was abandoned a year ago: https://bugs.launchpad.net/congress/+bug/1415199 00:10:56 <openstack> Launchpad bug 1415199 in congress "Refactor test_neutron_driver" [Medium,Confirmed] - Assigned to Cleber Rosa (cleber-gnu) 00:11:12 <thinrichs> We used to have bugs in launchpad marked with 'low-hanging-fruit'. 00:11:26 <aimeeu> Only 2 marked that way 00:11:35 <thinrichs> Do we have bugs aimeeu might be able to work on that we haven't registered in launchpad. 00:12:12 <ekcs> Here is a simple one assigned to me but I haven’t done it. #link https://bugs.launchpad.net/congress/+bug/1501579 00:12:12 <openstack> Launchpad bug 1501579 in congress "Calling plexxi_driver.execute(...) throws exception" [Medium,New] - Assigned to Eric K (ekcs) 00:12:28 <thinrichs> aimeeu: The tests are in a state of flux because we basically rewrote them all for a new architecture we're almost finished implementing. 00:13:06 <thinrichs> So we should probably abandon the bug you mentioned above. 00:13:11 <aimeeu> OK 00:13:34 <thinrichs> The new tests could require that kind of work too, but we would need to look. 00:14:01 <thinrichs> ekcs: thanks. 00:14:18 <thinrichs> aimeeu: you could work on the one ekcs is pointing out. Seems like there's just a method to add. 00:14:51 <aimeeu> Got it - just reassigned it to myself 00:15:11 <thinrichs> Does anyone else have bugs they're not actively working on? Especially if they're good starters, let's unassign ourselves. 00:16:22 <thinrichs> Can we all take an action item to look for 1-2 low-hanging bugs? Then aimeeu can have a few to choose from, and we make it easier for others to jump in and help out. 00:16:44 <ramineni_> thinrichs: ok 00:16:56 <aimeeu> thinrichs: thanks! 00:17:13 <ekcs> got it. 00:17:31 <thinrichs> #action Everyone will try to file a couple low-hanging bugs for starting points for newcomers 00:17:52 <thinrichs> #topic status updates 00:18:15 <thinrichs> ekcs: how is the HA discussion going? 00:19:52 <ekcs> I haven’t done much this past week cuz I got pulled into something urgent at work. But I think we’re all in fairly good agreement on the spec. Andrew from redhat also gave some comments. I’ll touch things up in the next couple days (a few clarifications requested and fill in a few TODO sections) then I think it’s ready to merge. 00:20:47 <thinrichs> ekcs: That was my reading too—that we're done with design and need to move on to implementation 00:21:11 <ekcs> the other thing I’m working on is looking for threading issues with the switch to new-arch. A lot of things that used to be safe because there was no blocking are now suspect because rpc.cast/call blocks and yields. 00:21:46 <thinrichs> ekcs: Totally important 00:21:48 <ekcs> and i’m done with the urgent project so I should be able to get more done in the coming week. 00:23:18 <thinrichs> I could see a number of bugs coming out of that analysis, and they could easily be super-hard to write tests for. 00:23:43 <thinrichs> ramineni_: want to discuss your status? 00:24:00 <ramineni_> thinrichs: sure 00:24:35 <ramineni_> last week worked on supporting keystone v3 and use of sessions for all the datasources and also made the default auth_url as v3 for creating datasources in devstack plugin 00:25:05 <ramineni_> still some datasouces are throwing errors, i have to look at it 00:25:14 <thinrichs> Do all the services support v3? 00:25:22 <ramineni_> yes 00:25:27 <thinrichs> Do we? 00:26:09 <ramineni_> yes , creating the client with v3 shouldnt throw any error, ill check agai 00:27:08 <thinrichs> Just wanted to check 00:27:13 <ramineni_> thats it from my side 00:27:40 <thinrichs> Ok. I'll go. 00:28:04 <thinrichs> I've been underwater for weeks now. 00:28:14 <thinrichs> I've been struggling just to keep up with reviews. 00:28:37 <thinrichs> But I'm hoping to have more time again. 00:29:24 <thinrichs> I was planning on testing out a multi-process deployment 00:29:42 <thinrichs> and maybe add docs around how to do that 00:31:05 <thinrichs> Or are there other things we think are more pressing before we begin the HA work? 00:31:32 <ekcs> that sounds right. 00:32:31 <thinrichs> Okay. That's my plan then. 00:32:38 <thinrichs> #topic open discussion 00:32:50 <ramineni_> thinrichs: https://review.openstack.org/#/c/329772/ 00:32:50 <patchbot> ramineni_: patch 329772 - congress - Fix listing of datasources 00:32:50 <thinrichs> Anything else we should discuss today? 00:33:21 <ramineni_> thinrichs: im thinking should we delete the datasource when we fail to load driver happens 00:33:23 <ramineni_> ? 00:34:02 <thinrichs> So actually remove it from the database? 00:34:09 <ramineni_> thinrichs: yes 00:35:25 <thinrichs> (ekcs: here's the problem ramineni_ is working on. User creates datasource with driver D. Then removes D from etc/congress/congress.conf and restarts congress. Today there's a fatal error b/c Congress can't reinstantiate the datasource b/c D is unavailabe.) 00:35:42 <thinrichs> The question is what do we do in that case? 00:36:01 <ekcs> thanks. just read up on it. 00:36:02 <thinrichs> I'm looking to see what happens if there's a policy referencing that datasource… 00:36:20 <thinrichs> Does Congress let you delete such a datasource or does it block it? I think it blocks it. 00:37:41 <ramineni_> thinrichs: it doesnt create policy 00:37:58 <ramineni_> thinrichs: it fails here https://github.com/openstack/congress/blob/master/congress/harness.py#L394 00:38:12 <ramineni_> but we silently , log the exception and continue 00:38:30 <ramineni_> so, after that if we list the datasources, it throws internal server error 00:38:59 <thinrichs> What's the behavior we think is right… 00:39:16 <thinrichs> there's a datasource in the DB but the driver for that DB is not available 00:39:36 <thinrichs> I typically think that deleting something out of the DB is dangerous when the user didn't ask us to do it. 00:40:18 <ekcs> Yea I’m with thinrichs on that. if someone dropped in a new config by mistake, that shouldn’d delete DS and especially not policies. 00:40:53 <thinrichs> So let's imagine we leave the DS in the DB. 00:40:54 <ramineni_> ekcs: policies are not created for that datasource 00:41:17 <thinrichs> What problems does that cause? 00:41:32 <ekcs> We could prompt the user to ok the deletion. options: 1. delete the DS and continue 2. fail and you fix the config before relaunching. 00:42:16 <thinrichs> What if it's a script that's starting up Congress? 00:42:36 <thinrichs> Interactive stuff seems scary 00:42:56 <ramineni_> so, we should mark as disabled 00:43:01 <ramineni_> and leave it in DB 00:43:01 <ekcs> anyway this should be a rare situation and i think the easiest thing is to document and let the user resolve it by fixing config or fixing DB. problem is it’s hard to fix DB without congress. so maybe we have a separate congress switch that says clean the DB based on config. 00:43:02 <ramineni_> ? 00:43:41 <ekcs> i feel like disabling adds to much complexity for something that doesn’t happen much anyway. cuz every other part of the system may need to account for that case. 00:44:06 <ramineni_> ekcs: ya, i agree 00:45:05 <thinrichs> +1 to it being a rare situation 00:45:14 <ekcs> my vote would be for: congress just fails with good error message. in doc direct user to either add the config back or launch congress with a new switch (with warning) to launch and delete DS. 00:45:28 <ramineni_> ekcs: you experienced this issue without changing config? 00:45:41 <thinrichs> And we probably shouldn't be doing anything substantial right now 00:46:03 <ekcs> ramineni_: no. did I suggest that? 00:46:08 <ramineni_> ekcs: you have raised the bug right 00:46:19 <ekcs> ramineni_: no. This is first time i heard of it. 00:46:40 <ekcs> ok wait. 00:46:54 <ramineni_> https://bugs.launchpad.net/congress/+bug/1564152 00:46:54 <openstack> Launchpad bug 1564152 in congress "When 1 driver fails to load, all drivers fail to list (Error 500)" [Medium,In progress] - Assigned to Anusha (anusha-iiitm) 00:47:12 <ekcs> I did raise the bug. 00:47:18 <ramineni_> may be you have experienced in totally different scenario? 00:48:42 <ekcs> but yea different scenario. like if there is bug in a driver and that driver has failure, then listing all drivers fail and don’t give any info. 00:49:33 <ekcs> imagine someone added a new thirdparty driver. and lists drivers. idealy he should see the third party driver failing and others working in the list. user gets no info from list or from horizon. 00:50:01 <ekcs> this is helpful especially if someone is developing their custom driver. 00:50:30 <thinrichs> ekcs: +1 to your suggestion about missing driver 00:50:46 <ramineni_> ekcs: but if error, it doesnt let you create datasource right 00:52:20 <thinrichs> ekcs: fixing that bug of yours will take some work 00:53:22 <ekcs> hmm I didn’t document it super clearly. I forget what error I introduced. but it’s something like this. list_drivers() calls each individually configured driver for info(). If some of those info() methods fail, the whole list_drivers() fail. Ideally list_drivers() still gets the info on other drivers. 00:54:00 <thinrichs> ekcs: is the obvious fix (adding a try/except around the info()) a good first approx? 00:54:07 <thinrichs> That could be something for aimeeu 00:54:16 <ekcs> that way when you look on horizon for instance, the page will list drivers and show ‘last error’ on each of the drivers. right now the page just doesn’t load. or shows 500. 00:54:24 <ekcs> yea that’s basically what I had in mind. 00:54:29 <aimeeu> I thought of the try/catch - I can take a look if you like 00:54:55 <ekcs> but I should document better how to reproduce. 00:55:30 <thinrichs> ekcs: maybe in addition to a few more lines about how to reproduce, you could add the low-hanging tag so aimeeu can easily find it 00:55:38 <ekcs> ok 00:55:50 <thinrichs> 5 minutes left. Anything else for today? 00:56:06 <ekcs> ramineni_: still a good thought on what to do when driver removed from config tho. 00:56:40 <ekcs> ramineni_: i think the delete drivers patch is a good thing to have (triggered by special switch) 00:57:14 <ramineni_> ekcs: hmm 00:57:24 <ekcs> nothing else from me. 00:57:35 <ramineni_> ekcs: or fail it, as they have messed up with config ? 00:57:52 <ekcs> yea fail should be default behavior 00:58:04 <thinrichs> ramineni_: so I think the proposal is: (i) congress fails to start if it has DS without a driver and (ii) add a switch to congress server that causes it to delete all DSs without drivers 00:58:22 <thinrichs> ekcs: is that right? 00:58:29 <ekcs> yea 00:58:40 <ekcs> but again maybe not even that important to add (ii) right now 01:00:25 <thinrichs> Out of time for today. We can continue on #congress if need be. 01:00:30 <thinrichs> #endmeeting