*** markwash has joined #openstack-neutron | 00:40 | |
*** WackoRobie has joined #openstack-neutron | 00:40 | |
*** yamahata has joined #openstack-neutron | 00:47 | |
*** clev has quit IRC | 00:52 | |
*** markwash has quit IRC | 01:19 | |
*** WackoRobie has quit IRC | 01:20 | |
*** WackoRobie has joined #openstack-neutron | 01:21 | |
*** markwash has joined #openstack-neutron | 01:21 | |
*** WackoRob_ has joined #openstack-neutron | 01:24 | |
*** WackoRobie has quit IRC | 01:24 | |
*** WackoRob_ has quit IRC | 01:26 | |
*** zzelle has quit IRC | 01:36 | |
*** harlowja has joined #openstack-neutron | 01:55 | |
*** Jianyong has joined #openstack-neutron | 02:02 | |
*** krast has joined #openstack-neutron | 02:15 | |
*** WackoRobie has joined #openstack-neutron | 02:16 | |
*** WackoRobie has quit IRC | 02:16 | |
openstackgerrit | Gary Duan proposed a change to openstack/neutron: FWaaS integration with service type framewrok https://review.openstack.org/60699 | 02:19 |
---|---|---|
*** clev has joined #openstack-neutron | 02:22 | |
*** krast has quit IRC | 02:28 | |
openstackgerrit | A change was merged to openstack/python-neutronclient: Remove a debugging print statement https://review.openstack.org/62617 | 02:31 |
*** krast has joined #openstack-neutron | 02:31 | |
*** WackoRobie has joined #openstack-neutron | 02:31 | |
*** clev has quit IRC | 02:32 | |
*** WackoRobie has quit IRC | 02:32 | |
*** markwash has quit IRC | 02:32 | |
*** markwash has joined #openstack-neutron | 02:35 | |
*** alexpilotti has quit IRC | 02:38 | |
*** vkozhukalov has joined #openstack-neutron | 02:54 | |
*** harlowja is now known as harlowja_away | 02:55 | |
*** alexpilotti has joined #openstack-neutron | 02:59 | |
*** alexpilotti has quit IRC | 03:07 | |
openstackgerrit | Zhang Hua proposed a change to openstack/neutron: Clean up ML2 Manager https://review.openstack.org/61351 | 03:08 |
*** yamahata has quit IRC | 03:25 | |
*** yamahata has joined #openstack-neutron | 03:28 | |
*** WackoRobie has joined #openstack-neutron | 03:42 | |
*** WackoRobie has quit IRC | 03:47 | |
*** gongysh has joined #openstack-neutron | 04:01 | |
*** jecarey has quit IRC | 04:08 | |
*** HenryG has quit IRC | 04:22 | |
*** harlowja_away is now known as harlowja | 04:31 | |
*** harlowja has quit IRC | 04:35 | |
*** networkstatic is now known as networkstatic_zZ | 04:37 | |
*** networkstatic_zZ is now known as networkstatic | 04:43 | |
*** chandankumar has joined #openstack-neutron | 04:44 | |
*** HenryG has joined #openstack-neutron | 04:51 | |
*** markwash has quit IRC | 05:08 | |
*** yfried has quit IRC | 05:29 | |
*** markwash has joined #openstack-neutron | 05:30 | |
*** markwash has quit IRC | 05:36 | |
openstackgerrit | enikanorov proposed a change to openstack/neutron: Fix empty network deletion in db_base_plugin for postgesql https://review.openstack.org/63597 | 06:15 |
*** irenab_ has joined #openstack-neutron | 06:16 | |
*** yfried has joined #openstack-neutron | 06:19 | |
*** ljjjustin has joined #openstack-neutron | 06:22 | |
*** AMike has quit IRC | 06:30 | |
*** ashaikh has joined #openstack-neutron | 06:32 | |
*** gdubreui has quit IRC | 06:34 | |
openstackgerrit | Jenkins proposed a change to openstack/neutron: Imported Translations from Transifex https://review.openstack.org/63056 | 06:38 |
*** vkozhukalov has quit IRC | 06:50 | |
*** ashaikh has quit IRC | 06:53 | |
*** otherwiseguy has quit IRC | 06:54 | |
*** yfried has quit IRC | 06:56 | |
*** yfried has joined #openstack-neutron | 07:02 | |
*** yfried has quit IRC | 07:05 | |
*** yfried has joined #openstack-neutron | 07:06 | |
*** garyk has joined #openstack-neutron | 07:15 | |
*** networkstatic has quit IRC | 07:21 | |
*** jlibosva has joined #openstack-neutron | 07:39 | |
*** evgenyf has joined #openstack-neutron | 07:43 | |
marun | salv-orlando: poing | 07:51 |
salv-orlando | marun: piong | 07:52 |
marun | salv-orlando: I'm not sure I see how the notification patch could possibly cause problems. | 07:52 |
salv-orlando | marun: I'm pretty sure it's not your patch causing those failures | 07:52 |
salv-orlando | I just wanted a confirmation from you | 07:52 |
marun | salv-orlando: i'm also not clear on why successful resource creation has anything to do with agent's being up or down | 07:53 |
marun | salv-orlando: I'm tempted to -1 the devstack patch accordingly | 07:53 |
salv-orlando | go for it | 07:53 |
salv-orlando | but you should not -1 something because it's not clear for you :) | 07:53 |
marun | salv-orlando: i don't want to be a jerk, though. what am i missing? | 07:53 |
salv-orlando | seriously, I can explain. Have you looked at the comment I left on bug 1253896 | 07:54 |
*** csd has quit IRC | 07:54 | |
marun | salv-orlando: why on earth would resource creation (setting db state) have anything to do with propagating that state to an agent? | 07:54 |
marun | salv-orlando: looking... | 07:54 |
salv-orlando | I can explain here... | 07:54 |
marun | salv-orlando: ok | 07:54 |
salv-orlando | basically that happens because when you create a port with the ml2 plugin, it does something which is a called a binding. | 07:55 |
salv-orlando | basically associate either the local VLAN or the provider attributes | 07:55 |
salv-orlando | to do so the ml2 plugin calls the "agent mechanism driver" | 07:55 |
salv-orlando | which if there is no agent available, just does not create this binding | 07:55 |
salv-orlando | then when the agent asks for the port details, since there are no bindings the plugin does not return the details | 07:56 |
salv-orlando | and the port is not wired. | 07:56 |
salv-orlando | Now devstack starts neutron server, creates the resources and then starts the agent. | 07:56 |
salv-orlando | Your next question should be "why this does not happen always then?" :) | 07:57 |
salv-orlando | because the port which does not get wired is the DHCP port, which is created asynchronously when the dhcp agent receives a notification that the subnet has been created. | 07:57 |
marun | so is the ml2 plugin just retarded?? | 07:57 |
salv-orlando | Therefore we are observing what is a race between the dhcp and the ovs agent | 07:57 |
*** amuller has joined #openstack-neutron | 07:58 | |
salv-orlando | I think there might be something to fix wrt this behaviour in the ml2 plugin as well. | 07:58 |
salv-orlando | indeed if you're unable to bind a port to a "segment", one should not put the port "DOWN" as the consumers of the neutron API might expect it to go UP eventually | 07:58 |
salv-orlando | the port should be put in ERROR | 07:58 |
marun | salv-orlando: right | 07:59 |
salv-orlando | but perhaps there's even more to do in the ml2 plugin. It's just that we need to involve the people which have been developing the mechanism driver framework | 07:59 |
marun | salv-orlando: <expletive deleted> | 07:59 |
salv-orlando | and I would like to stop seeing these failures for bug 1253896 | 07:59 |
marun | salv-orlando: can you explain this port binding thing again? | 08:00 |
salv-orlando | marun: if what you've deleted relates to ml2 developers not being involved as expected in bug triaging and fixing, I kind of understand your concern. | 08:00 |
salv-orlando | marun: sure. | 08:00 |
salv-orlando | when a port is created, ml2 plugin binds it to a "segment". A segment is either a local vlan identifier or a provider network specification. In the case of gate tests it's always a local vlan id. This local vlan id will be set by the agent on the ovs port. | 08:01 |
marun | local vlan id?? | 08:01 |
marun | are you saying the ml2 plugin now allocates the local vlan on the neutron server instead of it being done at the agent level like with the ovs plugin? | 08:02 |
* marun confused | 08:02 | |
*** amuller has quit IRC | 08:03 | |
salv-orlando | this is what I read from the code. I am confused either because I did not understand the necessity of it. However, I do not have enough info on the thought process which led to this decision. | 08:03 |
*** bvandenh has quit IRC | 08:03 | |
salv-orlando | Bottom line is that if there is no agent for a port, this "segment binding" is not created. | 08:03 |
*** amuller has joined #openstack-neutron | 08:03 | |
marun | salv-orlando: <expletive deleted> | 08:04 |
salv-orlando | when this happens, get_device_details in neutron/plugins/ml2/rpc.py does not return details for the port, and as a consequence the agent puts the port on the DEAD VLAN | 08:04 |
salv-orlando | as you might understand, I agree there is a probably an action to be taken on the ML2 plugin, but yesterday I thought that this might involve engaging with the developers, rediscussing the issue, reaching consensus on a solution, and then finding someone to implement it. | 08:05 |
marun | salv-orlando: ok, so I'll agree to not -1 the devstack patch if we file a bug ensuring this massive cock-up gets fixed. | 08:05 |
salv-orlando | marun: and this is why I went for the devstack patch. | 08:05 |
marun | salv-orlando: understood. | 08:06 |
salv-orlando | I will create a bug report, hopefully tonight in the neutron meeting I will find some time to query some ml2 dev | 08:06 |
marun | salv-orlando: cool. | 08:06 |
* marun regrets not looking closely at ml2 | 08:06 | |
salv-orlando | I tried yesterday on IRC, but as you might expect the room was desert. | 08:06 |
marun | salv-orlando: weekend before western holidays? yeah | 08:07 |
salv-orlando | marun: I think I already told you what I regret about ml2 | 08:07 |
marun | salv-orlando: I don't think so - maybe you can enlighten me? | 08:07 |
salv-orlando | We probably rushed a bit in making it the default plugin. | 08:07 |
marun | salv-orlando: ah, yes. | 08:08 |
marun | salv-orlando: given the bug in question, I would entirely agree with that assessment. | 08:08 |
marun | salv-orlando: The fact that it was allowed to introduce a dependency on agent liveness indicates a lamentable lack of distributed systems experience on the part of everyone involved. :( | 08:09 |
marun | salv-orlando: (though that criticism could probably be leveled at large swaths of openstack) | 08:10 |
*** dzyu has joined #openstack-neutron | 08:10 | |
salv-orlando | marun: your last comment is correct. The fact is that I'm the first who make mistakes when it comes to distributed systems designs. And I did not get a good peer review as I usually do, I would have made errors which are far worse. | 08:11 |
salv-orlando | last sentence was supposed to start with "And If I did not get..." | 08:11 |
marun | salv-orlando: Hey, doesn't armax have a phd in distributed systems? | 08:14 |
*** ljjjustin has quit IRC | 08:14 | |
salv-orlando | marun: technically I too have it | 08:14 |
salv-orlando | note the "technically" | 08:14 |
salv-orlando | marun: and I even TA-ed the distributed system class and implemented demo versions for student of several consensus protocols including paxos | 08:15 |
salv-orlando | marun: but still, I make poor mistakes | 08:15 |
marun | salv-orlando: oy. optimistically, at least, things can only improve ;) | 08:16 |
marun | salv-orlando: and I get it - this stuff is hard. we don't know our own ignorance | 08:17 |
salv-orlando | marun: right. I will report this bug, and then seek to elicit some feedback from the ML2 team. I might be ok to keep agents and server loosely coupled as they are now, I think the server should just not made assumptions like the one that an agent is always available, and if it does, take appropriate reactions - like failing the port create or setting it in ERROR state as like nova does when there is no compute node to schedule an instance | 08:19 |
marun | salv-orlando: ok | 08:20 |
marun | salv-orlando: why is it ok to require the agent to be running though? | 08:20 |
marun | salv-orlando: why can't the server decide and notify the agent? | 08:21 |
marun | salv-orlando: or is the implication that both the server and the agent are now making decisions that have to be synchronized? | 08:21 |
* marun feels a headache coming on | 08:21 | |
salv-orlando | marun: I can't give an answer to this question without hearing from the people who designed and developed the ml2 mechanism framework. | 08:27 |
marun | salv-orlando: fair enough | 08:27 |
salv-orlando | guesswork leads to headaches, and I don't take pills | 08:27 |
marun | salv-orlando: :) | 08:27 |
*** vkozhukalov has joined #openstack-neutron | 08:29 | |
*** ygbo has joined #openstack-neutron | 08:43 | |
*** rkukura has quit IRC | 08:45 | |
*** rkukura has joined #openstack-neutron | 08:45 | |
marun | rkukura: are you really up | 08:47 |
marun | ? | 08:47 |
*** fouxm has joined #openstack-neutron | 08:50 | |
*** jlibosva has quit IRC | 08:50 | |
*** jlibosva has joined #openstack-neutron | 08:51 | |
*** salv-orlando_ has joined #openstack-neutron | 08:54 | |
*** rossella_s has joined #openstack-neutron | 08:56 | |
*** salv-orlando has quit IRC | 08:57 | |
*** salv-orlando_ is now known as salv-orlando | 08:57 | |
*** safchain has joined #openstack-neutron | 08:58 | |
openstackgerrit | Aaron Rosen proposed a change to openstack/neutron: Bump api_workers from 0 to 4 https://review.openstack.org/59787 | 09:02 |
*** salv-orlando has quit IRC | 09:07 | |
*** salv-orlando has joined #openstack-neutron | 09:07 | |
*** ijw has joined #openstack-neutron | 09:13 | |
*** ihrachyshka has quit IRC | 09:14 | |
openstackgerrit | Sylvain Afchain proposed a change to openstack/neutron: Dnsmasq uses all agent IPs as nameservers https://review.openstack.org/61067 | 09:15 |
marun | salv-orlando: ping | 09:18 |
salv-orlando | hi martun | 09:18 |
salv-orlando | marun | 09:18 |
marun | salv-orlando: so, I'm guilty of thinking too much... | 09:18 |
marun | salv-orlando: I have one question, and then I'm hoping you would be willing to review the first part of a design doc regarding the dhcp agent | 09:19 |
salv-orlando | marun: and now you've got an headache? I'm sorry I'm out of drugs | 09:19 |
marun | salv-orlando: (i promise, it's not too painful) | 09:19 |
salv-orlando | marun: go ahead | 09:19 |
marun | salv-orlando: are networks that aren't isolated possible? | 09:20 |
marun | salv-orlando: i.e. is neutron going to be ns-isolated only at some point in the future | 09:20 |
marun | salv-orlando: the question is related to the safety of applying state changes to multiple networks out-of-order | 09:21 |
salv-orlando | marun: I think so, but I don't know when that will happen. I think the option of running neutron without overlapping ips (hence no namespaces), will stay for havana | 09:22 |
marun | salv-orlando: I guess the potential exists in that scenario for db changes across network to be applied in the wrong order by the dhcp agent | 09:22 |
salv-orlando | marun: perhaps that will go away once all major linux distros will support namespace, but then there's the user perspective (there are some who really don't want to use namespaces - not guessing here, I've met at least one of those users) | 09:22 |
marun | salv-orlando: Arg | 09:23 |
salv-orlando | marun: then that could be a problem regardless of whether we're using namespaces or not, I guess? | 09:23 |
marun | salv-orlando: No, I don't think so. | 09:23 |
marun | salv-orlando: so, example. | 09:23 |
marun | salv-orlando: network A has allocation range X1,X2 and network B has allocation range Y1,Y2 | 09:24 |
marun | salv-orlando: network A kills all ports on allocation range X2 and drops it, network B adds allocation range equivalent to X2 and adds ports | 09:24 |
marun | salv-orlando: with network isolation, this doesn't matter. | 09:24 |
*** ijw has quit IRC | 09:25 | |
marun | salv-orlando: without network isolation, well, it's possible that if we don't globally order operations we could have ports on X2 still there while ports are allocated on the new range on network b | 09:25 |
marun | salv-orlando: does that make sense? | 09:25 |
*** ijw has joined #openstack-neutron | 09:25 | |
marun | salv-orlando: In havana, well, there is no guarantee of notification ordering, so things are just going to suck anyway | 09:26 |
marun | salv-orlando: if we can guarantee notification ordering per-network, though, network isolation will prevent anything bad from happening cross-network | 09:26 |
salv-orlando | 6 | 09:26 |
salv-orlando | 35 | 09:26 |
salv-orlando | marun: sorry cat on keyboard | 09:26 |
marun | heh | 09:27 |
salv-orlando | marun: summarising, the current handling of port_update, but even just stashing all the messages in a set and then processing them in the main loop creates a window of opportunity for some range of IPs to be defined at the same time on more than one dnsmasq instance. | 09:28 |
salv-orlando | is that correct? | 09:28 |
marun | salv-orlando: yes | 09:29 |
salv-orlando | and the way for sorting it would be to put notifications in a queue and serially processing them? | 09:29 |
marun | salv-orlando: I have something prepared, I'll email. | 09:29 |
salv-orlando | marun: the only thing I have to say abut serial processing is that in our cloud, where we use the dhcp agent, we found out that at some scale (500 networks for instance), the dhcp agent would take about 35 minutes to do the initial synchronization | 09:31 |
salv-orlando | this is why arosen did that think about parallel processing for dhcp namespaces | 09:31 |
salv-orlando | think/thing | 09:31 |
marun | hold that thought | 09:31 |
salv-orlando | holding that | 09:31 |
marun | sent to your gmail | 09:31 |
salv-orlando | ok | 09:31 |
marun | salv-orlando: this is just the start - I haven't written the part about how notifications are actually processed. suffice to say, though, that operation ordering only has to be (necessarily) per-network, and there will be room to optimize by properly separating db to agent sync from agent to dnsmasq sync. | 09:34 |
salv-orlando | I am reading it, but I won't be able to answer immediately. At first glance it makes sense - the problem with agent counters reminds me of vector clocks but I'm not sure. I think it's a good idea to send it to the mailing list. | 09:34 |
marun | salv-orlando: I intend to post it as a blueprint design, I was just hoping to get some initial feedback from you as to whether I was missing anything obvious. | 09:34 |
marun | salv-orlando: it is kind of like vector clocks, but since we have the db we don't have to take into account multiple hosts. we have one source of truth - the db. | 09:35 |
salv-orlando | I will be able to send you my feedback in a few hours - I'm sorry for the delay, but I have something else I'm trying to finish for today. | 09:35 |
openstackgerrit | enikanorov proposed a change to openstack/neutron: Add security groups tables for ML2 plugin via migration https://review.openstack.org/63585 | 09:36 |
salv-orlando | anyway, from a process perspective, something else to think about is how to implement this incrementally so we do not end up with a bunch of large patches to review a few days before the I-3 deadline | 09:36 |
marun | salv-orlando: no worries. I'm going to sleep now anyway. :) | 09:36 |
*** ijw has quit IRC | 09:36 | |
marun | salv-orlando: I don't think this will be too complicated, and I'm hoping to make the rational clear to reviewers via a clear design doc. | 09:37 |
salv-orlando | marun: cool. | 09:37 |
marun | salv-orlando: the 3 phases I anticipate are 1) ordered notifications, 2) separation of db to agent and agent to dnsmasq sync, and 3) optimize for performance | 09:38 |
marun | salv-orlando: I'm hoping this effort might prove useful for the other agents if they have a similar issue with processing notifications out of order. | 09:39 |
salv-orlando | I'm not sure if markmcclain is cooking as well something for #2, you might try to ping him tomorrow | 09:39 |
marun | salv-orlando: i think he is, yes. | 09:40 |
marun | salv-orlando: I'm less clear on whether he's doing #1, though, so that effort at least might be useful. | 09:42 |
salv-orlando | marun: I and mark have never talked about #1 | 09:44 |
marun | salv-orlando: I think it's going to impact all of the agents. | 09:45 |
marun | salv-orlando: especially when multiple api workers are used. | 09:45 |
salv-orlando | marun: Sure. It does not have to be something specific to the dhcp agent. | 09:46 |
marun | salv-orlando: 'nite! | 09:51 |
salv-orlando | marun: good night | 09:51 |
marun | salv-orlando: thank you for the conversation, helpful as always. | 09:51 |
marun | :) | 09:51 |
*** majopela|lunch has joined #openstack-neutron | 09:51 | |
salv-orlando | marun: you're wlecome | 09:51 |
*** chandankumar has quit IRC | 09:56 | |
openstackgerrit | Yves-Gwenael Bourhis proposed a change to openstack/neutron: Make dnsmasq aware of all names https://review.openstack.org/52930 | 09:58 |
*** chandankumar has joined #openstack-neutron | 09:59 | |
*** ijw has joined #openstack-neutron | 10:00 | |
*** dzyu has quit IRC | 10:01 | |
*** chandankumar has quit IRC | 10:06 | |
*** chandankumar has joined #openstack-neutron | 10:08 | |
*** majopela|lunch has quit IRC | 10:10 | |
enikanorov__ | salv-orlando: hi. I wanted to ask about whether fixing bugs of ovs plugin is something desirable? afaik it is deprecated in favor of ml2 | 10:17 |
salv-orlando | idk but as long as it's there if there's a bug i'd fix it, unless it requires major changes in the plugin code | 10:23 |
enikanorov__ | ok i see | 10:23 |
*** evgenyf has quit IRC | 10:24 | |
*** rossella_s has quit IRC | 10:28 | |
gongysh | ping garyk | 10:28 |
gongysh | ping amotoki | 10:29 |
gongysh | ping safchain | 10:30 |
safchain | gongysh, Hi | 10:30 |
gongysh | safchain: how is the l3 agent HA going? | 10:31 |
safchain | gongysh, I hope and I plan to submit the first patches next week | 10:31 |
*** ijw has quit IRC | 10:32 | |
gongysh | ok thanks | 10:33 |
*** gongysh has quit IRC | 10:33 | |
*** evgenyf has joined #openstack-neutron | 10:48 | |
*** rossella_s has joined #openstack-neutron | 10:50 | |
*** Jianyong has quit IRC | 10:52 | |
*** bvandenh has joined #openstack-neutron | 10:58 | |
*** markvoelker has quit IRC | 11:04 | |
*** bvandenh has quit IRC | 11:09 | |
Qlawy | Hi, I have an issue with neutron and traffic to "outside" | 11:13 |
Qlawy | I have all-in-one installation | 11:14 |
Qlawy | I can ping vms from openstack, I can ping openstack from vms | 11:14 |
Qlawy | but | 11:14 |
Qlawy | when I try to ping vms from another machine which is in same network as openstack I cant do this | 11:14 |
Qlawy | when I want to ping that host from VM, I also cant do this | 11:15 |
Qlawy | i can bet its some kind of security-group/rules issue but no idea how to fix it | 11:16 |
Qlawy | another way to solve it is to connect that "another machine" into neutrons network ;/ | 11:17 |
openstackgerrit | Berezovsky Irena proposed a change to openstack/neutron: Add update from agent to plugin on device up https://review.openstack.org/53609 | 11:29 |
*** rossella_s has quit IRC | 11:29 | |
*** krast has quit IRC | 11:31 | |
openstackgerrit | Berezovsky Irena proposed a change to openstack/neutron: Add update from agent to plugin on device up https://review.openstack.org/53609 | 11:54 |
*** zzelle has joined #openstack-neutron | 11:56 | |
openstackgerrit | mouad benchchaoui proposed a change to openstack/neutron: Make the metadata namespace proxy transparent https://review.openstack.org/28137 | 12:00 |
openstackgerrit | Oleg Bondarev proposed a change to openstack/neutron: LBaaS: agent monitoring and instance rescheduling https://review.openstack.org/59743 | 12:03 |
*** jlibosva has quit IRC | 12:38 | |
sdague | salv-orlando: so I think https://bugs.launchpad.net/tempest/+bug/1253896 has changed it's signature, I'm now seeing a lot of no route to host issues | 12:42 |
sdague | http://logs.openstack.org/30/63530/3/gate/gate-tempest-dsvm-neutron/11c8e98/console.html#_2013-12-23_12_39_15_616 | 12:47 |
*** jlibosva has joined #openstack-neutron | 12:52 | |
*** b3nt_pin has quit IRC | 12:55 | |
*** ashaikh has joined #openstack-neutron | 12:59 | |
*** yamahata has quit IRC | 13:02 | |
*** yamahata has joined #openstack-neutron | 13:03 | |
*** aymenfrikha has joined #openstack-neutron | 13:10 | |
*** yamahata has quit IRC | 13:15 | |
*** b3nt_pin has joined #openstack-neutron | 13:19 | |
*** b3nt_pin is now known as beagles | 13:20 | |
*** markwash has joined #openstack-neutron | 13:31 | |
*** yamahata has joined #openstack-neutron | 13:32 | |
*** Jianyong has joined #openstack-neutron | 13:37 | |
*** irenab_ has quit IRC | 13:47 | |
*** jdev789 has joined #openstack-neutron | 13:48 | |
*** julim has joined #openstack-neutron | 13:54 | |
openstackgerrit | Sylvain Afchain proposed a change to openstack/neutron: L3 Agent can handle many external networks https://review.openstack.org/59359 | 13:55 |
openstackgerrit | Sylvain Afchain proposed a change to openstack/neutron: L3 Agent can handle many external networks https://review.openstack.org/59359 | 13:58 |
*** WackoRobie has joined #openstack-neutron | 14:00 | |
*** yfried has quit IRC | 14:00 | |
*** yfried has joined #openstack-neutron | 14:13 | |
*** ihrachys has quit IRC | 14:16 | |
*** ihrachys has joined #openstack-neutron | 14:17 | |
*** ihrachys has quit IRC | 14:17 | |
*** ihrachys has joined #openstack-neutron | 14:17 | |
*** ihrachys has quit IRC | 14:18 | |
*** ihrachys has joined #openstack-neutron | 14:19 | |
*** alexpilotti has joined #openstack-neutron | 14:20 | |
*** markwash has quit IRC | 14:24 | |
*** ihrachys has quit IRC | 14:30 | |
*** ihrachys has joined #openstack-neutron | 14:31 | |
*** rossella_s has joined #openstack-neutron | 14:41 | |
*** bvandenh has joined #openstack-neutron | 14:46 | |
*** clev has joined #openstack-neutron | 14:48 | |
*** nijaba has quit IRC | 14:50 | |
openstackgerrit | Sylvain Afchain proposed a change to openstack/neutron: L3 Agent can handle many external networks https://review.openstack.org/59359 | 14:51 |
*** nijaba has joined #openstack-neutron | 14:57 | |
*** nijaba has quit IRC | 14:57 | |
*** nijaba has joined #openstack-neutron | 14:57 | |
*** rwsu has joined #openstack-neutron | 14:59 | |
*** rustlebee is now known as russellb | 14:59 | |
*** nijaba has quit IRC | 15:01 | |
*** nijaba has joined #openstack-neutron | 15:04 | |
*** WackoRobie has quit IRC | 15:19 | |
*** Jianyong has quit IRC | 15:28 | |
*** jlibosva has quit IRC | 15:30 | |
openstackgerrit | Xiang Hui proposed a change to openstack/neutron: Add options for all commands executing run_vsctl https://review.openstack.org/58954 | 15:32 |
*** otherwiseguy has joined #openstack-neutron | 15:39 | |
*** ashaikh has quit IRC | 15:46 | |
openstackgerrit | Andreas Jaeger proposed a change to openstack/python-neutronclient: Fix description of ListSubnet https://review.openstack.org/63619 | 15:46 |
*** rossella_s_ has joined #openstack-neutron | 15:51 | |
openstackgerrit | Aleks Chirko proposed a change to openstack/neutron: Bugfix and refactoring for ovs_lib flow mgnt methods https://review.openstack.org/58533 | 15:52 |
*** rossella_s has quit IRC | 15:52 | |
*** rossella_s_ is now known as rossella_s | 15:52 | |
*** yfried has quit IRC | 15:56 | |
*** yfried has joined #openstack-neutron | 15:57 | |
*** vkozhukalov has quit IRC | 15:57 | |
*** ihrachys has quit IRC | 16:03 | |
*** SumitNaiksatam has quit IRC | 16:08 | |
*** yfried has quit IRC | 16:08 | |
*** garyk has quit IRC | 16:13 | |
openstackgerrit | A change was merged to openstack/neutron: BigSwitch: Fixes floating IP backend updates https://review.openstack.org/63047 | 16:13 |
openstackgerrit | Andreas Jaeger proposed a change to openstack/python-neutronclient: Fix description of ListSubnet https://review.openstack.org/63619 | 16:13 |
*** WackoRobie has joined #openstack-neutron | 16:18 | |
*** SumitNaiksatam has joined #openstack-neutron | 16:30 | |
*** rossella_s_ has joined #openstack-neutron | 16:48 | |
*** rossella_s has quit IRC | 16:49 | |
*** rossella_s_ is now known as rossella_s | 16:49 | |
*** rossella_s has quit IRC | 16:58 | |
*** yamahata has quit IRC | 16:58 | |
*** jdev789 has quit IRC | 17:02 | |
*** rossella_s has joined #openstack-neutron | 17:05 | |
*** alexpilotti has quit IRC | 17:07 | |
*** mlavalle has joined #openstack-neutron | 17:11 | |
anteaya | rossella_s mlavalle beagles hello | 17:12 |
anteaya | I have to leave for a meeting very soon | 17:12 |
beagles | hi | 17:12 |
mlavalle | anteaya: hello | 17:12 |
anteaya | so if you three want to have an update | 17:12 |
anteaya | I encourage it | 17:12 |
anteaya | and I will read the backscroll when I return | 17:12 |
anteaya | how are you doing? | 17:12 |
mlavalle | we are scheduled to talk at 18:00UTC | 17:12 |
rossella_s | anteaya: hello | 17:13 |
anteaya | okay great | 17:14 |
anteaya | 45 minutes | 17:14 |
anteaya | I won't be here for it, so don't wait for me | 17:14 |
mlavalle | ok, will do and keep you posted | 17:15 |
anteaya | thanks | 17:17 |
*** bvandenh has quit IRC | 17:18 | |
*** ygbo has quit IRC | 17:27 | |
*** AndreyGrebenniko has quit IRC | 17:27 | |
*** akamyshnikova_ has quit IRC | 17:27 | |
*** skraynev has quit IRC | 17:27 | |
*** obondarev has quit IRC | 17:27 | |
*** enikanorov has quit IRC | 17:27 | |
*** skraynev has joined #openstack-neutron | 17:28 | |
*** akamyshnikova_ has joined #openstack-neutron | 17:28 | |
*** enikanorov has joined #openstack-neutron | 17:28 | |
*** obondarev has joined #openstack-neutron | 17:30 | |
openstackgerrit | Aleks Chirko proposed a change to openstack/neutron: Bugfix and refactoring for ovs_lib flow mgnt methods https://review.openstack.org/58533 | 17:30 |
*** ijw has joined #openstack-neutron | 17:30 | |
openstackgerrit | Svetlana Dobogoeva proposed a change to openstack/neutron: Added unit tests for module neutron/plugins/nicira/api_client/client.py https://review.openstack.org/59948 | 17:33 |
*** amuller has quit IRC | 17:33 | |
anteaya | enikanorov__: this is currently the #2 gate blocking bug: https://bugs.launchpad.net/nova/+bug/1254890 | 17:36 |
anteaya | dims has offered a patch which he links to at the bottom of the bug report | 17:36 |
anteaya | if you have the time to evaluate the patch and review it prior to the meeting, we might be able to get that patched merged and that bug addressed today | 17:37 |
anteaya | would be great if that happened, if you feel the patch addresses the bug | 17:37 |
anteaya | thanks | 17:37 |
enikanorov__ | anteaya: yes, i have seen this. however i think that the patch needs to be tested against jobs on which the failures are seen | 17:37 |
enikanorov__ | i mean neutron-pg and neutron-pg-isolated | 17:37 |
anteaya | enikanorov__: great | 17:37 |
enikanorov__ | and nova patches are not tested against them | 17:37 |
anteaya | okay | 17:37 |
anteaya | what needs to happen? | 17:37 |
*** yfried has joined #openstack-neutron | 17:37 | |
enikanorov__ | so in fact i don't mind letting this patch to land, but i'm not a nova expert and neither nova core | 17:38 |
enikanorov__ | i just would like to see it tested with those jobs | 17:38 |
anteaya | yfried: you would like a patch tested against https://review.openstack.org/#/c/62702/ which is failing in jenkins | 17:39 |
anteaya | yfried: what do you need to do to edit the patch so that it passes tests? | 17:39 |
anteaya | enikanorov__: that is fair | 17:39 |
anteaya | enikanorov__: I would like to stay and help you get the testing done to your statisfaction | 17:40 |
anteaya | but if I do that I will be late for an appointment | 17:40 |
anteaya | can you ask clarkb or jeblair perhaps in -infra to see if they can help you learn how to get the tests to run on or with that patch? | 17:41 |
enikanorov__ | i'll talk to dims about it | 17:41 |
anteaya | I think you may have asked on the weekend but it was very quiet on the weekend | 17:41 |
anteaya | thanks | 17:41 |
enikanorov__ | afaik that will require to create specifal configuration so 'check experimental' could run neutron-pg* jobs agains nova patches | 17:41 |
* anteaya is afk for the rest of the day | 17:41 | |
anteaya | enikanorov__: okay | 17:41 |
anteaya | do you know how to create a check experiemental job? | 17:42 |
anteaya | this patch might help provide a bit of a template: https://review.openstack.org/#/c/62930/ | 17:42 |
*** safchain has quit IRC | 17:42 | |
*** amrit has quit IRC | 17:43 | |
*** harlowja has joined #openstack-neutron | 17:50 | |
*** ijw has quit IRC | 17:50 | |
*** rossella_s_ has joined #openstack-neutron | 17:52 | |
*** rossella_s has quit IRC | 17:54 | |
*** rossella_s has joined #openstack-neutron | 17:55 | |
*** garyk has joined #openstack-neutron | 17:56 | |
*** rossella_s_ has quit IRC | 17:57 | |
*** ijw has joined #openstack-neutron | 18:00 | |
rossella_s | mlavalle: beagles: hello! shall we start? | 18:01 |
mlavalle | rossella_s, beagles: hi | 18:01 |
mlavalle | yeah | 18:01 |
mlavalle | let's do it | 18:01 |
rossella_s | did you receive my email? | 18:01 |
beagles | yup | 18:01 |
rossella_s | https://etherpad.openstack.org/p/full-neutron-gate-tests | 18:01 |
mlavalle | rossella_s: thank you for th runs and the log files | 18:01 |
mlavalle | and the etherpad | 18:02 |
rossella_s | mlavalle: my pleasure | 18:02 |
mlavalle | At first look, the picture doesn't look that ugly | 18:02 |
rossella_s | indeed | 18:02 |
mlavalle | there are 3 or 4 tests failing consistently | 18:02 |
rossella_s | yes | 18:02 |
rossella_s | but there's always some random failure | 18:03 |
mlavalle | and 3 of them have to do with ssh to a floating ip | 18:03 |
rossella_s | yes | 18:03 |
rossella_s | what's usually the cause of that? | 18:03 |
mlavalle | right now thee is a lot of activity trying to fix floating ip propagation and timeouts | 18:04 |
beagles | time seems to be the cause of that most of the time I think | 18:04 |
mlavalle | I know yfried has been working over the past few days specifically on that | 18:04 |
mlavalle | in fact, he is been working with two of the scripts we are seeing failling | 18:05 |
mlavalle | test_network_basic_ops and test_cross_tenant_connect|vity | 18:05 |
beagles | mmm.. afaik that is really only that he is removing the timeout on the floating IP propagation, that won't affect the other failure modes | 18:06 |
mlavalle | so it seems to me that it is logical we are seeing these issues in this jjob | 18:06 |
rossella_s | beagles: what are the other failure mode? | 18:06 |
rossella_s | s/mode/modes | 18:06 |
beagles | ping and ssh themselves failing | 18:06 |
beagles | the floating IP check was something I added because the ping test was failing on me all fo the time and I felt it was cart before horse to ping before we had some idea that it had actually been alloated | 18:07 |
mlavalle | ahhhh, ok | 18:07 |
beagles | but really in the end I don't think it'll make a difference.. it takes too long for the floating IP to become active in some cases | 18:07 |
beagles | and to fix the tests, we'll need to fix that performance issue... still... | 18:08 |
beagles | if it takes a long time for the vm to become active, it will still fail | 18:08 |
mlavalle | so, beagles, is that performance issue with the FLIP's and VM's something you are activey involved? | 18:09 |
beagles | um not active in the openstack sense, sorry... but with actually booting etc. | 18:09 |
rossella_s | who's working on that? | 18:09 |
mlavalle | would you say we need to involve make yfried aware of what we are seeing? | 18:09 |
beagles | mlavalle: not at the moment. You could say that it is vying for top spot | 18:09 |
beagles | yfried is aware.. or should be, we discussed it a length | 18:10 |
beagles | there is an email thread as well... test_network_basic_ops and the "FloatingIPChecker" control point | 18:10 |
beagles | on openstack-dev | 18:10 |
mlavalle | so I suggest that one action is to track what is happening with this discussion on test_network_basic_ops and the "FloatingIPChecker". If that gets fixed, I think we won't see these issues in the job anymore | 18:13 |
mlavalle | does this makes sense to you? | 18:13 |
rossella_s | mlavalle: I agree | 18:13 |
beagles | sure | 18:13 |
mlavalle | before, we distribute tasks, let's discuss another issue | 18:14 |
rossella_s | there's another test that is consistently failing, tempest.cli.simple_read_only.test_neutron.SimpleReadOnlyNeutronClientTest.test_neutron_security_group_rule_list | 18:14 |
mlavalle | I didn't have time to analyze that one | 18:14 |
mlavalle | what did you find? | 18:14 |
rossella_s | mlavalle: sorry, I haven't checked either | 18:14 |
rossella_s | I can do that later | 18:14 |
mlavalle | that's fine, I thought you had gone through it | 18:15 |
mlavalle | so, let me bring up another issue | 18:15 |
*** vkozhukalov has joined #openstack-neutron | 18:15 | |
mlavalle | I went through the console log of the job | 18:15 |
mlavalle | you will fin several instances of skipped tests | 18:16 |
mlavalle | for example tempest.api.compute.servers.test_virtual_interfaces.VirtualInterfacesTestJSON.test_list_virtual_interfaces[gate] ... skipped u'Skipped until Bug: 1183436 is resolved. | 18:17 |
*** rossella_s_ has joined #openstack-neutron | 18:17 | |
mlavalle | if you look at bug 1183436, https://bugs.launchpad.net/tempest/+bug/1183436 | 18:18 |
*** rossella_s has quit IRC | 18:19 | |
*** rossella_s_ is now known as rossella_s | 18:19 | |
mlavalle | you will see that the tests method is skipped because Neutron cannot list virtual interfaces | 18:19 |
mlavalle | the way nova can | 18:19 |
mlavalle | so a few months ago, someone decided to just skip the test | 18:20 |
mlavalle | am I making sense so far? | 18:20 |
beagles | yup | 18:20 |
mlavalle | so, when we go to the nova / nautron parity conversation, what do these skips mean? | 18:21 |
mlavalle | I can assure you there are several of them | 18:21 |
mlavalle | in the tempest code | 18:21 |
beagles | yeah, I would expect so.. 1s | 18:21 |
*** WackoRobie has quit IRC | 18:21 | |
dims | enikanorov, hop onto -infra channel | 18:22 |
beagles | hrmm | 18:22 |
mlavalle | I think we need to bring this to the attention to the larger Neutron team | 18:22 |
mlavalle | and as a team make the decision whether we skip those tests or we are going to work to make those tests run | 18:22 |
beagles | well.. as part of the parity question we need to triage the bugs related to same and figure out if they need to be implemented or not | 18:23 |
beagles | in September there were several APIs in the nova-neutron interface that hadn't been implemented | 18:24 |
mlavalle | beagles: yeap, his is the conversation I want to have | 18:24 |
beagles | arosen, by virtue of volunteering to take the lead on that module, could use that info for sure | 18:25 |
mlavalle | it seems to me that the very next step in this regard is to go thorugh the log of the job and find all the skips that might be relevant and then proceed to that triaging | 18:25 |
*** alexpilotti has joined #openstack-neutron | 18:25 | |
mlavalle | makes sense? | 18:25 |
beagles | yup | 18:25 |
mlavalle | how about you rossella_s | 18:26 |
mlavalle | ? | 18:26 |
rossella_s | mlavalle: yes, makes sense | 18:26 |
beagles | in some cases it is pretty easy on the code from, just search for NotImplementedError() exceptions :) | 18:26 |
beagles | s/from/front/ | 18:26 |
rossella_s | :) | 18:27 |
mlavalle | ok, so to summarize, we have two next steps: | 18:27 |
mlavalle | 1) follow up with the FLIP propagation / timeout fixes and make sure they are going to contribute to fix this job | 18:28 |
mlavalle | 2) Compile all the skipped tests in the job that are relevant to the Neutron / nova parity convrsation and start triaging them | 18:29 |
mlavalle | am I missing something? | 18:29 |
rossella_s | 3) check the SimpleReadOnlyNeutronClientTest.test_neutron_security_group_rule_list failure | 18:29 |
mlavalle | yes. good catch | 18:30 |
mlavalle | beagles: could you help with 1? seems like you've been closer to the FLIP's conversation | 18:32 |
beagles | mlavalle: yes | 18:32 |
mlavalle | rossella_s: would you take 2 and 3? | 18:33 |
*** fouxm has quit IRC | 18:33 | |
rossella_s | mlavalle: yes, np. But I'm on holidays till the 3rd. | 18:34 |
*** fouxm has joined #openstack-neutron | 18:34 | |
beagles | actually same here.. who is *not* off until 2014? | 18:35 |
rossella_s | :D | 18:35 |
mlavalle | rossella_s: let's do something then…. I will make as much progress as I can on this from now until the 3rd | 18:35 |
mlavalle | when you come back and you take it from there | 18:35 |
rossella_s | mlavalle: ok, deal! | 18:35 |
mlavalle | I might not be able to do a lot, becuase I'm working on the api tests stuff | 18:35 |
mlavalle | but i'll give it a try | 18:35 |
rossella_s | I am working on that too. I hope my patches get merged today | 18:36 |
mlavalle | one final thing< I would like to update the larger team today on what we have agreed | 18:36 |
mlavalle | rossella_s: would you like to do that update? | 18:37 |
rossella_s | mlavalle: you mean at the neutron meeting? | 18:37 |
mlavalle | yes | 18:37 |
rossella_s | yes, ok. | 18:38 |
rossella_s | :) | 18:38 |
mlavalle | ok, I will put your name down in the agenda | 18:38 |
mlavalle | anything else? | 18:38 |
rossella_s | so what should I say? just a small summary of what we have been doing and our action list? | 18:38 |
mlavalle | yeah, just that we've been working on this and that we have 3 action items | 18:39 |
rossella_s | mlavalle: ok | 18:39 |
rossella_s | sound good | 18:39 |
rossella_s | s/sound/sounds | 18:39 |
mlavalle | I think we are done | 18:39 |
rossella_s | yes! bye! see you later at the neutron meeting! | 18:40 |
mlavalle | enjoy your time off!!! | 18:40 |
mlavalle | Happy holidays! | 18:40 |
rossella_s | yes! you too, I hope you are taking some day off | 18:40 |
mlavalle | yeah, some days | 18:40 |
mlavalle | :-) | 18:40 |
rossella_s | enjoy then! | 18:41 |
mlavalle | beagles: talk to you later | 18:41 |
beagles | cheers | 18:41 |
*** markmcclain has joined #openstack-neutron | 18:46 | |
*** ijw has quit IRC | 18:47 | |
*** ijw has joined #openstack-neutron | 18:47 | |
*** rossella_s_ has joined #openstack-neutron | 18:51 | |
*** rossella_s has quit IRC | 18:52 | |
*** rossella_s_ is now known as rossella_s | 18:52 | |
*** fouxm has quit IRC | 18:54 | |
*** rossella_s_ has joined #openstack-neutron | 18:59 | |
*** rossella_s has quit IRC | 19:02 | |
*** rossella_s_ is now known as rossella_s | 19:02 | |
yfried | mlavalle: beagles: can you guys fill me in on your discussion? Are you aware of this bug https://launchpad.net/bugs/1262529 and these patches https://review.openstack.org/#/c/63637/ https://review.openstack.org/#/c/63627/ | 19:04 |
*** otherwiseguy has quit IRC | 19:04 | |
beagles | yfried: yup yup and yup | 19:04 |
yfried | I would like to help, especially if this is an issue I'm already part of, but this participating during this hour is really hard for me | 19:05 |
beagles | yfried: basically it was a question regarding timeout failures that were occuring in the experimental queue with various connectivity failures | 19:05 |
yfried | beagles: they are completely untouched. I was hoping to have your opinion on them | 19:05 |
beagles | yfried: in some cases it was ssh timing out, in one case the floating IP propagation test was timing out | 19:06 |
beagles | yfried: yeah, I'll get to them today | 19:08 |
*** ijw has quit IRC | 19:09 | |
yfried | beagles: mlavalle: I'm mostly unavailable now, so I will appreciate if you could send me your conclusions. I would like to participate in solving this issue. I will make effort to go over this discussion later (probably tomorrow) but if you could post it to the mailing list as part of the thread it would be great. | 19:10 |
*** rossella_s_ has joined #openstack-neutron | 19:12 | |
mlavalle | yfried: will do, thanks for your help | 19:12 |
*** ijw has joined #openstack-neutron | 19:12 | |
*** garyk has quit IRC | 19:12 | |
beagles | okay, who has seen "Duplicate test id detected......." in tempest.log when running testr? | 19:13 |
beagles | bloody annoying | 19:13 |
*** rossella_s has quit IRC | 19:14 | |
*** rossella_s_ is now known as rossella_s | 19:14 | |
*** yamahata__ has joined #openstack-neutron | 19:19 | |
sdague | beagles: it's usually always correct. You have an autosave file getting picked up or something? | 19:19 |
beagles | sdague: hah, i bet that's it | 19:20 |
*** geekinutah1 has joined #openstack-neutron | 19:23 | |
beagles | sdague: mmm... not quite it | 19:25 |
*** marios_ has joined #openstack-neutron | 19:25 | |
*** ijw has quit IRC | 19:25 | |
*** ijw has joined #openstack-neutron | 19:26 | |
beagles | sdague: tox -esmoke seems to run, but testr based invocations do not | 19:26 |
*** yamahata_ has quit IRC | 19:26 | |
*** geekinutah has quit IRC | 19:26 | |
*** marios has quit IRC | 19:26 | |
*** ijw_ has joined #openstack-neutron | 19:30 | |
*** ijw has quit IRC | 19:30 | |
sdague | beagles: ValueError: Duplicate test id detected: tempest.api.compute.test_quotas.QuotasTestJSON.test_compare_tenant_quotas_with_default_quotas[gate,smoke] | 19:32 |
sdague | ? | 19:32 |
beagles | sdague: Duplicate test id detected: tempest.api.compute.test_authorization.AuthorizationTestJSON.test_change_password_for_alt_account_fails | 19:32 |
*** rossella_s_ has joined #openstack-neutron | 19:34 | |
sdague | hmmm... I'm getting a different duplicate error on list here | 19:36 |
*** geekinutah1 is now known as geekinutah | 19:36 | |
*** rossella_s has quit IRC | 19:38 | |
*** rossella_s_ is now known as rossella_s | 19:38 | |
beagles | sdague: I just tried hacking testsuite.py to dump the list to stdout.. and after a bit of hacking about, there is definitely a bit of duplication where it isn't expected | 19:46 |
sdague | yeh, wonder how close to being awake lifeless is | 19:56 |
sdague | this doesn't seem right | 19:56 |
sdague | I even went and purged all my pyc files | 19:56 |
beagles | same here | 19:57 |
*** aymenfrikha has quit IRC | 19:59 | |
beagles | sdague: code in testtools.py was recently (is Oct 25 recent?) changed | 20:05 |
*** armax has joined #openstack-neutron | 20:06 | |
beagles | for py 3.3 compat issue | 20:06 |
sdague | yeh, the fact that it works under tox is interesting though | 20:08 |
beagles | sdague: mmm yeah | 20:09 |
lifeless | beagles: sdague: I'm just stepping out | 20:11 |
sdague | lifeless: ok, when are you stepping back? | 20:11 |
lifeless | try testtools 0.9.34 and failing that shoot me a mail and I can look later | 20:11 |
lifeless | sdague: in 3 weeks :) | 20:11 |
*** ijw has joined #openstack-neutron | 20:11 | |
sdague | already using testtools 0.9.34 | 20:12 |
lifeless | sdague: [I will have a look my evening - about 9 hours from now probably] | 20:12 |
lifeless | however, that test is robust, so I think it's basically an issue in tempest; may be the loader logic isn't quite right, or another possible cause | 20:12 |
lifeless | is importing a class rather than a module | 20:13 |
lifeless | so file a) test_foo.py | 20:13 |
lifeless | class TestBar(TestSuite):.... | 20:13 |
lifeless | file b | 20:13 |
lifeless | from test_foo import TestBar | 20:13 |
lifeless | will cause TestBase to be instantiated twice in two files. | 20:13 |
lifeless | HTH, must run now. | 20:13 |
sdague | ok | 20:13 |
sdague | well I can 100% reproduce this issue | 20:15 |
*** ijw_ has quit IRC | 20:15 | |
sdague | oh, I think it's the eval error problem | 20:17 |
sdague | where there is a bad test somewhere, and because testr / subunit / discover swallow all the useful output it shows up as a duplicate test id error | 20:17 |
beagles | oh | 20:18 |
beagles | hah | 20:18 |
*** ijw has quit IRC | 20:19 | |
*** ijw has joined #openstack-neutron | 20:20 | |
beagles | sdague: python-mock | 20:23 |
*** gdubreui has joined #openstack-neutron | 20:23 | |
beagles | sdague: imported indirectily through tempest/tests/test_rest_client.py | 20:23 |
beagles | sdague: installing on my system resolved the issue at any rate | 20:23 |
sdague | ah, yeh, that's why we created the tox targets for various test sets | 20:25 |
beagles | yup, makes sense | 20:25 |
sdague | I really just wish the toolchain would fail with messages that make debugging possible :) | 20:25 |
beagles | sdague: +100 | 20:25 |
*** rossella_s_ has joined #openstack-neutron | 20:32 | |
*** ijw_ has joined #openstack-neutron | 20:33 | |
*** rossella_s has quit IRC | 20:34 | |
*** rossella_s_ is now known as rossella_s | 20:34 | |
*** ijw has quit IRC | 20:36 | |
*** rossella_s_ has joined #openstack-neutron | 20:43 | |
*** rossella_s has quit IRC | 20:47 | |
*** rossella_s_ is now known as rossella_s | 20:47 | |
*** amotoki_ has joined #openstack-neutron | 21:00 | |
*** WackoRobie has joined #openstack-neutron | 21:02 | |
*** amotoki is now known as __amotoki__ | 21:06 | |
*** amotoki_ is now known as amotoki | 21:06 | |
*** WackoRobie has quit IRC | 21:07 | |
*** carl_baldwin has joined #openstack-neutron | 21:12 | |
*** irenab_ has joined #openstack-neutron | 21:14 | |
*** ijw_ has quit IRC | 21:28 | |
*** vkozhukalov has quit IRC | 21:35 | |
*** mestery has joined #openstack-neutron | 21:50 | |
openstackgerrit | Claudiu Belu proposed a change to openstack/neutron: Fixes Hyper-V ceilometer metrics enabling after service restart https://review.openstack.org/63840 | 22:00 |
markmcclain | dkehn, salv-orlando, marun: picking up where we left off | 22:01 |
salv-orlando | I'm here | 22:02 |
markmcclain | as salv-orlando pointed out we do have a tricky problem | 22:02 |
dkehn | yes | 22:02 |
markmcclain | how to avoid branches | 22:02 |
salv-orlando | patch the code to run the schema changes at startup? idk if that can possibly work | 22:03 |
markmcclain | not excited about magic code at start-up | 22:03 |
marun | hmm, that begs the question. is it possible to add new migrations (i.e. change schema) to stable via a backport? | 22:04 |
salv-orlando | or add a havna-2 revision between havana and the first icehouse revision | 22:04 |
salv-orlando | and tell users to do a neutron-db-manage upgrade head? | 22:04 |
markmcclain | I think we insert a havana2 migration | 22:04 |
markmcclain | that fixes the problem for stable | 22:04 |
salv-orlando | marun: db migration in stable are -2'ed unless there a seriously good reason for them | 22:04 |
marun | salv-orlando: is eventual consistency a seriously good reason enough? | 22:05 |
marun | salv-orlando: i'm not trying to be a dick, it is an honest question. | 22:05 |
markmcclain | for those running trunk we could create a smarter revision that checks for the presence of the proper table | 22:05 |
markmcclain | so that way we can fix it both places | 22:06 |
salv-orlando | marun: I think the question to ask is whether there is a fundamental use case broken rather than whether there's eventual consistency or not. But let's first finish this other discussion | 22:06 |
marun | salv-orlando: ok | 22:06 |
*** amotoki has quit IRC | 22:06 | |
markmcclain | the havana-2 migration would need the same smarts | 22:06 |
markmcclain | we'd just have to insert into two places on the timeline | 22:06 |
salv-orlando | markmcclain: how can we avoid then to run it twice for new deployments? | 22:07 |
dkehn | markmcclain: whoim or what check for a proper table the migration or something else | 22:07 |
salv-orlando | with an explicit check in the 2nd? | 22:07 |
markmcclain | I think the migration has to check to make sure the state exists to modify the tables | 22:07 |
markmcclain | if not then skip | 22:07 |
markmcclain | that way the 2nd time alembic sees it | 22:08 |
markmcclain | the migration would be a no-op | 22:08 |
marun | +1 | 22:08 |
*** rossella_s has quit IRC | 22:08 | |
*** ijw has joined #openstack-neutron | 22:08 | |
marun | that's a nice property of code-based migrations - they can be smart | 22:08 |
markmcclain | for havana.2 release we'll just have to update the release notes | 22:09 |
*** irenab_ has quit IRC | 22:09 | |
markmcclain | I'd be happy to work on this | 22:09 |
markmcclain | dkehn: sorry didn't clarify that alembic would check | 22:10 |
salv-orlando | ok sounds like it's sorted? | 22:10 |
dkehn | markmcclain: thx, I guess the question is why aren't we doing it now? just curious | 22:10 |
salv-orlando | I think both migration should also be aware that if the lbaas plugin is enabled they should not run | 22:10 |
markmcclain | maybe.. I'd prefer for the migration to be aware of the requirement that the table exist at the specific moment in the timeline if ml2 is enable | 22:11 |
marun | salv-orlando: does that imply that the plugin's migration isn't smart enough to skip if the state is already as desired? | 22:12 |
markmcclain | so if case there is some other drive migration we missed this one will still correct the db | 22:12 |
salv-orlando | make sense to me. | 22:12 |
markmcclain | marun: right now.. no | 22:12 |
markmcclain | alembic's smarts are at migration creation time | 22:12 |
marun | markmcclain: i guess it would be a lot of work to always do that :/ | 22:12 |
markmcclain | so I'll be adding a bit of defensive code | 22:13 |
*** ijw has quit IRC | 22:13 | |
markmcclain | unless someone thinks I'm heading down the wrong path | 22:13 |
markmcclain | I'll work on coding this up and add you all as reviewers :) | 22:13 |
markmcclain | now onto the eventual consistency question…. | 22:14 |
*** armax has left #openstack-neutron | 22:14 | |
dkehn | markmcclain: in the for what its worth catagory the enabling q-lbaas does get the devstack up and working, wainting on all the exercises | 22:15 |
marun | I have heard lots of complaints that Neutron falls over when booting large numbers of vm's concurrently. | 22:16 |
dkehn | markmcclain: still fails on the floating_ips part | 22:16 |
marun | Given that scalability is an issue, having multiple workers for api and rpc is a matter of when, not if. | 22:17 |
markmcclain | marun: right we run 10+ instances in our cloud | 22:17 |
markmcclain | marun: I've got to clean up a patch that separates the api and rpc processes | 22:17 |
markmcclain | so that they can be scaled independently | 22:18 |
markmcclain | right now their entangled | 22:18 |
markmcclain | s/their/they're/ | 22:18 |
marun | scaling them independently is the starting point | 22:18 |
marun | but once that is done, the dhcp agent, at least, is a walking zombie of race conditions | 22:18 |
marun | so eventual consistency is needed to make sure that those races don't exhibit | 22:19 |
markmcclain | I wouldn't limit to dhcp.. l3 has the same basic workflow | 22:19 |
marun | i'm just talking about what I know, hopefully the solution for dhcp can be applied generally. | 22:19 |
salv-orlando | well, the l2 agent has no workflow at all then. | 22:19 |
marun | So, that said, the issue isn't that we're going to have eventually consistent agents | 22:20 |
marun | the issue is whether the patches that implement it can/should be backported to havana | 22:20 |
marun | i have some people clamoring for fixes for grizzly, but that seems like a bridge too far | 22:21 |
markmcclain | marun: if they fix bugs and can be deployed without migrations or invocation changes they I think we should consider them for backport | 22:21 |
marun | it will require some minor db changes - an extra field to the networks table should be it | 22:21 |
markmcclain | then sadly folks will Icehouse to enable it | 22:21 |
marun | :( | 22:22 |
marun | that really sucks | 22:22 |
markmcclain | given current velocity of new features | 22:22 |
marun | is there no way at least to give people the option? | 22:22 |
markmcclain | marun new migrations aren't allowed | 22:22 |
marun | It's pretty embarrassing to be working on a project that is so broken in grizzly and havana by default, and not be allowed to fix it. | 22:23 |
marun | But I guess them's the breaks. | 22:23 |
markmcclain | marun: yes there are bugs but that is the nature of releasing every 6 mos | 22:23 |
markmcclain | not sure how we provide an optional backport | 22:24 |
markmcclain | it's also one of the reason several different deployments track trunk | 22:24 |
marun | Well, for RH we have to support what we ship. | 22:24 |
salv-orlando | un0 | 22:24 |
markmcclain | marun: I understand that | 22:24 |
marun | grizzly/havana basically can't scale with the oss plugins, so I guess we'll have to maintain our own branches until icehouse ships | 22:25 |
salv-orlando | unless one deem current code is so broken a rewrite is needed. Then back port will be easy ;) | 22:25 |
markmcclain | marun: yeah not excited about random forks of backports either | 22:25 |
markmcclain | because that makes bug triaging harder | 22:25 |
markmcclain | will have to think about how we can approach it | 22:26 |
marun | markmcclain: yes, food for thought. | 22:26 |
markmcclain | at best Havana is as far back as I'd like to go | 22:26 |
marun | markmcclain: I would agree with that. Maintaining grizzly isn't in the cards. | 22:26 |
marun | markmcclain: So, to keep in mind - eventual consistency should require at most one field addition per agent-type. Possibly fewer. | 22:27 |
markmcclain | yeah.. I think the first step is to get this fixed in trunk | 22:27 |
marun | markmcclain: understood. | 22:27 |
markmcclain | then we'll be able to definitively be able to catalog the changes needed | 22:27 |
markmcclain | which will probably tell us whether we can make a reasonable backport available | 22:28 |
marun | agreed. | 22:29 |
markmcclain | I'm afraid that if we begin this discussion without specific it will get derailed quickly | 22:29 |
marun | ok, that's all. thank you for giving me some perspective on the issue. | 22:30 |
markmcclain | because folks will fill the gaps how they see fit | 22:30 |
*** yamahata has joined #openstack-neutron | 22:30 | |
marun | I wasn't expecting anything definitive, I just wanted to know in general some of the challenges ahead. | 22:30 |
markmcclain | ok | 22:30 |
markmcclain | migrations and changes in invocation, configuration are the biggest no-nos | 22:31 |
markmcclain | because those require changes to deployment tools | 22:31 |
marun | right | 22:31 |
markmcclain | otherwise logic changes are acceptable in backports | 22:31 |
markmcclain | we're really not changing the API so we should be good on that front | 22:32 |
marun | and if there is sufficient perceived demand for a given patch, even if it does break deployment tools, then it's possible at least. | 22:32 |
markmcclain | yeah if the fixes are good enough | 22:32 |
markmcclain | we can consider ways to enable it optionally in backports | 22:33 |
markmcclain | the migrations also cause the same problems we were discussing with ml2 bugs | 22:33 |
marun | i'm afraid i'm still not entirely clear on the migrations issue… | 22:33 |
markmcclain | so adding that fix is actually a good test for how we could make changes like this | 22:33 |
marun | is it that we can't interject a migration in the sequence? | 22:33 |
markmcclain | so we cannot change or alter migrations once we release | 22:34 |
markmcclain | we the problem is some are running the latest release and others are running trunk | 22:34 |
markmcclain | so the insertion point is different for those users | 22:34 |
marun | that would seem like a failing of the migration mechanism | 22:34 |
marun | so long as a migration can be made no-op if it is not required | 22:34 |
marun | and then it can be injected earlier in the chain without issue (if sufficiently orthogonal) | 22:35 |
markmcclain | well we're going to have to add code to make those migrations sufficiently optional | 22:35 |
markmcclain | the ml2 fix will be a good proving ground for this | 22:35 |
marun | ah, ok. | 22:36 |
marun | ok, that's all from me. | 22:36 |
*** carl_baldwin has quit IRC | 22:37 | |
*** mlavalle has quit IRC | 22:37 | |
*** HenryG has quit IRC | 22:38 | |
*** jaypipes has joined #openstack-neutron | 22:38 | |
*** HenryG has joined #openstack-neutron | 22:38 | |
*** HenryG has quit IRC | 22:43 | |
*** HenryG has joined #openstack-neutron | 22:44 | |
*** zzelle has quit IRC | 22:50 | |
*** julim has quit IRC | 22:56 | |
*** clev has quit IRC | 22:57 | |
*** carl_baldwin has joined #openstack-neutron | 23:00 | |
*** carl_baldwin has quit IRC | 23:03 | |
*** carl_baldwin2 has joined #openstack-neutron | 23:03 | |
*** markwash has joined #openstack-neutron | 23:09 | |
*** carl_baldwin2 has quit IRC | 23:12 | |
*** yamahata has quit IRC | 23:20 | |
*** markmcclain has quit IRC | 23:24 | |
*** markmcclain has joined #openstack-neutron | 23:26 | |
*** ijw has joined #openstack-neutron | 23:29 | |
*** harlowja_ has joined #openstack-neutron | 23:29 | |
*** harlowja has quit IRC | 23:29 | |
jaypipes | salv-orlando: ahoy. anything I can do to assist with https://bugs.launchpad.net/tempest/+bug/1253896? It's blocking some patches I have for Keystone and have a few cycles if you need assistance. | 23:31 |
anteaya | jaypipes: this patch is about to merge | 23:32 |
anteaya | https://review.openstack.org/#/c/63641/ | 23:32 |
salv-orlando | jaypipes: analyse log fle and found root causes. What is reported in bug 1253896 is not a bug but rather a failure manifestation. We have found several root causes now, but looking at failure reports I know there are others | 23:32 |
anteaya | pray it addressed 1253896 | 23:33 |
salv-orlando | There is also a patch from marun which addresses another failure mode of bug 1253896 | 23:33 |
*** ijw has quit IRC | 23:33 | |
jaypipes | salv-orlando: ok... will do my best to dig around. | 23:33 |
anteaya | salv-orlando: awesome, which patch? | 23:33 |
*** ijw has joined #openstack-neutron | 23:34 | |
salv-orlando | https://review.openstack.org/#/c/61168/ | 23:34 |
salv-orlando | anteaya: ^ | 23:34 |
* anteaya clicks | 23:34 | |
salv-orlando | marun did not add to the commit message this bug, but the patch will prevent failures in dhcp setup because of missed notifications | 23:34 |
*** SumitNaiksatam has quit IRC | 23:35 | |
salv-orlando | that failure will manifest in tempest as ssh timeout, but the truth is that the ssh connection fails because the vm never got an ip | 23:35 |
jaypipes | salv-orlando: patch now merged. | 23:36 |
jaypipes | salv-orlando: the first patch, that is.. to devstack. | 23:36 |
salv-orlando | jaypipes: good. We'll see if failure rate goes down. The next 7-8 days won't count however | 23:37 |
*** markwash has quit IRC | 23:37 | |
jaypipes | salv-orlando: how come? | 23:37 |
salv-orlando | jaypipes: christmas and new years' day | 23:38 |
jaypipes | salv-orlando: oh, I see what you mean... yes :) | 23:38 |
openstackgerrit | A change was merged to openstack/neutron: Send DHCP notifications regardless of agent status https://review.openstack.org/61168 | 23:40 |
openstackgerrit | A change was merged to openstack/neutron: Imported Translations from Transifex https://review.openstack.org/63056 | 23:40 |
jaypipes | salv-orlando: and now marun's patch is merged... :) | 23:41 |
salv-orlando | ok | 23:41 |
jaypipes | salv-orlando: what is the difference between an agent's admin_state_up and active flags? | 23:44 |
*** nijaba has quit IRC | 23:45 | |
jaypipes | salv-orlando: oh, never mind... see it now... | 23:45 |
jaypipes | sorry for the noise! | 23:45 |
salv-orlando | jaypipes: cool | 23:45 |
salv-orlando | no worries | 23:45 |
*** nijaba has joined #openstack-neutron | 23:45 | |
*** nijaba has quit IRC | 23:45 | |
*** nijaba has joined #openstack-neutron | 23:45 | |
*** ijw_ has joined #openstack-neutron | 23:52 | |
*** ijw has quit IRC | 23:56 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!