openstackgerrit | Aaron Rosen proposed openstack/networking-ovn: Disable more IPv6 tests. https://review.openstack.org/270552 | 00:12 |
---|---|---|
*** chandrav has joined #openstack-neutron-ovn | 00:15 | |
*** chandrav has quit IRC | 00:44 | |
*** flaviof has quit IRC | 01:02 | |
*** salv-orlando has quit IRC | 01:09 | |
*** gangil has quit IRC | 01:14 | |
*** gangil has joined #openstack-neutron-ovn | 01:16 | |
*** gangil has joined #openstack-neutron-ovn | 01:16 | |
*** gangil has quit IRC | 01:24 | |
openstackgerrit | Aaron Rosen proposed openstack/networking-ovn: random test https://review.openstack.org/270502 | 01:49 |
*** gangil has joined #openstack-neutron-ovn | 01:54 | |
*** gangil has joined #openstack-neutron-ovn | 01:54 | |
*** gangil has quit IRC | 01:58 | |
*** gangil has joined #openstack-neutron-ovn | 02:03 | |
*** gangil has joined #openstack-neutron-ovn | 02:03 | |
*** gangil has quit IRC | 02:07 | |
*** gangil has joined #openstack-neutron-ovn | 02:10 | |
*** gangil has joined #openstack-neutron-ovn | 02:10 | |
*** arosen has quit IRC | 02:13 | |
*** roeyc has quit IRC | 02:17 | |
*** gangil has quit IRC | 02:18 | |
*** fzdarsky has quit IRC | 02:19 | |
*** yamamoto has joined #openstack-neutron-ovn | 02:23 | |
*** fzdarsky has joined #openstack-neutron-ovn | 02:25 | |
*** yamamoto has quit IRC | 02:37 | |
*** yamamoto has joined #openstack-neutron-ovn | 02:37 | |
*** roeyc has joined #openstack-neutron-ovn | 02:41 | |
*** yamamoto has quit IRC | 02:46 | |
*** roeyc has quit IRC | 02:49 | |
*** gangil has joined #openstack-neutron-ovn | 03:07 | |
*** gangil has joined #openstack-neutron-ovn | 03:07 | |
*** yamamoto_ has joined #openstack-neutron-ovn | 03:39 | |
*** armax has quit IRC | 03:42 | |
openstackgerrit | li,chen proposed openstack/networking-ovn: HOST_IP is missing in computenode-local.conf.sample https://review.openstack.org/270594 | 04:01 |
*** armax has joined #openstack-neutron-ovn | 04:20 | |
*** chenli has joined #openstack-neutron-ovn | 04:34 | |
chenli | hello guys, I have build a 2 node devstack environment with ovn enabled followed by http://docs.openstack.org/developer/networking-ovn/testing.html. | 04:35 |
chenli | Then I created 10 instances | 04:35 |
chenli | But all instances on the second node do not get IP address from DHCP | 04:35 |
chenli | anyone can help me ? | 04:35 |
*** gangil has quit IRC | 04:42 | |
*** mamulsow has quit IRC | 04:48 | |
*** mamulsow has joined #openstack-neutron-ovn | 04:49 | |
*** gangil has joined #openstack-neutron-ovn | 04:51 | |
*** gangil has joined #openstack-neutron-ovn | 04:51 | |
*** mamulsow has quit IRC | 04:54 | |
*** roeyc has joined #openstack-neutron-ovn | 05:33 | |
*** numans has joined #openstack-neutron-ovn | 06:17 | |
*** chandrav has joined #openstack-neutron-ovn | 06:22 | |
*** chandrav has quit IRC | 07:17 | |
*** stac- has quit IRC | 07:23 | |
*** fzdarsky has quit IRC | 07:27 | |
*** fzdarsky has joined #openstack-neutron-ovn | 07:27 | |
*** chandrav has joined #openstack-neutron-ovn | 07:33 | |
*** stac has joined #openstack-neutron-ovn | 07:47 | |
*** numans has quit IRC | 07:52 | |
*** roeyc has quit IRC | 07:57 | |
*** roeyc has joined #openstack-neutron-ovn | 07:59 | |
*** gangil has quit IRC | 08:15 | |
*** numans has joined #openstack-neutron-ovn | 08:23 | |
*** ajo_ is now known as ajo | 08:24 | |
*** chandrav has quit IRC | 08:28 | |
*** stac has quit IRC | 08:31 | |
*** chandrav has joined #openstack-neutron-ovn | 08:32 | |
*** armax has quit IRC | 08:36 | |
*** stac has joined #openstack-neutron-ovn | 08:40 | |
*** chandrav has quit IRC | 09:33 | |
*** openstackgerrit has quit IRC | 10:02 | |
*** openstackgerrit has joined #openstack-neutron-ovn | 10:02 | |
*** roeyc has quit IRC | 10:06 | |
*** roeyc has joined #openstack-neutron-ovn | 10:12 | |
*** roeyc has quit IRC | 10:44 | |
*** ajo has quit IRC | 11:24 | |
*** ajo has joined #openstack-neutron-ovn | 11:25 | |
*** yamamoto_ has quit IRC | 11:31 | |
*** chenli has quit IRC | 12:46 | |
*** chenli has joined #openstack-neutron-ovn | 12:47 | |
*** flaviof has joined #openstack-neutron-ovn | 13:01 | |
*** chenli has quit IRC | 13:09 | |
*** rtheis has joined #openstack-neutron-ovn | 13:33 | |
*** yamamoto has joined #openstack-neutron-ovn | 13:48 | |
*** yamamoto has quit IRC | 13:49 | |
*** yamamoto has joined #openstack-neutron-ovn | 13:52 | |
*** dslevin has quit IRC | 13:54 | |
* mestery yawns and sips coffee | 14:10 | |
Sam-I-Am | mornings | 14:10 |
Sam-I-Am | i have not coffeed yet | 14:10 |
mestery | Sam-I-Am: Get on that man! | 14:11 |
Sam-I-Am | ovsing before coffee is dangerous | 14:11 |
Sam-I-Am | yeah, i'm slackin' | 14:11 |
* Sam-I-Am looks at pile of things | 14:11 | |
mestery | :) | 14:12 |
Sam-I-Am | gates are still broken | 14:13 |
mestery | YEs | 14:14 |
Sam-I-Am | think we need to figure out how to make the ovn gate actually test provider nets | 14:16 |
mestery | Sam-I-Am: The Vagrant updates I have out for review do that, so it's possible | 14:16 |
mestery | But yes, we need to do that | 14:16 |
Sam-I-Am | but last time i looked into that, i got deep into neutron-legacy and almost jumped off my roof | 14:17 |
*** stac has quit IRC | 14:18 | |
Sam-I-Am | it looks easy... tell ovn there's a public net, put a flat neutron net there, drop an ip on br-ex (or whatever) ... everything else is the same | 14:18 |
Sam-I-Am | right now the gate puts qg* into br-ex which bypasses... a lot of things | 14:19 |
mestery | True | 14:19 |
Sam-I-Am | which is how i came across the larger neutron gate problem | 14:19 |
Sam-I-Am | the only reason it works at all is because the l3 agent has external_network_bridge=br-ex | 14:19 |
Sam-I-Am | should be fun when that option finally goes away | 14:19 |
Sam-I-Am | if you look at the nets, the pub net is type vxlan... yet it works... because of that option | 14:20 |
mestery | Heh | 14:20 |
mestery | :) | 14:20 |
Sam-I-Am | thats when i spent too long in the devstack code and went blind | 14:20 |
*** yamamoto has quit IRC | 14:28 | |
russellb | misc idea ... a bug tag for bugs that require work in OVN itself (not just networking-ovn) | 14:33 |
russellb | but i can't think of a good name for it | 14:33 |
russellb | naming is hard. | 14:34 |
*** stac has joined #openstack-neutron-ovn | 14:34 | |
mestery | lol | 14:35 |
mestery | russellb: ovn-ovs? ovn-github? ovn-upstream? | 14:35 |
russellb | ovn-upstream! | 14:35 |
russellb | perfect. | 14:35 |
russellb | thanks. | 14:35 |
mestery | :) | 14:36 |
russellb | i even made it an official tag in launchpad. | 14:36 |
russellb | watch out! | 14:37 |
Sam-I-Am | wooo | 14:37 |
Sam-I-Am | russellb: any more suggestions for my ovn gate logic patch? | 14:38 |
russellb | not right now, no | 14:39 |
russellb | oh | 14:39 |
Sam-I-Am | ok. didnt know if we wanted to do anything re: dougs comment | 14:39 |
russellb | i commented | 14:39 |
Sam-I-Am | i think i fixed your comment | 14:39 |
Sam-I-Am | unless you commented again | 14:39 |
russellb | i suggested we remove tests from the regex | 14:39 |
russellb | as suggested by doug | 14:39 |
*** dslev has joined #openstack-neutron-ovn | 14:39 | |
Sam-I-Am | i did that | 14:39 |
Sam-I-Am | i'm quick. like a rabbit, but not like rabbitmq. | 14:39 |
*** regXboi has joined #openstack-neutron-ovn | 14:41 | |
russellb | heh, +1 | 14:54 |
Sam-I-Am | thx | 14:55 |
Sam-I-Am | that file is a mess though | 14:55 |
Sam-I-Am | russellb: what impact does the mtu on the system-level bridge interface have? | 14:56 |
Sam-I-Am | if any | 14:57 |
russellb | not sure | 14:57 |
Sam-I-Am | can one set the mtu of an ovs bridge? | 14:57 |
Sam-I-Am | i'm trying my mtu experiments on ovs now | 14:57 |
Sam-I-Am | seems like an ovs bridge uses the mtu of the smallest interface on the bridge, which makes sense | 14:58 |
*** yamamoto has joined #openstack-neutron-ovn | 15:03 | |
mestery | russellb: Looks like blp and han provided nice feedback on your OVN provider patches! Yay! | 15:12 |
russellb | mestery: yep, i need to re-spin them today | 15:12 |
mestery | #awesomesauce | 15:13 |
russellb | :) | 15:13 |
*** shettyg has joined #openstack-neutron-ovn | 15:19 | |
*** mamulsow has joined #openstack-neutron-ovn | 15:23 | |
regXboi | russellb: ping - shouldn't ovn-controller be dropping a PID file in /usr/local/var/run/openvswitch so that ovs-appctl will work? | 15:38 |
*** fzdarsky has quit IRC | 15:39 | |
russellb | regXboi: is it not? | 15:39 |
*** fzdarsky has joined #openstack-neutron-ovn | 15:39 | |
russellb | oh, we're not running with --pid-file | 15:39 |
russellb | same for ovn-northd | 15:39 |
russellb | wanna patch devstack/plugin.sh ? | 15:40 |
russellb | .. and put it in the queue of stuff we can't merge yet :( | 15:40 |
mestery | Man, we gotta get the dsvm jobs fixed | 15:40 |
mestery | so many patches | 15:40 |
regXboi | sure, I'm on it | 15:40 |
mestery | o_o | 15:40 |
russellb | i have some ideas to chase, but long todo list too | 15:40 |
mamulsow | russellb: let me know when you want to start looking at ovn-controller again | 15:42 |
russellb | mamulsow: main question i have right now is what impact patch #2 has vs just patch #1 | 15:43 |
russellb | perf wise | 15:43 |
russellb | on an idle system | 15:43 |
mamulsow | I cleaned up our environment last night, but the latest ovn-controller from your branch is intermittently spinning at 100% with nothing in the cloud | 15:43 |
mamulsow | other ovn-controller nodes were all 0% | 15:44 |
mamulsow | haven't had a chance to look into where it's spinning its time, but I can do that now | 15:44 |
mamulsow | spending its time* | 15:44 |
*** fzdarsky has quit IRC | 15:45 | |
mamulsow | yeah, on an idle system the latest build from your ovn-controller-perf branch is spinning at 100% | 15:46 |
russellb | ok thanks | 15:49 |
russellb | weird, that was supposed to make it do *less* | 15:50 |
russellb | heh | 15:50 |
*** fzdarsky has joined #openstack-neutron-ovn | 15:52 | |
mamulsow | oh, I think it is doing less work, but isn't sleeping at the end of each poll interval | 15:56 |
mamulsow | checking... | 15:57 |
openstackgerrit | Ryan Moats proposed openstack/networking-ovn: Run northd and ovn-controller with --pidfile https://review.openstack.org/270879 | 15:58 |
regXboi | russellb: ^^^^^ | 15:58 |
russellb | ack | 15:58 |
*** numans has quit IRC | 15:58 | |
regXboi | now to restack with it and verify :) | 15:59 |
russellb | mamulsow: it should be waiting on poll_block(), in theory, unless it's constantly waking back up because we're skipping code that should be handling something making poll wake up immediately... | 16:01 |
russellb | possible | 16:01 |
regXboi | mestery: you should have made something else out of that | 16:01 |
russellb | seems like i should be able to reproduce that if so | 16:01 |
russellb | mamulsow: oh, yes, it's spinning in a trivial test setup for me, oops | 16:02 |
russellb | sorry. | 16:02 |
russellb | i'll figure it out | 16:02 |
mamulsow | cool, I like easy problems :) | 16:03 |
russellb | also, reminder for everyone interested, OVN meeting is today in about 2 hours (see topic) | 16:04 |
russellb | this patch makes my laptop very sad :-p | 16:06 |
*** yamamoto has quit IRC | 16:11 | |
*** yamamoto has joined #openstack-neutron-ovn | 16:11 | |
*** yamamoto has quit IRC | 16:11 | |
regXboi | and yes, that patch worked | 16:14 |
russellb | great thanks regXboi | 16:15 |
regXboi | sc68cal: where are we w.r.t all dsvm jobs getting unwedged? | 16:15 |
regXboi | ugh - wrong handle | 16:16 |
regXboi | Sam-I-Am: ^^^^ | 16:16 |
Sam-I-Am | regXboi: good question | 16:17 |
Sam-I-Am | i think its this patch - https://review.openstack.org/#/c/270417/ | 16:18 |
Sam-I-Am | which was failing due to the nova-net pub/priv thing | 16:19 |
regXboi | yeah but that patch failed the recheck | 16:22 |
regXboi | I think | 16:22 |
regXboi | yeah, it did | 16:22 |
regXboi | so this is now waiting on ... what? | 16:22 |
russellb | mamulsow: i just dropped that 2nd patch from the branch for now ... | 16:22 |
mamulsow | okay | 16:22 |
mamulsow | so I added some debug statements and started it (patch #2) again and it's working great right now | 16:23 |
mamulsow | the environment has about 20 routers and several VMs | 16:23 |
regXboi | Sam-I-Am: I'm watching -infra now | 16:24 |
russellb | hm, well something is broken with it .. it made ovn-controller spin at 100% even without creating any resources | 16:24 |
regXboi | so I see what's going on | 16:24 |
mamulsow | other ovn-controllers are at about 10 - 30%, patched one is at 0% | 16:24 |
Sam-I-Am | regXboi: the talk is actually in -qa | 16:24 |
Sam-I-Am | as i later found out | 16:24 |
Sam-I-Am | so many irc | 16:24 |
mamulsow | yeah, I'll try creating/deleting some stuff to see if I can get it to hit the 100% case again | 16:25 |
mamulsow | when it works it works great though :) | 16:25 |
mamulsow | oh | 16:27 |
mamulsow | russellb: I created one network and it started busy spinning | 16:28 |
mamulsow | it's no longer waiting at poll_block | 16:28 |
russellb | ok | 16:28 |
mamulsow | ovsdb_changed = 0 | 16:28 |
mamulsow | both before and after | 16:28 |
mamulsow | I didn't catch it when I made the change, but I assume it was true once | 16:29 |
russellb | right | 16:29 |
russellb | something in the true path clears an fd thats' causing poll to wake up | 16:29 |
russellb | not sure what, needs a closer look | 16:29 |
russellb | mamulsow: mestery btw, there's some config things you could consider to isolate your VMs from ovn-controller (and other openstack agents, like nova-compute) | 16:59 |
russellb | you can configure nova to exclude a CPU from the set of CPUs VMs are allowed to use | 16:59 |
russellb | so then you'd have 1 CPU free for everything else | 16:59 |
mestery | coolio! | 16:59 |
russellb | just a random thought | 16:59 |
mamulsow | sounds like a good idea | 16:59 |
russellb | probably worthwhile though | 17:00 |
russellb | as ovn-controller is going to spike as you load the system with create/deletes, same with nova-compute i bet | 17:00 |
russellb | to some degree anyway | 17:00 |
russellb | but i'd definitely like to get ovn-controller down to where it's truly idling when the env is idling, that's a bug IMO | 17:00 |
russellb | we should be able to do that soonish | 17:00 |
russellb | but like i said, you'll still see spikes that are normal (as it calculates new state in response to changes) | 17:01 |
mestery | Thanks for the advice russellb, mamulsow I think we should look at implementing this. | 17:02 |
mestery | :) | 17:02 |
mamulsow | yeah, these boxes have 56 cores and are mostly idle, I wasn't so much concerned about the CPU usage itself, but I was worried that it appeared OVN controller wasn't able to keep up with the requests it was getting | 17:02 |
mamulsow | but yes, that does seem like a good thing to do anyway | 17:03 |
russellb | i don't think it's not keeping up | 17:04 |
russellb | i think it's actually keeping up way too aggressively, heh | 17:04 |
*** flaviof is now known as flaviof_afk | 17:05 | |
*** gangil has joined #openstack-neutron-ovn | 17:05 | |
*** gangil has joined #openstack-neutron-ovn | 17:05 | |
* mestery heads out for lunch and will be back in an hour or so | 17:06 | |
*** salv-orlando has joined #openstack-neutron-ovn | 17:10 | |
*** roeyc has joined #openstack-neutron-ovn | 17:12 | |
*** roeyc has quit IRC | 17:13 | |
*** roeyc has joined #openstack-neutron-ovn | 17:15 | |
*** flaviof_afk is now known as flaviof | 17:15 | |
*** arosen has joined #openstack-neutron-ovn | 17:16 | |
*** rtheis has quit IRC | 17:28 | |
*** armax has joined #openstack-neutron-ovn | 17:28 | |
*** rtheis has joined #openstack-neutron-ovn | 17:29 | |
*** rtheis has quit IRC | 17:33 | |
mamulsow | well, I put in a dumb hack that's working pretty well, added a 'rerun' variable that gets set to true any time ovsdb_changed so it runs twice through the true case | 17:34 |
mamulsow | obviously it would be better to find the fd that's not getting cleared | 17:34 |
mamulsow | but this is working well for me now | 17:35 |
mamulsow | other interesting insight from that is that a second run through the true case clears the fd | 17:37 |
regXboi | ok, 270417 has merged - let's see if that makes a difference or not | 17:40 |
Sam-I-Am | regXboi: woohoooo | 17:43 |
*** rtheis has joined #openstack-neutron-ovn | 17:43 | |
* regXboi watches jobs in zuul and prepares rechecks | 17:45 | |
Sam-I-Am | trigger happy? | 17:46 |
regXboi | I like to think of it is "being prepared" :) | 17:47 |
regXboi | russellb: I may be a bit distracted at the start of today's IRC meeting | 17:53 |
* regXboi has a rescheduled scrum call running at the same time and has to channel mestery while on it | 17:53 | |
Sam-I-Am | regXboi: you need more ram | 17:53 |
regXboi | Sam-I-Am: I could use more parallel processing | 17:53 |
Sam-I-Am | downloadmorebrain.com ? | 17:54 |
regXboi | has somebody registered that? | 17:54 |
Sam-I-Am | looks that way | 17:55 |
* regXboi sighs | 17:56 | |
Sam-I-Am | my brain is full | 17:56 |
Sam-I-Am | i need to increase the mtu | 17:56 |
Sam-I-Am | did you read my mtu ramblings from the weekend? | 17:56 |
regXboi | where were they? | 17:57 |
* regXboi was chasing getting ovn running most of last weekend | 17:57 | |
Sam-I-Am | http://lists.openstack.org/pipermail/openstack-dev/2016-January/084303.html | 17:57 |
* regXboi queues up to read later | 17:58 | |
Sam-I-Am | i am not responsible for drain bramage resulting from it | 17:58 |
Sam-I-Am | tl;dr - i think i figured out the primary problem we're having, at least for phys nets with mtu = 1500 | 17:58 |
Sam-I-Am | the next step is seeing what happens with phys net mtu > 1500. i.e., do the 'middle things' use it, or do we need to do something else | 17:59 |
regXboi | ok, I read the tl;dr - /me now weeps | 17:59 |
Sam-I-Am | the next step is 'what happens in ovn' | 17:59 |
Sam-I-Am | regXboi: it should make you happy | 18:00 |
regXboi | no it makes me weep | 18:00 |
Sam-I-Am | weep for joy? | 18:00 |
regXboi | no weep | 18:00 |
regXboi | but that's for a little later - time for scrum call now | 18:00 |
*** flaviof is now known as flaviof_afk | 18:05 | |
*** flaviof_afk is now known as flaviof | 18:05 | |
arosen | join #openvswitch | 18:09 |
arosen | doh :( | 18:09 |
*** shettyg1 has joined #openstack-neutron-ovn | 18:11 | |
*** shettyg has quit IRC | 18:12 | |
*** chandrav has joined #openstack-neutron-ovn | 18:14 | |
*** numans has joined #openstack-neutron-ovn | 18:16 | |
*** zhouhan has joined #openstack-neutron-ovn | 18:18 | |
*** azbiswas has joined #openstack-neutron-ovn | 18:19 | |
mestery | lol arosen :) | 18:36 |
*** rtheis has quit IRC | 18:49 | |
*** rtheis has joined #openstack-neutron-ovn | 18:50 | |
numans | russellb, for multiple entries of same mac in Logical_Port.addresses, it needs a fix right ? (https://review.openstack.org/#/c/269897/2) | 18:52 |
*** rtheis has quit IRC | 18:54 | |
*** numans has quit IRC | 18:58 | |
*** rtheis has joined #openstack-neutron-ovn | 19:09 | |
*** roeyc has quit IRC | 19:13 | |
*** roeyc has joined #openstack-neutron-ovn | 19:22 | |
*** roeyc has quit IRC | 19:23 | |
*** roeyc has joined #openstack-neutron-ovn | 19:29 | |
*** roeyc has quit IRC | 19:42 | |
*** salv-orlando has quit IRC | 19:42 | |
regXboi | so, it looks like the dsvm stuff is somewhat uncorked | 19:52 |
regXboi | and maybe totally uncorked | 19:55 |
arosen | what was it? | 20:01 |
regXboi | ok... good news - 269121 passed dsvm :) - so if somebody wants to pull the W+1 on it - I'm going to recheck my other patch | 20:01 |
arosen | a change in ovn ? | 20:01 |
arosen | or openstack ;) | 20:01 |
regXboi | no - it was changes needed in openstack | 20:01 |
arosen | which one? | 20:02 |
regXboi | 270417 was one of them | 20:02 |
regXboi | I don't remember the other | 20:02 |
russellb | huh. | 20:02 |
regXboi | but thanks for rechecking 270879 - that was where I was going | 20:03 |
arosen | ah this keystone thing? | 20:03 |
regXboi | that was part of it yes | 20:03 |
regXboi | extra delays leading to race conditions in nova-net | 20:03 |
arosen | do we understand the race conditions? | 20:04 |
regXboi | um... yes - it's nova networks | 20:04 |
arosen | what's the race condition? | 20:04 |
arosen | it seems like the tempest test did: | 20:04 |
arosen | delete_vm() | 20:04 |
regXboi | nova networks API does *not* hang around and wait for a valid response code before continuing | 20:04 |
arosen | didn't wait long enough for it to be deleted (and have the ports cleaned up) | 20:04 |
regXboi | it's async | 20:04 |
arosen | before doing delete network. | 20:04 |
russellb | arosen: that's what i was thinking too | 20:05 |
russellb | arosen: and looping in ml2 would mask it | 20:05 |
arosen | :) | 20:05 |
russellb | seems worthy of a dev list post | 20:05 |
regXboi | I filed a bug a while back against nova about that problem | 20:05 |
arosen | well i wonder if it's a bug in nova then or tempest. | 20:05 |
* regXboi goes and looks for it | 20:05 | |
russellb | arosen: that'd be a bug in ... tempest and neutron, depending on perspective | 20:06 |
regXboi | yeah... take a look at https://bugs.launchpad.net/nova/+bug/1497740 | 20:06 |
openstack | Launchpad bug 1497740 in OpenStack Compute (nova) "nova API proxy to neutron should avoid race-ful behavior" [Medium,Confirmed] | 20:06 |
russellb | 1) tempest needs to make sure it's a valid time to delete a network before trying to delete it ... | 20:06 |
russellb | 2) neutron shouldn't mask things in a stupid way | 20:06 |
regXboi | that's w.r.t to floating ips in specific, but I can believe the same problem is elsewhere | 20:06 |
arosen | i'll recheck all the patches up there. | 20:06 |
regXboi | arosen, I was going to do that, but if you have a quicker way :) | 20:07 |
regXboi | but anyway, russellb, you want to pull the +A lever on 269121? | 20:08 |
arosen | regXboi: i just do it manually. | 20:08 |
arosen | done anyways. | 20:08 |
regXboi | arosen: ok, thx | 20:08 |
arosen | could write a script to do it ;) | 20:08 |
russellb | regXboi: done | 20:08 |
regXboi | mahalo - /me now goes to zuul page to watch the various patches | 20:08 |
regXboi | 178826 just merged | 20:09 |
regXboi | yay! | 20:09 |
russellb | merged? | 20:09 |
russellb | that merged a long time ago right? | 20:09 |
regXboi | hmm, that's not what I saw just now - let me double check | 20:10 |
regXboi | yeah this merged back on 1/13 | 20:10 |
regXboi | hmmm | 20:13 |
russellb | arosen: you think we should try to work on a tempest patch? | 20:13 |
*** rtheis has quit IRC | 20:13 | |
russellb | i probably don't have time this week though, maybe we should record it in a bug for now | 20:14 |
regXboi | russellb: I was trying out master this morning with the northd and controller processes in debug mode and I'm seeing a bunch of row_event messages in q-svc log - is that expected? | 20:14 |
russellb | regXboi: sounds lik eit | 20:14 |
arosen | russellb: we can. I want to look at nova as well and see if the issue could be there too. | 20:15 |
regXboi | so if northd and controller are logging in info, they don't report row events? | 20:15 |
arosen | it seems like after nova marks a vm as deleted the vm's ports could still be around | 20:15 |
russellb | arosen: i think vm delete is an async request | 20:15 |
arosen | or it could be tempest not waiting on the vm to be deleted (and checking that it's deleted). | 20:15 |
russellb | arosen: is tempest doing anything to wait for a vm to go deleted? | 20:15 |
russellb | yeah, need to look at tempest code, i haven't yet | 20:15 |
arosen | let me look :) | 20:15 |
russellb | i'm still guessing | 20:15 |
russellb | heh | 20:15 |
russellb | i have a big pile of ovn patches i'm trying to revise | 20:15 |
*** rtheis_ has joined #openstack-neutron-ovn | 20:17 | |
arosen | it looks like tempest does wait. | 20:17 |
arosen | https://github.com/openstack/tempest/blob/master/tempest/common/waiters.py#L103 | 20:18 |
*** rtheis__ has joined #openstack-neutron-ovn | 20:18 | |
*** rtheis has joined #openstack-neutron-ovn | 20:20 | |
*** rtheis has quit IRC | 20:20 | |
*** rtheis has joined #openstack-neutron-ovn | 20:20 | |
*** rtheis_ has quit IRC | 20:21 | |
*** rtheis__ has quit IRC | 20:22 | |
*** azbiswas has quit IRC | 20:28 | |
*** azbiswas has joined #openstack-neutron-ovn | 20:29 | |
arosen | let me play around with nova and see what it does if i put a huge sleep in port_delete in neutron. | 20:29 |
arosen | i'll dig into it. | 20:29 |
arosen | going to grab a quick lunch though bbl | 20:29 |
*** azbiswas has quit IRC | 20:33 | |
russellb | sounds good | 20:34 |
* russellb revising his provider network fixes for ovn | 20:34 | |
*** chandrav has quit IRC | 20:40 | |
*** gangil has quit IRC | 20:45 | |
*** chandrav has joined #openstack-neutron-ovn | 20:53 | |
*** dslev has quit IRC | 21:00 | |
*** manand has joined #openstack-neutron-ovn | 21:05 | |
*** salv-orlando has joined #openstack-neutron-ovn | 21:11 | |
russellb | i don't think our job is fixed | 21:13 |
russellb | same errors happening on the rechecks | 21:13 |
*** roeyc has joined #openstack-neutron-ovn | 21:14 | |
*** zhouhan has quit IRC | 21:16 | |
*** gangil has joined #openstack-neutron-ovn | 21:19 | |
*** gangil has joined #openstack-neutron-ovn | 21:19 | |
*** dslev has joined #openstack-neutron-ovn | 21:19 | |
mamulsow | russellb: nothing urgent, by FYI, I've been doing some scale testing with your latest (I believe it's patch #1 level now) and see some interesting log messages | 21:21 |
russellb | only on the patched host? | 21:21 |
mamulsow | I've got them all patched now, I could switch one back though to test | 21:22 |
mamulsow | http://paste.openstack.org/show/484616/ | 21:22 |
russellb | arosen: looking at one of the latest NetworkInUse failures, and i dug into the specific test code ... | 21:23 |
russellb | 100 def _delete_server(self, server): | 21:23 |
russellb | 101 self.servers_client.delete_server(server['id']) | 21:23 |
russellb | 102 waiters.wait_for_server_termination(self.servers_client, server['id']) | 21:23 |
russellb | 21:23 | |
russellb | so it waits for the server to be terminated, in theory | 21:23 |
openstackgerrit | Merged openstack/networking-ovn: Make master networking-ovn work with stable/liberty https://review.openstack.org/269121 | 21:23 |
mestery | mamulsow: Now that ^^^ has merged, we can likely move away from my private fork we're using for testing. | 21:24 |
mestery | mamulsow: I assume regXboi tested this on stable/liberty even :) | 21:24 |
mamulsow | mestery: nice! | 21:24 |
regXboi | I'm pretty sure I did - it all blurs together now | 21:25 |
russellb | seems we're passing some now at least | 21:26 |
mestery | russellb: progress? | 21:27 |
russellb | yes progress of some sort | 21:27 |
russellb | mamulsow: re: log messages, looks "normal" if ovn-controller is under load | 21:28 |
russellb | the INFO stuff is a little noisy under high load it seems | 21:28 |
mamulsow | yeah, it was about 600 routers, 600 networks/subnets at the time, and still in the process of creating things | 21:29 |
*** chandrav has quit IRC | 21:29 | |
russellb | mamulsow: so ovn-controller is just pegged recalculating state as things are changing | 21:31 |
russellb | you've basically hit a bottleneck | 21:31 |
*** azbiswas has joined #openstack-neutron-ovn | 21:31 | |
mamulsow | it's not *too* noisy, it's adding a few lines every 5 seconds or so, a lot better than many of the other logs on this system | 21:31 |
russellb | 1) you may want to make the nova config change i suggested earlier to reserve a CPU for system stuff | 21:31 |
russellb | 2) we need to improve ovn-controller to help this bottleneck | 21:31 |
russellb | 3) even if it's pegged, it's likely still processing everything, it's just a question of how long it's taking | 21:32 |
russellb | and if it can keep up with the rate of change you want to inflict on the env | 21:32 |
mamulsow | yep, 1.4 second poll interval is not bad | 21:32 |
russellb | yeah, so that means it's re-adjusting to new state in 1.4 seconds or so | 21:32 |
mamulsow | as long as that doesn't get into the 30 second range | 21:32 |
russellb | which should apply all changes that have happened in the last 1.4 seconds or whatever | 21:32 |
russellb | something like that | 21:33 |
russellb | i'd also verify other ways, like look at how long it takes ports to come up | 21:33 |
mamulsow | yeah | 21:33 |
russellb | right now we have it so that neutron port state won't be up until OVN says it's up | 21:33 |
mamulsow | so is this applying just the things that changed since the last time or is it trying to apply everything | 21:33 |
russellb | and OVN won't say it's up until ovn-controller has done its thing for the new port | 21:33 |
russellb | right now ovn-controller does a full calculation of state every time. blp said in our meeting today that he'd try to prioritize working on incrememntal updates now that we know this is a bottleneck | 21:34 |
russellb | it calculates full state, and then in practice only applies differences | 21:34 |
russellb | but there's lots of room to be smarter | 21:34 |
mamulsow | cool, sounds good | 21:34 |
mamulsow | so far this is still looking pretty good even as it's working now | 21:35 |
russellb | ok great | 21:35 |
mamulsow | I'll let you know how it looks when I get to 4k routers | 21:35 |
mamulsow | :) | 21:35 |
russellb | we'll keep workikng on improvements too of course | 21:35 |
russellb | this is hugely helpful | 21:35 |
*** chandrav has joined #openstack-neutron-ovn | 21:36 | |
mamulsow | hugely helpful for me too :) | 21:38 |
mamulsow | thanks so much for your help so far, definitely seeing gains with your patch | 21:38 |
russellb | great, that was low hanging fruit | 21:41 |
russellb | that patch is going to change, ben told me he had some suggestions | 21:41 |
russellb | but it's a test env after all :) | 21:41 |
russellb | arosen: from my digging through the neutron code, it should be deleting ports before the instance is marked as deleted in the db, so as long as all tests are waiting like the one i looked at earlier, it should be OK... | 21:43 |
russellb | digging through *nova* code | 21:43 |
mestery | russellb: neutron mid-cycle in Rochester, MN at IBM: https://etherpad.openstack.org/p/neutron-mitaka-midcycle | 21:43 |
mestery | A chance to hack on OVN perhaps? | 21:43 |
russellb | i thought that wasn't happening for some reason | 21:44 |
mestery | It wasn't ... and then armax and dougwig made it happen :P | 21:44 |
russellb | how do i get there, snowmobile? | 21:44 |
mestery | And whiskey. Whiskey keeps you warm. | 21:44 |
russellb | lol | 21:44 |
russellb | mestery: i put it on my calendar to think about at least | 21:45 |
Sam-I-Am | i thought the neutron mid-cycle was just for a few things... 2 weeks ago | 21:45 |
russellb | but what a terrible place to go in Feb :-p | 21:45 |
Sam-I-Am | march sounds like a 3/4 cycle | 21:45 |
Sam-I-Am | or the week prior | 21:45 |
mestery | :) | 21:46 |
* russellb trolling | 21:46 | |
russellb | mestery: 4 full days? | 21:47 |
russellb | or more likely 2 full and 2 partial, depending on travel schedule ... | 21:47 |
Sam-I-Am | russellb: got time for some questions? | 21:48 |
mestery | Yeah, that's what I was thinking | 21:48 |
mestery | 2 full, 2 partial | 21:48 |
mestery | I may even drive home a couple nights for kids activities (shhhh, don't tell) | 21:48 |
Sam-I-Am | mestery: aren't you up that way? | 21:48 |
Sam-I-Am | minnesooooota | 21:48 |
mestery | Sam-I-Am: About 1.5 hours away | 21:48 |
* regXboi gets in queue behind Sam-I-Am: | 21:49 | |
regXboi | speak for yourself - it's more like 6 hours | 21:49 |
Sam-I-Am | regXboi: at least you can avoid air travel | 21:49 |
russellb | quick questions at least | 21:50 |
russellb | i need to leave in a few minutes | 21:50 |
regXboi | the beauty of being in the middle of the country - anything from about OKC to Denver to Fargo to Chicago can be a drive | 21:50 |
Sam-I-Am | russellb: doh, how about tomorrow? | 21:50 |
russellb | i'll be around | 21:50 |
Sam-I-Am | russellb: i guess the quick question is - has someone thought about mtu considerations in ovn. | 21:51 |
regXboi | russellb: pointer to what AddLogicalPortCommand actually does vis a vis OVS | 21:51 |
Sam-I-Am | adds a logical port | 21:51 |
russellb | Sam-I-Am: yes, but i'm not sure OVN needs to do anything | 21:51 |
* regXboi looking for a secret decoder ring in terms of the protocol, Sam-I-Am | 21:51 | |
russellb | same issue as with vxlan or gre today | 21:52 |
Sam-I-Am | or should i say &addlogicalportcommand :) | 21:52 |
russellb | regXboi: AddLogicalPortCommand creates a new row in the Logical_Port table in the OVN_Northbound db | 21:52 |
russellb | well, and adds it to the ports column of a logical switch in the Logical_Switch table | 21:52 |
regXboi | russellb: ok, does the client do anything in addition to sending the new row? | 21:52 |
*** azbiswas has quit IRC | 21:52 | |
regXboi | does it pull the table first? | 21:52 |
russellb | http://openvswitch.org/support/dist-docs/ovn-nb.5.html | 21:53 |
regXboi | does it pull the table afterwards? | 21:53 |
russellb | is the schema docs | 21:53 |
russellb | the client actually has a local cache of the entire db | 21:53 |
russellb | that it receives updates for from the db | 21:53 |
regXboi | async updates or sync updates? | 21:53 |
russellb | it gets a dump at startup, and then async updates over time | 21:53 |
regXboi | ok, so that likely isn't it | 21:53 |
russellb | depending on what you mean by sync or async | 21:53 |
regXboi | well, would it refresh the cache *before* making the update | 21:54 |
regXboi | er the add I mean | 21:54 |
russellb | no | 21:54 |
regXboi | ok, so that's not it | 21:54 |
regXboi | ok, I'll likely need to add some instrumentation into ovn-controller to trace the code | 21:54 |
Sam-I-Am | russellb: well, it might need to pull in an mtu config option. i'm mainly concerned about the mismatch problem we're discovering in the ml2 drivers. mtus can (and should) change to account for overhead, as long as the change happens in something that can emit icmp pmtu messages. | 21:54 |
Sam-I-Am | russellb: i dont know enough about ovs and the l3 implementation to know if it'll do this | 21:55 |
regXboi | but that will be part of tomorrow's headache | 21:55 |
russellb | Sam-I-Am: *nods*, what would it do with an mtu config option | 21:55 |
russellb | right now we rely on the admin configuring the dhcp server opt for it | 21:55 |
Sam-I-Am | russellb: that only helps traffic outbound from a vm | 21:55 |
Sam-I-Am | its the easy part of this | 21:56 |
russellb | ok | 21:56 |
russellb | then i don't understand the hard part :) | 21:56 |
Sam-I-Am | inbound traffic | 21:56 |
Sam-I-Am | host outside openstack sends packet with max mtu, packet needs to traverse a tunnel to get to the vm. what happens there? with the ml2 drivers, its just discarded because the disparity occurs on a layer-2 device that can't speak icmp for ptmu discovery. | 21:57 |
*** azbiswas has joined #openstack-neutron-ovn | 21:58 | |
Sam-I-Am | tl;dr on my mtu experiments over the weekend - we need to do mtu changes to account for tunnels in the layer-3 device that routes between the provider net and overlay net | 21:59 |
*** dslev has quit IRC | 21:59 | |
openstackgerrit | Merged openstack/networking-ovn: Run northd and ovn-controller with --pidfile https://review.openstack.org/270879 | 22:00 |
regXboi | russellb: ok I see now - I'll revisit my instrumentation run with that in mind | 22:01 |
*** s3wong has joined #openstack-neutron-ovn | 22:01 | |
Sam-I-Am | russellb: the other side of the coin is making sure that if the phys net supports a large mtu, that large mtu is implemented in all the necessary places | 22:01 |
russellb | Sam-I-Am: ok, this is probably an area i don't understand well know to know what should change, but definitely be interested in feedback | 22:01 |
russellb | i need to go for now though | 22:01 |
Sam-I-Am | russellb: yeah, thats why we need a bit more time | 22:02 |
Sam-I-Am | i can fill you in on All the Things | 22:02 |
Sam-I-Am | the goal - get it right the first time :) | 22:02 |
*** shettyg1 has quit IRC | 22:02 | |
Sam-I-Am | because its a mess in neutron now | 22:02 |
russellb | ok cool | 22:10 |
russellb | i think my brain is fried for today | 22:10 |
regXboi | russellb: oh I ack that | 22:10 |
russellb | have a nice evening everyone | 22:10 |
mestery | you too russellb | 22:10 |
*** dslev has joined #openstack-neutron-ovn | 22:14 | |
openstackgerrit | Kyle Mestery proposed openstack/networking-ovn: Vagrant: Completely redo the Vagrant configuration https://review.openstack.org/269255 | 22:15 |
*** chandrav has quit IRC | 22:19 | |
*** regXboi has quit IRC | 22:19 | |
*** chandrav has joined #openstack-neutron-ovn | 22:25 | |
openstackgerrit | Merged openstack/networking-ovn: Devstack: cleanup datapath https://review.openstack.org/269938 | 22:25 |
openstackgerrit | Merged openstack/networking-ovn: devstack: Move tox install. https://review.openstack.org/267145 | 22:28 |
openstackgerrit | Merged openstack/networking-ovn: Deployment: Update with OVN DB requirements https://review.openstack.org/268717 | 22:28 |
Sam-I-Am | somanymerges | 22:32 |
*** dslev has quit IRC | 22:34 | |
* mestery loves it | 22:35 | |
*** chandrav has quit IRC | 22:40 | |
openstackgerrit | Merged openstack/networking-ovn: Updated from global requirements https://review.openstack.org/268471 | 22:41 |
*** chandrav has joined #openstack-neutron-ovn | 22:43 | |
*** chandrav has quit IRC | 22:53 | |
*** jckasper has joined #openstack-neutron-ovn | 23:00 | |
*** chandrav has joined #openstack-neutron-ovn | 23:01 | |
*** roeyc has quit IRC | 23:05 | |
*** chandrav has quit IRC | 23:06 | |
*** roeyc has joined #openstack-neutron-ovn | 23:16 | |
openstackgerrit | Matthew Kassawara proposed openstack/networking-ovn: Modify docs build environment https://review.openstack.org/271091 | 23:44 |
openstackgerrit | Aaron Rosen proposed openstack/networking-ovn: Add missing call to self._process_l3_update/delete() https://review.openstack.org/270509 | 23:49 |
openstackgerrit | Merged openstack/networking-ovn: Vagrant: Completely redo the Vagrant configuration https://review.openstack.org/269255 | 23:52 |
*** azbiswas has quit IRC | 23:55 | |
*** rtheis has quit IRC | 23:57 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!