*** fnaval has quit IRC | 00:18 | |
*** salmankhan has quit IRC | 00:26 | |
*** irclogbot_1 has quit IRC | 00:38 | |
*** Swami has quit IRC | 01:01 | |
openstackgerrit | Michael Johnson proposed openstack/octavia-tempest-plugin master: Add traffic tests using an IPv6 VIP https://review.openstack.org/611980 | 01:11 |
---|---|---|
*** aojea has joined #openstack-lbaas | 01:23 | |
openstackgerrit | Adam Harwell proposed openstack/octavia master: WIP: Floating IP Network Driver (spans L3s) https://review.openstack.org/435612 | 01:25 |
*** aojea has quit IRC | 01:27 | |
*** yamamoto has joined #openstack-lbaas | 01:46 | |
*** dayou has quit IRC | 02:20 | |
*** dayou has joined #openstack-lbaas | 02:21 | |
*** dayou has quit IRC | 02:25 | |
*** dayou has joined #openstack-lbaas | 02:26 | |
*** dayou has quit IRC | 02:29 | |
*** dayou has joined #openstack-lbaas | 02:33 | |
*** dayou has quit IRC | 02:34 | |
lxkong | hi guys, if i want to delete a member from a pool in the program using python-octaviaclient, when the pool id is invalid, can i get some exception with error code like 404? or just a normal python exception? | 02:34 |
lxkong | how python-octaviaclient deals with that? | 02:34 |
*** dayou has joined #openstack-lbaas | 02:36 | |
*** dayou has quit IRC | 02:37 | |
rm_work | hmmm | 02:38 |
*** dayou has joined #openstack-lbaas | 02:38 | |
rm_work | lets see | 02:38 |
rm_work | so, at the CLI it seems to do a pool list and look to see if the ID is in the list | 02:39 |
rm_work | but not sure if that's actually how python-octaviaclient will respond | 02:40 |
rm_work | I think in general you could expect it to look up the pool first, and give you an error based on it not existing, yeah | 02:40 |
rm_work | personally, i think it's a bit silly to have to specify a pool ID to delete a member, so I was looking at possibly making members a top-level object | 02:41 |
rm_work | so you could just delete a member directly without pool info | 02:41 |
rm_work | but that's a ways off | 02:41 |
*** dayou has quit IRC | 02:42 | |
*** dayou has joined #openstack-lbaas | 02:44 | |
*** dayou has quit IRC | 02:45 | |
*** dayou has joined #openstack-lbaas | 02:46 | |
*** dayou has quit IRC | 02:47 | |
*** dayou has joined #openstack-lbaas | 02:49 | |
*** yamamoto has quit IRC | 02:49 | |
lxkong | rm_work: yeah, but when heat deletes the member resource, it heat doesn't see 404 something, it complains rather than just ignore that resource and continue to delete the rest. | 03:04 |
lxkong | so i'm wondering if it's a bug for python-octaviaclient | 03:04 |
*** dayou has quit IRC | 03:05 | |
lxkong | when it is trying to delete a member, if that pool doesn't exist, if it should throw http not found exception? | 03:05 |
lxkong | that's what heat expects | 03:05 |
rm_work | hmmm | 03:11 |
*** hongbin has joined #openstack-lbaas | 03:12 | |
rm_work | it SHOULD 404 I think | 03:13 |
rm_work | but the way we handle return codes is a little wonky IMO | 03:13 |
rm_work | would need to test it honestly <_< | 03:14 |
*** dayou has joined #openstack-lbaas | 03:20 | |
*** yamamoto has joined #openstack-lbaas | 04:21 | |
*** sapd1_ has quit IRC | 04:35 | |
*** sapd1 has joined #openstack-lbaas | 04:36 | |
*** hongbin has quit IRC | 04:47 | |
lxkong | rm_work: i created a ticket here https://storyboard.openstack.org/#!/story/2004283 for tracking this issue | 05:11 |
*** yboaron_ has joined #openstack-lbaas | 05:55 | |
*** dayou has quit IRC | 06:14 | |
*** dayou has joined #openstack-lbaas | 06:15 | |
*** aojea has joined #openstack-lbaas | 06:27 | |
*** aojea has quit IRC | 06:31 | |
*** ccamposr has joined #openstack-lbaas | 06:43 | |
*** velizarx has joined #openstack-lbaas | 07:06 | |
*** pcaruana has joined #openstack-lbaas | 07:36 | |
*** yamamoto has quit IRC | 07:43 | |
*** yboaron_ has quit IRC | 07:52 | |
*** yamamoto has joined #openstack-lbaas | 07:57 | |
*** apuimedo has joined #openstack-lbaas | 08:00 | |
*** dayou has quit IRC | 08:35 | |
*** yboaron_ has joined #openstack-lbaas | 08:52 | |
*** dims has quit IRC | 08:52 | |
*** dims has joined #openstack-lbaas | 08:53 | |
*** dims has quit IRC | 08:58 | |
*** dims has joined #openstack-lbaas | 08:59 | |
*** dayou has joined #openstack-lbaas | 09:04 | |
*** yamamoto has quit IRC | 09:14 | |
*** abaindur has quit IRC | 09:31 | |
*** dayou has quit IRC | 09:38 | |
*** yamamoto has joined #openstack-lbaas | 09:41 | |
*** dayou has joined #openstack-lbaas | 10:05 | |
*** salmankhan has joined #openstack-lbaas | 10:20 | |
*** yboaron_ has quit IRC | 10:30 | |
*** yboaron_ has joined #openstack-lbaas | 10:30 | |
*** yamamoto has quit IRC | 10:39 | |
openstackgerrit | ZhaoBo proposed openstack/octavia master: Add client_ca_tls_container_ref to Octavia v2 listener API https://review.openstack.org/612267 | 10:58 |
openstackgerrit | ZhaoBo proposed openstack/python-octaviaclient master: Add 'client_ca_tls_container_ref' in Listener on client side https://review.openstack.org/616158 | 10:59 |
*** sapd1 has quit IRC | 11:12 | |
*** sapd1 has joined #openstack-lbaas | 11:12 | |
*** yamamoto has joined #openstack-lbaas | 11:16 | |
*** aojea_ has joined #openstack-lbaas | 11:18 | |
*** aojea_ has quit IRC | 11:23 | |
*** yamamoto has quit IRC | 11:25 | |
*** yboaron_ has quit IRC | 11:35 | |
*** yboaron_ has joined #openstack-lbaas | 11:47 | |
*** yamamoto has joined #openstack-lbaas | 12:12 | |
rm_work | lxkong: hmm ok, i'll see if i can look at that sometime soon -- still catching up from being out on vacation for three weeks :P | 12:46 |
*** velizarx has quit IRC | 13:24 | |
*** velizarx has joined #openstack-lbaas | 13:35 | |
*** pcaruana has quit IRC | 13:45 | |
*** apuimedo has quit IRC | 13:59 | |
*** pcaruana has joined #openstack-lbaas | 14:00 | |
*** aojea_ has joined #openstack-lbaas | 14:04 | |
*** yamamoto has quit IRC | 14:29 | |
*** pcaruana has quit IRC | 14:33 | |
*** ccamposr has quit IRC | 14:34 | |
*** pcaruana has joined #openstack-lbaas | 14:34 | |
*** apuimedo has joined #openstack-lbaas | 14:36 | |
*** apuimedo has quit IRC | 14:36 | |
*** celebdor has joined #openstack-lbaas | 14:36 | |
*** yamamoto has joined #openstack-lbaas | 14:52 | |
*** pck has joined #openstack-lbaas | 14:53 | |
*** aojea_ has quit IRC | 15:04 | |
*** ccamposr has joined #openstack-lbaas | 15:18 | |
*** aojea_ has joined #openstack-lbaas | 15:30 | |
*** aojea_ has quit IRC | 15:35 | |
*** aojea_ has joined #openstack-lbaas | 15:35 | |
*** aojea_ has quit IRC | 15:35 | |
*** aojea_ has joined #openstack-lbaas | 15:36 | |
*** yboaron_ has quit IRC | 15:37 | |
*** yboaron has joined #openstack-lbaas | 15:39 | |
*** ivve has joined #openstack-lbaas | 15:44 | |
*** aojea_ has quit IRC | 15:56 | |
*** aojea_ has joined #openstack-lbaas | 15:57 | |
*** aojea_ has quit IRC | 16:06 | |
*** fnaval has joined #openstack-lbaas | 16:07 | |
*** hvhaugwitz has quit IRC | 16:07 | |
*** hvhaugwitz has joined #openstack-lbaas | 16:08 | |
cgoncalves | FYI, I might join today's meeting a bit late | 16:30 |
johnsom | Ok, thanks | 16:36 |
*** pcaruana has quit IRC | 16:38 | |
*** yboaron has quit IRC | 16:49 | |
*** irclogbot_1 has joined #openstack-lbaas | 17:03 | |
*** velizarx has quit IRC | 17:07 | |
*** velizarx has joined #openstack-lbaas | 17:08 | |
*** velizarx has quit IRC | 17:12 | |
*** yamamoto has quit IRC | 17:17 | |
rm_work | oh there's a meeting! I can make it today! :P | 17:35 |
johnsom | Noon pacific time | 17:36 |
*** ccamposr has quit IRC | 17:38 | |
*** yamamoto has joined #openstack-lbaas | 17:56 | |
*** salmankhan has quit IRC | 18:04 | |
*** yamamoto has quit IRC | 18:09 | |
*** kosa777777 has joined #openstack-lbaas | 18:09 | |
*** aojea_ has joined #openstack-lbaas | 18:26 | |
*** Swami has joined #openstack-lbaas | 18:30 | |
*** aojea_ has quit IRC | 18:57 | |
Swami | johnsom: Have you seen such errors in octavia 'octavia-health-manager-json.log.1:{"stack_trace": "Traceback (most recent call last):\n File \"/opt/stack/venv/octavia-20180814T160839Z/lib/python2.7/site-packages/octavia/controller/healthmanager/health_drivers/update_db.py\", line 336, in update_stats\n self._update_stats(health_message)\n File \"/opt/stack/venv/octavia-20180814T160839Z/lib/python2.7/site-packages/octavia/controller/healthmanager/ | 19:07 |
Swami | health_drivers/update_db.py\", line 382, in _update_stats\n 'request_errors': stats['ereq']}\nKeyError: 'ereq'\n", "process": 44590, "project_name": "", "tags": "octavia-health-manager", "color": "", "@timestamp": "2018-11-06T23:35:22.214Z", "isotime": "2018-11-06T23:35:22.214219+00:00", "host": "hlm003-cp1-c1-m3-mgmt", "thread_name": "MainThread", "logger_name": "octavia.controller.healthmanager.health_drivers.update_db", "path": | 19:07 |
Swami | "/opt/stack/venv/octavia-20180814T160839Z/lib/python2.7/site-packages/octavia/controller/healthmanager/health_drivers/update_db.py", "message": "update_stats encountered an unknown error", "extra_keys": ["project", "version"], "resource": "", "level": "ERROR", "@version": "1", "user_identity": "", "project": "unknown", "instance": "", "version": "unknown", "lineno": 338, "user_name": "", "error_summary": "KeyError: 'ereq'", "type": "octavia"}' | 19:07 |
johnsom | Swami Are you running a Rocky or newer Amphora image with an older release health manager? | 19:10 |
Swami | johnsom: We are testing migration of an old ( newton ) to Pike. | 19:12 |
*** yboaron has joined #openstack-lbaas | 19:15 | |
johnsom | Swami Ok, yeah, I thiink I see the issue. (just a note, newton and pike are before we asserted our upgrade tags and had a grenade gate). So the Amp image is newton, and the HM is pike. The stats message from the amp doesn't have the "ereq" stat in it. | 19:15 |
Swami | johnsom: yes | 19:16 |
Swami | johnsom: is that normal. | 19:16 |
johnsom | So this line is blowing up: https://github.com/openstack/octavia/blob/stable/pike/octavia/controller/healthmanager/health_drivers/update_db.py#L382 | 19:16 |
johnsom | Two options: | 19:16 |
johnsom | 1. Upgrade the amps first. | 19:17 |
johnsom | 2. modify the HM to have a default value of 0 on line 382 | 19:17 |
Swami | johnsom: Can we upgrade a Pike AM in a Newton environment and make the Amp failover to Pike Amphora, will that work. | 19:18 |
Swami | I think I tried to update the glance image from Newton Image to Pike version Image. But did not failover to the new Pike Amphora. | 19:19 |
johnsom | If you meant HM instead of AM, yes it should work. The only harm from that error is the stats won't update for the LB until the amp is updated | 19:19 |
Swami | johnsom: let me rephrase your statement, so that my understanding is right. | 19:21 |
Swami | If I need to migrate a Newton based repo that has an active Newton Amphora in place, I need to upgrade the Amphora VM first to have the Pike version of Amphora instead of the Newton version Amphora. | 19:22 |
Swami | Then once I have the Pike version of Amphora, then I need to modify the HM to have a default value of 0 on line 382 and restart the controller. Then migrate to Pike. | 19:23 |
johnsom | That should work, yes. However, I also pointed out that the Error you see will only impact the statistics updating for the LB. The LB and amphora will still be operational. So you could do the control plane update, note to users that stats won't update until you have failed over the amphora to the new image. | 19:24 |
johnsom | Swami, no you don't need to modify the code for the default value. That was an option, not an "AND". | 19:25 |
johnsom | For migration, here are two options: | 19:25 |
Swami | johnsom: The issue that i am having right now is the Old Amphora did not failover to the new Pike image, after moving to Pike and rebooting the compute node. | 19:25 |
johnsom | 1. Update the amphora to a Pike image, then update control plane. | 19:25 |
johnsom | 2. Update the control plane, then update the amphora, but note that stats won't update until the amp is updated. | 19:26 |
Swami | So the 'Error log' that i am seeing might not be related to loosing my Amphora VM. Is that right. | 19:26 |
Swami | johnsom: So after step2, before I reboot the node, I should manually failover the Amphora VM to pick up the new Pike image. Right? | 19:27 |
johnsom | Right, you will need to manually cause the amp to failover to the new image. We do not have an automatic failover for that so that operators can schedule it should they be running in a non-HA topology. | 19:28 |
Swami | johnsom: Sorry you mentioned two options. So I can either use option1 or option 2. | 19:28 |
johnsom | We have an API call for failover. LB failover will do it sequentially across the amps, Amphora failover will only do one amp . | 19:29 |
johnsom | Right, 1 or 2 | 19:29 |
Swami | johnsom: But didn't you mention that the failover API is not in Pike. Or has it landed in Pike. | 19:29 |
johnsom | Swami You are right, sorry. I forgot that it didn't make the Pike release. | 19:30 |
Swami | johnsom: So what we did here is. Updated the controle plane to Pike, then updated the Glance image to Pike Amphora, But did not update the Amphora VM that is running, then triggered a reboot of compute. Then we lost the VM. | 19:31 |
*** hyang has joined #openstack-lbaas | 19:32 | |
johnsom | Swami Rebooting the compute node won't necessarily failover the amp. What do you mean by "lost the VM"? The LB went to error and the amp was deleted? | 19:33 |
Swami | johnsom: yes | 19:36 |
Swami | johnsom: The LB went to error state and the Amp was deleted. | 19:36 |
johnsom | Can you provide the health manager logs? The stats error should not cause that. | 19:36 |
Swami | johnsom: Sure I will check it out and will post it in the pastbin. Give me a minute. | 19:37 |
johnsom | Swami Thanks | 19:37 |
hyang | Hi there, I have a question for amphora networks. In my cluster there are two networks: public and private and both are flat-vlan type. If I create the amphora using private netwrok as management net and create the vip-address from the pubic network. Can I add my instance from private network as the backend member? Currently I got error operation status after adding the member so wondering if it is possible to do so | 19:42 |
hyang | . | 19:42 |
hyang | btw, same member can be added to the amphora with vip-address in private network and everything works fine. | 19:44 |
johnsom | Hmmm, should be possible to have the lb-mgmt-net the same as a member network, but I don't know that anyone has tried it. Technically inside the amp, the VIP and member networks are inside their own network namespace. I'm just not sure if it would get confused that the network is already "plugged" into the instance. | 19:47 |
*** aojea_ has joined #openstack-lbaas | 19:49 | |
hyang | johnsom at least in my environment, using the member network for lb-mgmt-net as well works fine. Just don't know does the vip-address have to be in the same network too? | 19:51 |
johnsom | No, it should not need to be in the same network | 19:52 |
*** kosa777777 has quit IRC | 19:54 | |
johnsom | #startmeeting Octavia | 20:00 |
openstack | Meeting started Wed Nov 7 20:00:03 2018 UTC and is due to finish in 60 minutes. The chair is johnsom. Information about MeetBot at http://wiki.debian.org/MeetBot. | 20:00 |
openstack | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 20:00 |
*** openstack changes topic to " (Meeting topic: Octavia)" | 20:00 | |
openstack | The meeting name has been set to 'octavia' | 20:00 |
cgoncalves | Hi | 20:00 |
johnsom | Hi folks | 20:00 |
johnsom | Guess you aren't going to be late cgoncalves.... | 20:00 |
johnsom | We will see how many folks have the wrong time.. grin | 20:00 |
xgerman_ | o/ | 20:00 |
johnsom | #topic Announcements | 20:01 |
*** openstack changes topic to "Announcements (Meeting topic: Octavia)" | 20:01 | |
johnsom | OpenStack Berlin Summit is next week | 20:01 |
johnsom | We have some nice sessions: | 20:01 |
johnsom | https://wiki.openstack.org/wiki/Octavia/Weekly_Meeting_Agenda#Meeting_2018-10-31 | 20:01 |
johnsom | #link https://wiki.openstack.org/wiki/Octavia/Weekly_Meeting_Agenda#Meeting_2018-10-31 | 20:01 |
johnsom | Should we cancel next weeks meeting? I know Carlos and German will be at the Summit? | 20:02 |
xgerman_ | yeah, cancel! | 20:02 |
cgoncalves | yeah, we never know which bus we'll be taking on Wednesday evening ;) | 20:02 |
johnsom | So the next announcement: Next weeks meeting is canceled for the Summit. | 20:03 |
johnsom | lol | 20:03 |
johnsom | I will send out an e-mail after the meeting | 20:03 |
johnsom | I think that is all I have for announcements. Anyone else? | 20:03 |
nmagnezi | o/ sorry to be late | 20:04 |
johnsom | #topic Brief progress reports / bugs needing review | 20:04 |
nmagnezi | The joy of DST.. | 20:04 |
*** openstack changes topic to "Brief progress reports / bugs needing review (Meeting topic: Octavia)" | 20:04 | |
johnsom | Yep, DST.... | 20:05 |
johnsom | So I was out a few days on a long weekend trip. | 20:05 |
johnsom | I did get the basis for the octavia-lib work done before I left. | 20:05 |
johnsom | I still have some constants updates to work on. | 20:06 |
johnsom | This week I was looking at a bug that was reported for running tempest against OSP 13. Turned out to be some missing tempest.conf settings, so no work to be done there. | 20:06 |
johnsom | I plan to finish up the that octavia-lib work and then get started on flavors. | 20:07 |
johnsom | We still have a lot of open patches in need of reviews..... | 20:07 |
johnsom | #link https://review.openstack.org/#/c/589292/ | 20:07 |
johnsom | ^^^ that is a good one to get in to get the rrain rolling on getting IPv6 fixed and being tested. | 20:07 |
johnsom | Any other progress updates? | 20:08 |
nmagnezi | johnsom, re: tempest in osp13, this might help: https://review.openstack.org/#/c/571177/15/config_tempest/services/octavia.py@25 | 20:08 |
xgerman_ | like every week: #link https://review.openstack.org/#/c/613685/ #link https://review.openstack.org/#/c/585864/ and #link https://review.openstack.org/#/c/604479/ | 20:08 |
xgerman_ | reviews would be appreciated | 20:09 |
johnsom | nmagnezi Hmm, thanks! Didn't know that existed. So the member role is _member_ on OSP13? not just member? | 20:09 |
cgoncalves | po-tay-to, po-tah-to | 20:10 |
nmagnezi | Yeah, I think you'll actually find them both | 20:11 |
nmagnezi | I can double check | 20:11 |
johnsom | Well, I told the user to use member, so I hope it works.... grin | 20:11 |
nmagnezi | Kinda out sick now, so maybe cgoncalves can keep me honest here | 20:11 |
cgoncalves | I'm also not sure which one is set | 20:12 |
johnsom | Well, I guess I will find out if they come back..... | 20:12 |
cgoncalves | 'member' judging by an Rocky/OSP14 undercloud deployment | 20:12 |
rm_work | oops got distracted. hey o/ | 20:13 |
johnsom | Welcome back rm_work | 20:13 |
johnsom | Any other progress updates today? | 20:13 |
rm_work | ummm | 20:14 |
rm_work | this: | 20:14 |
openstackgerrit | Adam Harwell proposed openstack/octavia master: Fix possible state machine hole in failover https://review.openstack.org/616287 | 20:14 |
cgoncalves | yet again, nothing from my side. tripleo/puppet work continues | 20:14 |
johnsom | Ok, moving on then | 20:15 |
johnsom | #topic Open Discussion | 20:15 |
*** openstack changes topic to "Open Discussion (Meeting topic: Octavia)" | 20:15 | |
johnsom | Other topics today? | 20:15 |
rm_work | I won't be making it to the summit after all, if anyone was expecting me to be there still | 20:16 |
xgerman_ | That was quick today :-) | 20:16 |
rm_work | i think i have mentioned that | 20:16 |
cgoncalves | so I'll be stuck with German? :/ | 20:17 |
johnsom | Bummer. | 20:17 |
johnsom | Yep | 20:17 |
xgerman_ | +1 | 20:17 |
xgerman_ | cgoncalves: you won’t even notice I am there | 20:17 |
johnsom | You guys will nail it. I have faith. | 20:17 |
cgoncalves | xgerman_, I hope I will. we have 3 sessions to present together | 20:18 |
johnsom | Maybe I should send out a deprecation reminder e-mail on Friday. Just to make things lively for you all. | 20:18 |
xgerman_ | ok, I better show up then (halfway sober) | 20:18 |
xgerman_ | oh, my twitter feet already has someone who is looking forward to the talk (so he can tell us about his disasters) | 20:19 |
johnsom | Joy | 20:19 |
cgoncalves | johnsom, have you been contacted asking for some sort of delay of neutron-lbaas EOL? | 20:19 |
cgoncalves | I have not, just asking | 20:19 |
johnsom | Nope, full steam ahead! | 20:19 |
cgoncalves | nice! | 20:20 |
xgerman_ | yeah, let’s accelerate that ;-) | 20:21 |
johnsom | Ok, if we don't have any more topics, I will close it out. | 20:21 |
johnsom | I have posted the slides I said I would create for you all. Let me know if you want/need changes. | 20:21 |
*** aojea_ has quit IRC | 20:21 | |
*** aojea_ has joined #openstack-lbaas | 20:21 | |
johnsom | Thanks folks | 20:22 |
johnsom | #endmeeting | 20:22 |
*** openstack changes topic to "OpenStack PTG etherpad: https://etherpad.openstack.org/p/octavia-stein-ptg" | 20:22 | |
openstack | Meeting ended Wed Nov 7 20:22:50 2018 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 20:22 |
openstack | Minutes: http://eavesdrop.openstack.org/meetings/octavia/2018/octavia.2018-11-07-20.00.html | 20:22 |
openstack | Minutes (text): http://eavesdrop.openstack.org/meetings/octavia/2018/octavia.2018-11-07-20.00.txt | 20:22 |
openstack | Log: http://eavesdrop.openstack.org/meetings/octavia/2018/octavia.2018-11-07-20.00.log.html | 20:22 |
*** abaindur has joined #openstack-lbaas | 20:35 | |
*** abaindur has quit IRC | 20:35 | |
*** abaindur has joined #openstack-lbaas | 20:35 | |
*** aojea_ has quit IRC | 20:43 | |
*** aojea_ has joined #openstack-lbaas | 20:46 | |
*** hogepodge has quit IRC | 20:47 | |
*** mnaser has quit IRC | 20:48 | |
*** hogepodge has joined #openstack-lbaas | 20:48 | |
*** mnaser has joined #openstack-lbaas | 20:49 | |
*** LutzB has quit IRC | 20:54 | |
*** abaindur has quit IRC | 20:56 | |
*** abaindur has joined #openstack-lbaas | 20:56 | |
*** aojea_ has quit IRC | 20:56 | |
*** aojea_ has joined #openstack-lbaas | 20:58 | |
*** aojea_ has quit IRC | 21:05 | |
*** aojea_ has joined #openstack-lbaas | 21:06 | |
*** hyang has quit IRC | 21:08 | |
*** abaindur has quit IRC | 21:09 | |
*** abaindur has joined #openstack-lbaas | 21:11 | |
Swami | johnsom: here is the health-manager log. http://paste.openstack.org/show/734371/. | 21:27 |
*** salmankhan has joined #openstack-lbaas | 21:34 | |
johnsom | Swami Ok, I think I know what happened. So the HM decided the amphora was failed, can't say why exactly, but it attempted to do the failover. As part of that it expects nova to be able to start a VM inside the amp_active_retries and amp_active_wait_sec timeout period in the conf. I assume that since you rebooted the compute host, nova was unable to start an instance inside that time period, thus the aborted failover. | 21:38 |
johnsom | In Pike, that default timeout is 100 secods | 21:39 |
*** hyang has joined #openstack-lbaas | 21:39 | |
*** yboaron has quit IRC | 21:41 | |
Swami | johnsom: So will it just retry for the same host, or will it try to schedule it to other nodes. Since we had 4 different compute nodes. | 21:44 |
johnsom | Swami That is up to nova. I would have thought it would have scheduled it on one of the other nodes.... | 21:44 |
Swami | johnsom: But my concern is the same, I thought, if the node where the Amphora was initially located has rebooted or in a different, state, should it not schedule it to another active nodes. | 21:45 |
*** ivve has quit IRC | 21:46 | |
johnsom | Swami Yes, nova should certainly do that. We have no control over the scheduling of the VM instances in nova. | 21:46 |
johnsom | At the time of the error, we had just asked nova to boot us a new VM. It should have done so on one of the working compute nodes. | 21:47 |
Swami | johnsom: So should I check the nova scheduler log to see if there was any message in there. | 21:48 |
johnsom | All I can see from our side is we waited 100 seconds for nova to start the VM, and then the timeout expired | 21:48 |
johnsom | Swami Yes, I would try to track what nova was doing with the boot request. | 21:50 |
Swami | johnsom: Thanks will check it out. | 21:50 |
*** hyang has joined #openstack-lbaas | 21:59 | |
*** fnaval has quit IRC | 22:54 | |
*** aojea_ has quit IRC | 23:24 | |
*** aojea_ has joined #openstack-lbaas | 23:27 | |
*** aojea_ has quit IRC | 23:32 | |
*** hyang has quit IRC | 23:37 | |
Swami | johnsom: I verified the nova-scheduler log and nova-compute log. It seems that the scheduler is scheduling to a different valid node that is active. ( Which is expected ). The VM is getting spawned in the Compute host and while it is getting spawned we do see a delete is being issued based on the health manager timeout. I posted some of critical logs here. http://paste.openstack.org/show/734382/ | 23:53 |
johnsom | Swami When failover starts, the first thing it does is call delete on the failed amphora (in case the cloud only has space for one). It's the build call after that that nova failed | 23:54 |
Swami | johnsom: Why should it schedule first and then call delete. Am I missing something. | 23:56 |
johnsom | Swami It calls delete on the old instance. Calls nova create. Waits 100 seconds for nova to start the instance. Reports the failure, then will clean up any resources that were created. | 23:57 |
johnsom | See in your log, this is the key: | 23:57 |
johnsom | Expected: {\'task_state\': [u\'spawning\']}. Actual: {\'task_state\': u\'deleting\'}\\n"], "class": "UnexpectedDeletingTaskStateError"}', u'ending': True} | 23:58 |
Swami | johnsom: Yes I see the message. | 23:58 |
johnsom | The instance should have been in "spawning" but for whatever reason nova hadn't started it | 23:58 |
Swami | johnsom: So do you think we should increase the time for amp_active_wait_sec | 23:59 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!