ignaziocassano | Hellp, please any help on octavia wallaby ? | 08:55 |
---|---|---|
gthiemonge | ignaziocassano: hi | 08:56 |
ignaziocassano | amphora agent logs Closing connection. | 08:56 |
ignaziocassano | But if I try opessl connect it works from controllers | 08:56 |
ignaziocassano | gthiemonge Hi | 08:57 |
ignaziocassano | I installed octavia using kolla wallaby | 08:57 |
ignaziocassano | the ambora image is wallaby version | 08:58 |
ignaziocassano | the amphira agent can connect to port 5555 on controllers | 08:58 |
ignaziocassano | I did not understand why it does not work | 08:58 |
ignaziocassano | Anyone could help me,please ? | 09:00 |
gthiemonge | ignaziocassano: do you have any other errors in the system logs of the amphora? (not the amphora-agent logs) | 09:06 |
gthiemonge | it looks like the connection is established then it goes into timeout | 09:07 |
ignaziocassano | gthiemonge oh yes, in worker.log I have : 2021-07-07 10:39:07.751 19 WARNING octavia.amphorae.drivers.haproxy.rest_api_driver [-] Could not connect to instance. Retrying.: requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='10.102.191.17', port=9443): Read timed out. (read timeout=10.0) | 09:07 |
ignaziocassano | it tries a lot of time then it ends with octavia.amphorae.driver_exceptions.exceptions.TimeOutException: contacting the amphora timed out | 09:08 |
ignaziocassano | If I try to connect from the controller to amphora agent on port 9433 it responds (I tried with telnet and with openssl) | 09:10 |
ignaziocassano | agargano: are yoy from italy ? | 09:12 |
gthiemonge | ignaziocassano: and any errors with 'journalctl -le' in the amphorae? I've seen some similar errors in the past, because of memory allocation errors in haproxy | 09:13 |
ignaziocassano | <gthiemonge> I am going to verify | 09:14 |
ignaziocassano | gthiemonge: no errors with journlctl -le | 09:20 |
ignaziocassano | agargano: have you the same issue ? | 09:21 |
agargano | ignaziocassano: yes | 09:21 |
gthiemonge | ignaziocassano: do you use ACTIVE_STANDBY loadbalancers? | 09:24 |
ignaziocassano | gthiemonge: the topology is SINGLE | 09:25 |
gthiemonge | ignaziocassano: so when I see thoses messages (Read timed out), it's either an issue in the amphora (in that case you should see some logs about that), or a connectivity issue with the amphora (basically a cloud outage) | 09:39 |
ignaziocassano | gthiemonge: I I created the image two times. I 'll try to create it on centos | 09:41 |
ignaziocassano | the problem was the mtu | 10:44 |
ignaziocassano | gthiemonge: the proble was the mtu: setting it to 1550 works fine | 10:45 |
ignaziocassano | sorry 1500 | 10:45 |
ignaziocassano | it is strange because the network is 9000, probaply some switch port is not 9000 | 10:45 |
frickler | johnsom: focal has haproxy 2.0.13, is there any news on https://storyboard.openstack.org/#!/story/1650270 (haproxy multi cpu)? | 14:19 |
johnsom | frickler So, HAProxy decided to enable the (excellent) multi-threading code path by default for 2.x. So, if you build an amp with multi-core you will immediately get the benefit. | 14:20 |
johnsom | frickler We have various optimization ideas on top of that on the Octavia roadmap, such as interrupt CPU pinning, etc. but I don't think anyone is working on that yet. Basically enhancements in the amphora agent to implement optimizations. | 14:22 |
johnsom | frickler Also, we highly recommmend not enabling nbproc. There are many issues with it. | 14:23 |
frickler | johnsom: so that means an amphora based on focal will use multiple CPUs by default, do I understand this correctly? | 14:31 |
johnsom | frickler That is correct | 14:32 |
frickler | johnsom: great, thx | 14:32 |
johnsom | Any amphora with 2.x or newer will use the cores | 14:32 |
opendevreview | Merged openstack/octavia-dashboard master: Drop horizon-nodejs10-jobs template https://review.opendev.org/c/openstack/octavia-dashboard/+/795594 | 15:39 |
gthiemonge | #startmeeting Octavia | 16:00 |
opendevmeet | Meeting started Wed Jul 7 16:00:42 2021 UTC and is due to finish in 60 minutes. The chair is gthiemonge. Information about MeetBot at http://wiki.debian.org/MeetBot. | 16:00 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 16:00 |
opendevmeet | The meeting name has been set to 'octavia' | 16:00 |
johnsom | o/ | 16:00 |
gthiemonge | Hi everyone | 16:00 |
johnsom | Some folks are on vacation I think, might be a quiet week | 16:01 |
haleyb | hi | 16:01 |
gthiemonge | Don't be shy | 16:01 |
gthiemonge | #topic Announcements | 16:02 |
gthiemonge | Next week is Xena-2 milestone | 16:02 |
gthiemonge | perhaps we need to publish an intermediate release for python-octaviaclient | 16:03 |
gthiemonge | johnsom: ^ | 16:03 |
gthiemonge | we missed it for Xena-1 | 16:03 |
johnsom | Yeah, we have some good backports/patches as well, so worth a release | 16:03 |
gthiemonge | ok,I'll need your help for it | 16:04 |
johnsom | Ok, ping me | 16:04 |
gthiemonge | johnsom: thanks | 16:04 |
johnsom | Shameless plug: | 16:04 |
johnsom | #link https://review.opendev.org/q/project:openstack/python-octaviaclient+status:open+owner:johnsomor%2540gmail.com | 16:04 |
gthiemonge | yeah we need to get those commits | 16:05 |
gthiemonge | other core reviewers are on vacation | 16:06 |
gthiemonge | any other announcements? | 16:07 |
gthiemonge | #topic Brief progress reports / bugs needing review | 16:09 |
gthiemonge | I have spent most of my time on downstream stuff | 16:09 |
gthiemonge | but I've also been working at fixing the two-node job and the centos stream job | 16:10 |
gthiemonge | centos stream support is now broken on devstack (issue with libvirt bindings) I hope it will be fixed soon | 16:10 |
johnsom | Yeah, I had some vacation time. Not much new on Octaiva | 16:10 |
gthiemonge | and two-node job is still V-1 because of unrelated CI issues | 16:10 |
haleyb | i've been working on the health monitor backport for OVN, and trying to fix it's gates, plan on getting to multi-VIP rebase this week | 16:12 |
gthiemonge | haleyb: great! | 16:12 |
haleyb | gthiemonge: i have to review your network interface change again as well | 16:12 |
gthiemonge | I'll have to rebase it, it's in merge conflict :/ | 16:13 |
gthiemonge | but the conflict is probably in a file that I have deleted | 16:14 |
johnsom | Yeah, I don't think I got back to that review yet either | 16:14 |
johnsom | sigh | 16:14 |
gthiemonge | FYI a revert has been proposed in devstack | 16:14 |
gthiemonge | #link https://review.opendev.org/c/openstack/devstack/+/799251 | 16:15 |
gthiemonge | merging this commit would break our IPV6-based scenario tests | 16:15 |
johnsom | Joy | 16:16 |
gthiemonge | #topic Open Discussion | 16:18 |
gthiemonge | one last topic about CI issues: | 16:18 |
gthiemonge | some jobs (noop-api) have been frequently failing in timeout since the beginning of June | 16:18 |
gthiemonge | it seems that the duration of those jobs have increased a lot | 16:18 |
gthiemonge | for instance the duration of the scoped-tokes jobs was between 1h30 and 1h45 last month | 16:19 |
gthiemonge | it is now around 2h | 16:19 |
gthiemonge | #link https://zuul.openstack.org/builds?job_name=octavia-v2-dsvm-noop-api-scoped-tokens&project=openstack%2Foctavia-tempest-plugin | 16:19 |
gthiemonge | and sometime it hits the timeout (2h15) | 16:19 |
gthiemonge | it seems that something has changed between June 6th and 11th | 16:20 |
johnsom | sqlalchemy 1.4 was added to upper constraints June 9th | 16:20 |
johnsom | #link https://review.opendev.org/c/openstack/requirements/+/788339 | 16:21 |
gthiemonge | yeah, I checked the new octavia/octavia-tempest-plugin/devstack commits, I didn't see anything there | 16:22 |
gthiemonge | so sqlalchemy might be a suspect | 16:23 |
gthiemonge | My first idea was to increase the value of the timeout for the noop-api jobs (+15 or 30min) | 16:24 |
gthiemonge | but I believe that johnsom has some smarter ideas ;-) | 16:24 |
johnsom | That means we wait longer.... sad face | 16:24 |
johnsom | My thoughts would be to try to narrow down what is slow. If it's the SQL calls, see if there is some new magic incantation we missed (probably not). Then I might explore moving the sqlite file to /dev/shm so it runs in RAM vs. the slow disks on these test hosts. Though RAM is certainly a premium on these test nodes, so that may not be a good idea. Frankly I have no idea what size that file gets during a test run. | 16:25 |
johnsom | Or we split the jobs up more. | 16:25 |
johnsom | These are no-op, so really should be darn fast. | 16:26 |
gthiemonge | 573 tests :D | 16:26 |
johnsom | That isn't that much. grin | 16:27 |
gthiemonge | johnsom: I can work on your first point: analyzing what is slow | 16:28 |
johnsom | Cool. I just can't spare the time right now | 16:28 |
gthiemonge | perhaps we could isolate one part ofcode that is problematic | 16:28 |
johnsom | I would be curious to know. | 16:31 |
gthiemonge | if I don't find anything, we can try to move the sqlite file | 16:31 |
gthiemonge | or just increase the timeout | 16:31 |
gthiemonge | yeha | 16:31 |
johnsom | FYI, there is a comment on the discuss mailing list: | 16:32 |
johnsom | #link http://lists.openstack.org/pipermail/openstack-discuss/2021-July/023502.html | 16:32 |
johnsom | About barbican integration in dashboard. | 16:32 |
johnsom | I don't have a stack with Octavia and dashboard at the moment to test this out. | 16:33 |
gthiemonge | this is weird, because barbicanAPI.getCertificates is part of octavia-dashboard | 16:33 |
gthiemonge | johnsom: I tried to reproduce it on master, I didn't find anything | 16:33 |
johnsom | There is a code check in the plugin that looks in the service catalog for key-manager, that should disable any calls to the barbican client. But this seemed like some javascript issues I don't understand and would need to run it myself. | 16:34 |
gthiemonge | hmm | 16:34 |
johnsom | It's possible that message comes out on the console all the time, but doesn't impact functionality | 16:34 |
gthiemonge | I will take another look | 16:36 |
johnsom | In my experience, almost every page spews something in the browser console these days. Most of it doesn't matter. | 16:36 |
gthiemonge | any other topics for this meeting? | 16:41 |
johnsom | That is all I have today. | 16:41 |
gthiemonge | ok | 16:42 |
gthiemonge | thanks everyone! | 16:42 |
gthiemonge | #endmeeting | 16:42 |
opendevmeet | Meeting ended Wed Jul 7 16:42:53 2021 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 16:42 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/octavia/2021/octavia.2021-07-07-16.00.html | 16:42 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/octavia/2021/octavia.2021-07-07-16.00.txt | 16:42 |
opendevmeet | Log: https://meetings.opendev.org/meetings/octavia/2021/octavia.2021-07-07-16.00.log.html | 16:42 |
johnsom | Thanks Greg | 16:43 |
rm_work | +W'd your client backports | 16:51 |
johnsom | Cool, thanks! | 16:54 |
opendevreview | Merged openstack/python-octaviaclient stable/wallaby: Support pagination for 'list' API calls https://review.opendev.org/c/openstack/python-octaviaclient/+/796200 | 17:02 |
opendevreview | Merged openstack/python-octaviaclient stable/wallaby: Improve the client performance on large clouds https://review.opendev.org/c/openstack/python-octaviaclient/+/796203 | 17:04 |
opendevreview | Merged openstack/python-octaviaclient stable/victoria: Support pagination for 'list' API calls https://review.opendev.org/c/openstack/python-octaviaclient/+/796201 | 17:05 |
opendevreview | Merged openstack/python-octaviaclient stable/victoria: Improve the client performance on large clouds https://review.opendev.org/c/openstack/python-octaviaclient/+/796204 | 17:17 |
opendevreview | Merged openstack/python-octaviaclient stable/ussuri: Support pagination for 'list' API calls https://review.opendev.org/c/openstack/python-octaviaclient/+/796202 | 17:17 |
opendevreview | Merged openstack/python-octaviaclient stable/ussuri: Improve the client performance on large clouds https://review.opendev.org/c/openstack/python-octaviaclient/+/796205 | 17:20 |
opendevreview | Merged openstack/python-octaviaclient stable/train: Support pagination for 'list' API calls https://review.opendev.org/c/openstack/python-octaviaclient/+/796161 | 17:20 |
opendevreview | Merged openstack/python-octaviaclient stable/train: Improve the client performance on large clouds https://review.opendev.org/c/openstack/python-octaviaclient/+/796162 | 17:20 |
opendevreview | Douglas Mendizábal proposed openstack/octavia master: Replace md5 for fips https://review.opendev.org/c/openstack/octavia/+/798146 | 18:35 |
*** gthiemon1e is now known as gthiemonge | 20:57 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!