*** ysandeep|out is now known as ysandeep|ruck | 03:12 | |
*** raukadah is now known as chandankumar | 04:22 | |
*** ysandeep|ruck is now known as ysandeep|ruck|afk | 04:32 | |
*** ysandeep|ruck|afk is now known as ysandeep|ruck | 05:08 | |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_rally master: Return upgrade jobs to voting https://review.opendev.org/c/openstack/openstack-ansible-os_rally/+/848778 | 06:43 |
---|---|---|
jrosser_ | morning | 08:23 |
*** ysandeep|ruck is now known as ysandeep|ruck|lunch | 08:43 | |
noonedeadpunk | o/ | 09:04 |
damiandabrowski[m] | hi! | 09:05 |
jrosser_ | doh got to be careful with those +W regarding gate queues | 09:08 |
noonedeadpunk | oh yes | 09:08 |
jrosser_ | i still am 50/50 about if the shared queue is a good idea or no | 09:09 |
jrosser_ | like random db errors and MODULE FAILURE still happens too much | 09:09 |
jrosser_ | and those are soooo hard to debug, i don't know what to do about them | 09:09 |
jrosser_ | it was on my mind to make an AIO and take that MODULE FAILURE prone facts gathering task and just run it endlessly in a loop | 09:10 |
jrosser_ | "see stdout stderr for details" it says and theres just nothing to see | 09:11 |
noonedeadpunk | oh yes, it's weird | 09:39 |
noonedeadpunk | regarding shared queue - yeah, I don't know. But I'd tried it just to be sure it doesn't work for us indeed. | 09:40 |
noonedeadpunk | As zuul and infra folks were quite convincing about need of that | 09:41 |
jrosser_ | yeah, and it then also becomes quite important which order we +W things in | 09:42 |
noonedeadpunk | And I don't think this is smth you can catch locally... | 09:42 |
jrosser_ | which is a total change of workflow for all of us | 09:42 |
noonedeadpunk | Oh really? I thought for some reason that without shared queues +w is important and at least that will be fixed... | 09:43 |
jrosser_ | +W order (and maybe something to do with topics?) decides the order the patches stack up in the gate queue i think | 09:43 |
jrosser_ | i was 8-O about that | 09:43 |
noonedeadpunk | but how that would affect our workflow... | 09:44 |
jrosser_ | well we allow many things to go in parallel | 09:44 |
noonedeadpunk | we still have depends-on and when it's defined queue should understand that | 09:44 |
jrosser_ | and quite often we make a mistake with dependancies or something that causes an os_<> role patch to fail | 09:44 |
jrosser_ | anyway | 09:45 |
noonedeadpunk | yeah, dunno | 09:45 |
noonedeadpunk | I mean it's contraversary for sure | 09:45 |
noonedeadpunk | But dunno if we should try that or not | 09:46 |
*** ysandeep|ruck|lunch is now known as ysandeep|ruck | 09:53 | |
opendevreview | Merged openstack/openstack-ansible-os_rally master: Control rally-openstack installed version https://review.opendev.org/c/openstack/openstack-ansible-os_rally/+/848666 | 10:35 |
*** dviroel|out is now known as dviroel | 11:25 | |
mgariepy | woohoo !!! openstack-ansible 25.0.0: Ansible playbooks for deploying OpenStack | 14:32 |
jrosser_ | deploy! | 14:39 |
mgariepy | i usually wait a couple of months :D | 14:40 |
*** dviroel is now known as dviroel|lunch | 14:59 | |
b1tsh1ft3r | Running train release here, is there a specific tag or playbook for removing a single controller node from the cluster? I know one exists for removing/adding compute nodes. | 15:08 |
noonedeadpunk | b1tsh1ft3r: I'm pretty sure we don't have anything to remove compute, only to add | 15:16 |
jrosser_ | removing a controller is pretty challenging too | 15:16 |
jrosser_ | depending on what you want to do it is better to "replace it in place" rather than try to remove | 15:17 |
b1tsh1ft3r | Well.. looking to remove it entirely to use the gear somewhere else. Best i could do for now was change up haproxy to not use the services on the node and then power it off for now. | 15:21 |
jrosser_ | it could easily be that the memcached config in all the hosts needs updating | 15:22 |
jrosser_ | perhaps oslo.cache deals with a missing server, not sure | 15:23 |
*** ysandeep|ruck is now known as ysandeep|dinner | 15:28 | |
b1tsh1ft3r | noonedeadpunk im thinking of the openstack-ansible-ops remove_compute_node.yml playbook. | 15:29 |
noonedeadpunk | oh, well. I didn't know about it lol | 15:31 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible master: Add default rate-limits for API endpoints and Horizon authentication https://review.opendev.org/c/openstack/openstack-ansible/+/848659 | 15:31 |
b1tsh1ft3r | Heh, no worry. I figured if the compute node removal playbook existed, surely a controller would. Oh well | 15:32 |
jrosser_ | i think it doesnt exist becasue to do it properly is really tricky | 15:33 |
b1tsh1ft3r | Yeah i could see that for sure. It's tied into quite a lot. | 15:33 |
*** ysandeep|dinner is now known as ysandeep | 16:03 | |
*** dviroel|lunch is now known as dviroel | 16:08 | |
*** ysandeep is now known as ysandeep|out | 16:08 | |
spatel | anyone here from cumulus world? | 18:08 |
spatel | i need help to setup switch | 18:09 |
mgariepy | i know a little bit | 18:26 |
mgariepy | what do you need ? | 18:26 |
mgariepy | spatel, ^^ | 18:40 |
spatel | give me a sec.. | 18:45 |
spatel | mgariepy i am learning cumulus linux | 19:11 |
spatel | i used vagrant to bring up one cumulus linux | 19:11 |
spatel | but its not allowing me to run "net add" etc.. (100% related to permission or privileg issue) | 19:12 |
spatel | trying to understand how do i give full access to vagrant user so it can run all NLCU commands | 19:13 |
spatel | This is very nice doc but somehow not working for me - https://docs.nvidia.com/networking-ethernet-software/cumulus-linux-41/System-Configuration/Network-Command-Line-Utility-NCLU/#:~:text=To%20add%20a%20new%20user,group%20%60netedit'%20...&text=You%20can%20use%20the%20adduser%20command%20for%20local%20user%20accounts%20only. | 19:13 |
spatel | users_with_edit = root, cumulus | 19:15 |
spatel | groups_with_edit = netedit | 19:15 |
spatel | https://paste.opendev.org/show/bDI0aMXdlkoInJkR1d1E/ | 19:16 |
mgariepy | ho. no idea. we do run it on our switches and we do always use the cumulus user. | 19:18 |
mgariepy | our testbed are on gns3 | 19:18 |
mgariepy | we do have some ansible playbook to manage the configuration | 19:19 |
spatel | I did switch to cumulus user but still same issue | 19:20 |
mgariepy | you probably need to complet command | 19:20 |
mgariepy | `net show configuration commands` | 19:21 |
mgariepy | does that display the running config ? | 19:21 |
mgariepy | you also should have tab completion | 19:22 |
noonedeadpunk | spatel: never trust nvidia docs :p | 19:22 |
spatel | :( | 19:22 |
mgariepy | lol. | 19:22 |
spatel | They own cumulus lol | 19:23 |
noonedeadpunk | that's what I learned working with vGPUs | 19:23 |
mgariepy | and yet they push for sonicos. | 19:23 |
noonedeadpunk | they also own GRID licensing. Doesn't mean their docs are always relevant | 19:23 |
mgariepy | https://en.wikipedia.org/wiki/SONiC_(operating_system) | 19:23 |
noonedeadpunk | and not misleading or just wrong | 19:23 |
noonedeadpunk | btw, if the own cumulus, why they ship their drivers in packed qcow images of Ubuntu? | 19:24 |
noonedeadpunk | They can't get it working as well? :p | 19:25 |
noonedeadpunk | s/drivers/license server | 19:25 |
spatel | hmm | 19:26 |
mgariepy | lol. | 19:26 |
noonedeadpunk | sry, just got some beer and every time I hear nvidia I feel so frustrated with them.... | 19:26 |
spatel | I am building ovn lab using cumulus spine-leaf network | 19:26 |
mgariepy | i've seens quite a lot of ugly stuff from corporate trying to do linux stuff | 19:27 |
mgariepy | so that `net show configuration commands` does it works ?? | 19:28 |
spatel | all net show command works | 19:28 |
mgariepy | ok that's a good start | 19:28 |
spatel | only net add or privilege command not working :( | 19:28 |
mgariepy | from cumulus and vagrant user? | 19:28 |
spatel | cumulus@spine-1:~$ id cumulus | 19:28 |
spatel | uid=1000(cumulus) gid=1000(cumulus) groups=1000(cumulus),24(cdrom),25(floppy),27(sudo),29(audio),30(dip),44(video),46(plugdev),993(netedit) | 19:28 |
spatel | I am cumulus user and it has permission of netedit group | 19:29 |
mgariepy | then ? what does it output ? `net add hostname spine1`? | 19:30 |
spatel | do i need to restart any service etc ? | 19:30 |
mgariepy | if you edit the configuration of certain services i think you do. | 19:31 |
spatel | https://paste.opendev.org/show/bg8FKSb9lZ8q0wgMuo2s/ | 19:31 |
spatel | it doesn't understand "add" | 19:31 |
mgariepy | net help ? | 19:31 |
spatel | net show config works | 19:31 |
spatel | https://paste.opendev.org/show/bKrfc6jr6Wtz4pi9vGdG/ | 19:32 |
mgariepy | hmm. weird. | 19:32 |
mgariepy | what pkg do you have installed? | 19:32 |
spatel | I just did vagrant init cumulus-vx | 19:34 |
spatel | vangrant up | 19:34 |
spatel | and machine was ready in few min | 19:34 |
jamesdenton | what version of cumulus do you end up with on that? | 19:35 |
jrosser | i never did anything more that run up one of the examples but there is stuff here https://air.nvidia.com/Login | 19:35 |
jrosser | nvidia air is a virtualised network-lab-as-a-service you can try stuff out in | 19:36 |
spatel | Cumulus Linux 5.1.0 | 19:36 |
mgariepy | installing vagrant and virtual box on a spare server to test it | 19:37 |
spatel | jrosser I am trying to build lab on my desktop | 19:37 |
jamesdenton | https://docs.nvidia.com/networking-ethernet-software/cumulus-linux-50/System-Configuration/NVIDIA-User-Experience-NVUE/ | 19:37 |
spatel | Here is the lab which i am trying to mimic using their vagrant files - https://ltomasbo.wordpress.com/2021/02/04/ovn-bgp-agent-testing-setup/ | 19:38 |
jamesdenton | nclu may be deprecated now? | 19:38 |
mgariepy | nclu has been replaced by curl ? | 19:39 |
jamesdenton | lol | 19:39 |
jamesdenton | plausible. but, NVUE CLI | 19:39 |
jamesdenton | "In addition to the nv show commands, Cumulus Linux continues to provide a subset of the NCLU net show commands." | 19:39 |
mgariepy | hahah | 19:39 |
jamesdenton | IIRC this change made me migrate my lone Cumulus-based switch -> SONiC | 19:40 |
mgariepy | yeah | 19:41 |
mgariepy | SONiC seems quite nice. | 19:41 |
noonedeadpunk | as I said - never trust nvidia docs :D | 19:41 |
mgariepy | and the cumulus support is just... | 19:41 |
mgariepy | terrible .. | 19:41 |
jamesdenton | last convo i had with Nvidia, my takeaway was they were going to double down on cumulus and maybe slowly pull away from Onyx | 19:42 |
jamesdenton | so, maybe it will get better | 19:42 |
mgariepy | we have a site with probably close to 100 of cumulus switch. | 19:43 |
mgariepy | when the network engeneer need support it's always painful | 19:43 |
jamesdenton | how do you manage them? | 19:44 |
mgariepy | they are managed via ansible | 19:44 |
mgariepy | some homemade playbooks. | 19:44 |
jamesdenton | how do you store their configuration? in inventory files or do you have some better way? | 19:45 |
mgariepy | it's somewhat bad and in inventory... | 19:45 |
spatel | let me post this issue in reddit or stackoverflow to see what other folks thinking | 19:47 |
mgariepy | does `nv set system` works? | 19:48 |
spatel | if nclu is deprecated then what is the alternative way? | 19:48 |
mgariepy | https://docs.nvidia.com/networking-ethernet-software/cumulus-linux-51/System-Configuration/NVIDIA-User-Experience-NVUE/NVUE-CLI/ | 19:48 |
mgariepy | this ? ^^ | 19:48 |
spatel | i ran this command - nv set system and it works but what is this? | 19:48 |
spatel | no error in command | 19:48 |
mgariepy | nv set system hostname leaf01 | 19:48 |
mgariepy | that's the new command.. | 19:48 |
spatel | it works - nv set system hostname spine-1 | 19:49 |
spatel | damn, now i need to translate all net add to nv stuff | 19:49 |
mgariepy | cumulus linux 5 is nv now .. | 19:49 |
mgariepy | so.. | 19:49 |
spatel | noonedeadpunk agreed never trust nvidia doc | 19:50 |
mgariepy | well you were not reading the right doc ;p | 19:50 |
* mgariepy not defending nvidia | 19:50 | |
spatel | This is handy - https://docs.nvidia.com/networking-ethernet-software/knowledge-base/Configuration-and-Usage/Network-Configuration/NCLU-to-NVUE-Commands/ | 19:50 |
spatel | They should update doc (with if / else version ) :D | 19:51 |
mgariepy | the version is in the url ;p | 19:51 |
mgariepy | save you some trouble.. go with sonic :P | 19:52 |
mgariepy | jamesdenton, is sonic working correcly on your side ? | 19:52 |
spatel | sonic switches? | 19:52 |
mgariepy | i did try to convince the network guy to try it but i'm not sure it will be done. | 19:53 |
mgariepy | https://github.com/sonic-net/SONiC/blob/master/doc/SONiC-User-Manual.md#sonic-user-manual | 19:53 |
jamesdenton | "working correctly" | 19:54 |
jamesdenton | if you mean, forwarding traffic, then yes | 19:55 |
jamesdenton | i have it installed on an SN2100 | 19:55 |
mgariepy | what no advanced configuration ? | 19:55 |
jamesdenton | i think i had immediate issues with some of the CLI commands, and needed to patch them. But nothing too advanced, no. | 19:55 |
jamesdenton | i've only got 1. and it's powered off right now to save me a few bucks in power costs | 19:56 |
spatel | what HW are you guys running sonic ? | 19:58 |
jamesdenton | Mellanox/NVIDIA SN2100 here | 19:58 |
spatel | hmm | 19:58 |
jamesdenton | i can only imagine support is worse than Cumulus Linux, though, since it's DIY | 19:59 |
mgariepy | well and least sonic os open.. | 20:00 |
mgariepy | not a big fan of support contract here.. | 20:00 |
jamesdenton | true | 20:01 |
mgariepy | that's why i love osa :) haha | 20:01 |
noonedeadpunk | mgariepy: they should have training for finding right doc | 20:02 |
noonedeadpunk | which you also should never trust:D | 20:02 |
mgariepy | lol | 20:02 |
mgariepy | yeah or just.. net add .. error. you should sitch to nv instead.. lol | 20:03 |
spatel | nv also giving tough time.. crashing daemon | 20:08 |
spatel | I am going to switch image version to 3.x | 20:08 |
jamesdenton | 3.x. The dark ages. | 20:09 |
mgariepy | yep. | 20:09 |
mgariepy | indeed it's quite old. | 20:09 |
mgariepy | have a nice weekend guys. i'm out until monday. | 20:09 |
jamesdenton | see ya! enjoy your time off | 20:09 |
mgariepy | thanks | 20:09 |
spatel | I want to match with author version for POC. i am not going to use cumulus in my life... hehe | 20:09 |
noonedeadpunk | oh, that's sweet! have great time! | 20:09 |
spatel | This is for simplicity.. otherwise my goal is to use Cisco lab | 20:10 |
noonedeadpunk | *a great time | 20:10 |
noonedeadpunk | I see you love pain spatel :D | 20:10 |
jamesdenton | it's a good learning exercise, if it works | 20:11 |
spatel | I will make it work so we can implement in OSA :) | 20:11 |
jamesdenton | -1 | 20:12 |
noonedeadpunk | works for me then :) | 20:12 |
jamesdenton | :D | 20:12 |
spatel | -1 scare the shit out of me...lol | 20:12 |
noonedeadpunk | doesn't reject my previous replica though hehe | 20:12 |
noonedeadpunk | -1 should not. jamesdenton was generous and haven't put -2 :[ | 20:12 |
spatel | lol | 20:13 |
spatel | ok guys time for some beers... my brain exploding watching errors whole day :D | 20:14 |
noonedeadpunk | cheers! | 20:15 |
spatel | i will see you tomorrow!! good night | 20:15 |
jamesdenton | enjoy | 20:15 |
*** dviroel is now known as dviroel|out | 20:50 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!