Monday, 2020-03-30

*** tetsuro has joined #openstack-infra00:25
*** tetsuro_ has joined #openstack-infra00:28
*** tetsuro has quit IRC00:31
*** armax has joined #openstack-infra00:34
*** dangtrinhnt has joined #openstack-infra00:41
*** armax has quit IRC00:57
*** ociuhandu has joined #openstack-infra01:06
*** ociuhandu has quit IRC01:09
*** ociuhandu has joined #openstack-infra01:10
*** yamamoto has joined #openstack-infra01:16
*** ociuhandu has quit IRC01:20
*** ociuhandu has joined #openstack-infra01:21
*** ociuhandu has quit IRC01:25
*** yamamoto has quit IRC01:41
*** yamamoto has joined #openstack-infra01:41
*** Goneri has quit IRC01:48
*** rfolco has quit IRC01:48
*** dangtrinhnt has quit IRC01:55
*** dangtrinhnt has joined #openstack-infra01:56
*** larainema has joined #openstack-infra01:57
*** dangtrinhnt has quit IRC01:57
*** dangtrinhnt has joined #openstack-infra02:04
*** dangtrinhnt has quit IRC02:07
*** dangtrinhnt_ has joined #openstack-infra02:07
openstackgerritIan Wienand proposed zuul/zuul-jobs master: test-upload-logs-swift: revert download script  https://review.opendev.org/71575502:11
openstackgerritIan Wienand proposed zuul/zuul-jobs master: bulk-download : role with script to download all log files  https://review.opendev.org/71575602:11
*** ricolin_ has joined #openstack-infra02:22
*** ricolin_ has quit IRC02:31
*** yamamoto has quit IRC02:32
*** ociuhandu has joined #openstack-infra02:34
*** ociuhandu has quit IRC02:39
*** yamamoto has joined #openstack-infra03:01
*** ramishra has joined #openstack-infra03:07
*** psachin has joined #openstack-infra03:09
*** smarcet has joined #openstack-infra03:15
*** rosmaita has left #openstack-infra03:22
kevinzping ianw: Hi03:28
kevinzRecently there are some node failure in Linaro US: http://zuul.openstack.org/builds?job_name=kolla-build-debian-source-aarch64&job_name=kolla-publish-debian-source-aarch64&job_name=kolla-ansible-debian-source-aarch6403:28
ianwkevinz: ok, give me a sec and i can poke at some logs03:29
*** dangtrinhnt_ has quit IRC03:37
*** yamamoto has quit IRC03:38
*** yamamoto has joined #openstack-infra03:41
ianwkevinz: here's one of the failures -> http://paste.openstack.org/show/791301/03:44
ianwkevinz: looks like that's pretyt consistent, floating ip failures03:44
*** dangtrinhnt has joined #openstack-infra03:50
*** smarcet has quit IRC04:03
*** evrardjp has quit IRC04:03
*** dave-mccowan has quit IRC04:06
*** dangtrinhnt has quit IRC04:08
*** evrardjp has joined #openstack-infra04:10
openstackgerritIan Wienand proposed zuul/zuul-jobs master: local-log-download : role with script to download all log files  https://review.opendev.org/71575604:13
*** ociuhandu has joined #openstack-infra04:14
*** dangtrinhnt has joined #openstack-infra04:24
*** ociuhandu has quit IRC04:24
*** ociuhandu has joined #openstack-infra04:24
*** ociuhandu has quit IRC04:30
*** evrardjp has quit IRC04:36
*** evrardjp has joined #openstack-infra04:36
openstackgerritIan Wienand proposed zuul/zuul-jobs master: local-log-download : role with script to download all log files  https://review.opendev.org/71575604:44
*** ykarel|away is now known as ykarel04:50
openstackgerritIan Wienand proposed zuul/zuul-jobs master: local-log-download : role with script to download all log files  https://review.opendev.org/71575604:51
chandankumarianw, Hello04:52
chandankumarianw, we have created openstack-tempest-skiplist repo yesterday https://review.opendev.org/#/c/713809/, please add me to the reviewer group for this project https://review.opendev.org/#/admin/groups/2083,members04:53
ianwchandankumar: np, should be done04:55
chandankumarianw, thanks :-)04:55
*** dangtrinhnt has quit IRC04:58
openstackgerritIan Wienand proposed zuul/zuul-jobs master: local-log-download : role with script to download all log files  https://review.opendev.org/71575605:00
*** yamamoto has quit IRC05:03
*** yamamoto has joined #openstack-infra05:04
*** ykarel is now known as ykarel|afk05:21
*** ociuhandu has joined #openstack-infra05:24
*** udesale has joined #openstack-infra05:28
*** ijw has joined #openstack-infra05:31
openstackgerritIan Wienand proposed zuul/zuul-jobs master: local-log-download : role with script to download all log files  https://review.opendev.org/71575605:31
*** ijw_ has quit IRC05:32
*** ociuhandu has quit IRC05:34
*** ociuhandu has joined #openstack-infra05:35
*** ociuhandu has quit IRC05:40
*** ykarel|afk is now known as ykarel05:40
*** ociuhandu has joined #openstack-infra05:44
openstackgerritIan Wienand proposed zuul/zuul-jobs master: local-log-download : role with script to download all log files  https://review.opendev.org/71575605:49
*** ociuhandu has quit IRC05:54
*** ociuhandu has joined #openstack-infra05:56
*** ociuhandu has quit IRC06:00
kevinzianw: thanks a lot! It the floating IP is essentail? I suppose it will use IPv6 Public IP only06:03
kevinzsince we don't have enough floating ips actually06:04
kevinzianw: could we set the Linaro US not to use floating ip?06:06
openstackgerritOpenStack Proposal Bot proposed openstack/project-config master: Normalize projects.yaml  https://review.opendev.org/71577206:08
*** udesale_ has joined #openstack-infra06:23
*** dpawlik has joined #openstack-infra06:23
*** udesale has quit IRC06:25
*** udesale_ has quit IRC06:27
*** xek has joined #openstack-infra06:29
*** xek_ has joined #openstack-infra06:33
*** xek has quit IRC06:33
*** ociuhandu has joined #openstack-infra06:35
openstackgerritMerged openstack/project-config master: Normalize projects.yaml  https://review.opendev.org/71577206:37
*** smarcet has joined #openstack-infra06:39
*** ociuhandu has quit IRC06:40
*** yamamoto has quit IRC06:41
*** yamamoto has joined #openstack-infra06:42
*** smarcet has quit IRC06:44
*** xek_ has quit IRC06:49
*** xek_ has joined #openstack-infra06:50
*** ociuhandu has joined #openstack-infra07:05
*** ociuhandu has quit IRC07:09
*** yamamoto has quit IRC07:10
*** yamamoto has joined #openstack-infra07:12
*** jcapitao has joined #openstack-infra07:17
*** ociuhandu has joined #openstack-infra07:19
*** tosky has joined #openstack-infra07:25
*** ysandeep|rover is now known as ysandeep|rover|l07:25
*** ociuhandu has quit IRC07:25
*** pgaxatte has joined #openstack-infra07:28
*** jcapitao has quit IRC07:29
*** jcapitao has joined #openstack-infra07:31
*** rpittau|afk is now known as rpittau07:34
*** arxcruz|off is now known as arxcruz07:35
*** yamamoto has quit IRC07:44
*** ociuhandu has joined #openstack-infra07:52
*** jpena|off is now known as jpena07:53
*** ralonsoh has joined #openstack-infra07:53
*** ociuhandu has quit IRC07:57
*** yamamoto has joined #openstack-infra07:58
*** ociuhandu has joined #openstack-infra08:01
*** smarcet has joined #openstack-infra08:11
*** yamamoto has quit IRC08:14
*** yamamoto has joined #openstack-infra08:15
*** smarcet has quit IRC08:16
*** tkajinam has quit IRC08:30
*** derekh has joined #openstack-infra08:33
*** dtantsur|afk is now known as dtantsur08:34
*** ysandeep|rover|l is now known as ysandeep|rover08:34
*** nightmare_unreal has joined #openstack-infra08:37
*** udesale has joined #openstack-infra08:38
dtantsurAJaeger: morning! thanks for your suggestion on the ironic hacking 2.0 patch, totally missed it. I wonder if somebody has to update https://docs.openstack.org/hacking/latest/user/usage.html#local-checks08:53
*** kevko_ has joined #openstack-infra08:59
*** rcernin has quit IRC09:00
*** pkopec has joined #openstack-infra09:03
*** ociuhandu has quit IRC09:03
*** smarcet has joined #openstack-infra09:06
*** ociuhandu has joined #openstack-infra09:09
*** smarcet has quit IRC09:10
*** ociuhandu has quit IRC09:13
AJaegerdtantsur: yes, let me do the update... Check topic:update-hacking for my weekend fun ;)09:28
dtantsurAJaeger: oh, you did have a lot of fun :)09:29
dtantsurAJaeger: FYI we're handling ironic projects now, no need to bother with them (unless we miss something)09:29
AJaegerdtantsur: Great! One less on my plate!09:30
dtantsurI can assure you don't want to fix W504 all over the ironic codebase :D09:31
AJaegerdtantsur: I disabled W504 everywhere ;)09:31
dtantsurI quite like to have either 503 or 504 enabled for consistency09:31
dtantsur(and 504 seems preferred apparently)09:31
AJaegerOh, there's hacking 3.0 out???09:31
dtantsurWUT09:31
dtantsurrpittau: we're too slow ^^^09:32
*** ykarel is now known as ykarel|lunch09:32
rpittauwhat the...........09:32
* rpittau flips table09:32
dtantsurby the time we update ironic, hacking will have as many versions as firefox09:32
rpittaulol09:32
AJaegerjust minimal changes, shouldn't hurt us.09:32
dtantsurooookay, lemme update my patches while they're not yet numerous09:32
rpittaulast famous words? :P09:32
rpittauso we go for 3.0 ?09:33
dtantsurrpittau: let's try?09:33
rpittaulet's09:33
dtantsurAJaeger: do they have a changelog other than git log?09:33
*** admcleod has quit IRC09:36
*** sgw has quit IRC09:39
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: Allow configure-mirrors to enable extra repos  https://review.opendev.org/69388709:40
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: Allow configure-mirrors to enable extra repos  https://review.opendev.org/69388709:41
*** tkajinam has joined #openstack-infra09:41
openstackgerritAndreas Jaeger proposed openstack/hacking master: Document new way of registering local plugins  https://review.opendev.org/71589409:42
AJaegerdtantsur: I'm not aware of anything besides git log/review.opendev.org for hacking09:43
AJaegerdtantsur: please review 715894 to address the point you raised09:44
dtantsurthx!09:44
*** gshippey has joined #openstack-infra09:47
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: Improve job and node information banner  https://review.opendev.org/67797109:47
openstackgerritAndreas Jaeger proposed openstack/hacking master: Document new way of registering local plugins  https://review.opendev.org/71589409:48
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: Avoid confusing rsync errors when source folders are missing  https://review.opendev.org/67004409:52
*** smarcet has joined #openstack-infra10:00
*** smarcet has quit IRC10:04
*** tetsuro_ has quit IRC10:04
*** ykarel|lunch is now known as ykarel10:15
AJaegerdtantsur: I've done bifrost, see https://review.opendev.org/71561710:17
dtantsurthanks AJaeger10:18
AJaegerdtantsur: I think that's the only ironic one I did - I leave the rest to you and rpittau ;). Now updating for hacking 3.010:18
rpittauAJaeger: sounds good :)10:20
*** ociuhandu has joined #openstack-infra10:21
*** admcleod has joined #openstack-infra10:22
*** yamamoto has quit IRC10:47
*** dmellado has quit IRC10:50
*** smarcet has joined #openstack-infra10:53
*** yamamoto has joined #openstack-infra10:54
*** rpittau is now known as rpittau|bbl10:56
*** smarcet has quit IRC10:58
*** jcapitao is now known as jcapitao_lunch11:02
*** dklyle has quit IRC11:15
*** gfidente has quit IRC11:21
openstackgerritMatthew Treinish proposed openstack/pbr master: Update python requires packaging metadata for package  https://review.opendev.org/71591711:23
*** rfolco has joined #openstack-infra11:24
*** jpena is now known as jpena|lunch11:34
*** rosmaita has joined #openstack-infra11:36
*** artom has joined #openstack-infra11:38
*** smarcet has joined #openstack-infra11:48
*** ysandeep|rover is now known as ysandeep|rover|b11:51
*** smarcet has quit IRC11:52
chandankumarAJaeger, Hello12:00
chandankumarAJaeger, is it possible to enable noop zuul jobs for openstack-tempest-skiplist?12:00
chandankumarto merge few patches there for example https://review.opendev.org/#/c/715871/12:00
*** rlandy has joined #openstack-infra12:01
*** lpetrut has joined #openstack-infra12:02
*** rh-jelabarre has joined #openstack-infra12:04
AJaegerchandankumar: sure, just merge a change in your repo to add it ;)12:05
*** yamamoto has quit IRC12:05
*** dmellado has joined #openstack-infra12:06
AJaegerchandankumar: amend the change to add .zuul.yaml file with the noop-jobs and you should be good12:07
*** yamamoto has joined #openstack-infra12:08
chandankumarAJaeger, ah, will do that thanks :-)12:08
*** ysandeep|rover|b is now known as ysandeep|rover12:13
openstackgerritMohammed Naser proposed openstack/openstack-zuul-jobs master: DNM: test inline pep8 (should not fail)  https://review.opendev.org/71592812:14
*** jcapitao_lunch is now known as jcapitao12:24
*** jpena|lunch is now known as jpena12:30
*** andrewbonney has joined #openstack-infra12:31
openstackgerritMohammed Naser proposed openstack/openstack-zuul-jobs master: DNM: this _should_ be a failing change  https://review.opendev.org/71593012:36
openstackgerritMerged openstack/project-config master: Add Shrews to alumni  https://review.opendev.org/71537312:36
openstackgerritMerged openstack/project-config master: Replace python-charm-jobs to py3 job  https://review.opendev.org/71479612:38
*** rpittau|bbl is now known as rpittau12:46
*** ociuhandu has quit IRC12:48
openstackgerritGrzegorz Grasza proposed openstack/project-config master: Add ability to push signed tags to tripleo-ipa  https://review.opendev.org/71593212:52
*** redrobot has quit IRC12:55
*** Guest43440 has joined #openstack-infra12:56
fungichandankumar: ianw: just now catching up on scrollback in here but the project creation didn't complete. we're still working through failures in jeepyb but i fell asleep12:57
*** Guest43440 is now known as redrobot12:58
*** dave-mccowan has joined #openstack-infra12:59
mordredfungi: SLEEP12:59
*** ociuhandu has joined #openstack-infra13:00
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: tox: allow tox to be upgraded  https://review.opendev.org/69005713:03
*** ociuhandu has quit IRC13:05
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: install-docker: allow removal of conflicting packages  https://review.opendev.org/70230413:05
fungichandankumar: ianw: oh. except that somehow that acl eventually got applied13:07
*** gfidente has joined #openstack-infra13:07
*** ociuhandu has joined #openstack-infra13:10
*** rh-jlabarre has joined #openstack-infra13:12
*** irclogbot_3 has quit IRC13:13
*** mrmartin has quit IRC13:15
*** rlandy has quit IRC13:15
*** rh-jelabarre has quit IRC13:15
*** auristor has quit IRC13:15
*** tinwood has quit IRC13:15
*** rlandy has joined #openstack-infra13:16
*** tinwood has joined #openstack-infra13:17
*** irclogbot_3 has joined #openstack-infra13:18
*** smarcet has joined #openstack-infra13:22
*** cdearborn has joined #openstack-infra13:24
*** ociuhandu has quit IRC13:27
*** auristor has joined #openstack-infra13:28
fungion closer inspection, it took two passes of manage-projects to fully provision that project. discussion in #opendev but fix for that has been approved minutes ago13:31
*** yamamoto has quit IRC13:32
*** Goneri has joined #openstack-infra13:40
*** yamamoto has joined #openstack-infra13:41
*** yamamoto has quit IRC13:43
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: install-docker: allow removal of conflicting packages  https://review.opendev.org/70230413:45
openstackgerritMonty Taylor proposed openstack/project-config master: Run manage-projects on gerrit related changes  https://review.opendev.org/71594513:46
Tenguhello there! not sure this is the right place to ask, but: how may I get a new "stable/branch" in a new code repository? it's for tripleo/validations-libs and tripleo/validations-common - they currently have "only" master, and we'd need to get stable/train in addition...13:47
AJaegerTengu: the release team will create branches for you.13:54
TenguAJaeger: do I need to make a ticket/request somewhere?13:54
Tenguor is it linked to rdo directly?13:55
AJaegerTengu: https://docs.openstack.org/project-team-guide/stable-branches.html13:55
*** xek has joined #openstack-infra13:56
Tenguþanks13:56
AJaegerTengu: so, create a request in releases repo13:56
*** lseki has joined #openstack-infra13:57
TenguAJaeger: ok :). Will check that after my current call13:57
*** xek_ has quit IRC13:57
*** sgw has joined #openstack-infra13:57
*** smarcet has quit IRC14:03
*** dmellado has quit IRC14:04
*** yamamoto has joined #openstack-infra14:04
*** dmellado has joined #openstack-infra14:05
artomdonnyd, o/ OpenEdge (That's FN's new name, right?) doing OK? I got a couple of NODE_FAILUREs14:09
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Adds roles to install and run hashicorp packer  https://review.opendev.org/70929214:11
*** Goneri has quit IRC14:11
*** smarcet has joined #openstack-infra14:11
*** ysandeep|rover is now known as ysandeep|away14:12
*** lpetrut has quit IRC14:13
donnydartom:  Checking now14:13
donnydWell that is quite a bit more than a coule14:13
donnydWell that is quite a bit more than a couple14:13
*** Goneri has joined #openstack-infra14:14
*** lpetrut has joined #openstack-infra14:15
*** beekneemech is now known as bnemec14:15
artomdonnyd, I'm a selfish prick, I speak only for my patches ;)14:16
donnydOh well that was easy to figure out.. .Had a jenkins server run away with all my resources14:16
donnydseriously thank you for the heads up.14:16
* artom imagines a British butler running away with rackmounts14:16
artomLaughing maniacally14:16
artomdonnyd, thank you for providing the resources :)14:17
artomdonnyd, btw, would you consider adding the nested-virt label/flavor? We (whitebox) will probably move some of our tests to a job that runs on those, to avoid being 100% dependant on Open Edge (not that we don't trust you, but single point of failure and all that)14:17
donnydI will add any flavor you want - but I do believe its enabled by default14:18
artomOh right, it's there already. Ignore me :)14:19
donnydBut if you need something special - always ask because if I can do it - I will14:20
artom:)14:20
*** jackedin has joined #openstack-infra14:21
*** ykarel is now known as ykarel|away14:25
donnydartom: keep me in the loop if you have anymore issues getting to OE14:30
artomdonnyd, will do - thanks again, it's appreciated14:32
artomdonnyd, hrmm, so actually14:32
artomdonnyd, do your machines have SRIOV-capable network cards?14:32
*** ociuhandu has joined #openstack-infra14:34
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: Allow configure-mirrors to enable extra repos  https://review.opendev.org/69388714:37
donnydyes - but its not setup ATM14:37
artomdonnyd, interesting :) Good to know, I don't have a specific ask for now, I need to mull it over some more14:39
*** dklyle has joined #openstack-infra14:40
donnydartom: I have Intel X520 nics14:42
donnydthere are 8 per hypervisor with 4 currently in use14:42
artomdonnyd, so, the context is - Red Hat (my employer) is working on a bunch of Nova features involving what I like to call "exotic hardware"14:42
artomVGPUs, FPGAs, that kind of stuff14:43
artomUnless RH ponies up the hardware, we'll never have upstream CI for that14:43
fungiwell, or unless it can be effectively emulated14:43
artomSo I've been kinda half-assedly pushing for us to pony up the hardware and set up that CI14:43
fungiand you could certainly have upstream ci performing unit testing on all the bits of the driver even if it's not an integrated functional test with the hardware14:44
artomhalf-assedly because our internal hardware situation is a mess, and because I don't want to do devops for a CI cloud14:44
artomfungi, yeah, we did that that in the past for NUMA and PCI14:44
artomAnd while it's better than nothing, real hardware integration tests are definitely a massive advantage14:44
donnydartom: IMO - 3P CI is not as good as 1P CI14:45
artomdonnyd, we don't even have 3p14:45
fungii would argue that you at least want thorough unit testing with as much code coverage of the driver as possible, regardless of whether or not you also have functional tests with representative hardware14:45
artomfungi, right, I'm not disagreeing with you14:45
artomJust saying there's a testing gap there :)14:45
artomSo I figured I thought I could start with something smaller in scope and simpler, namely SRIOV CI14:46
artomWhich we also don't really have - I guess Mellanox have a 3rd party one?14:46
artomSo... if donnyd enabled SRIOV on OE, *and* Nova does the work to enable 2-level passthrough of PFs, we would cover the SRIOV CI bit14:47
artom*But* we'd still have the GPU, FPGA gap14:47
artomWhereas if I managed to get that RH 3P CI up and running14:47
clarkbnote we have gpus available in small quantities14:47
artomWe'd have the groundwork to just add cards/machines to that CI for any future needs14:48
clarkbits like 1 gpu enabled test node at a time iircbut >014:48
artomSo I think the RH CI is the more "correct" idea, as it lays the groundwork for future exotic hardware testing14:49
artomWhereas SRIOV testing on OE is limtited in scope14:49
artomDespite being easier (for my, at least) to get started with14:49
donnydIt would be better to grab the right hardware and send it to a trusted 1P CI provider14:49
artomWe do have a deal with Vexxhost...14:50
donnyd3P CI's only work so good and are only so useful14:50
clarkbiirc the reason nova wasnt super interested in the gpus we have is they cant do virtualization if gpu resources to split it up its all 1:1 pci passthrough14:50
artomIt's just... very much outside my scope as a dev engineer14:50
donnydmnaser:  is pretty trustworthy - maybe some agreement to load "exotic" hw into 1P CI is a more scalable and reliable method to success for such a thing14:51
artomdonnyd, I don't doubt mnaser's quality as a person and business :)14:51
donnydI would think... If we look at how well the 3P CI thing has worked over time it appears to be hit or miss to me14:51
*** ociuhandu has quit IRC14:52
artomTripleO seem to have made it work14:52
donnydYes they have - with a significant amount of kinetic effort14:52
artomLike everything else TripleO...14:53
donnydand when someone (company A) wants to get a special test env setup - they have to built the whole thing instead of just plugging it into something that already runs and drives14:53
clarkband is reliable14:53
clarkbthat seems to be the big thing third party ci underestimates14:53
donnydAs our resources grow tighter - I would think the model would work better... but there has to be a level of trust between the 1P CI providers14:53
artomdonnyd, it's a very valid point...14:53
*** iurygregory has quit IRC14:54
artomdonnyd, I could try and bubble that up the chain14:54
artom(My internal RH management chain)14:54
donnydand if someone wants to go it like that.. well you are a 1P provider - or your not.. Its possible that some may see the value in contributing14:54
donnydIt makes sense to me - but also I sit on the 1P provider side of the fence.. If I need to get a feature in Openstack - it would be super duper easy for me to do so - because I already contribute direct resources that are already plugged in and proven14:56
*** iurygregory has joined #openstack-infra14:56
artomdonnyd, so out of curiosity - and I want to emphasize, this is really just me asking questions - if RH came to you with a proposal to add hardware to OE14:58
artomPresumably with some amount of money being exchanged in some way14:58
donnydCall me crazy - but that is my thought. If RH or Intel, or Nvidia needed something special (for example - not an exclusive list) - I don't think its too much to ask to find a partner who already does CI... or contribute to the general pool themselves with a trusted proven build system. Also in the event we ever needed an audit or something like that - we won't have a bunch of build systems behind some nebulous14:59
donnydwall.14:59
artomWould you be open to that?14:59
artom"contribute to the general pool themselves with a trusted proven build system"14:59
artomWe don't have that :/14:59
artomNot for running external CI workloads, at any rate14:59
artomWe're a software-first kind of shop, running clouds is new to us15:00
artomWe're getting there, but it's still WIP15:00
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: install-docker: allow removal of conflicting packages  https://review.opendev.org/70230415:00
donnydI would happily host them likely for free - but if I am being honest - I think vexxhost / limestone / rackspace  (etc) are better candidates for this type of proposal.  I run my cloud to gain experience and because it is just fun for me. I don't have a fancy degree, so I have to learn through doing to back up skill set in the market.  All the other 1P providers make their living from Openstack and already have15:04
donnydreal data centers and prod workloads running.  I am pretty sure they would happily take on such a project.15:04
mordredmnaser: ^^15:06
artomdonnyd, since you're running for TC, wouldn't that be a project that you could champion? Make 1P CI "pluggable" in an official way ;)15:07
*** yoctozepto has quit IRC15:07
*** yoctozepto has joined #openstack-infra15:08
*** edausq has quit IRC15:08
*** toabctl has quit IRC15:08
*** osmanlicilegi has quit IRC15:08
*** ab-a has quit IRC15:08
*** arxcruz has quit IRC15:09
*** rledisez has quit IRC15:10
*** osmanlicilegi has joined #openstack-infra15:10
*** aarents has quit IRC15:10
*** EmilienM has quit IRC15:10
*** armax has joined #openstack-infra15:10
*** arxcruz has joined #openstack-infra15:10
*** rakhmerov has quit IRC15:10
*** rledisez has joined #openstack-infra15:10
*** EmilienM has joined #openstack-infra15:11
*** ab-a has joined #openstack-infra15:11
donnydartom:  I would ask why the TripleO CI is 3P to begin with15:11
*** cmurphy has quit IRC15:13
donnydI would not ask them to fund the whole CI... I can tell you from experience how much it costs each month (about 600 -1K dollars for me).  I would ask why 3P is easier to maintain, easier to wire up, ... just easier..  Also I have a pretty myopic view of the world.. its through my tiny little lens of personal learning. So of course take all that in context.15:13
mordredartom: 1P CI is pluggable already15:13
mordredwe have an extensible system that accepts resources from multple parties15:13
artommordred, but only in the sense that each party spins up their own cloud, and then plugs into nodepool, right?15:13
donnydIts on my hit list - we need resources.. and I am willing to bet all these 3P CI's could really make a discernible difference in how efficient and effective our CI really is.  If we could get all the 3P to convert to 1P - and just get a small amount of general purpose workloads.. we could make our CI do more with the already dwindling resources... also schedule jobs where they belong15:13
mordredartom: that's right - or hands resources to someone already running a cloud.15:14
artommordred, I was thinking more of the following use case: I want to test a specific piece of hardware, but don't have the capacity/will/resource to actually run a cloud. But I'm willing to buy the hardware itself. Can I send it to an existing 1P provider?15:14
mordredit is not required that someone run a cloud themselves - as donnyd  pointed out - there are plenty of vendors out there already doing that15:14
artommordred, ah, so how would that work?15:14
mordredartom: in theory that should be possible - but so far nobody has decided to do it :)15:15
mordredartom: that said ...15:15
mordredit's worth noting there is another facet that makes some things more 3PCI than 1PCI - and that's ability of the general population of openstack to debug/reproduce/fix issues15:15
donnydVendor A signs partnering agreement with Provider B - Provider B creates appropriate provisions and labels in nodepool - vendor A schedules jobs against that15:15
donnydWe literally already do this right now with OE(FN)15:16
mordredso if the special hardware is a million dollar SAN - it's still unlikley to go into 1PCI because actually gating on it working potentially presents an undue burden on the developers if something goes wrong - the number of people able to fix the issue are ... low15:16
mordreddonnyd: and yup - that15:16
*** edausq has joined #openstack-infra15:16
fungia big hurdle for many of the third-party ci systems is that they're run in corporate labs with draconian firewall rules which preclude things like our zuul servers making connections to api endpoints they might be hosting15:16
artommordred, the "general population" thing is a differnet problem. We (RH) have massive customers that use all that stuff (think telcos), but don't participate upstream 1 iota15:17
mordredso - there's two axes - one is general purpose things that we just don't have access to but if someone wanted to fund it would make sense - GPUs probably fit into this category15:17
artomSo they're there, but the only way they have to influence/improve the community is through us (RH)15:17
mordredartom: that's not what I mean15:17
fungiand getting their companies' network security overseers to okay external access for lab environments or to create entirely separate networks to put these resources on is more than they want to deal with15:17
artommordred, ah, you mean in terms of the infra team being able to manage it?15:18
mordredwhat I mean is that causing all of the openstack developers to be bound by the health of gating on a piece of gear that most of the openstack developers do not have the ability to interact with or fix when it goes wrong is a fundamental issue - and those sorts of things are organized into 3PCI quite on purpose - it's not just a matter of availabilty or willingness of a vendor to manage it15:18
donnydThere are many ways to slice this up - but really it boils down to making 1P CI the preferred method.  Something like a partnering agreement can be done direct between vendor dev shops and existing 1P CI providers.. no need to involve corporate IT in this15:19
mordredartom: I mean that openstack developers in general shoudl have a reasonable expectation of beinga ble to debug an issue if it's something we're going to gate on15:19
donnydbut for the million dollar SAN example - 3P makes sense15:19
mordredhas very little to do with infra15:19
*** cmurphy has joined #openstack-infra15:19
artommordred, I see15:20
mordredfor general purpose specialized hardware that just isn't in out clouds - like ARM or GPUs - that's a thing where working with providers to make sure it's provided into some clouds is a great way forward to allow people to work on it15:20
mordredbecause those things aren't crazy for normal devs to be expected to debug if the gate goes south15:20
mordredso there's some things not in the 1PCI system just because nobody has provided them but would be very happy additions - and some things where it's still likely not a good idea for the entire community to be gated by it15:21
donnydThe existing providers can already get dev's that direct debugging access. We all already have public facing clouds that are designed with that intent in mind.15:21
mordredyeah15:21
artommordred, ok, I see what you mean15:21
donnydits not a one size fits all answer.. its more about lowing the barriers to getting devs access to the things they *need* to be successful .. and lowering the burden for contribution15:22
mordredartom: but back to your original question - I bet there are still a bunch of ways in which RH could choose to send money to some of the existing cloud providers to add capabilities for specific things15:23
donnydFor example wouldn't it be cheaper (and easier)  for RH to just buy CI resources from an existing CI provider.15:23
mordredboth for 1PCI and 3PCI use cases - because even if it's a 3PCI use case - it still might be ... yeah ^^15:23
mordredthat15:23
artommordred, I'd agree - it's just way outside my usual scope15:23
mordredartom: same :)15:23
sean-k-mooneymordred: well part of teh issue is if we want to set up public ci we basicaly need to disconenct form the redhat network15:24
sean-k-mooneythe current provider we have for upshift is connecteedd to the redhat network by it15:24
clarkbone thing I've seen over the years is that devs will demand X, we provide it, then the actual uptick on use of that feature is really slow. That then feeds back into the system as "this wasn't important afterall". We've seen that with multi node testing, the gpu resources mentioned above, and more. I mention this because one thing to be wary of is doing a bunch of work then having the result sit idle15:24
clarkbsometimes starting small and iteratively is a good thing15:24
clarkbnow multinode testing is common and many people make use of it but for about a year no one would touch it15:25
clarkbwhich can be very discouraging15:25
mordredsean-k-mooney: yah. that's why just paying mnaser (or someone) to run the capacity you need instead of trying to run your own cloud *might* be a more cost-effective choice15:25
artomclarkb, I guess it's chicken and egg a bit? It appears, but not necessarily super stable, devs are weary of using it, so it doens't improve, etc etc15:25
*** ociuhandu has joined #openstack-infra15:26
*** ab-a has quit IRC15:26
*** edausq has quit IRC15:26
clarkbartom: I guess? In the case of multinode testing it was super unstable bceause openstack was unstable :)15:26
donnydI think it really falls back to why you should buy it instead of build it for *most* cases. Clouds are hard.. if they were easy... well everybody would already have one15:26
clarkbI think the correct reaction is to realize "we demanded this now it is up to us to make our software work with it"15:26
artomdonnyd, they do. It's called AWS ;)15:26
donnydwhere is the barf emoji when you really need it15:27
donnydLOL15:27
clarkbartom: fwiw literally nothign is stopping you from testing with gpus today aiui15:27
clarkbthe only thing is the assumption its not possible and the annoyance it isn't new enough gpu hardware to do the virtualized gpu stuff15:28
artomclarkb, except my own ignorance ;)15:28
donnyddoesn't vexxhost already have those labels clarkb ?15:28
mnaseri don't think our gpus support vgpus15:28
clarkbbut if you start with pci passthrough testing and that all works its so much easier to say "we can apply this to the newer thing if we had it"15:28
clarkbmnaser: correct15:28
*** rledisez has quit IRC15:28
clarkbmnaser: I'm suggesting we not worry about that to start15:28
clarkbbut instead we have an zero progress has been made15:28
clarkbif instead we start by doing what we can then we make some progress and have an arugment for the future in order to make more progress15:29
clarkbbut as far as I nkow the whole thing has been DOA because no vgpu15:29
artomclarkb, so in essence you're saying "create CI jobs that use GPU PCI passthrough as an argument to add vGPU capability"?15:29
sean-k-mooneymnaser: they do not15:29
*** pgaxatte has quit IRC15:29
sean-k-mooneymnaser:  i confirmed that with you sortly after you added them to the ci pool15:30
*** ab-a has joined #openstack-infra15:30
clarkbartom: right because it shows there is actual interest and something is working. What we are showing today is no one cares enough to do the basic thing15:30
donnydright - wouldn't it be easier for the vendor wanting to test VGPUS to just send you the right gear mnaser ?15:30
fungiartom: if the goal is to test gpus, then have jobs which make use of pci passthrough to interact with the gpus. if the goal is not actually using gpus then the "need" for gpu-enabled test nodes may have been a mischaracterization15:30
sean-k-mooneydonnyd: said vender being nvida means that is not going to happen15:30
sean-k-mooneyalso licensing would still be a pain15:30
artomfungi, I guess the latter then. I wasn't involved in those discussions, but I'm assuming GPUs were requested for vGPU stuff, not plain old PCI passthrough.15:31
sean-k-mooneyartom: correct it was for the vgpu testing15:31
clarkbexcept we don't have first party pci passthrough testing either...15:31
sean-k-mooneywhich needs specific sku15:32
fungiin which case do you really need gpus to test gpu virtualization features?15:32
clarkbliterally this kills two birds with one stone but devs are not interested unless they get extra features15:32
donnydsean-k-mooney: hence the partnering agreement15:32
*** dmellado has quit IRC15:32
clarkbI get it, we can't test the extra features, but we can test all the other bits15:32
artomfungi, yeah, because the physical card provides the virtual GPUs via the mdev mechanism15:32
artomCan't test the latter if you don't have the actual card :)15:32
fungiseems like if you don't need an actual gpu to handle some workload, you could just mock the vgpu interface15:32
clarkband by not doing what we can it takes the wind out of the sails for doing more15:32
sean-k-mooneyfungi: qemu cant virtualis fake gpus capable of testing mdev based vgpu15:32
*** rledisez has joined #openstack-infra15:32
sean-k-mooneyfungi: so the only way to test it is on baremetal ironic nodes15:33
fungisean-k-mooney: because nobody has written the software yet? do we know anyone who writes software? ;)15:33
artomfungi, I don't know much about that stuff, but that would be an entirely new kernel module IIUC :P15:33
fungii hear those are software too15:33
*** jgwentworth is now known as melwitt15:33
artomYou're not genuinely suggesting we start writing kernel-level mocks15:33
clarkbartom: sean-k-mooney yes, I think we've all accepted that. What I'm suggesting is that the pci passthrough case is a literal example artom made above and we can test that as far as I know but no one is willing to15:33
sean-k-mooneyfungi: for what its worth i have looked at faking this before but it wont really help imporve testing15:34
fungiartom: why not mock the interactions in the driver?15:34
*** ociuhandu has quit IRC15:34
clarkband by not doing that work we've taken all the momentum behind getting to the point of doing what you really want and thrown it away15:34
sean-k-mooneyfor examlpe we can crete mdves and check that that logic works using the fake tty mdev driver15:34
artomfungi, those are functional tests, and we have those15:34
artomfungi, but functional tests are only as good as the mocks they use15:34
*** iurygregory has quit IRC15:35
artomAnd we've been burned by that before15:35
*** ociuhandu has joined #openstack-infra15:35
fungisure, and integration tests are only as good as the integrations they test15:35
sean-k-mooneyyep15:35
*** edausq has joined #openstack-infra15:35
*** iurygregory has joined #openstack-infra15:35
sean-k-mooneyfungi: nova has tired to require thridpary ci before merging some of those feature in the past15:35
sean-k-mooneybut we have not always been successfully in geting the vendor to set it up15:35
*** nightmare_unreal has quit IRC15:36
*** yamamoto has quit IRC15:36
*** smarcet has quit IRC15:36
sean-k-mooneywe were able to get intel to provide a hardared based thrid party ci for virtual persitent memory but now that they have pulled back that is not really maintianed and the cyborge ci is not going to be put in place either15:37
artomclarkb, so, let's say I get a job merged that uses the GPUs we do have15:38
artomclarkb, tbh, I don't see how that's an argument to then go out and buy different GPUs15:38
fungiyep, there's probably also a limit to what nova should expect to have to test, if what's actually happening is interactions with some hypervisor and not directly with the hardware. can't the hypervisor maintainers test those hardware interactions?15:38
clarkbartom: it supports the idea that people actually care about that testing15:38
clarkbartom: right now the message we are giving is that no one cares enough to do a simple test15:38
artomBecause they're completely different features. It's like saying "we have NUMA CI jobs, now please buy some FPGAs"15:39
clarkbartom: also from experience we tend to learn a lot setting up the simple case15:39
*** andreykurilin has quit IRC15:39
*** dmellado has joined #openstack-infra15:39
clarkbartom: from an implementation detail perspective maybe, but for end users they are very related15:39
artomfungi, that's kind of different issue - libvirt isn't really a hypervisor15:40
clarkbartom: if I know that gpu pci passthrough is working and well tested I'm more likely to use that feature15:40
clarkbthen I might want to enable vgpus and find oh that isn't tested due to some technical issue lets help there15:40
clarkbright now we've put the car in park and have given up before leaving the driveway15:40
artomclarkb, ah, I see15:40
artomclarkb, well, I was seeing it as two cars15:40
fungiartom: i was using "hypervisor" as short hand for nova backend, my point was does libvirt test that its support for those things works?15:41
artomExcept one is missing, and the one we have is actually a bike15:41
mordredbikes provide good exercise15:41
artomfungi, I have no idea15:41
artom(about the libvirt testing)15:41
fungilike, is nova's job to test libvirt for the libvirt maintainers?15:41
artomkashyapc would not15:41
artom*know15:41
fungior can nova just assume that libvirt tests that its support for these things works, and so only test its libvirt api compliance?15:42
artomfungi, well, libvirt doesn't fully encapsulate/abstract the hw15:42
clarkbfrom experience most of the big leaps in ci capability that have been made started small on a bike15:42
clarkbmulti node testing is a major example of this15:42
fungiso at what point does nova talk directly to the kernel modules for this stuff? are those kernel modules tested, and can nova just test its compliance with the kernel module's api?15:43
artomfungi, I think we're just debating the value of full stack integration tests at this point, no? :)15:43
artomOr at least, whether it's OpenStack CI's role to perform them15:44
clarkbartom: I think fungi is saying a similar thing to what I'm saying15:44
clarkbif we start by doing what we can, that shows interest and motivation in the space15:44
clarkband you can often turn that into further development15:44
artomclarkb, fair enough - snowball effect and all that15:44
*** rledisez has quit IRC15:45
*** ab-a has quit IRC15:45
*** Ng has quit IRC15:45
*** tdasilva has quit IRC15:45
*** rajinir has quit IRC15:45
*** evgenyl has quit IRC15:45
*** donnyd has quit IRC15:45
*** jdelaros1 has quit IRC15:45
*** hrybacki has quit IRC15:45
*** dougwig has quit IRC15:45
*** ttx has quit IRC15:45
*** vdrok has quit IRC15:45
*** jamespage has quit IRC15:45
*** rpittau has quit IRC15:45
*** lathiat has quit IRC15:45
*** kota_ has quit IRC15:45
*** Anticimex has quit IRC15:45
clarkbI'm talking about it from a test resource perspective and fungi is talkign about it from a mocks/fakes perspective15:45
clarkbboth push the problem space forward in different ways15:45
fungiartom: take it the other way, shouldn't you then want to test different bios loadouts on the servers too, because you can't trust that the bios is thoroughly tested by its maintainers? there's always going to be a point at which you say "this is the scope of what we feel is reasonable to test, and we trust the the people who build the things we're interacting with test what they're responsible for"15:45
*** rledisez has joined #openstack-infra15:45
*** ab-a has joined #openstack-infra15:45
*** Ng has joined #openstack-infra15:45
*** tdasilva has joined #openstack-infra15:45
*** rajinir has joined #openstack-infra15:45
*** evgenyl has joined #openstack-infra15:45
*** donnyd has joined #openstack-infra15:45
*** jdelaros1 has joined #openstack-infra15:45
*** hrybacki has joined #openstack-infra15:45
*** dougwig has joined #openstack-infra15:45
*** ttx has joined #openstack-infra15:45
*** vdrok has joined #openstack-infra15:45
*** jamespage has joined #openstack-infra15:45
*** rpittau has joined #openstack-infra15:45
*** lathiat has joined #openstack-infra15:45
*** kota_ has joined #openstack-infra15:45
*** Anticimex has joined #openstack-infra15:45
mnaserI think there is a massive difference from testing pci pas through and vgpus15:46
*** andreykurilin has joined #openstack-infra15:46
artomfungi, right, it's definitely a big grey area15:46
artomfungi, tending towards pitch black when you're at BIOS level ;)15:46
clarkbmnaser: from a technical perspective, yes15:46
mnaserone is straight up just an extra thing we pass to libvirt, the other is a big combination of interactions with the kernel, placement service and external software all together15:46
clarkbmnaser: but from users driving use cases they are more closely related15:46
artomfungi, my understanding as a Nova dev is - functional tests have lied to us in the past15:47
artomAnd integration tests on real hardware are better15:47
*** rledisez has left #openstack-infra15:47
artomObviously taking into account cost of resources, etc15:47
mnaserIf we want to test pci passthrough, we can probably do it in other ways that don’t need a GPU15:48
mnaserbut we would be pretty much testing libvirt at that points15:48
artommnaser, I agree, but I also get clarkb's point - if a user wants GPUs in their instance, there's not much difference between PCI passthrough and VGPUs15:48
fungibut vgpu provisioning doesn't go through libvirt?15:48
artomThe operator will make that distinction, but not the user15:48
artomfungi, no, kernel IIRC15:49
*** aarents has joined #openstack-infra15:49
fungigot it, so nova's host agent talks to the kernel driver?15:49
artomfungi, I'm not the expert on this - but I believe it's the deployer/operator's job to set it up, and Nova then reads from /sys and stuff15:50
fungiokay, so sysfs interactions15:51
fungithat does seem like something which could be recorded and played back, at least15:51
*** hashar has joined #openstack-infra15:51
fungi(in absence of having actual hardware representatives)15:52
fungiand then devs could even run those tests locally without needing a fancy gpu too15:52
artomI'm pretty sure we have func tests that do something like that15:53
fungias long as the jobs don't go so far as to try to put a workload on the gpu from a guest15:53
artomIn any case, I guess I need to put my money where my mouth is and start using the GPUs we have15:53
artom... and then ask for more money for VGPU-capable hw :)15:53
donnydartom: or ask the vendor that makes them to collaborate with someone who can lower the burden15:54
donnyd:)15:54
artomdonnyd, yeah, I believe NVIDIA were giving away GPUs left and right... but to the wrong company ;)15:54
fungiworth talking to knikolla as he may have some info on what the gpu situation is like in moc/cloudlab15:55
artomI need to go feed the kiddos, before the explode the house15:55
fungii saw lots of fancy presentations at the open cloud workshop earlier this month about testing gpu and fpga enabled systems15:56
donnydfungi: links?15:56
* fungi checks to see if the recordings have gone up yet15:57
artomhttps://massopen.cloud/events/2020-open-cloud-workshop/ I'm guessing15:57
fungiyeah, there15:57
fungii especially liked the talk on 100% free/libre open source toolchain for configuring fpgas15:57
* fungi breaks himself of the habit of saying "programming" where fpgas are concerned15:58
knikollafor openstack, i believe we're using PCI passthrough. https://github.com/CCI-MOC/rhosp-director-config this is our full tripleo config in case anyone finds it helpful.15:58
fungiknikolla: okay, so no vgpu availability in there yet, i guess15:59
knikollaI don't think that is available in queens yet, which is what we're still on.16:00
*** smarcet has joined #openstack-infra16:02
clarkbit also requires specific hardware16:03
clarkb(which you may or may not have)16:03
sean-k-mooneyknikolla: the issue is that witough some change to nova and maybe qemu we cant do double passhtough16:03
sean-k-mooneyknikolla: so even if you had the correct gpus available unelss the ci could spin up the hsot as an ironic node it would not allow use to test with them16:04
knikollayeah, we've created a VM flavor which fully eats up the entire GPU node16:04
knikollabut for us that's usually less of an issue, since we also allow people to reserve bare metal nodes16:05
knikollaand usually people who need specialized hardware do that16:05
*** smarcet has quit IRC16:05
knikollaour PCI passthrough it's mostly for OpenShift running on top of OpenStack to see the GPUs and run containers on those.16:06
sean-k-mooneyknikolla: right but unless you modifed the xml to have a virtual iommu, placed the gpu into a sperate iommue by adding it to a different pcie bridge and configured the q35 machine type and used a uefi image16:06
sean-k-mooneythen i dont think we would be able to create VF in the vm to then create the mdevs for the l2 guest to use16:06
sean-k-mooneyalso we would need nested virt16:06
sean-k-mooneyos one of the issue is while there are ways to test specifc hardware in a vm its not suppored by nova so the first layer vm would have to be created by something that is not openstack at the moment16:08
sean-k-mooneyi have done some testing like this manually usign libvirt but just enough to know what the gaps are in nova and how much work it is to close them16:08
sean-k-mooneyfor what its worth amd vGPU technology called Mxgpu just uses sriov16:09
sean-k-mooneyand requries no licening16:09
*** rpittau is now known as rpittau|afk16:12
knikollasean-k-mooney: we haven't needed double pci passthrough so haven't experimented with that.16:16
knikollabut your last message gives me quite a bit to look up and learn about :)16:17
sean-k-mooneyknikolla: you cant do it with openstack but https://www.berrange.com/posts/2017/02/16/setting-up-a-nested-kvm-guest-for-developing-testing-pci-device-assignment-with-numa/ actully does work16:17
sean-k-mooneythe issue is normally if you do that you cant creat VFs in the first level vm to assing to the second level guest16:18
sean-k-mooneyi belive its possible to get it too work but its non trivial16:19
sean-k-mooneyi have done a double passthough of a full nice using that method16:19
sean-k-mooney* full nic16:19
fungiand just to clarify, the reason i suggested knikolla is that he's one of the folks i can think of who might have access to the hardware necessary to work out how to test such features (and mordred has chatted with him in the past about us possibly getting some small nodepool quota in their environment)16:22
sean-k-mooneycool16:22
*** kevinz has quit IRC16:22
sean-k-mooneyit would have to be an ironic node really to test given the current limitations16:23
knikollai could see us again offering hardware to help test, especially in the context of #openinfralabs16:23
sean-k-mooneythat said i would proably priorites normal pci pashtough and sriov testing over vGPU testing as that is more generally useful and requires less specific hardware16:24
sean-k-mooneyfungi: does infra currently run its own cloud by the way. i think it did/does16:25
clarkbwe did, but that hardware went away16:26
sean-k-mooneyah16:26
clarkbI'm not sure we're interested in taking that on again16:26
sean-k-mooneyunderstandable16:26
clarkbrunning openstack is easy compared to dealing with "datacenter flooded", "we lost your rails", "the network switch mysteriously became a bridge"16:26
clarkband so on16:26
sean-k-mooneyya16:27
sean-k-mooneyor "the datacenter cooling system is dead and so is the backup, rooms are hiting 70C, shut off your worklod NOW! we are powering off the racks"16:28
sean-k-mooneythat happend when i was an intern16:28
*** udesale has quit IRC16:29
sean-k-mooneyactully i think it was more like 40c in the room the servers were at 70 and were starting to trip there overheat protection16:30
fungiyes, operating openstack, even remotely, was not that challenging for us. dealing with the inevitable hands-on tasks in data centers for hardware which was "donated" to us because it was too old for the cloud provider to reliably use, that was the issue16:30
artomI read that as "the network switch mysteriously became a fridge" and was really confused16:30
fungiiot16:30
fungiin the case in question, the tor switch was shared with some other users on different vlans, and i think the switch admin (we didn't have config/management access to it) didn't set a large enough memory allocation for the bridge tables16:31
knikollaobligatory reference to "when sysadmins ruled the world"16:31
knikollaearth*16:31
fungiitym "systems reliability engineers" ;)16:32
* fungi still misses being called a "sysop"16:32
clarkb"datacenter flooded" isn't made up if anyone is wondering :)16:32
fungiwelcome to texas!16:33
clarkbturns out having datacenters in a swamp that gets hit by hurricanes leads to that16:33
sean-k-mooneywait flooded and texas? isnt most of it a dessert16:33
fungithat's what they want you to believe16:33
sean-k-mooneyah16:33
sean-k-mooneyhurricanes ya16:33
fungihewlett packard made some interesting facilities choices, we'll just leave it at that16:34
*** evrardjp has quit IRC16:36
*** evrardjp has joined #openstack-infra16:36
*** kevko_ has quit IRC16:37
*** jcapitao has quit IRC16:40
*** Anticimex has quit IRC16:43
*** Anticimex has joined #openstack-infra16:44
*** prometheanfire has quit IRC16:45
sean-k-mooneyartom: just to conclude this we might want to see if we can fake out the mdevs using the tty driver upstream to add some addtional testing but for now i think we have to just make do with what we have unless we can get REXCI running16:46
donnydartom: If you are interested in SR-IOV things that can be done with OE - maybe with a small amount of assistance I can get you something to support the effort16:51
sean-k-mooneydonnyd: we are missing openstack support so th eonly way to do it would be via ironic or staticaly provisioned nodes in nodepool right now16:52
*** yamamoto has joined #openstack-infra16:53
*** psachin has quit IRC16:53
donnydIronic isn't hard - I am thinking that is in the realm of possibilities - any specific hw requirements if that is the way?16:53
donnydI have some xeon-d based boxes I have no issues with inserting into the mix16:54
sean-k-mooneyjust some nics that support sriov. ofthen the onboard nics if they are intel will be just fine for that16:54
sean-k-mooneywe would need to create a new lable in upstream nodepoll to then consume them but after that is done a test jobs could be created.16:55
sean-k-mooneyfor the vgpu testing we might be able to cheat and use https://github.com/torvalds/linux/tree/f97c81dc6ca5996560b3944064f63fc87eb18d00/samples/vfio-mdev16:55
donnydpretty sure this is the board I have in stock - https://www.supermicro.com/en/products/motherboard/X10SDV-4C-TLN4F16:56
donnydwill that work?16:56
sean-k-mooneythats a samble mdev driver that creates a virtual siral port16:56
sean-k-mooneyform a nova point of view it would be effectly the same as a vgpu16:56
sean-k-mooneydonnyd: let me check16:56
sean-k-mooneyoh xeon-d ya it should work ill jusg check the nic16:57
sean-k-mooneyDual LAN with Intel® Ethernet Controller I350-AM216:57
sean-k-mooneyDual LAN with 10Gbase-T16:57
sean-k-mooneyso yes the i350 support sriov16:57
sean-k-mooneythe 10G nics proably also do16:58
*** yamamoto has quit IRC17:00
*** xek has quit IRC17:02
*** xek has joined #openstack-infra17:03
artomsean-k-mooney, donnyd, so, with RHEx CI (the internal Red Hat 3P CI I wanted), the plan was always to use SRIOV as a gateway to a larger scope (vGPUs, FPGAs)17:04
artomdonnyd's argument that it's better to centralize in 1P than every vendor reproducing their own entire cloud for their own bit of hardware in their own 3P CI speaks to me a lot17:04
artomRH already has a contract with mnaser, and donnyd seems open to a similar arrangement17:05
artomSo I'm wondering whether we shouldn't pivot to that approach17:05
artomI'll talk to Eoghan about it17:05
*** dtantsur is now known as dtantsur|afk17:06
donnydsean-k-mooney: I will get ironic up and running - it was already on my hit list anyways17:06
sean-k-mooneyartom: this is also interesting https://github.com/torvalds/linux/commit/d61fc96f47fdac1f031ed4eafa9106fe10cdaa37#diff-a85d93c1c9bb0ede2e7ef1beaa2534af apparently there is a sample dispaly driver.17:07
sean-k-mooneyi have used the mtty serial driver for testing before17:07
sean-k-mooneybut we might be able to use the sample dispaly driver for testing without vgpu hardware17:08
*** smarcet has joined #openstack-infra17:08
sean-k-mooneylooks like it supports multiple types too17:08
sean-k-mooneyhttps://github.com/torvalds/linux/blob/f97c81dc6ca5996560b3944064f63fc87eb18d00/samples/vfio-mdev/mdpy.c#L51-L5317:09
*** derekh has quit IRC17:09
*** zxiiro has joined #openstack-infra17:09
sean-k-mooneyit was only adding in kernel 4.18 however17:09
sean-k-mooneystill ubuntu 20.04 will certely work for that and maybe 18.04 if we are using the hardware enableing kernel in the vms17:10
sean-k-mooneyi think 18.04 orgianlly shiped with 4.15 but you can get much later kernels just not sure what the gate is useing17:10
clarkbsean-k-mooney: we install the default kernel by default (which is the older one), but you can install and reboot on a newer kernel if you need it17:11
sean-k-mooneyclarkb: we would jsut do that as a pre playbook right17:12
clarkbya17:12
sean-k-mooneyadding the appriate wait and recoonect logic17:12
clarkbif you codesearch .yaml files for reboots you'll probably find a small number of examples17:12
sean-k-mooneyi have comipled that before and used the mtty driver. i might try and play with it later in the week17:13
sean-k-mooneyif i can get it too work locally then i can try it in the ci17:13
sean-k-mooneyclarkb: would you have any issues with uploading a second copy of one of the sample image with different device metadata?17:21
sean-k-mooneyi i am able to make the mdpy driver work i think we would have to set hw_machine_type=q35 in the image17:21
sean-k-mooneyi know nodepool can do this but not sure if that would be ok17:22
*** jpena is now known as jpena|off17:22
clarkbsean-k-mooney: will nova fallback gracefully if it can't provide a q35 machine?17:23
clarkbif so maybe we can just set it and not worry?17:23
*** prometheanfire has joined #openstack-infra17:23
clarkbnote rax is not kvm so I think that won't apply there either17:23
clarkbbut q35 machine type is old enough I would expect the other clouds to work17:24
sean-k-mooneyclarkb: it will give a no valid host. it also would only work for qemu based clouds not sure if that include xen or not17:24
*** ociuhandu has quit IRC17:24
sean-k-mooneyya its almost 10 years old17:24
clarkbya pretty sure it won't work on rax but would elsewhere17:24
clarkbsean-k-mooney: its done per provider too so we could turn it on a provider at a time to be conservative17:24
sean-k-mooneyya. let me see if this is feasable locally first if it is i might propose some patches17:25
donnydseems like a federated placement service would be pretty slick for this use case17:27
sean-k-mooneynodepool fills in the gap via the lables17:30
sean-k-mooneybut it could be17:30
donnydoh yes nodepool is the only existing mechanism we have for defining a resource.. just a random though17:33
donnydthought *17:33
donnydartom: I have been watching OE for a while now and all seems to be good to go. Much appreciated heads up. I pretty much depend on it because its a one man show over here17:34
artomdonnyd, thank you again, it's appreciated :)17:35
*** diablo_rojo has joined #openstack-infra17:38
*** lpetrut has quit IRC17:40
*** andrewbonney has quit IRC17:41
*** diablo_rojo has quit IRC17:55
*** diablo_rojo has joined #openstack-infra17:55
*** ociuhandu has joined #openstack-infra17:59
openstackgerritJeremy Stanley proposed openstack/project-config master: Replace incident channel with opendev-meeting  https://review.opendev.org/71603818:10
openstackgerritMohammed Naser proposed openstack/project-config master: vexxhost: add repos for exporters  https://review.opendev.org/71496518:14
openstackgerritMerged zuul/zuul-jobs master: test-upload-logs-swift: revert download script  https://review.opendev.org/71575518:14
*** ralonsoh has quit IRC18:17
*** rlandy is now known as rlandy|brb18:29
*** rledisez has joined #openstack-infra18:44
openstackgerritMerged openstack/project-config master: Add nginx-ingress-controller armada app to StarlinX  https://review.opendev.org/71468618:47
*** ociuhandu has quit IRC18:48
*** rlandy|brb is now known as rlandy18:54
openstackgerritAndreas Jaeger proposed openstack/pbr master: Update hacking for Python3  https://review.opendev.org/71605919:14
openstackgerritMerged openstack/project-config master: vexxhost: add repos for exporters  https://review.opendev.org/71496519:15
openstackgerritAndreas Jaeger proposed openstack/pbr master: Update hacking for Python3  https://review.opendev.org/71605919:17
*** pkopec has quit IRC19:21
*** ociuhandu has joined #openstack-infra19:25
*** ociuhandu has quit IRC19:29
*** larainema has quit IRC19:41
openstackgerritAndreas Jaeger proposed openstack/cookiecutter master: Update hacking version of new repo  https://review.opendev.org/71606519:42
*** gfidente is now known as gfidente|afk19:49
*** mugsie has quit IRC19:51
*** mugsie has joined #openstack-infra19:54
*** dpawlik has quit IRC19:59
*** sshnaidm is now known as sshnaidm|afk20:07
*** jackedin has quit IRC20:07
*** dpawlik has joined #openstack-infra20:08
*** dpawlik has quit IRC20:13
*** xek has quit IRC20:35
*** njohnston is now known as njohnston_20:39
*** njohnston_ has quit IRC20:47
*** smarcet has quit IRC20:48
*** smarcet has joined #openstack-infra20:48
*** hashar has quit IRC20:49
*** njohnston has joined #openstack-infra20:51
*** gshippey has quit IRC20:52
*** cdearborn has quit IRC20:54
openstackgerritMerged openstack/project-config master: Add xstatic-** projects for vitrage-dashboard  https://review.opendev.org/70413320:56
openstackgerritMerged openstack/project-config master: Add Rook to StarlingX  https://review.opendev.org/71365021:13
openstackgerritMerged openstack/project-config master: Add Cert-Manager Armada app to StarlingX  https://review.opendev.org/71468921:13
openstackgerritMerged zuul/zuul-jobs master: Improve job and node information banner  https://review.opendev.org/67797121:37
*** rcernin has joined #openstack-infra21:56
*** slaweq has quit IRC22:08
*** tosky has quit IRC22:20
*** todd-inmotion has joined #openstack-infra22:23
*** rfolco has quit IRC22:57
*** todd-inmotion has quit IRC22:57
*** rfolco has joined #openstack-infra22:58
*** rfolco has quit IRC22:59
*** gfidente|afk has quit IRC23:14
*** arif-ali has quit IRC23:25
*** arif-ali has joined #openstack-infra23:34
*** smarcet has quit IRC23:39
*** rh-jlabarre has quit IRC23:53
*** tetsuro has joined #openstack-infra23:58

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!