15:00:30 #startmeeting RDO meeting - 2016-07-20 15:00:30 Meeting started Wed Jul 20 15:00:30 2016 UTC. The chair is imcsk8. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:30 Useful Commands: #action #agreed #halp #info #idea #link #topic. 15:00:30 The meeting name has been set to 'rdo_meeting_-_2016-07-20' 15:00:34 Meeting started Wed Jul 20 15:00:30 2016 UTC and is due to finish in 60 minutes. The chair is imcsk8. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:35 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:38 The meeting name has been set to 'rdo_meeting___2016_07_20' 15:00:38 #topic roll call 15:00:51 o/ 15:01:01 ¯\_(ツ)_/¯ 15:01:05 o/ 15:01:16 o/ 15:01:19 #chair apevec coolsvap jruzicka rbowen mengxd 15:01:19 Current chairs: apevec coolsvap imcsk8 jruzicka mengxd rbowen 15:01:20 Current chairs: apevec coolsvap imcsk8 jruzicka mengxd rbowen 15:01:26 o/ 15:01:32 ☃ 15:01:48 #chair jjoyce trown 15:01:48 Current chairs: apevec coolsvap imcsk8 jjoyce jruzicka mengxd rbowen trown 15:01:48 Current chairs: apevec coolsvap imcsk8 jjoyce jruzicka mengxd rbowen trown 15:02:21 ok, let's start 15:02:25 \m/ -_- \m/ 15:02:32 #chair eggmaster 15:02:32 Current chairs: apevec coolsvap eggmaster imcsk8 jjoyce jruzicka mengxd rbowen trown 15:02:33 Current chairs: apevec coolsvap eggmaster imcsk8 jjoyce jruzicka mengxd rbowen trown 15:02:46 trown, it's too hot for snowman 15:02:55 #topic newton2 testday readiness 15:02:58 wishful thinking 15:03:15 is that testday readiness summary? :) 15:03:24 heh 15:03:24 lol, it does work there too 15:03:33 so we're down to 1 issue 15:03:46 though I think there is a better chance of promotion than building a snowman outside right now 15:03:51 so we have that going for us 15:04:02 there is a weirdo failure on scen001 15:04:23 dmsimard, ^ has it fixed for the next run iiuc ? 15:04:31 and a introspection issue that we've confirmed https://review.openstack.org/#/c/344792/ fixes 15:04:32 it's only puppet scn1 15:04:36 overcloud deploy just failed on HA as well, which would not be introspection 15:04:46 doh 15:04:57 o/ 15:05:00 waiting on logs 15:05:10 \o/ 15:05:15 #chair chandankumar 15:05:16 Current chairs: apevec chandankumar coolsvap eggmaster imcsk8 jjoyce jruzicka mengxd rbowen trown 15:05:17 Current chairs: apevec chandankumar coolsvap eggmaster imcsk8 jjoyce jruzicka mengxd rbowen trown 15:05:25 apevec: yeah, the gerrit replication wasn't working since the gerrit replication revamp for weirdo repositories 15:05:32 so the fix landed in the gerrit repo but wasn't replicated to github 15:05:34 fixed it this morning 15:05:35 ha still has a high transient failure rate, so might not be a new issue 15:05:45 trown: see my comment re: firewalld and networkmanager 15:05:47 ? 15:05:59 trown: it might explain the flapping to a certain extent 15:06:32 dmsimard: hmm, maybe you could uninstall those at the beginning of weirdo? 15:06:34 o/ 15:06:43 #chair jtomasek 15:06:43 Current chairs: apevec chandankumar coolsvap eggmaster imcsk8 jjoyce jruzicka jtomasek mengxd rbowen trown 15:06:44 Current chairs: apevec chandankumar coolsvap eggmaster imcsk8 jjoyce jruzicka jtomasek mengxd rbowen trown 15:06:46 o/ 15:06:59 #chair florianf 15:06:59 Current chairs: apevec chandankumar coolsvap eggmaster florianf imcsk8 jjoyce jruzicka jtomasek mengxd rbowen trown 15:07:00 Current chairs: apevec chandankumar coolsvap eggmaster florianf imcsk8 jjoyce jruzicka jtomasek mengxd rbowen trown 15:07:02 trown: would ideally not manage that in weirdo (i.e, bake it in the image for review.rdo and do it some other way for ci.centos) 15:07:12 but yeah, we should definitely try it 15:07:14 k 15:07:20 #chair jrist 15:07:20 see if that fixes some flapping 15:07:44 jrist: only chairs can chair 15:07:51 oh :) 15:07:52 o/ 15:08:03 #chair jrist 15:08:03 Current chairs: apevec chandankumar coolsvap eggmaster florianf imcsk8 jjoyce jrist jruzicka jtomasek mengxd rbowen trown 15:08:03 Current chairs: apevec chandankumar coolsvap eggmaster florianf imcsk8 jjoyce jrist jruzicka jtomasek mengxd rbowen trown 15:08:31 so summary for the meeting minutes is: not ready but have a good chance? 15:09:19 yeah and 15:09:20 we can retrigger after introspection fix is merged and built in dlrn 15:09:41 #action dmsimard to investigate if removing firewalld and networkmanager from the default centos installations can help alleviate flapping results 15:09:44 (18. in current issues) 15:10:13 dmsimard, ^ is there more info on that? where have you seen it in logs? 15:10:37 NM was supposed to work fine w/ Packstack 15:10:42 imcsk8, ^ ? 15:10:48 apevec: it's part intuition, part it's documented to remove those anyway, part these are not installed upstream 15:11:06 apevec: i've been testing packstack without disabling NM for a while now with no problems 15:11:14 imcsk8, what about firewalld ? 15:11:27 it could be disabled w/o removing? 15:11:35 firewalld has to be disabled at least 15:13:59 ok, who will watch over 18. and retrigger promotion pipeline when it gets build in RDO Trunk ? 15:14:48 apevec, dmsimard https://github.com/puppetlabs/puppetlabs-firewall/blob/master/manifests/linux/redhat.pp#L29 15:14:50 oh it is not +W yet? https://review.openstack.org/#/c/344792/ 15:15:00 the firewall puppet module disables firewalld 15:15:04 #action trown babysit instack-undercloud patch 15:15:17 apevec: no it is not passing upstream CI, because upstream CI issues 15:15:59 seriously? 15:16:06 (the firewalld thing) 15:16:08 ya.. downloading packages 15:16:12 weee 15:16:32 anything else on this topic? 15:16:56 ok, let's continue 15:17:30 #topic RDO CI: POWER nodes sizing (w/ mengxd) 15:18:09 yes, i want to discuss with team to understand the h/w requirements for RDO CI 15:18:29 https://ci.centos.org/view/rdo/view/all/ 15:18:44 k 15:19:17 from the above link, i can see there are about 5 nodes used for RDO ci. Is this number correct? 15:20:08 nodes? 15:20:19 physical server 15:20:53 that is the physical servers in the node pool in the CI system, right? 15:20:53 we have the slave governed to start 15 max at any time 15:21:13 mengxd, atm I see all 15 running 15:21:46 so there will be at least 15 servers/nodes reserved by rdo-ci atm.. plus any that have "leaked" 15:21:50 weshay: i think that 15 is the CI pipelines, not the nodes 15:22:11 15 physical servers 15:22:47 mengxd, weshay: We're in the process of adding more slave capacity 15:22:48 i saw 15 pipelines under rdo-ci-slave01 15:22:57 mostly to distribute the load and have redundancy 15:23:15 dmsimard, ya.. we should then limit each to 5 jobs 15:23:15 if we end up w/ 3 15:23:15 3 slaves 15:23:16 We will lower the amount of threads on the current slave and use 3 more slaves (2 are under testing right now) 15:23:20 mengxd, not sure what you mean re: pipelines 15:23:29 dmsimard++ 15:23:30 weshay: Karma for dmsimard changed to 3 (for the f24 release cycle): https://badges.fedoraproject.org/tags/cookie/any 15:24:04 mengxd, each job when running uses 1 physical server 15:24:32 weshay: do we really need one physical for a job? i thought they are running on top of VMs 15:24:40 mengxd, yes we need that 15:25:01 mengxd: there will be an openstack cloud available for virtual workloads 15:25:05 soon 15:25:07 these jobs are deploying openstack on vms on top of the physical server 15:25:29 mengxd: we try not to run jobs directly on the slaves as they are static, we run the jobs on ephemeral nodes 15:25:49 hopefully we have no jobs running on the slaves 15:25:59 ok, then if i need to enable RDO on ppc64le, what is the minimal h/w requirements? which will trigger the ci job? 15:26:07 weshay: we might have some things.. like lint jobs or things like that 15:26:33 mengxd: we don't have ppc64le available on the ci.centos environment, I don't think 15:26:37 mengxd: would need to check. 15:26:53 you are right, but i have an interest to enable that 15:27:07 so i want to get some idea about the h/w requirements if we do so. 15:27:11 mengxd: okay, right 15:27:21 http://docs.openstack.org/developer/tripleo-docs/environments/virtual.html 15:27:38 for CI, we need to test w/ HA.. which requires 64gb of memory on the host 15:27:44 same w/ upgrades 15:27:47 #chair weshay 15:27:48 Current chairs: apevec chandankumar coolsvap eggmaster florianf imcsk8 jjoyce jrist jruzicka jtomasek mengxd rbowen trown weshay 15:27:48 please provide multiple scenarios 15:27:48 Current chairs: apevec chandankumar coolsvap eggmaster florianf imcsk8 jjoyce jrist jruzicka jtomasek mengxd rbowen trown weshay 15:27:59 mengxd: This is the documentation for the hardware currently on ci.centos.org: https://wiki.centos.org/QaWiki/PubHardware 15:28:08 #chair number80 15:28:08 Current chairs: apevec chandankumar coolsvap eggmaster florianf imcsk8 jjoyce jrist jruzicka jtomasek mengxd number80 rbowen trown weshay 15:28:08 Current chairs: apevec chandankumar coolsvap eggmaster florianf imcsk8 jjoyce jrist jruzicka jtomasek mengxd number80 rbowen trown weshay 15:28:28 #chair dmsimard 15:28:28 Current chairs: apevec chandankumar coolsvap dmsimard eggmaster florianf imcsk8 jjoyce jrist jruzicka jtomasek mengxd number80 rbowen trown weshay 15:28:29 Current chairs: apevec chandankumar coolsvap dmsimard eggmaster florianf imcsk8 jjoyce jrist jruzicka jtomasek mengxd number80 rbowen trown weshay 15:28:43 mengxd: Jobs are currently running on nodes with 32GB ram, 8 cores and disk space varies between 200GB to 500GB I believe 15:28:43 so from the 1st link, it seems one big power server is enough for triple-o test 15:29:03 mengxd: some jobs are designed to fit within 8GB of ram, others really require 32GB at the very least. 15:29:31 dmsimard, mengxd really 32gb is min, 64 is ideal 15:29:45 weshay: there's no 64GB anywhere on ci.centos, where's that number from ? 15:29:50 w/o 64 we can't test a sudo supported deployment 15:29:59 dmsimard, aye I know 15:30:13 in fact he's probably the most familiar, ha 15:30:13 mengxd: we'll need also enough capacity for gating DLRN changes (I mean packaging changes) 15:30:34 number80: that's from review.rdo though 15:30:45 dmsimard: well, we can plug external CI 15:30:48 so what will trigger a RDO CI job now? 15:30:49 number80: though it could be third party 15:30:56 isn't 65Gb too much?? 15:31:02 (64) 15:31:04 imcsk8: tripleo. 15:31:06 for 3o? nope 15:31:06 mengxd, an update to a yum repo 15:31:23 mengxd, and we also have periodic triggers 15:31:26 actually 32Go is barely enough 15:31:30 btw, i can run full Tempest with 16GB memory on a CentOS VM, not sure why we need 64GB here 15:31:33 mengxd: we periodically check if new RDO packages have been built and if so, we trigger a series of jobs to check if those packages work well. 15:31:48 mengxd: tripleo has particular requirements 15:31:49 are you running full Tempest? 15:32:04 mengxd: packstack and puppet-openstack don't require more than 8GB of RAM 15:32:06 mengxd, for HA 3 controller, 2 compute 1 ceph is the min official support arch 15:32:15 we don't attempt that today 15:32:20 but that is the CI requirement 15:32:33 we work w/ in the current hardware atm 15:32:38 and test the rest else where 15:33:22 ok, so even for triple-O, we can test RDO with VMs (nested virtualization), right? 15:33:37 we're very happy w/ what we have.. but those our the requirements we've been given 15:33:45 mengxd: yes, we test with nested virtualization in the review.rdoproject.org and the review.openstack.org environment. 15:33:56 mengxd: well, wait, I read that wrong 15:34:28 mengxd: tripleo does nested virt itself (through tripleo-quickstart), the job is not designed to run on a VM (since then you'll end up with ultimately 3 layers of nested virt) 15:34:47 dmsimard, that does work though 15:34:57 weshay: quickstart on VMs ? must be slow, no ? 15:35:07 ya.. not saying it's ideal.. but it works 15:35:11 ok 15:35:20 that explains why jobs are so long 15:35:28 ok, usually how long will each CI job take? 15:35:32 number80: which jobs ? 15:35:37 dmsimard: tripleo 15:35:48 mengxd: 3 to 5 hours 15:36:00 mengxd: packstack and puppet openstack finish within 45 minutes, tripleo takes several hours 15:36:03 mengxd, w/ quickstart a min job is 1:15 and upgrade or scale can run as much as 3.5 hrs 15:36:11 weshay: are you including the image build in that time ? 15:36:27 dmsimard, image build is only done in the promotion pipeline 15:36:35 fair 15:36:51 we manage to get average below 3hours? 15:36:53 ok, and how frequently will CI job be triggered? 15:36:54 woot 15:37:23 we are working on a downloading an already deployed stack. and restarting it.. 15:37:26 mengxd: well, that's up to you I guess ? I'm not sure where you want this job and what you want it to test 15:37:38 so that upgrades, scale etc. don't have to deploy the initial cloud each time 15:37:52 which will bring run times down to 1.5 hrs for upgrades which is our longest job 15:38:05 it's only WIP atm 15:38:41 weshay: that is really nice to have. since it can save a lot of time. 15:38:50 agree.. upgrades are terrible 15:39:24 I have to step out for an appointment, I'll be afk for a bit 15:39:35 dmsimard: how about the current RDO CI? Is it reporting to every community patch-set? 15:39:43 mengxd, no 15:40:07 mengxd, we poll the git repo every 4hrs or so.. check if there is a change.. if true; then execute 15:40:15 other jobs are configured to run once a day 15:40:30 for instance 15:40:30 https://ci.centos.org/view/rdo/view/promotion-pipeline/ 15:40:33 are triggered off the yum repo 15:40:48 https://ci.centos.org/view/rdo/view/tripleo-gate/ are triggered off of changes to CI src.. 15:40:54 so every patch 15:41:02 https://ci.centos.org/view/rdo/view/tripleo-periodic/ 15:41:06 are triggered once a day 15:41:14 guys, i think we're getting a little off topic and we have other stuff to address 15:41:28 sure 15:41:42 weshay: so i just want to get an estimate about the CI h/w reqs 15:41:53 maybe we can talk in mailing list 15:42:37 or here after the meeting ;) 15:42:43 sure 15:42:43 mengxd, sure np.. we're very grateful for what we have in ci.centos but the requirements of tripleo are what they are :( 15:44:15 is there anything else? do you want an action for continue this topic on the ML? 15:44:55 yes, i will send out a note to mailing list for further discussion 15:45:23 #action mengxd to send a message to the Mailing List about CI requirements 15:45:25 so we can move on with other topics 15:45:36 next topic 15:45:48 #topic tripleo-ui packaging 15:46:47 jrist: ^^ 15:47:01 honza, jtomasek, florianf 15:47:04 so yeah 15:47:12 thanks imcsk8 15:47:18 we've got to get tripleo-ui packaged 15:47:26 and we've got some first steps with upstream openstack 15:47:27 #chair jrist 15:47:28 Current chairs: apevec chandankumar coolsvap dmsimard eggmaster florianf imcsk8 jjoyce jrist jruzicka jtomasek mengxd number80 rbowen trown weshay 15:47:28 Current chairs: apevec chandankumar coolsvap dmsimard eggmaster florianf imcsk8 jjoyce jrist jruzicka jtomasek mengxd number80 rbowen trown weshay 15:47:42 but there is a concern for what we need to do for compilation 15:47:51 i.e. all of the possible dependencies 15:48:00 note, tripleo-ui is npm/react based 15:48:03 which dependencies are problematic in particular? 15:48:15 well, it is npm based, so there are many npm packages 15:48:29 jruzicka: we would like to understand what might be already packaged 15:48:39 or if there is an npm repo that we can work from, instead of packaging 15:48:42 if that is not possible 15:48:53 we will have to package the dependencies that are not already in RPM f 15:49:36 does anyone have any insight? is this something we can or need to set up another 15:49:44 to not derail the #rdo meeting 15:50:10 in general, we're talking hundreds of dependencies, since the nature of how npm packages work 15:50:20 last count was 856, but it will reduce a little 15:50:23 whole ecosystem is very similar to this https://fedoraproject.org/wiki/KojiMavenSupport 15:50:41 to give a bit of background: this discussion has been going on for some days now on various channels. Initially we were hoping to find a way to deliver the compiled/minified JS packages with the UI package. 15:50:51 honza mentioned that there might that there might be a fedora NPM registry 15:50:58 but that it won't be ready until next year 15:51:01 an approved npm registry where the dependencies could be sourced would be a nice solution 15:51:30 i think it was number80 who suggested we might be able to get away with only packaging the build toolchain to start and work on the rest later 15:52:50 yes 15:52:54 honza, yes, how big is the toolchain? 15:53:06 problem is that the build toolchain is most of the deps 15:53:12 say 500 15:53:13 apevec: i'll let jtomasek answer that one 15:53:37 maybe less if we don't need to include testing tools 15:53:53 do you have a dep tree? this sounds insane :) 15:54:00 apevec, hi there 15:54:18 yeah, sounds pretty instane if no npm rpms are available ATM 15:54:21 apevec, do you have a time for delorean issue talk? 15:54:22 jtomasek: we can ignore testing tools, and not request strict unbundling for toolchain 15:54:37 sshnaidm, we're in the meeting 15:55:02 number80: that means that we would not need to package dependencies of the toolchain dependencies? 15:55:34 this is the list of the direct app dependencies https://github.com/openstack/tripleo-ui/blob/master/package.json 15:56:08 jtomasek: top priority is to build from sources and have the toolchain available 15:56:23 unbundling will be ongoing work but not a blocker for this package 15:56:57 (and if this lands in RHOSP, you'd have to do it anyway) 15:57:18 this is the full dependency tree http://paste.openstack.org/show/538860/ 15:57:30 * jrist winces 15:57:44 we're in process of cutting down some of them but not a lot 15:57:47 number80: this is supposed to land in RHOSP10 15:57:49 what a magnificient tree! 15:57:54 jruzicka: :) 15:57:55 !! 15:57:56 imcsk8: Error: "!" is not a valid command. 15:58:13 guys, we're almost at the top of the our. can we proceed with the next topic? 15:58:30 ok 15:58:36 * chandankumar thinks https://github.com/ralphbean/npm2spec it might be useful for creating npm specs for packages 15:58:43 thanks chandankumar 15:58:48 chandankumar: nice 15:58:54 imcsk8: sounds like we need to have another separate meeting. thanks 15:58:54 #topic rdopkg 0.38 released 15:59:06 yeah, that's just a quick info 15:59:37 new version contains bugfixes, cbsbuild command by number80 and 1000 less sloc of obsolete code 15:59:41 number80++ 15:59:50 nice! 15:59:59 18.5 % of code is gone ;) 16:00:08 cool!! 16:00:18 sweet! 16:00:27 so let me know if something broke as always and that's all :) 16:00:34 ok, next 16:00:38 #topic Chair for next meetup 16:00:42 jruzicka: awesome 16:01:17 i am up for chairing. 16:01:23 hguemar proposed DLRN: Add rdopkg reqcheck output in CI runs http://review.rdoproject.org/r/1275 16:01:36 #action chandankumar to chair next meeting 16:01:57 #topic open floor 16:02:10 is there anything else? or should we finish? 16:03:05 ok, closing meeting 16:03:08 1 16:03:10 2 16:03:11 3 16:03:15 C-C-COMBO BREAKER 16:03:18 4 16:03:20 #endmeeting