14:01:17 #startmeeting tripleo 14:01:18 Meeting started Tue Sep 18 14:01:17 2018 UTC and is due to finish in 60 minutes. The chair is jaosorior. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:01:19 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:01:23 The meeting name has been set to 'tripleo' 14:01:28 #topic agenda 14:01:31 * Review past action items 14:01:33 * One off agenda items 14:01:35 * Squad status 14:01:36 o/ 14:01:37 * Bugs & Blueprints 14:01:39 * Projects releases or stable backports 14:01:41 * Specs 14:01:43 * open discussion 14:01:43 o/ 14:01:45 Anyone can use the #link, #action and #info commands, not just the moderatorǃ 14:01:47 Hey folks! who's around? 14:01:50 o/ 14:01:52 o/ hello 14:01:52 o/ 14:01:54 hi 14:01:55 0/ 14:01:56 \o 14:01:57 o/ 14:01:57 o/ 14:02:04 hi 14:02:04 hi2u 14:02:09 hey 14:02:21 o/ 14:02:28 o/ 14:02:32 o/ 14:02:47 o/ 14:02:49 o7 14:02:56 hi, I'm trying oooquickstart with rhos-14 and I'm getting an import error for zmq. I'm being told that zmq shouldn't be used in osp14, is it something I'm doing wrong? I'm using all the defaults 14:03:00 o/ 14:03:02 hi (half here) 14:03:07 haha o7, is that like a salute? 14:03:15 uh, sorry, it seems there is a meeting 14:03:18 I will come later 14:03:23 jrist: yeah I saw tengu do it once and figured it looked a little cool 14:03:51 o/ 14:03:55 o/ 14:04:05 o/ 14:04:09 yea it's a salute 14:04:14 dpeacock+1 14:04:25 #topic review past action items 14:04:32 None. 14:04:45 #topic one off agenda items 14:04:47 #link https://etherpad.openstack.org/p/tripleo-meeting-items 14:05:06 o/ 14:06:06 #topic Stein forum topic proposals 14:06:22 So... we have until the 26th of September to propose topics for the Stein Summit 14:06:28 jaosorior: are we past one off agenda items? 14:06:30 Here's the etherpad: 14:06:40 o/ 14:06:41 #link https://etherpad.openstack.org/p/tripleo-forum-stein 14:06:51 jrist: no, this is one of the "one off agenda items" 14:07:07 oh ok :) 14:07:11 thanks 14:07:19 So, if you have a topic to bring up to the Summit, which is coming soon, please add it to the etherpad. 14:07:37 #topic citellus POC 14:07:39 iranzo: ^^ 14:07:44 jaosorior: yup :) 14:08:08 weshay++ some months ago worked on a review at https://review.openstack.org/#/c/553571/4 ) 14:08:23 that used a set of scripts that are used by support folks 14:08:30 to detect known issues on osp deployments 14:08:39 from either live systems or sosreports 14:09:01 unlike others, goal was to enable operators, and sysadmins in general to run them and of course write new ones 14:09:07 by using bash or other languages of choice 14:09:17 and some common agreements (return codes and stderr for messages) 14:09:33 it can drop a json that can be parsed later by other tools to get overall status 14:10:03 so... thoughts? 14:10:14 review was 'rechecked' as it passed some time ago, but to get updated results 14:10:14 URGENT TRIPLEO TASKS NEED ATTENTION 14:10:16 https://bugs.launchpad.net/tripleo/+bug/1786764 14:10:16 Launchpad bug 1786764 in tripleo "Wrong versions of tripleo-common in container images updated in CI" [Critical,In progress] - Assigned to Emilien Macchi (emilienm) 14:10:16 https://bugs.launchpad.net/tripleo/+bug/1792296 14:10:17 https://bugs.launchpad.net/tripleo/+bug/1792343 14:10:18 https://bugs.launchpad.net/tripleo/+bug/1792560 14:10:18 https://bugs.launchpad.net/tripleo/+bug/1792862 14:10:19 https://bugs.launchpad.net/tripleo/+bug/1792870 14:10:19 Launchpad bug 1792296 in tripleo "Overcloude deploy error:Timed out waiting for messages from Execution" [Critical,In progress] - Assigned to Quique Llorente (quiquell) 14:10:20 https://bugs.launchpad.net/tripleo/+bug/1792872 14:10:20 https://bugs.launchpad.net/tripleo/+bug/1792892 14:10:20 Launchpad bug 1792343 in tripleo "[tripleo] rocky baremetal deployment fails with jq: error: Could not open file /var/lib/heat-config/deployed/.notify.json" [Critical,Triaged] - Assigned to Quique Llorente (quiquell) 14:10:21 Launchpad bug 1792560 in tripleo "Upgrades in CI still using Q->master instead of R->master release" [Critical,In progress] - Assigned to Jiří Stránský (jistr) 14:10:21 iranzo: I see the POC patch is for CI; how would that be used? 14:10:23 Launchpad bug 1792862 in tripleo "[master] Telemetry Tempest integration tests failed giving Unable to complete operation on subnet b13c76ec-c851-48d0-91a9-245a2fdcad9b: One or more ports have an IP allocation from this subnet." [Critical,Fix committed] - Assigned to Mehdi Abaakouk (sileht) 14:10:24 Launchpad bug 1792870 in tripleo "[Rocky] FS01 periodic job failed at overcloud prepare image giving Error: Unable to establish IPMI v2 / RMCP+ session\n'.: ProcessExecutionError: Unexpected error while running command" [Critical,Triaged] - Assigned to Quique Llorente (quiquell) 14:10:25 Launchpad bug 1792872 in tripleo "[queens] overcloud prepare image failed by giving IronicAction.node.set_provision_state failed: 'NoneType' object has no attribute '__getitem_" [Critical,Triaged] - Assigned to Quique Llorente (quiquell) 14:10:26 Launchpad bug 1792892 in tripleo "Undercloud upgrades fails at ERROR! Invalid callback for stdout specified: yaml" [Critical,In progress] - Assigned to Quique Llorente (quiquell) 14:10:40 ah that bot is well timed 14:10:45 capturing errors and making it easy to find the associated data is awesome.. 14:10:51 jaosorior: ^ 14:10:57 citellus looked pretty awesome 14:11:02 iranzo++ 14:11:06 weshay++ 14:11:11 weshay, iranzo: It sounded really nice; that's why I asked iranzo to bring it up here :D 14:11:17 we need to find a better way to package/deploy that 14:11:22 right now i'd be -2 on that review 14:11:28 it is delivered as pip package if needed 14:11:28 just asking for more details cause I'm interested. 14:11:32 that is build automatically 14:11:33 ya.. that was not ready for merge 14:11:34 that website isn't working 14:11:36 or a container 14:11:40 weshay: iranzo do we have a exemple of the kind of check it will be used for ? 14:11:42 can someone summarize? is it just about associated data? 14:11:42 mwhahaha, jaosorior so we need to think about not adding to CI 14:11:47 chem: yes, one sec 14:11:54 mwhahaha, jaosorior we need to think about adding it to the project 14:11:57 oh nevermind, it had an extra ) 14:12:01 chem: https://asciinema.org/a/169814 14:12:06 i don't know if w ewant to add it to the project 14:12:08 asciicinema of operation 14:12:25 and as example of checks: non configured token expirations, wrong ntp sync setup 14:12:32 distributing it as a container might be interesting 14:12:34 like when people used chronyc instead of ntpd 14:12:48 mwhahaha, iranzo getting consensus on the tool to use for error detection is difficult at best 14:12:52 missing support in system for virtualization, etc 14:12:54 what is citellus-master.tgz? is that a tarball of the citellus git repo? 14:13:03 slagle: yes, its an old tarball of the repo 14:13:20 iranzo: can we do that a different way? just use git clone or something 14:13:23 i would rather see it added as a new service rather than added to quickstart or integrated into tripleo itself 14:13:39 it might make sense to have it on the undercloud itself 14:13:40 slagle: yes or even add it to the requirements.txt 14:13:46 +1 14:13:58 iranzo: hum so static report of sosreport, nice. Plugged into the ci it would only check logs then, no triggering another process 14:14:01 from undercloud, you can pass an ansible-host-file that uploads executes and brings back data from the hosts 14:14:15 iranzo: (afraid of timeout in ci) 14:14:24 chem: yes, and the json can be easily parsed for the errors 14:14:41 let me show you one example of the json 14:14:45 if I recall it was all very fast.. a number of seconds 14:14:53 web UI: https://htmlpreview.github.io/?https://github.com/citellusorg/citellus/blob/master/doc/sampleweb/citellus.html 14:15:05 https://github.com/citellusorg/citellus/blob/master/doc/sampleweb/citellus.json 14:15:07 json generated 14:15:11 yes, it runs usually under 10 seconds 14:15:16 (for the sosreport check) 14:15:18 it seems like something that should be integrated with the UI itself 14:15:31 for live it can take a bit more because of system logs parsed, etc 14:15:39 but it's not like a century 14:15:47 goal was always to make it damn fast and easy to get new things there 14:15:54 to highlight what could go wrong 14:16:09 even you can define sizing data based on number of cpus, etc and report if a setting is not 'suited' to host setup 14:16:12 if you want to give hints there 14:16:17 weshay, iranzo: so, is that a service that would constantly be running on the nodes? or is it a one-off script you run and pulls down a bunch of info? 14:16:43 I would say a one-off executed as part of the tests could be ok 14:16:50 in support we used it to check what customers had wrongly setup 14:16:55 based on our past experience 14:17:07 and test 'id's' are unique so you can define which ones should be reported or not 14:17:13 based on regexps to the full path 14:17:19 like "all for openstack", or all for "clock" 14:17:24 mwhahaha: if it's in the product then maybe that could be used for the 2 days validation framework currently WIP by Tengu 14:17:29 or report openstack issues but not bugzillas 14:17:47 chem: yea that was my thought as it makes sense to have available on the undercloud for subsequent runs 14:17:54 but it's a delivery thing into tripleo 14:17:58 ie rpm or containers or whatever 14:18:12 when it checks several hosts, it also aggregates data and shows a combined status, also reporting things that could be problematic 14:18:19 think of different OS releases 14:18:21 on the same environment 14:18:25 or sharing same iscsi initiatior name 14:18:27 it's likely that we'd want it wrapped up in a container so it could be pulled in on the undercloud and executed via the UI/CLI 14:18:30 among different hosts 14:19:10 iranzo, weshay: does the POC patch have an output already from the deployment? 14:19:12 so i think it's a good idea to have, but we need to figure out the best way to integrate the execution/results 14:19:13 mwhahaha: I've made some attempts to use an Super privileged container, but still require work, as the actual 'tests' are written for either checking the path in sosreport or the path in a live system 14:19:13 mwhahaha: that would be cool 14:19:18 jaosorior, it's so old 14:19:25 I'll refresh it at some point 14:19:30 sure 14:19:31 so trying to run 'live' against a folder might give some strange output for some commands 14:19:36 iranzo: seems like all feature are there :) 14:19:52 citellus++ :) 14:19:59 think of 'rpm -qa' inside the container instead of the actal 'host' folder 14:20:15 * weshay notes.. it would be very cool to have the same tool used to diagnose errors up and down stream 14:20:21 yes! 14:20:48 so it sounds like there's more work on the citellus front to figure out distribution 14:20:49 which is why I would hope we can avoid making this a ci project 14:21:09 weshay, iranzo: So, the tool does sound quite cool. But yeah, like mwhahaha said, it would be best if we could have it packaged and integrated properly. 14:21:12 the list of plugins in the tarball is at this commit: https://github.com/citellusorg/citellus/tree/1ee1c6a36f51e8a7c809d5162004fb57ee99b168 14:21:18 jaosorior: what kind of packaging do you mean? 14:21:30 project has automatic creationg via dockerhub of container and pypi package upload 14:21:31 iranzo, jaosorior ok.. the ci team will help get it packaged 14:21:35 ok 14:22:00 so we need it rpm'd to be properly integrated 14:22:02 mwhahaha: well, the patch does say POC in the title. I assumed weshay and iranzo meant to show how it works. 14:22:05 +1 14:22:06 then we'd like to have it work from in a container 14:22:11 cause that's what we have 14:22:14 sure 14:22:15 Hello everyone, a quick question: Is containerized undercloud installation available on Queens? (I know it's the default of Rocky) 14:22:22 geneliu: no 14:22:27 jaosorior: I can show it :) what do you use? bjns? asciinema? 14:22:54 iranzo: asciinema is fine :) 14:22:54 flash cards 14:22:58 lol 14:23:14 poster board diorama 14:23:21 must include at least 2 historical figures 14:23:34 weshay may be used as 1 of the figures 14:23:36 jaosorior: then https://asciinema.org/a/169814 should work 14:23:56 for later processing I would just run and get the json or report to console the 'failed' tests 14:24:16 internally it uses 'priority' which is the likelyhood of a test failing being bringing down your environment 14:24:21 and can be also be filtered based on it 14:24:29 weshay: so, the CI team would help with the packaging/containerization of citellus? 14:24:34 and tests are 'info', 'skipped' 'ok' or 'failed' 14:25:02 so first steps would be to package and containerize? 14:25:02 and this is for the old-group processing https://asciinema.org/a/170429 14:25:18 (now just outputs a json to make it more easily consumible basd on the information generated) 14:25:22 thanks for the answer, mwhahaha! 14:25:52 weshay: yeah. 14:26:20 Trying to deploy rocky with OVN, and getting : resources.ServiceChain: Property error: resources[43].properties: Property DockerOvnMetadataImage not assigned 14:27:06 rook: you're missing the ovn metadata container from the prepare (also we're in a meeting) 14:27:07 Where must this get defined -- and why isn't it defined in the environment file passed ? 14:27:20 erm 14:27:21 looking 14:28:03 any other question about the tool? something I missed? 14:28:22 Alex Schultz proposed openstack/tripleo-quickstart-extras master: Revert "Revert "Add ability to have dlrn build all the packages at once"" https://review.openstack.org/603406 14:28:38 iranzo: not from my side. I like it though 14:28:56 there's a recording from devconf.cz in youtube 14:28:58 let me find uri 14:29:08 https://www.youtube.com/watch?v=SDzzqrUdn5A&t=1257s&index=1&list=LLyqRUm2tl7NOBlSL4Gz0e_Q 14:29:22 #link https://www.youtube.com/watch?v=SDzzqrUdn5A&t=1257s&index=1&list=LLyqRUm2tl7NOBlSL4Gz0e_Q 14:29:31 and project uri 14:29:36 #link https://citellus.org 14:29:44 #link https://github.com/citellusorg/citellus 14:29:54 Alex Schultz proposed openstack/tripleo-quickstart-extras master: Revert "Revert "Add ability to have dlrn build all the packages at once"" https://review.openstack.org/603406 14:29:57 iranzo: thanks 14:30:03 jaosorior++ to you 14:30:03 iranzo, is this tool mostly used w/ tirpleo? 14:30:09 tripleo 14:30:16 weshay: no, it can be used even outside triple-O 14:30:29 I'll catch up later 14:30:33 thanks iranzo++ 14:30:35 and has little dependencies so if properly written, the bash scripts for tests, coudl be run on any platfrom 14:30:47 we do use them even for openshift (reduced number of plugins) 14:30:58 general systems with services like firewall, containers, clock sync, etc 14:31:09 it's pretty generic as the project itself runs the plugins 14:31:19 and each plugin 'decides' if have or not to run based on the requirements 14:31:38 think for example of something checking galera db status, if the package/process is not there, the plugins skips itself 14:32:00 so can be used for testing the deployed status or even infra for other projects 14:32:25 weshay++ 14:32:50 lets move to the next topic in the agenda 14:32:54 #topic Workflow for wrapping register, introspect and provide 14:32:58 jrist: ^^ 14:33:43 akrivoka: ^^ 14:34:50 yeah so 14:34:53 this was proposed last year 14:34:56 this time. ;) 14:35:02 it was recently abandoned due to age 14:35:12 we're just wondering if this has priority or what 14:35:32 because it really didn't get many reviews or attention, despite asking 14:35:42 maybe akrivoka can chime in 14:36:08 and maybe someone like dtantsur or bfournie 14:37:09 jrist: sure, feedback from them, and if we can get d0ugal to look at it again for the mistral side of things, it would be good. 14:38:24 ok so I am bringing it up here because it has been difficult getting feedback from everyone 14:38:28 jrist: this would be shared by the CLI and the UI, right? 14:38:35 that is definitely the purpose 14:39:22 Added myself as a reviewer; I'll give it a check, but would sure hope that dtantsur and d0ugal would look at it as well. 14:40:10 jrist: thanks for bringing it up 14:40:31 #topic Squad status 14:40:33 ci 14:40:35 #link https://etherpad.openstack.org/p/tripleo-ci-squad-meeting 14:40:37 upgrade 14:40:39 #link https://etherpad.openstack.org/p/tripleo-upgrade-squad-status 14:40:41 containers 14:40:43 #link https://etherpad.openstack.org/p/tripleo-containers-squad-status 14:40:45 integration 14:40:47 #link https://etherpad.openstack.org/p/tripleo-integration-squad-status 14:40:49 ui/cli 14:40:51 #link https://etherpad.openstack.org/p/tripleo-ui-cli-squad-status 14:40:53 validations 14:40:55 #link https://etherpad.openstack.org/p/tripleo-validations-squad-status 14:40:57 networking 14:40:59 #link https://etherpad.openstack.org/p/tripleo-networking-squad-status 14:41:01 workflows 14:41:03 #link https://etherpad.openstack.org/p/tripleo-workflows-squad-status 14:41:05 security 14:41:07 #link https://etherpad.openstack.org/p/tripleo-security-squad 14:41:09 edge 14:41:11 #link https://etherpad.openstack.org/p/tripleo-edge-squad-status 14:42:13 So, from the security squad side, we'll be working on reviving the patches that restrict SSH access for the overcloud (namely https://review.openstack.org/#/c/582437/ and https://review.openstack.org/#/c/582436/ ) 14:42:31 other than this, not a lot to share 14:42:41 any other squad that would like to share some highlights? 14:42:48 it seems 1400UTC thursday will be a good time for the edge squad meeting 14:43:01 i'll send an update to the ML, and let's plan to meet this thursday 14:43:13 slagle++ 14:45:11 any other updates from other squads? 14:45:38 jaosorior, ci notes are summed up in the etherpad 14:45:48 great, thanks! 14:46:05 #topic bugs & blueprints 14:46:08 #link https://launchpad.net/tripleo/+milestone/rocky-rc2 14:46:10 #link https://launchpad.net/tripleo/+milestone/stein-1 14:46:12 For Stein we currently have 28 blueprints open in Launchpad. 14:46:14 Bugs: 12 rocky-rc2, 772 (+32) stein-1. 103 (0) open Storyboard bugs. 14:46:16 #link https://storyboard.openstack.org/#!/project_group/76 14:47:03 we have a whole bunch of alert bugs open 14:47:12 would be nice to get those addressed this week 14:47:37 they're all assigned to folks, so hopefully they do get addressed :) 14:48:21 #topic projects releases or stable backports 14:48:30 we're going to cut rc2 this week 14:48:42 which if nothing major is wrong, will be the ga for rocky 14:49:37 * fultonj hopes to land https://review.openstack.org/#/q/topic:bug/1769769+(status:open+OR+status:merged) in rocky 14:49:43 so if there are any major blockers for rocky, please raise them now 14:50:16 mwhahaha: well, TLS everywhere is broken for rocky. Hope to get support for owalsh (and maybe beagles) on that. 14:50:21 Ronelle Landy proposed openstack/tripleo-quickstart-extras master: Resolve get_build_command output before container_build_id https://review.openstack.org/602734 14:51:01 mwhahaha: there may be some openvswitch upgrade from queens to rocky as well - looking into this right now 14:51:02 k those will likely need to follow up after GA 14:51:27 mwhahaha: it won't affect things until we have a 2.10 package though 14:51:33 k 14:52:37 #topic specs 14:52:40 #link https://review.openstack.org/#/q/project:openstack/tripleo-specs+status:open 14:52:54 it's a good time to review specs folks! 14:54:06 #topic open discussion 14:54:11 standalone installer works w/ ceph; docs patch https://review.openstack.org/#/c/602710 14:54:17 Anything else folks want to bring up to the meeting? 14:54:18 nice 14:54:21 fultonj++ 14:54:26 fultonj: nice! 14:54:35 FYI.. ci folks will be on irc to help, chat etc 14:54:49 we may also be on bluejeans.. 14:55:11 weshay++ 14:55:17 7050859455 weshay community call next 14:55:56 is there a plan to replace some scenario ci jobs with stand alone versions? 14:56:11 i remember the topic came up at PTG but doesn't sure if there's a squad with a todo list for it 14:56:55 fultonj: right i dont think we've planned the work items or assigned to anyone yet 14:57:02 fultonj: but there was no opposition at ptg at least 14:57:29 i think we'll do that after we get a fedora28 standalone going 14:57:39 mwhahaha: ok, that makes sense 14:57:41 fultonj: so i guess tripleo ci team will at least be involved there but it would likely require input from other squads too 14:57:42 it's part of further optimizations, we've got some other hurdles first 14:57:48 like tempest on standalone (still not merged) 14:59:41 stupid mirrors 14:59:56 Anything else folks want to bring up? 15:00:32 Alright! thanks for joinin! 15:00:34 #endmeeting