*** UtahDave has joined #tripleo | 00:00 | |
*** UtahDave has quit IRC | 00:06 | |
*** UtahDave has joined #tripleo | 00:07 | |
*** matsuhashi has joined #tripleo | 00:09 | |
*** cd-undercloud has joined #tripleo | 00:13 | |
cd-undercloud | ************** overcloud complete status=0 ************ | 00:13 |
---|---|---|
*** cd-undercloud has quit IRC | 00:13 | |
lifeless | SpamapS: btw - https://blueprints.launchpad.net/nova/+spec/instance-group-api-extension | 00:22 |
*** UtahDave has quit IRC | 00:27 | |
SpamapS | lifeless: right I thought maybe that was something people were already looking at | 00:27 |
SpamapS | lifeless: oh and huzzah ;) | 00:27 |
lifeless | indeed | 00:29 |
lifeless | SpamapS: can you review https://review.openstack.org/#/q/status:open+project:openstack-infra/tripleo-ci,n,z please? | 00:29 |
lifeless | simple stuff | 00:29 |
lifeless | SpamapS: and will take CI a big step forwards | 00:30 |
*** matsuhashi has quit IRC | 00:33 | |
openstackgerrit | A change was merged to openstack-infra/tripleo-ci: Install dependencies from prepare_tripleo.sh https://review.openstack.org/68653 | 00:43 |
*** ccrouch has joined #tripleo | 00:43 | |
openstackgerrit | A change was merged to openstack-infra/tripleo-ci: Switch TRIPLEO_ROOT https://review.openstack.org/68654 | 00:44 |
ccrouch | question about "state" on the overcloud nodes | 00:44 |
ccrouch | there is "precious state" which iiuc goes in /mnt/state | 00:45 |
ccrouch | there will also eventually be a readonly / filesystem | 00:45 |
ccrouch | but is there also "non precious state" e.g. pids and such, which would get written to /var/run ? | 00:46 |
*** matsuhashi has joined #tripleo | 00:48 | |
lifeless | ccrouch: sure, e.g. the neutron dnsmasq conf files are written to /var/run which is a tmpfs on Ubuntu | 00:51 |
ccrouch | ok great. I wasn't sure how far the readonly / was going to extend | 00:52 |
lifeless | who knows :P | 00:54 |
ccrouch | :-) | 00:54 |
ccrouch | i guess if its /tmpfs mounted it doesnt really matter | 00:54 |
ccrouch | i was thinking of any other parts of the filesytem actually mapped to persistent disk | 00:55 |
lifeless | I think it would be surprising | 00:55 |
lifeless | /mnt is enough of a delta from regular ubuntu/fedora etc | 00:55 |
lifeless | having some bits sticky and some not - harder to reason about | 00:55 |
ccrouch | oh i agree, if we can get anything that we care about being persistent over to /mnt/state that would be optimal | 00:56 |
ccrouch | and everything else that needs to get written going to tmpfs | 00:57 |
ccrouch | then we should be in good shape | 00:57 |
*** nosnos has joined #tripleo | 00:58 | |
*** cd-undercloud has joined #tripleo | 01:01 | |
cd-undercloud | ************** overcloud complete status=0 ************ | 01:01 |
*** cd-undercloud has quit IRC | 01:01 | |
pleia2 | lifeless: I've been Ubuntu and OpenStack eventing all weekend while you've been working a lot on our CI :) could use a state of the union tomorrow morning to see where I should jump back in | 01:01 |
ccrouch | lifeless: your statement about /var/run and tmpfs holds for Fedora too: http://fedoraproject.org/wiki/Features/var-run-tmpfs | 01:01 |
*** UtahDave has joined #tripleo | 01:05 | |
lifeless | pleia2: ok | 01:08 |
lifeless | ccrouch: cool | 01:08 |
*** jcooley_ has joined #tripleo | 01:12 | |
lifeless | pleia2: next thing I think would be to work on the tripleo prepare scripts so they work on fedora | 01:12 |
pleia2 | lifeless: perfect, I started on friday to run a few of them manually but didn't get very far, things broke down pretty quick | 01:14 |
pleia2 | lifeless: do we want to make the current scripts distro-agnostic-ish or run different ones for fedora? | 01:14 |
pleia2 | (I've been assuming the former) | 01:15 |
lifeless | pleia2: I would assume the former | 01:15 |
lifeless | like devstack-gate | 01:15 |
pleia2 | yeah | 01:15 |
SpamapS | lifeless: on the same vein as what ccrouch was discussing earlier.. I think a push for readonly / soon will help us catch things that are writing state to "not /mnt" earlier and thus will help us avoid "oops we lost X" scenarios | 01:44 |
lifeless | yes | 01:44 |
lifeless | or find things we don't want to keep | 01:44 |
lifeless | like | 01:44 |
lifeless | I'm thinking we don't want to keep the ovs config | 01:45 |
lifeless | as I think thats whats messing up reboots | 01:45 |
SpamapS | lifeless: interesting | 01:47 |
*** cd-undercloud has joined #tripleo | 01:47 | |
cd-undercloud | ************** overcloud complete status=0 ************ | 01:47 |
*** cd-undercloud has quit IRC | 01:47 | |
SpamapS | wow this is cool | 01:48 |
SpamapS | we could actually calculate our downtime now :) | 01:48 |
SpamapS | or I should say uptime | 01:48 |
lifeless | yup | 01:48 |
lifeless | 50% or something | 01:48 |
SpamapS | I think we're down for 17 minutes out of every 45 - 50 | 01:48 |
SpamapS | could be more like 20 | 01:49 |
*** nosnos_ has joined #tripleo | 01:53 | |
*** nosnos has quit IRC | 01:54 | |
*** matsuhashi has quit IRC | 02:26 | |
*** matsuhashi has joined #tripleo | 02:32 | |
*** morazi has joined #tripleo | 02:32 | |
*** cd-undercloud has joined #tripleo | 02:35 | |
cd-undercloud | ************** overcloud complete status=0 ************ | 02:35 |
*** cd-undercloud has quit IRC | 02:35 | |
lifeless | StevenK: btw man apt-mirror makes my eyes bleed | 02:39 |
lifeless | StevenK: what does the 'clean' option do? | 02:40 |
openstackgerrit | A change was merged to openstack/diskimage-builder: Fix ramdisk element for openSUSE https://review.openstack.org/68418 | 02:45 |
greghaynes | huh, it appears my overcloud compute node kernel panicked when booting during resize2fs | 02:47 |
SpamapS | greghaynes: I've seen that too! | 02:48 |
greghaynes | scary | 02:49 |
SpamapS | greghaynes: it is possible it is due to the virsh stuff we do, not a real resize bug | 02:49 |
* greghaynes screenshots and rebuilds | 02:51 | |
lifeless | SpamapS: not really | 02:51 |
lifeless | SpamapS: its a real bug | 02:51 |
SpamapS | lifeless: ugh | 02:51 |
lifeless | SpamapS: hey, can you do a review for me please? | 02:51 |
lifeless | https://review.openstack.org/#/c/68143/ | 02:52 |
lifeless | broke tests | 02:52 |
SpamapS | lifeless: that was the one I was thinking might cause the resize bug | 02:52 |
lifeless | SpamapS: how so? | 02:53 |
SpamapS | lifeless: oh... IIRC sbader pointed out one that was specifically only with "really big resize values" .. which we have. | 02:53 |
SpamapS | lifeless: oh I was thinking maybe we were writing to our images somehow | 02:53 |
*** sdake has joined #tripleo | 02:53 | |
*** sdake has joined #tripleo | 02:53 | |
SpamapS | but that is just wishful thinking :) | 02:53 |
lifeless | it is ;) | 02:53 |
lifeless | dan has seen it on bare metal | 02:54 |
StevenK | lifeless: Yes, the man page is *horrible*. | 02:54 |
StevenK | lifeless: clean tells apt-mirror to delete files not referenced by the mirror, and can be told to miss directories by use of skip-clean | 02:55 |
lifeless | SpamapS: that review is needed for ci test runs | 02:55 |
openstackgerrit | A change was merged to openstack/tripleo-incubator: Destroy seed domains before copying new files. https://review.openstack.org/68143 | 02:55 |
lifeless | SpamapS: woo | 02:55 |
greghaynes | So when heat gets wedged in 'CREATE_IN_PROGRESS' due something like that kernel panic, how do you force it to give up hope of success? | 02:55 |
SpamapS | greghaynes: :( delete it | 02:56 |
SpamapS | greghaynes: note that the recently landed abandon/adopt feature might also work. | 02:56 |
clarkb | stack deletes were a very common thing for me during the tripleo sprint | 02:56 |
lifeless | greghaynes: nova stop $instanceid | 02:56 |
lifeless | greghaynes: nova start $instanceid | 02:56 |
greghaynes | oo | 02:56 |
SpamapS | lifeless: I tried that when I hit it. No dice. | 02:57 |
lifeless | greghaynes: heat is waiting for the waitcondition to fire, and the resize happens after we've deployed | 02:57 |
lifeless | greghaynes: if that trips the panic again, stop it, then take a copy of the VM disk file so we can reproduce | 02:57 |
SpamapS | true would be quite useful | 02:58 |
lifeless | SpamapS: https://jenkins02.openstack.org/job/gate-tripleo-deploy/20/console | 02:58 |
lifeless | SpamapS: testing https://review.openstack.org/#/c/68308/ | 02:58 |
SpamapS | wait | 02:59 |
greghaynes | panic's again! | 02:59 |
SpamapS | _TESTING_?! | 02:59 |
* greghaynes copies | 02:59 | |
* SpamapS preps happy dance | 02:59 | |
SpamapS | we test stuff now? | 02:59 |
lifeless | SpamapS: not kept up on email :) | 03:00 |
lifeless | SpamapS: if you have time, reviewing stevenk's tie patch for debmirror would be useful | 03:00 |
SpamapS | lifeless: no I saw that we have some kind of test somewhere of something | 03:01 |
lifeless | SpamapS: toci is online, non-voting | 03:01 |
greghaynes | The .qcow2 is sufficient to grab right? We dont do any mangling inside of glance we migh suspect? | 03:01 |
SpamapS | lifeless: yeah I've seen things pop up | 03:01 |
lifeless | greghaynes: sufficient yes | 03:01 |
SpamapS | reasonable expectation that the kernel is the same | 03:02 |
lifeless | oh right | 03:02 |
lifeless | greghaynes: yeah, grab the kernel and ramdisk from glance | 03:02 |
greghaynes | ok | 03:02 |
lifeless | you can get their image id out of nova show $id | 03:03 |
lifeless | oh god the logging burns | 03:06 |
lifeless | 2014-01-27 03:05:57.926 | http://pypi.python.org/simple/python-ironicclient/ uses an insecure transport scheme (http). Consider using https if pypi.python.org has it available | 03:06 |
lifeless | 2014-01-27 03:05:58.056 | Requirement already up-to-date: python-ironicclient in /opt/stack/new/tripleo-incubator/openstack-tools/lib/python2.7/site-packages | 03:07 |
lifeless | spam after spam after spam | 03:07 |
SpamapS | lovely spaaaaaaam | 03:08 |
SpamapS | glorious spaaaaaaaam | 03:08 |
lifeless | apS | 03:08 |
SpamapS | exactly | 03:08 |
SpamapS | t-minus 52 minutes to sunday night happy hhour sushi and saki bombs | 03:09 |
SpamapS | I shall toast to rebuild | 03:09 |
SpamapS | StevenK: note that your README.md needs a little tweaking (posted comments inline) | 03:10 |
lifeless | nuts | 03:10 |
lifeless | 2014-01-27 01:58:52.418 | Calling <function virsh_start at 0x7f8aba439c80> with: ['start', 'seed_1'] | 03:10 |
lifeless | 2014-01-27 01:58:52.421 | error: Domain is already active | 03:10 |
SpamapS | lifeless: so I feel like we need a new name for "rolling updaets" | 03:14 |
SpamapS | coordinated updates is rattling around in my brain at the moment. | 03:14 |
lifeless | graceful? | 03:15 |
lifeless | its the distinguishing thing for me | 03:15 |
lifeless | coordinated would work too | 03:15 |
lifeless | hmm, so we got | 03:16 |
lifeless | 2014-01-27 01:38:59.115 | pulling/updating tripleo-incubator | 03:16 |
lifeless | 2014-01-27 01:38:59.116 | Already up-to-date. | 03:16 |
greghaynes | does rolling updates essentially mean restart nodes in certain order so app-level HA prevents downtime? | 03:17 |
lifeless | yah | 03:17 |
lifeless | plus allow the nodes to offload stuff before restarting | 03:17 |
*** matsuhashi has quit IRC | 03:18 | |
SpamapS | greghaynes: and also trigger rollback if error rates climb | 03:20 |
SpamapS | but that is more canary than rolling/coordinated/graceful | 03:20 |
openstackgerrit | lifeless proposed a change to openstack/tripleo-incubator: Show what revision we're on in pull-tools. https://review.openstack.org/69267 | 03:20 |
greghaynes | oo, thatd be a fun one to figure oyt | 03:20 |
lifeless | SpamapS: care to land ^ - can't tell what is going on with the CI failure - dunno if the code is bad or we weren't running the code desired | 03:20 |
lifeless | CANARY | 03:20 |
*** cd-undercloud has joined #tripleo | 03:21 | |
cd-undercloud | ************** overcloud complete status=0 ************ | 03:21 |
*** cd-undercloud has quit IRC | 03:21 | |
SpamapS | lifeless: any reason you're not doing git log -n 1 ? | 03:22 |
lifeless | didn't know about it | 03:22 |
SpamapS | :) | 03:22 |
clarkb | git log -1 :P | 03:22 |
SpamapS | 6 or 1/2 dozen | 03:23 |
lifeless | uhm oh because its unneeded | 03:23 |
SpamapS | lifeless: use that. Nice to get the whole commit :) | 03:23 |
lifeless | SpamapS: no thanks | 03:23 |
SpamapS | but.. dates.. | 03:23 |
SpamapS | humans | 03:23 |
SpamapS | etc. | 03:23 |
SpamapS | no? | 03:23 |
lifeless | SpamapS: there are two caes | 03:23 |
lifeless | three | 03:24 |
StevenK | git rev-parse | 03:24 |
lifeless | a) its not being altered, so we'll see what commit from trunnk it ran, and git show will show us | 03:24 |
StevenK | Rather than log | head -n 1 | 03:24 |
lifeless | StevenK: shows nothing | 03:24 |
lifeless | SpamapS: b) it is being altered but is the top and a fastforward, in which case the ref will match that in gerrit | 03:25 |
lifeless | SpamapS: c) it is the zuul ref, in which case its inaccessible to mere humans | 03:25 |
lifeless | SpamapS: but also will just have a one-line 'merge x' message AIUI | 03:25 |
*** panda has quit IRC | 03:26 | |
lifeless | StevenK: if you're looking to tune it | 03:27 |
lifeless | git rev-list HEAD --max-count=1 | 03:27 |
lifeless | probably | 03:27 |
StevenK | rev-parse HEAD | 03:28 |
lifeless | ah | 03:29 |
lifeless | what about | 03:29 |
lifeless | git log -1 --pretty=oneline | 03:29 |
lifeless | b948d4a23f6dda93bf8d7b5893d78c9eabb11bcd Merge "Fix ramdisk element for openSUSE" | 03:29 |
lifeless | SpamapS: ^? | 03:29 |
SpamapS | lifeless: sorry I really don't know what you're saying. log with just commit vs. log with the commit/author/date ... whats the resistance to the latter? | 03:29 |
lifeless | SpamapS: verbosity in a tool that looks at lots of repositories | 03:30 |
SpamapS | ah short is good | 03:30 |
lifeless | SpamapS: I want enough to diagnose | 03:30 |
lifeless | SpamapS: not enough to drown | 03:30 |
SpamapS | mmmk | 03:31 |
lifeless | oneline has the commit message first line and the hash | 03:31 |
SpamapS | lifeless: I'm about to shut down. You going to change it or just want the head -n 1? | 03:31 |
SpamapS | I get the brevity desire for sure | 03:31 |
openstackgerrit | lifeless proposed a change to openstack/tripleo-incubator: Show what revision we're on in pull-tools. https://review.openstack.org/69267 | 03:32 |
lifeless | ^ | 03:32 |
lifeless | SpamapS: ^ | 03:32 |
greghaynes | or to shed - log --pretty=format:'%h : %s' | 03:32 |
lifeless | of course, we probably wont' be running that verion of the tool to get diagnostics if its not actually running trunk | 03:32 |
lifeless | gnar :) | 03:32 |
lifeless | lol | 03:33 |
lifeless | we log the ssh private key | 03:33 |
lifeless | perhaps not idea | 03:33 |
SpamapS | hahaha | 03:34 |
SpamapS | ok, date night time | 03:35 |
openstackgerrit | A change was merged to openstack/tripleo-incubator: Show what revision we're on in pull-tools. https://review.openstack.org/69267 | 03:36 |
*** panda has joined #tripleo | 03:41 | |
openstackgerrit | lifeless proposed a change to openstack-infra/tripleo-ci: Be verbose in toci_devtest.sh. https://review.openstack.org/69270 | 03:53 |
openstackgerrit | lifeless proposed a change to openstack/tripleo-incubator: Make the devtest scripts for toci run with -x. https://review.openstack.org/69271 | 03:55 |
lifeless | and omg we had a success | 03:56 |
lifeless | -> seed deployed | 03:56 |
lifeless | http://logs.openstack.org/08/68308/2/check/gate-tripleo-deploy/d356b09/console.html | 03:56 |
lifeless | https://review.openstack.org/#/c/68308/ | 03:56 |
*** UtahDave has quit IRC | 03:59 | |
lifeless | ok, so I think the issue is that running pull-tools shouldn't be done here | 04:01 |
lifeless | as we have devstack-gate wrapping things up for us | 04:01 |
lifeless | zuul wise | 04:01 |
clarkb | oh yeah, d-g should sort that out for you | 04:04 |
lifeless | yeah | 04:04 |
openstackgerrit | lifeless proposed a change to openstack/tripleo-incubator: Don't try to pull-tools when there is a zuul ref. https://review.openstack.org/69274 | 04:04 |
openstackgerrit | lifeless proposed a change to openstack/tripleo-incubator: Make the devtest scripts for toci run with -x. https://review.openstack.org/69271 | 04:04 |
lifeless | clarkb: d-g wil reset things back to master when there is no zuul change for it, right ? | 04:06 |
lifeless | clarkb: (node reuse concerns) | 04:06 |
clarkb | lifeless: yes it will reset to $ZUUL_BRANCH | 04:07 |
clarkb | and if there is no $ZUUL_BRANCH in the project it will fall back to master | 04:07 |
lifeless | cool | 04:08 |
lifeless | clarkb: doesn't seem to be doing quite that | 04:08 |
*** cd-undercloud has joined #tripleo | 04:08 | |
cd-undercloud | ************** overcloud complete status=0 ************ | 04:08 |
*** cd-undercloud has quit IRC | 04:08 | |
lifeless | clarkb: http://logs.openstack.org/71/69271/2/check/gate-tripleo-deploy/7373fdc/console.html | 04:09 |
lifeless | clarkb: /opt/stack/new still has the trunk revision of this code | 04:09 |
lifeless | https://review.openstack.org/#/c/69271/ | 04:10 |
lifeless | which is stacked on | 04:10 |
lifeless | https://review.openstack.org/#/c/69274/1 | 04:10 |
lifeless | which has this patch https://review.openstack.org/#/c/69274/1/scripts/pull-tools | 04:11 |
lifeless | but you can see the log: | 04:11 |
lifeless | http://logs.openstack.org/71/69271/2/check/gate-tripleo-deploy/7373fdc/console.html | 04:11 |
lifeless | mmm, possibly I missed something | 04:11 |
* lifeless looks harder | 04:11 | |
lifeless | oh right, we have some stuff to get the tools | 04:12 |
clarkb | lifeless: you want to look in the setup workspace script to see all of the git updating | 04:12 |
*** werebutt has joined #tripleo | 04:12 | |
*** werebutt has left #tripleo | 04:12 | |
openstackgerrit | lifeless proposed a change to openstack/tripleo-incubator: Make the devtest scripts for toci run with -x. https://review.openstack.org/69271 | 04:14 |
openstackgerrit | lifeless proposed a change to openstack/tripleo-incubator: Don't try to git pull when there is a zuul ref. https://review.openstack.org/69274 | 04:14 |
*** ramishra has joined #tripleo | 04:28 | |
greghaynes | hrm, qemu-nbd in overcloud-novacompute seems to be spinning on a lockfile | 04:34 |
lifeless | greghaynes: I thought I disabled that | 04:36 |
greghaynes | its /var/lock/qemu-nbd-nbd0 | 04:36 |
*** coolsvap has joined #tripleo | 04:38 | |
*** ramishra has quit IRC | 04:39 | |
*** matsuhashi has joined #tripleo | 04:40 | |
openstackgerrit | lifeless proposed a change to openstack/tripleo-image-elements: Source devtest_variables in tripleo-cd. https://review.openstack.org/69279 | 04:41 |
openstackgerrit | lifeless proposed a change to openstack/tripleo-image-elements: Disable libvirt file injection. https://review.openstack.org/69280 | 04:41 |
lifeless | greghaynes: pull that into your /opt/stack/os-apply/config/templates/etc/nova/nova.conf and do an os-collect-config --force --one | 04:41 |
lifeless | omg | 04:43 |
lifeless | there is a dedicated channel on cable here at the moment | 04:43 |
lifeless | 'twilight' | 04:44 |
lifeless | W | 04:44 |
lifeless | T | 04:44 |
lifeless | F | 04:44 |
*** jcooley_ has quit IRC | 04:50 | |
greghaynes | lifeless: tyty | 04:53 |
greghaynes | hahaha | 04:53 |
*** cd-undercloud has joined #tripleo | 04:55 | |
cd-undercloud | ************** overcloud complete status=0 ************ | 04:55 |
*** cd-undercloud has quit IRC | 04:55 | |
*** akuznetsov has joined #tripleo | 04:59 | |
*** ramishra has joined #tripleo | 05:05 | |
*** jcooley_ has joined #tripleo | 05:11 | |
*** jcooley_ has quit IRC | 05:19 | |
*** jcooley_ has joined #tripleo | 05:20 | |
*** jcooley_ has quit IRC | 05:24 | |
*** noslzzp has joined #tripleo | 05:46 | |
*** noslzzp has quit IRC | 05:49 | |
*** ramishra has quit IRC | 05:51 | |
*** ramishra has joined #tripleo | 05:57 | |
*** jcooley_ has joined #tripleo | 06:01 | |
*** rpodolyaka1 has joined #tripleo | 06:01 | |
*** rpodolyaka1 has left #tripleo | 06:10 | |
*** rpodolyaka1 has joined #tripleo | 06:10 | |
*** jcooley_ has quit IRC | 06:14 | |
*** AaronGr is now known as AaronGr_Zzz | 06:29 | |
lifeless | fuck yeah, SUCCESS from check | 06:32 |
*** boris-42 has quit IRC | 06:34 | |
* lifeless toasts status=0 | 06:35 | |
openstackgerrit | lifeless proposed a change to openstack/tripleo-incubator: Stop sourcing devtest_seed.sh. https://review.openstack.org/69286 | 06:42 |
*** ramishra has quit IRC | 07:00 | |
*** e0ne has joined #tripleo | 07:09 | |
*** morazi has quit IRC | 07:16 | |
*** jcoufal has joined #tripleo | 07:18 | |
*** lsmola_ has joined #tripleo | 07:31 | |
*** mrunge has joined #tripleo | 07:34 | |
*** coolsvap has quit IRC | 07:39 | |
*** jprovazn has quit IRC | 07:52 | |
*** jprovazn has joined #tripleo | 07:53 | |
*** pblaho has joined #tripleo | 08:04 | |
*** jtomasek has joined #tripleo | 08:07 | |
*** e0ne has quit IRC | 08:12 | |
*** markmc has joined #tripleo | 08:18 | |
*** coolsvap has joined #tripleo | 08:21 | |
*** d0ugal has joined #tripleo | 08:24 | |
*** d0ugal has joined #tripleo | 08:24 | |
openstackgerrit | Ralf Haferkamp proposed a change to openstack/diskimage-builder: Include /lib64 into the deploy ramdisk on openSUSE https://review.openstack.org/69295 | 08:35 |
openstackgerrit | Ralf Haferkamp proposed a change to openstack/diskimage-builder: Add bash as a dependency to the deploy ramdisk https://review.openstack.org/69296 | 08:35 |
*** boris-42 has joined #tripleo | 08:36 | |
*** matsuhashi has quit IRC | 08:36 | |
*** matsuhashi has joined #tripleo | 08:37 | |
*** ifarkas has joined #tripleo | 08:40 | |
*** jistr has joined #tripleo | 08:48 | |
openstackgerrit | A change was merged to openstack/diskimage-builder: Mount root filesystem readonly during boot https://review.openstack.org/68675 | 08:56 |
Ng | morning | 09:03 |
rlandy | rpodolyaka1: hello - ... status update on devtest with tuskar | 09:06 |
jprovazn | morning | 09:11 |
jprovazn | lifeless: ping | 09:11 |
*** e0ne has joined #tripleo | 09:15 | |
openstackgerrit | Dirk Mueller proposed a change to openstack/diskimage-builder: Add a service mapping for openSUSE https://review.openstack.org/68051 | 09:17 |
*** derekh has joined #tripleo | 09:18 | |
rpodolyaka1 | rlandy: hey! | 09:18 |
lifeless | jprovazn: pong | 09:20 |
lifeless | rpodolyaka1: oh hai | 09:20 |
lifeless | rpodolyaka1: all the baremetal rebuild stuff landed... and then on the weekend I found new issues :) | 09:20 |
lifeless | rpodolyaka1: I'm going to ask StevenK if he thinks he can polish the rough fixes up, just thought you should know | 09:21 |
rpodolyaka1 | lifeless: morning! | 09:21 |
jprovazn | lifeless: hi, thanks for looking at the rabbitmq patch, what do you mean by the last comment here: https://review.openstack.org/#/c/68392/2/elements/rabbitmq-server/os-refresh-config/post-configure.d/10-rabbitmq-hostnames ? (This looks like something we should push upstream - a hosts.d/ thing.) | 09:21 |
rpodolyaka1 | lifeless: new issues in preserve ephemeral nova series? or rebuild story overall? | 09:22 |
rlandy | rpodolyaka1: so ... I got as far as registering the baremetal nodes with the undercloud - then the errors started | 09:22 |
lifeless | rpodolyaka1: https://review.openstack.org/#/c/69060/ | 09:23 |
lifeless | jprovazn: well wouldn't it be nice if rather than editing hosts | 09:23 |
lifeless | jprovazn: we could drop a file in /etc/hosts.d/somethingorother | 09:24 |
jprovazn | lifeless: ah, ok, will check this option | 09:25 |
lifeless | jprovazn: you don't need to block on it | 09:25 |
jprovazn | even better | 09:25 |
lifeless | jprovazn: but I see it as part of our job to see things that are hard and make them easier | 09:25 |
lifeless | jprovazn: e.g.: implement for use a hosts.d -> hosts idempotent script | 09:25 |
lifeless | jprovazn: then we can use hosts.d | 09:25 |
lifeless | jprovazn: and separately we can send a patch to libc or whatever to make it an intrinsic feature | 09:26 |
jprovazn | lifeless: I see, thanks | 09:26 |
rpodolyaka1 | lifeless: oh, nice catch | 09:29 |
lifeless | rpodolyaka1: SpamapS figured it out, I just wrote the code | 09:29 |
lifeless | rpodolyaka1: but yeah, it was a bit WTF | 09:29 |
rpodolyaka1 | lifeless: I bet :) | 09:29 |
rpodolyaka1 | rlandy: please elaborate on errors :) | 09:30 |
lifeless | DATA. moah DATA | 09:31 |
rlandy | rpodolyaka1: yes (just waiting conversation above to conclude) | 09:39 |
openstackgerrit | Dougal Matthews proposed a change to openstack/python-tuskarclient: Remove concepts that no longer exist in the API https://review.openstack.org/69306 | 09:39 |
rpodolyaka1 | rlandy: so what errors do you see? | 09:40 |
lifeless | rlandy: be bold :) | 09:40 |
rlandy | pressure, pressure | 09:40 |
rlandy | rpodolyaka1: the - I think they are unrelated (2002, "Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2)") None None (HTTP 500) | 09:41 |
lifeless | rlandy: where do you see that ? | 09:41 |
rlandy | more ... ERROR: HTTPConnectionPool(host='192.0.2.2', port=8774): Max retries exceeded with url: /v2/2db5418a351a4c58bf7cb09dc29c3218/os-baremetal-nodes (Caused by <class 'socket.error'>: [Errno 111] Connection refused | 09:41 |
jamezpolley | lifeless: is the meeting happening at the usual time this week? https://wiki.openstack.org/wiki/Meetings/TripleO doesn't seem to have been updated since mid-december | 09:41 |
rlandy | the errors are returned directly from the setup-baremetal command | 09:42 |
lifeless | jamezpolley: yes | 09:42 |
lifeless | jamezpolley: its terrible for you | 09:42 |
lifeless | jamezpolley: we should perhaps start doing alternating times | 09:42 |
rpodolyaka1 | rlandy: is mysqld up and running? nova-api? | 09:42 |
*** athomas has joined #tripleo | 09:43 | |
rlandy | rpodolyaka1: yes ... ssh'ed into the undercloud | 09:43 |
rlandy | mysql is up | 09:43 |
rlandy | as is nova-api | 09:43 |
rlandy | nova baremetal-node-list is interesting | 09:44 |
rpodolyaka1 | ? | 09:44 |
jamezpolley | lifeless: it's a mere 6am. I'm usually up around then anyway. | 09:44 |
lifeless | jamezpolley: oh, ok cool. | 09:44 |
lifeless | jamezpolley: winter may be worse? | 09:44 |
rlandy | rpodolyaka1: nova baremetal-node-list errors the first time and returns an output when rerun | 09:45 |
rpodolyaka1 | rlandy: hmm, maybe you run it the first time when os-refresh-config were still executing? | 09:45 |
lifeless | rlandy: did you check heat stack-list ? | 09:45 |
rlandy | rpodolyaka1: it's repeatable | 09:46 |
rpodolyaka1 | rlandy: interesting. can you show the error message it gives you? | 09:46 |
rlandy | heat stack-list returns the undercloud when the seed is source'd and nothing when undercloudrc is source'd | 09:46 |
rpodolyaka1 | yeah, that's correct. as you haven't created overcloud stack yet | 09:46 |
rpodolyaka1 | but undercloud is in CREATE_COMPLETE state, right? | 09:48 |
rlandy | correct - that part makes sense because I can't heat stack-create at this point | 09:48 |
jamezpolley | lifeless: 5am isn't great, no. We probably don't want to make it too much later though - 1900UTC is already 8pm London by then | 09:48 |
rlandy | correct | 09:48 |
rlandy | undercloud is functional | 09:48 |
lifeless | jamezpolley: what other meetings are doing is one week time A one week time B one week time A | 09:48 |
lifeless | jamezpolley: so one nice for west-coast-us-through-asian one nice for europe through nz | 09:49 |
lifeless | we have some indians here via the indian public cloud thing | 09:49 |
jamezpolley | lifeless: fortunately there's a long time before this becomes a problem. I'm fine with alternating times too | 09:49 |
lifeless | or you could man up and do 5am :P | 09:49 |
rlandy | rpodolyaka1: error from nova baremetal-node-list | 09:50 |
rlandy | ERROR: An unexpected error prevented the server from fulfilling your request. (OperationalError) (2002, "Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2)") None None (HTTP 500) | 09:50 |
lifeless | rlandy: this is your undercloud or your seed ? | 09:50 |
rlandy | lifeless: undercloud | 09:50 |
lifeless | rlandy: is there anything in /var/log/mysql/error.log ? | 09:51 |
rlandy | yes - getting/pasting | 09:51 |
jamezpolley | 5am is doable for an irc meeting. It'd be a stretch if I had to turn on a camera | 09:51 |
lifeless | jamezpolley: oh god no, we don't need *that* | 09:51 |
marios | haha | 09:52 |
lifeless | https://review.openstack.org/#/c/62221 might be interesting to play with | 09:57 |
rlandy | lifeless: I got this in /mnt/state/var/log/mysql/error.log ... (seemed close enough - /var/log/mysql/error.log doesn't exist F19) http://fpaste.org/71896/81655613/ | 09:57 |
lifeless | ohr ight, yes :) | 09:57 |
lifeless | so something shut it down | 09:58 |
lifeless | 140127 9:54:02 [Note] /usr/libexec/mysqld: Normal shutdown | 09:58 |
rlandy | yes - I did - I restarted it | 09:59 |
rlandy | seemed like the thing to do when your services aren't being helpful :) | 10:00 |
rlandy | I take that back - there are more shutdowns in the log than one | 10:01 |
openstackgerrit | Dougal Matthews proposed a change to openstack/python-tuskarclient: Add bindings for overcloud API entrypoints https://review.openstack.org/69312 | 10:04 |
*** martyntaylor has joined #tripleo | 10:04 | |
lifeless | rlandy: so, I'd tail that log. then try to reproduce the problem | 10:06 |
rlandy | lifeless: ok | 10:07 |
rpodolyaka1 | rlandy: lifeless: hmm, if I'm not missing something, there is another strange thing here. Why are we trying to connect to MySQL via a unix domain socket at all? AFAIK, we should be using TCP sockets according to a connection string we pass to SQLAlchemy (e.g. mysq;://unset:unset@localhost/nova_bm instead of mysql:///nova_bm?unix_socket=/var/lib/mysql/mysql.sock) | 10:08 |
lifeless | thats true too | 10:08 |
*** e0ne_ has joined #tripleo | 10:13 | |
openstackgerrit | Dougal Matthews proposed a change to openstack/python-tuskarclient: Add bindings for Resource Category API entrypoints https://review.openstack.org/69313 | 10:13 |
openstackgerrit | A change was merged to openstack/diskimage-builder: Add a service mapping for openSUSE https://review.openstack.org/68051 | 10:14 |
openstackgerrit | A change was merged to openstack/diskimage-builder: Use /usr/bin/env, not /bin/env https://review.openstack.org/68928 | 10:15 |
*** akrivoka has joined #tripleo | 10:16 | |
*** athomas has quit IRC | 10:16 | |
*** e0ne has quit IRC | 10:16 | |
*** max_lobur_afk is now known as max_lobur | 10:18 | |
rlandy | lifeless: rpodolyaka1: here is the command output and tail of mysql/error.log ... http://fpaste.org/71902/90817901/ | 10:19 |
*** frankbutt has joined #tripleo | 10:21 | |
*** frankbutt has left #tripleo | 10:21 | |
*** athomas has joined #tripleo | 10:23 | |
lifeless | markmc: hey, so RH folk @ a midcycle tripleo meetup - doable? [see my mail to -dev] | 10:23 |
rpodolyaka1 | rlandy: hmm, what's in the nova-api.log and os-collect-config.log (I'm wondering, if it's os-collect-config who restarts mysqld)? | 10:26 |
openstackgerrit | A change was merged to openstack/diskimage-builder: Include /lib64 into the deploy ramdisk on openSUSE https://review.openstack.org/69295 | 10:26 |
markmc | lifeless, probably a bit short notice to get as many people as last time | 10:26 |
markmc | lifeless, what dates are you thinking of? | 10:26 |
openstackgerrit | A change was merged to openstack/diskimage-builder: Add bash as a dependency to the deploy ramdisk https://review.openstack.org/69296 | 10:26 |
openstackgerrit | A change was merged to openstack/tuskar-ui: Adds basic deployment log API and tab https://review.openstack.org/68764 | 10:27 |
lifeless | markmc: well thats the thing, if folk are like 'we need 4 weeks warning but we can come' then I'd say - 3rd marchish perhaps - 5 weeks from now, which will let us get a couple more to come too | 10:30 |
lifeless | markmc: HP travel policy doesn't like < 4 weeks booking lead time anyhow | 10:31 |
lifeless | markmc: I have a separate trip at the end of march to florida, so early march then home then that trip is doable, but mid-march would suck | 10:31 |
markmc | lifeless, ok, I'll ask around | 10:32 |
markmc | lifeless, funnily enough, I'll be in SF on Mar 4th for a board meeting | 10:32 |
lifeless | markmc: so, we get one :) | 10:33 |
markmc | lifeless, it's not me you want so much though :) | 10:34 |
lifeless | markmc: I'd really like a cross section of CI/tuskar/overall-plumbing folk, if we can get it | 10:34 |
markmc | lifeless, yeah | 10:34 |
lifeless | markmc: we have made some huge progress recently | 10:34 |
markmc | lifeless, totally, it's very exciting | 10:34 |
lifeless | e.g. see https://review.openstack.org/#/c/68308/ - the jenkins check there that ran against the test broker | 10:35 |
rlandy | rpodolyaka1: I see os-apply-config.log in /var/log: http://fpaste.org/71905/39081862/ (would os-collect-config.log somewhere else). tailing messages give me: http://fpaste.org/71906/13908188/ | 10:35 |
markmc | lifeless, very very cool | 10:38 |
lifeless | markmc: yeah, soon as the RH region is online we can make it voting | 10:38 |
lifeless | markmc: ok, so let me know as soon as possible re: meetup; for now, gnight | 10:41 |
markmc | lifeless, yep, will do - thanks | 10:41 |
lifeless | derekh: I have pushed a bunch of things to infra config/ devstack-gate and tripleo all related to CI | 10:41 |
lifeless | derekh: you might like to review/ approve /etc as possible | 10:41 |
derekh | lifeless: cool, will do | 10:41 |
lifeless | gnight all | 10:41 |
rpodolyaka1 | rlandy: ok, so setup-baremetal fails because it's trying to insert a new entry having the mac-address value, that's already saved in the table (though it definitely should return a meaningful error message...) | 10:42 |
rpodolyaka1 | night, lifeless | 10:42 |
rpodolyaka1 | rlandy: but why mac-address value is 11:22:33:44:55:66... | 10:42 |
rpodolyaka1 | rlandy: anyway, you might try to delete all existing values using baremetal-node-delete/baremetal-interface-remove commands of novaclient | 10:44 |
rlandy | rpodolyaka1: ok - I see the duplicate error | 10:44 |
rpodolyaka1 | rlandy: and then run setup-baremetal again | 10:45 |
rlandy | rpodolyaka1: ok - trying that | 10:45 |
*** coolsvap_away has joined #tripleo | 11:05 | |
*** coolsvap has quit IRC | 11:06 | |
*** lucasagomes has joined #tripleo | 11:18 | |
openstackgerrit | A change was merged to openstack/tripleo-incubator: Don't try to git pull when there is a zuul ref. https://review.openstack.org/69274 | 11:26 |
openstackgerrit | A change was merged to openstack/tripleo-incubator: Make the devtest scripts for toci run with -x. https://review.openstack.org/69271 | 11:27 |
openstackgerrit | A change was merged to openstack/tripleo-image-elements: Remove deprecated option. https://review.openstack.org/69161 | 11:35 |
*** coolsvap_away is now known as coolsvap | 11:36 | |
*** e0ne has joined #tripleo | 11:39 | |
*** matsuhashi has quit IRC | 11:40 | |
*** e0ne_ has quit IRC | 11:42 | |
*** matsuhashi has joined #tripleo | 11:45 | |
*** matsuhashi has quit IRC | 12:12 | |
*** matsuhashi has joined #tripleo | 12:13 | |
*** noslzzp has joined #tripleo | 12:16 | |
*** e0ne_ has joined #tripleo | 12:16 | |
*** e0ne has quit IRC | 12:20 | |
openstackgerrit | A change was merged to openstack/diskimage-builder: Fix tftp mapping on openSUSE https://review.openstack.org/68962 | 12:22 |
*** coolsvap has quit IRC | 12:36 | |
openstackgerrit | Ladislav Smola proposed a change to openstack/tuskar-ui: Adding nova baremetal API https://review.openstack.org/68973 | 12:36 |
*** rpodolyaka1 has left #tripleo | 12:39 | |
*** CaptTofu has joined #tripleo | 12:43 | |
openstackgerrit | Derek Higgins proposed a change to openstack/tripleo-image-elements: Configure glance to use the internal swift endpoint https://review.openstack.org/68941 | 12:51 |
*** max_lobur is now known as max_lobur_afk | 13:07 | |
*** CaptTofu has quit IRC | 13:11 | |
*** CaptTofu has joined #tripleo | 13:11 | |
*** bcrochet has quit IRC | 13:12 | |
*** mrunge has quit IRC | 13:13 | |
*** bcrochet has joined #tripleo | 13:14 | |
*** CaptTofu has quit IRC | 13:16 | |
*** morazi has joined #tripleo | 13:16 | |
*** vkozhukalov has joined #tripleo | 13:20 | |
*** weshay has joined #tripleo | 13:29 | |
*** lblanchard has joined #tripleo | 13:41 | |
ProfFalken | hey all, just to let you knwo that for those who are interested, devtest will stand up on Ubuntu Precise (12.04LTS) without any amendments required to the scriptss | 13:56 |
ProfFalken | backports does need to be enabled, and you need to get the disk layout correct but other than that it works fine | 13:57 |
ProfFalken | is there anywhere I can help document this for those who want to experiment but are forced by company policy to use Precise? | 13:57 |
*** jayg|g0n3 is now known as jayg | 14:01 | |
*** dprince has joined #tripleo | 14:05 | |
openstackgerrit | A change was merged to openstack/python-tuskarclient: Remove concepts that no longer exist in the API https://review.openstack.org/69306 | 14:05 |
*** matty_dubs|gone is now known as matty_dubs | 14:08 | |
akrivoka | ProfFalken: I have created a wiki page for devtest installation instructions, feel free to add it there if you want - https://wiki.openstack.org/wiki/Tuskar/Devtest | 14:16 |
openstackgerrit | A change was merged to openstack/tripleo-image-elements: Make nova and nova-kvm elements more compatible. https://review.openstack.org/68469 | 14:17 |
*** d0ugal has quit IRC | 14:19 | |
*** d0ugal has joined #tripleo | 14:22 | |
*** boris-42 has quit IRC | 14:23 | |
openstackgerrit | Ladislav Smola proposed a change to openstack/tuskar-ui: Adding nova baremetal API https://review.openstack.org/68973 | 14:23 |
*** julim has joined #tripleo | 14:31 | |
*** julim has quit IRC | 14:32 | |
*** julim has joined #tripleo | 14:35 | |
*** max_lobur_afk is now known as max_lobur | 14:45 | |
*** CaptTofu has joined #tripleo | 14:45 | |
openstackgerrit | Ryan Brady proposed a change to openstack/tripleo-heat-templates: Multiple cinder nodes in overcloud https://review.openstack.org/69375 | 14:47 |
ProfFalken | akrivoka: ok, thanks | 14:54 |
*** jdob has joined #tripleo | 14:56 | |
*** hewbrocca has joined #tripleo | 14:59 | |
*** morazi_ has joined #tripleo | 15:02 | |
*** morazi has quit IRC | 15:03 | |
rlandy | akrikova: hi ... thanks for posting https://wiki.openstack.org/wiki/Tuskar/Devtest. I'm going to try those instructions. My attempt to include tuskar within devtest was not so successful | 15:03 |
openstackgerrit | Jiri Tomasek proposed a change to openstack/tuskar-ui: Updated Nodes Overview page https://review.openstack.org/69380 | 15:03 |
*** e0ne has joined #tripleo | 15:11 | |
*** e0ne_ has quit IRC | 15:11 | |
*** rwsu has joined #tripleo | 15:16 | |
*** cody-somerville has joined #tripleo | 15:17 | |
*** noslzzp has quit IRC | 15:23 | |
akrivoka | rlandy: great | 15:25 |
akrivoka | rlandy: let me know if I can help | 15:25 |
*** boris-42 has joined #tripleo | 15:26 | |
*** e0ne has quit IRC | 15:31 | |
*** e0ne has joined #tripleo | 15:31 | |
*** lucasagomes is now known as lucas-hungry | 15:35 | |
*** morazi_ is now known as morazi | 15:38 | |
*** coolsvap has joined #tripleo | 15:40 | |
*** noslzzp has joined #tripleo | 15:41 | |
*** vkozhukalov has quit IRC | 15:50 | |
*** sdake has quit IRC | 15:52 | |
*** jprovazn has quit IRC | 15:56 | |
*** ftcjeff has joined #tripleo | 15:56 | |
*** jistr has quit IRC | 15:58 | |
*** pblaho has quit IRC | 16:00 | |
*** AaronGr_Zzz is now known as AaronGr | 16:11 | |
*** nosnos_ has quit IRC | 16:14 | |
*** sdake has joined #tripleo | 16:15 | |
*** sdake has joined #tripleo | 16:15 | |
*** matsuhashi has quit IRC | 16:15 | |
*** julim has quit IRC | 16:16 | |
*** julim has joined #tripleo | 16:19 | |
*** e0ne_ has joined #tripleo | 16:24 | |
*** e0ne has quit IRC | 16:26 | |
*** noslzzp has quit IRC | 16:32 | |
*** jistr has joined #tripleo | 16:34 | |
*** lucas-hungry is now known as lucasagomes | 16:37 | |
*** noslzzp has joined #tripleo | 16:39 | |
openstackgerrit | A change was merged to openstack/tripleo-heat-templates: Allow setting a single NTP Server https://review.openstack.org/69173 | 16:39 |
*** rlandy is now known as rlandy|bbl | 16:40 | |
*** UtahDave has joined #tripleo | 16:42 | |
openstackgerrit | A change was merged to openstack/tripleo-incubator: Fix openSUSE detection https://review.openstack.org/68602 | 16:44 |
openstackgerrit | Marios Andreou proposed a change to openstack/tuskar: WIP: Using merge.py from tuskar to generate overcloud.yaml https://review.openstack.org/52045 | 16:49 |
marios | rbrady: ping - hey, made a comment on your https://review.openstack.org/#/c/69375/ - some progress with the computes (template validating)... but issues pulling in block-storage.yaml still chasing | 16:52 |
*** lsmola_ has quit IRC | 16:53 | |
*** e0ne_ has quit IRC | 16:54 | |
rbrady | marios: thanks | 16:57 |
*** markmc has quit IRC | 16:59 | |
*** morazi has quit IRC | 17:00 | |
rbrady | derekh: ping | 17:07 |
openstackgerrit | A change was merged to openstack/tripleo-image-elements: Enable xinetd service https://review.openstack.org/68966 | 17:08 |
derekh | rbrady: ack, on a call at the moment, but ask away, will answer when I can | 17:08 |
openstackgerrit | A change was merged to openstack/tripleo-image-elements: Configure glance to use the internal swift endpoint https://review.openstack.org/68941 | 17:08 |
rbrady | derekh: I'm just looking for the change you submitted last week for python six module | 17:09 |
derekh | rbrady: https://review.openstack.org/#/c/67178/ | 17:09 |
rbrady | derekh: thank you sir | 17:10 |
derekh | rbrady: np | 17:10 |
rbrady | derekh: just hit this in ucl | 17:10 |
*** morazi has joined #tripleo | 17:13 | |
*** bauzas has joined #tripleo | 17:14 | |
dkehn | lifeless: ping me today about the NTP, I think the wording that worries me is subnet | 17:19 |
SpamapS | dkehn: care to elaborate? | 17:21 |
SpamapS | dkehn: the desire is to be able to set a dhcp option for a subnet really.. NTP is just this particular use case | 17:21 |
dkehn | SpamapS: using the above ^^^^ context | 17:22 |
dkehn | SpamapS: just want to make sure I understand if its a configuration issue or a code change issue | 17:22 |
*** matty_dubs is now known as matty_dubs|lunch | 17:23 | |
dkehn | SpamapS: currently you should be able to put any option in there and its should work, in theory | 17:23 |
SpamapS | dkehn: via the API? | 17:23 |
dkehn | SpamapS: true the only use case we've really tested it with is boot option | 17:23 |
SpamapS | dkehn: and I'm not sure I see the context you're referring to | 17:24 |
dkehn | SpamapS: 2014-01-25 23:33:20[ lifeless] dkehn: ^ what do you think, add a feature to set the ntp servers for a subnet in neutron ? | 17:24 |
dkehn | SpamapS: just want to make sure I understand it | 17:25 |
SpamapS | ah | 17:30 |
SpamapS | dkehn: so subnets are the things that dnsmasq serves by way of dhcp agent right? | 17:30 |
dkehn | SpamapS: yes | 17:30 |
dkehn | SpamapS: actually in the dhcp/network-id./dhcp/ by network, then ports, | 17:31 |
dkehn | SpamapS: this is why I want to make sure I understand the use case | 17:32 |
*** jcooley_ has joined #tripleo | 17:34 | |
SpamapS | dkehn: We want to set ntp-server on DHCP, and it may very well need to be different based on what your subnet is. | 17:35 |
derekh | lifeless: could gate-tripleo-deploy be using VM's on the ci-overcloud more then once? | 17:36 |
derekh | http://logs.openstack.org/41/68941/2/check/gate-tripleo-deploy/761a743/console.html#_2014-01-27_12_58_39_313 | 17:36 |
dkehn | SpamapS: so currently the extra-dhcp-opts, work on a per port basis | 17:36 |
derekh | no sign of this line https://github.com/openstack/diskimage-builder/blob/master/elements/source-repositories/extra-data.d/98-source-repositories#L52 | 17:36 |
derekh | sugesting that the CACHE_PATH already exists... | 17:36 |
*** CaptTofu has quit IRC | 17:37 | |
*** CaptTofu has joined #tripleo | 17:37 | |
*** hewbrocca has quit IRC | 17:38 | |
*** max_lobur is now known as max_lobur_afk | 17:39 | |
*** marun has joined #tripleo | 17:39 | |
*** vkozhukalov has joined #tripleo | 17:40 | |
*** marun has quit IRC | 17:41 | |
*** CaptTofu has quit IRC | 17:42 | |
*** CaptTofu has joined #tripleo | 17:43 | |
SpamapS | dkehn: right we don't want per-port. We want a blanket "everybody on this subnet gets ntp-server=x.x.x.x | 17:46 |
*** derekh has quit IRC | 17:55 | |
* Ng breaks for dinner&bedtimes | 17:58 | |
*** akuznetsov has quit IRC | 18:00 | |
*** hewbrocca has joined #tripleo | 18:02 | |
*** akuznetsov has joined #tripleo | 18:05 | |
*** julim has quit IRC | 18:05 | |
*** bauzas has quit IRC | 18:08 | |
wendar | lifeless: How up-to-date is the README in tripleo-incubator? It makes a lot of references to Grizzly. | 18:13 |
greghaynes | Morning | 18:13 |
*** matty_dubs|lunch is now known as matty_dubs | 18:14 | |
SpamapS | wendar: I haven't read it in a while.. | 18:15 |
*** martyntaylor has quit IRC | 18:16 | |
* SpamapS reading.. looking pretty good actually | 18:16 | |
SpamapS | - File injection is required due to the PXE boot configuration conflicting | 18:19 |
SpamapS | with Nova-network/Neutron DHCP (work is in progress to resolve this) | 18:19 |
SpamapS | wendar: ^^ thats not true | 18:19 |
wendar | SpamapS: okay, cool. so the details have been updated, it's just missing some mention of whether it's also "quite usable on Havana" and not just Grizzly | 18:19 |
* SpamapS starts edits | 18:19 | |
openstackgerrit | Clint "SpamapS" Byrum proposed a change to openstack/tripleo-incubator: Update README - file injection is not required https://review.openstack.org/69430 | 18:20 |
*** jcooley_ has quit IRC | 18:22 | |
openstackgerrit | Clint "SpamapS" Byrum proposed a change to openstack/tripleo-incubator: Heat is quite usable in more than Grizzly https://review.openstack.org/69431 | 18:22 |
SpamapS | wendar: thanks for pointing this out. New eyes always find bugs. :) | 18:23 |
*** jcooley_ has joined #tripleo | 18:23 | |
wendar | SpamapS: happy to start being useful :) | 18:23 |
greghaynes | SpamapS: does that depend on https://review.openstack.org/#/c/69280/ merging? | 18:24 |
*** akuznetsov has quit IRC | 18:25 | |
*** boris-42 has quit IRC | 18:26 | |
openstackgerrit | Clint "SpamapS" Byrum proposed a change to openstack/tripleo-incubator: Update os-*-config section of README https://review.openstack.org/69434 | 18:26 |
SpamapS | greghaynes: no. That is just a thing that is still turned on, but we don't actually use anymore. :) | 18:27 |
greghaynes | hrm | 18:27 |
greghaynes | I was pointed to pull that in to fix qemu-nbd spinning on a lockfile | 18:28 |
greghaynes | although verifying that works atm | 18:28 |
SpamapS | greghaynes: yeah, its generally the suck and horrible, but we don't _NEED_ injection anymore. | 18:28 |
greghaynes | gotcha | 18:29 |
*** boris-42 has joined #tripleo | 18:29 | |
openstackgerrit | Clint "SpamapS" Byrum proposed a change to openstack/tripleo-incubator: Update README as we do support updates now https://review.openstack.org/69436 | 18:31 |
*** jistr has quit IRC | 18:32 | |
openstackgerrit | Clint "SpamapS" Byrum proposed a change to openstack/tripleo-incubator: Update tripleo-heat-templates: mention merge tool https://review.openstack.org/69438 | 18:33 |
openstackgerrit | Clint "SpamapS" Byrum proposed a change to openstack/tripleo-incubator: Add in missing definite article in README https://review.openstack.org/69439 | 18:36 |
*** CaptTofu has quit IRC | 18:37 | |
*** CaptTofu has joined #tripleo | 18:37 | |
SpamapS | wendar: ^^ ok if you checkout that last review.. you get an up to date README :) | 18:38 |
wendar | SpamapS: awesome, thanks! | 18:39 |
*** rlandy|bbl is now known as rlandy | 18:42 | |
*** CaptTofu has quit IRC | 18:42 | |
*** jprovazn has joined #tripleo | 18:44 | |
*** CaptTofu has joined #tripleo | 18:45 | |
SpamapS | wendar: and really not that much was out of place. Thanks again for looking. :) | 18:45 |
openstackgerrit | Ana Krivokapic proposed a change to openstack/tuskar-ui: Run PEP8 check by default when tests are run https://review.openstack.org/69444 | 18:56 |
*** marun has joined #tripleo | 18:56 | |
wendar | SpamapS: Cool, yeah, that's what I'd expect. | 19:05 |
lifeless | o/ everybody | 19:07 |
jog0 | lifeless: o/ | 19:07 |
jog0 | just the man I wanted to see | 19:07 |
greghaynes | \o/ | 19:07 |
lifeless | ruh roh | 19:08 |
jog0 | was about to inquire about the next NVP | 19:08 |
jog0 | MVP | 19:08 |
jog0 | two things, what nova patches are outstanding? | 19:08 |
*** vkozhukalov has quit IRC | 19:08 | |
lifeless | https://trello.com/c/PFU8HYWw/39-nova-baremetal-rebuild-support-including-preserving-ephemeral-devices-if-that-isn-t-already-a-stock-nova-feature | 19:08 |
jog0 | and second, wanted to ask about why make kernel upgrades part of next | 19:08 |
*** noslzzp has quit IRC | 19:08 | |
jog0 | MVP | 19:08 |
lifeless | jog0: has a link to the review required to make things work at all | 19:08 |
jog0 | lifeless: now that gate is flowing I have novareview time | 19:08 |
lifeless | jog0: and in the same series there is another patch fixing a restart failure mode | 19:08 |
lifeless | jog0: though neither pass jenkins yet, just unit test tweaks needed | 19:09 |
jog0 | lifeless: ahh | 19:09 |
lifeless | jog0: I'm going to ask StevenK if he'd like to polish them up | 19:10 |
*** noslzzp has joined #tripleo | 19:10 | |
lifeless | jog0: since I'm a bit of a bottleneck atm | 19:10 |
jog0 | lifeless: cool, I will follow those two and when jenkins is happy I will revew 'em | 19:10 |
lifeless | jog0: you could eyeball the code changes now | 19:10 |
lifeless | jog0: because we're running them ;) | 19:10 |
lifeless | jog0: functional jenkins is happy | 19:11 |
jog0 | well functional tests don't cover bare metal | 19:11 |
lifeless | I know :) | 19:11 |
lifeless | cd-undercloud is exercising it hourly | 19:12 |
lifeless | wendar: a little stale ;) | 19:12 |
lifeless | dkehn: ping about ntp | 19:12 |
dkehn | lifeless: pong | 19:13 |
lifeless | SpamapS: is tripleo-cd disabled ? | 19:13 |
lifeless | dkehn: so you said subnet worries you w.r.t. ntp ? | 19:13 |
jog0 | lifeless: I like patches like those: clear and small | 19:14 |
lifeless | jog0: its how I roll | 19:14 |
jog0 | gave both a +0 | 19:14 |
SpamapS | lifeless: not by me. Looking. | 19:14 |
lifeless | jog0: awesome, thanks | 19:14 |
jog0 | lifeless: so for the next MVP, why is kernel based upgrades included | 19:14 |
dkehn | lifeless: doesn't worry me, just need a clalification. currently we wrt opt on a per port basis, for a subnet, would require code change, | 19:14 |
jog0 | vs doing non kernel based upgrades in this MVP and kernel based in next | 19:15 |
lifeless | jog0: ok so; I'd flip it around, why shouldn't we include them ? | 19:15 |
lifeless | jog0: right now we upgrade the kernel on cd-overcloud every couple of days | 19:15 |
jog0 | lifeless: because the extra complexcity of adding HA in everwhere and live migration support | 19:15 |
SpamapS | lifeless: seems to be stuck in git log -1 --pretty=oneline | 19:15 |
SpamapS | lifeless: DOOHHHH | 19:15 |
SpamapS | lifeless: upstart uses pty's for logging | 19:15 |
lifeless | SpamapS: dodah? | 19:16 |
SpamapS | lifeless: pager is running | 19:16 |
greghaynes | lol | 19:16 |
jog0 | lifeless: kernel upgrades are definitly important, infact russellb is working on that in the gate right now | 19:16 |
lifeless | f* git and the horse it rode in on | 19:16 |
jog0 | but I think doing non-kernel based upgrades is hard | 19:16 |
lifeless | jog0: huh? | 19:16 |
lifeless | jog0: I don't understand what russelb is working on | 19:16 |
jog0 | https://bugs.launchpad.net/bugs/1254890 | 19:17 |
jog0 | bumping the kernel version that we gate on | 19:17 |
lifeless | oh, totally unrelated | 19:18 |
lifeless | this is *deployment* | 19:18 |
jog0 | lifeless: yes, I only brought it up because it shows how important it is to have the right kernel | 19:18 |
lifeless | jog0: so, if we do non-HA that implies a non-reboot code path | 19:18 |
jog0 | lifeless: yes | 19:18 |
SpamapS | lifeless: you fixing, or shall I? | 19:19 |
lifeless | the non-reboot code path *still* needs rolling heat support and no-downtime neutron migrations, because we have to restart services | 19:19 |
jog0 | lifeless: right, and rolling nova upgrades | 19:19 |
jog0 | nova-compute | 19:19 |
lifeless | so that implies some form of HA | 19:19 |
jog0 | lifeless: true | 19:19 |
lifeless | so we just got back to having HA | 19:19 |
jog0 | I should of said non-reboot code path | 19:19 |
lifeless | therefor it's a paradoc to say non-HA and VMs are not interrupted | 19:19 |
lifeless | ok, so the non-reboot codepath | 19:20 |
jog0 | yeah, I used the wrong words | 19:20 |
lifeless | we either have to put extra code in to guarantee the kernel doesn't cange | 19:20 |
lifeless | that is something like | 19:20 |
lifeless | on the first build cache the kernel files in glance | 19:20 |
lifeless | on every build after that *uninstall* the kernel the OS vendor image has and reinstall the files we want from glance | 19:20 |
lifeless | or | 19:20 |
jog0 | so I think we want logic to check if the kernel changed or not anyway | 19:21 |
lifeless | we have to accept that the next time the node is rebooted (e.g. power failure, whatever) the kernel it boots from (held by nova-bm) and the kernel in it's image (which is where extra modules are held) may differ | 19:21 |
jog0 | because if we make every upgrade require live-migration thats a lot of extra load | 19:21 |
lifeless | and it will then failure to come up in entirely hilarious ways | 19:21 |
lifeless | or | 19:21 |
lifeless | we need to update the kernel boot files held by nova-bm, and know that after the first such deploy we can't load modules anymore | 19:22 |
lifeless | jog0: so hang on, we're not talking optimisations yet | 19:22 |
jog0 | fair enough | 19:22 |
lifeless | jog0: we're comparing two different features we want both of | 19:22 |
lifeless | jog0: the question is do we do A(reboot required but users don't have to care) or B(when a reboot isn't /required/ don't do one) first | 19:23 |
lifeless | jog0: and I'm trying to show that B actually has a bunch of things impacting on it that don't make it this slam-dunk simpler problem | 19:23 |
lifeless | IMNSHO | 19:23 |
jog0 | lifeless: what about doing B but not handling the kernel issues | 19:24 |
lifeless | I think they are about equally hard to implement well, but A solves a lot more cases for us | 19:24 |
lifeless | jog0: then we'd introduce something unsafe | 19:24 |
jog0 | just assume kernel never changes | 19:24 |
jog0 | yes we would | 19:24 |
openstackgerrit | Clint "SpamapS" Byrum proposed a change to openstack/tripleo-incubator: Stop git using a pager in upstart console logs https://review.openstack.org/69447 | 19:24 |
jog0 | I think that this MVP is hard without adding in kernel changes | 19:24 |
openstackgerrit | A change was merged to openstack/tripleo-incubator: Stop git using a pager in upstart console logs https://review.openstack.org/69447 | 19:25 |
lifeless | jog0: like i said, kernels change quite regularly for cd-overcloud | 19:26 |
jog0 | and that we will quickly find doing no-downtime neutron migrations and rolling upgrades of OpenStack Services etc will be hard | 19:26 |
lifeless | jog0: making the assuming that the kernel never changes would lead to a dead overcloud pretty quickly | 19:26 |
lifeless | jog0: but we have to do those migrations *in both A and B* | 19:26 |
dkehn | lifeless: currently the dhcp_option are being stored in the port attributes, would need support for dhcp_option in the subnet attributes if this is going to be on a subnet basis | 19:26 |
lifeless | dkehn: additional support - yes. | 19:27 |
SpamapS | lifeless: we could pin the kernel. | 19:27 |
jog0 | lifeless: I just think that the proposed MVP is pretty big in scope | 19:27 |
lifeless | SpamapS: until the version we pinned drops out of the archive | 19:27 |
SpamapS | (just playing devil's advocate, I"m on board with focusing on A btw) | 19:27 |
dkehn | lifeless: additional support? | 19:28 |
lifeless | jog0: ok, I accept that; so for lean startup here, what we should focus on is how much we'll learn from A and B | 19:28 |
lifeless | dkehn: storing subnet wide options would be an additional thing in neutron | 19:28 |
dkehn | yes | 19:28 |
dkehn | lifeless: true it would be | 19:28 |
dkehn | lifeless: just want to make sure that's what your saking? | 19:29 |
dkehn | s/saling/asking | 19:29 |
lifeless | jog0: what do we expect to learn from A(rolling graceful image deploys) vs B(rolling graceful rsync deploys) | 19:29 |
lifeless | dkehn: it doesn't make a lot of sense to me to do per-server ntp settings | 19:29 |
lifeless | dkehn: I mean we could, today, as a workaround | 19:29 |
jog0 | lifeless: much of the same things | 19:29 |
jog0 | A and B both need no downtime to VMs etc which involve rolling upgrades etc | 19:30 |
lifeless | jog0: ok, so now lets look at implementation: the delta from do A to do B is: | 19:30 |
jog0 | so I am not saying do B | 19:30 |
lifeless | - 2 compute nodes | 19:30 |
jog0 | I am saying break down A into two phases | 19:31 |
lifeless | + eithe kernel failures or some assocaite | 19:31 |
lifeless | ... go on, am listening | 19:31 |
*** hewbrocca has quit IRC | 19:31 | |
jog0 | do A but pin kernel version in phase 1. phase 2 don't pin kernel | 19:31 |
lifeless | jog0: that is more work, not les | 19:32 |
jog0 | how so? | 19:32 |
lifeless | jog0: we need an entirely different deployment method, which we have prototyped | 19:32 |
lifeless | but that needs to be polished, integrated into heat in an appropriate way | 19:32 |
jog0 | what is the different deployment method? | 19:32 |
jog0 | the rsync? | 19:32 |
jog0 | ohh wait now I get it | 19:33 |
*** coolsvap has quit IRC | 19:33 | |
jog0 | upgrading a box with a reboot is easier then without | 19:33 |
jog0 | for images | 19:33 |
jog0 | thats what I was overlooking | 19:33 |
lifeless | it's an optimisation | 19:33 |
lifeless | jog0: have you watched my disk-image-builder video from LCA? | 19:34 |
*** jprovazn has quit IRC | 19:34 | |
lifeless | jog0: I think you might get some insight from it | 19:34 |
jog0 | lifeless: link? | 19:34 |
*** martyntaylor has joined #tripleo | 19:34 | |
lifeless | http://mirror.linux.org.au/linux.conf.au/2014/Friday/111-Diskimage-builder_deep_dive_into_a_machine_compiler_-_Robert_Collins.mp4 | 19:34 |
jog0 | lifeless: thanks | 19:35 |
jog0 | watching now | 19:35 |
*** epim has joined #tripleo | 19:35 | |
jog0 | I'll come back after watching it | 19:35 |
SpamapS | I watched it, and it definitely crystalizes the things we've been driving toward. | 19:36 |
SpamapS | lifeless: so regarding easier/harder.. I'm not sure I agree that the rsync is harder, it is just less known. | 19:36 |
SpamapS | perhaps that is the definition of harder | 19:36 |
lifeless | SpamapS: we need patches in nova for it | 19:37 |
lifeless | SpamapS: unless we kamikazi on kernel changes within it | 19:37 |
*** martyntaylor has quit IRC | 19:38 | |
lifeless | SpamapS: to update the boot files for AMI/ARI/AKI image setups (which is an all-hypervisors thing) | 19:38 |
jog0 | SpamapS: ... there are known knowns; there are things we know that we know. | 19:38 |
jog0 | There are known unknowns; that is to say, there are things that we now know we don't know. | 19:38 |
jog0 | But there are also unknown unknowns – there are things we do not know we don't know. | 19:38 |
jog0 | ~ Donald Rumsfeld | 19:39 |
jog0 | rsync is more in the unknown unknown realm | 19:41 |
greghaynes | Seems like doing the new image/reboot first you also better learn which cases *require* non reboot, which makes sense given you want the general case to be reimage/reboot as it is much easier to test. | 19:41 |
SpamapS | lifeless: wait, I thought B was "no kernel upgrade" ?? | 19:41 |
greghaynes | Or at least thats what I gleaned from the LCA talk... curious how on target it s | 19:42 |
lifeless | SpamapS: B was no reboot | 19:42 |
lifeless | SpamapS: but the kernel changing in the image will impose kernel changing complexity on us | 19:42 |
*** hewbrocca has joined #tripleo | 19:42 | |
lifeless | I have to run, acct appointment - reachable on phone | 19:42 |
dkehn | lifeless: got it | 19:43 |
SpamapS | lifeless: ACK | 19:43 |
SpamapS | greghaynes: the most common update case is updating less than 50 files on the filesystem and none of those being the kernel, I think. | 19:44 |
greghaynes | Sure | 19:45 |
SpamapS | greghaynes: the complexity comes in identifying that you're in such a situation, and optimizing for it. | 19:45 |
greghaynes | I just mean the goal being actually doing CI/CD testing for the images, its a difference of testing the actual image vs hoping two deltas end up in the same state | 19:46 |
SpamapS | greghaynes: right we want to assert that there is no delta every time. | 19:46 |
greghaynes | :) | 19:47 |
greghaynes | So its nice to make that the general case (hence doing it first) | 19:47 |
rlandy | dprince: hi - can you tell me how I could enable iptables during image building? | 19:47 |
SpamapS | One thing I think we can do is test both .. reboot and no reboot.. and then if we can assert that the no-reboot one is running the things we expect.. we can take the no-reboot path from there, onward. | 19:47 |
SpamapS | like we could publish with the tested images, the tested upgrade paths | 19:48 |
*** openstackgerrit has quit IRC | 20:06 | |
*** openstackgerrit has joined #tripleo | 20:06 | |
*** julim has joined #tripleo | 20:06 | |
*** jcooley_ has quit IRC | 20:22 | |
*** jcooley_ has joined #tripleo | 20:22 | |
openstackgerrit | Ana Krivokapic proposed a change to openstack/tuskar-ui: WIP: Add node detail view https://review.openstack.org/69462 | 20:31 |
*** akrivoka has quit IRC | 20:41 | |
*** lucasagomes has quit IRC | 20:46 | |
*** athomas has quit IRC | 21:01 | |
*** jcoufal has quit IRC | 21:03 | |
*** rlandy has quit IRC | 21:05 | |
*** e0ne has joined #tripleo | 21:07 | |
lifeless | back | 21:15 |
lifeless | SpamapS: interesting idea | 21:16 |
lifeless | SpamapS: I like that better than a human saying 'oh yeah, this is a non-reboot case', or finding out at deploy time that actually a reboot REALLY WAS NEEDED | 21:16 |
lifeless | dprince: hey so - see the discussion above w/joe - does that cover your list questions as well ? | 21:16 |
lifeless | pleia2: you wanted a quick hangout? | 21:17 |
dprince | lifeless: perhaps. I'm not quite sure it is as easy as we think it is though. | 21:18 |
dprince | lifeless: I think I understand the general idea though for sure. | 21:19 |
lifeless | dprince: I'm sure it's not /easy/ I was more seeing if you were satisfied with the rationale for <set of things> as the next step vs <other set of things> | 21:22 |
lifeless | dprince: also - we're now passing check tests on all tripleo repos (and tripleo-ci when infra merge my patch) | 21:24 |
dprince | lifeless: yeah | 21:24 |
jog0 | lifeless: so just watched the video and while very interesting it didn't really answer anything about why do kernel in this MVP. but you answered that separately | 21:24 |
lifeless | dprince: they aren't voting yet - we need the RH region up for that per infra policy | 21:24 |
lifeless | jog0: ack; it was more that it pulled everything together | 21:25 |
jog0 | lifeless: it sure did | 21:25 |
jog0 | very interesting talk | 21:25 |
lifeless | dprince: where are we at on getting access ? | 21:25 |
dprince | lifeless: :(. So... while I certainly value the infra policy it would seem that any gating at all is valuable at this time. | 21:25 |
dprince | lifeless: why not break the rules | 21:26 |
lifeless | dprince: because if ci-overcloud wedges, without a fallback, infra have to have a firedrill to let us land anything at al | 21:26 |
dprince | lifeless: We are waiting for them to essentially unplug the cables and do final security tests to verify the rack has been properly disconnected from any internal networks. | 21:26 |
dprince | lifeless: once that happens it'll go public | 21:26 |
lifeless | dprince: https://bugs.launchpad.net/tripleo/?field.importance=CRITICAL | 21:27 |
lifeless | bug 1272803 | 21:27 |
lifeless | bug 1272969 | 21:27 |
lifeless | bug 1271344 | 21:27 |
lifeless | are all affecting the ci-overcloud | 21:27 |
jog0 | lifeless: btw for the cards for the next MVP a few quick questions | 21:27 |
lifeless | so its not an academic question | 21:27 |
jog0 | when you have a moment | 21:27 |
lifeless | jog0: shoot | 21:27 |
jog0 | so starting with the easy: | 21:28 |
jog0 | why do we need HA API? | 21:28 |
jog0 | or rather do we need HA API for all APIs? | 21:28 |
lifeless | jog0: APIs that I know of | 21:28 |
lifeless | metadata API | 21:28 |
lifeless | if thats not HA, then a rebooting VM will fail to come up | 21:28 |
lifeless | I don't know what the metadata API depends on but I wouldn't be surprised if it calls out to neutron for networking details | 21:29 |
lifeless | so we need HA for the neutron API | 21:29 |
jog0 | lifeless: both of those make sense, what about nova-compute-api | 21:29 |
lifeless | accessing the neutron API requires keystone | 21:29 |
lifeless | so keystone API | 21:29 |
lifeless | metadata API is nova-api | 21:29 |
lifeless | nova-compute doesn't have an API | 21:30 |
jog0 | metadata is one of several nova apis | 21:30 |
lifeless | jog0: they're all in the same process | 21:30 |
lifeless | jog0: so if we get one HA'd we get them all | 21:30 |
dprince | lifeless: re 1272969 (https://bugs.launchpad.net/tripleo/+bug/1272969) what if we just munge the config files in init-neutron-ovs so that dhcp is disabled? | 21:30 |
jog0 | lifeless: err osapi_compute | 21:30 |
jog0 | ahh we run them as an all in one | 21:30 |
jog0 | not everyone does that | 21:30 |
lifeless | jog0: its the default | 21:31 |
* dprince hates coupling... but that would seem to work | 21:31 | |
lifeless | jog0: if the default is wrong, change it :) | 21:31 |
jog0 | lifeless: true, but default isn't to run neutron either | 21:31 |
jog0 | anyay make sense | 21:31 |
lifeless | jog0: it was meant to be :) | 21:31 |
lifeless | jog0: in fact, installer docs steer people at neutron rather strongly, so I'd argue it is | 21:31 |
jog0 | I only ask because I assume we are just focusing on uptime for VMs and not APIs. although uptime for VMs means no downtime for many APIs as pointed out | 21:32 |
lifeless | jog0: totally get that | 21:32 |
lifeless | jog0: so basically the only VM consumed HTTP APIs I know of are nova metadata and heat | 21:32 |
lifeless | they are ones that if they go away VM's *per se* may glitch | 21:32 |
jog0 | lifeless: cool we are in agreement | 21:33 |
lifeless | jog0: maybe cinder? | 21:33 |
jog0 | so second question | 21:33 |
lifeless | jog0: nova metadata -> cinder that is | 21:33 |
lifeless | and or | 21:33 |
jog0 | hmm I don't think so, but we will find out soon enough | 21:33 |
lifeless | cinder being needed when live migrating a block storage using VM | 21:33 |
lifeless | you need to detach and reattach the volume | 21:33 |
lifeless | dprince: maybe; I think this is a whiteboard problem | 21:34 |
jog0 | lifeless: so that reminds me actually, for live migration are we going to look at distributed file systems? | 21:34 |
lifeless | dprince: would you like to do a voice call later today and try and nut out the design interactions of ovs, state, dhcp-all-interfaces ? | 21:34 |
lifeless | jog0: I don't think they help do they - you still need to block-migrate the ephemeral volume | 21:34 |
jog0 | lifeless: unless you use distributed file system for ephemeral | 21:35 |
jog0 | which some people do | 21:35 |
lifeless | jog0: ugh :) | 21:35 |
jog0 | not saying we should for the record | 21:35 |
lifeless | jog0: yeah, I know | 21:35 |
*** dprince has quit IRC | 21:35 | |
jog0 | anyway that is a detail that we can sort out later | 21:35 |
jog0 | what I was going to say was: | 21:35 |
lifeless | so my opinionated w/out data opinion is that users that ask for ephemeral are asking for local disk | 21:35 |
lifeless | users that want network disk should use cinder | 21:36 |
jog0 | I think the rolling-upgrade card is pretty big | 21:36 |
lifeless | and we should make cinder be backed by cluster//sheepdog//ceph | 21:36 |
jog0 | lifeless: I agree with that opinion | 21:36 |
jog0 | anyway I haven't played with live migration enough to have an informed opinion about the possible issues | 21:37 |
lifeless | jog0: depending on who you ask it's all terrible or just fine | 21:38 |
jog0 | so for rolling-upgrade | 21:38 |
lifeless | jog0: https://etherpad.openstack.org/p/Live_Migration might be relevant | 21:38 |
lifeless | right | 21:40 |
openstackgerrit | A change was merged to openstack/os-collect-config: Updated from global requirements https://review.openstack.org/69038 | 21:40 |
jog0 | thats a pretty big item | 21:43 |
jog0 | do we have a etherpad for it? | 21:43 |
lifeless | rolling upgrade ? | 21:43 |
pleia2 | lifeless: I think I'm ok, had some paperworky things to do this morning, but post lunch I'm now digging into the devstack scripts w/ fedora | 21:43 |
lifeless | pleia2: ack | 21:44 |
lifeless | oh crap I still have expenses to do :( | 21:44 |
jog0 | lifeless: yeah, etherpad for details on rolling upgrade item | 21:44 |
jog0 | or other doc | 21:44 |
lifeless | looking in heat blueprints | 21:46 |
SpamapS | Heat vms will just not get updated metadata | 21:48 |
SpamapS | gah | 21:48 |
SpamapS | pgup'd and forgot | 21:48 |
SpamapS | ignore me | 21:48 |
SpamapS | https://blueprints.launchpad.net/heat/+spec/rolling-updates | 21:49 |
SpamapS | ^^ rolling updates | 21:49 |
SpamapS | and some completely out of date total fiction http://wiki.openstack.org/Heat/Blueprints/RollingUpdates | 21:50 |
lifeless | so I think there are two things | 21:50 |
SpamapS | probably need to go through that wiki spec and rewrite it to reflect what we actually know now | 21:50 |
lifeless | there's canary controls | 21:50 |
jog0 | SpamapS: lol | 21:50 |
lifeless | and theres graceful N-at-a-time sequencing | 21:50 |
SpamapS | lifeless: right, the canary thing just makes N a calculation. | 21:51 |
lifeless | SpamapS: *and* possibly rollsback on OMG moments | 21:51 |
lifeless | I'll start an etherpad | 21:51 |
lifeless | because we have more than just heat scope | 21:52 |
SpamapS | Well with N-at-a-time do we roll back on one fail? | 21:52 |
lifeless | SpamapS: I was thinking rollback was an orthogonal thing | 21:52 |
lifeless | SpamapS: for first iteration | 21:52 |
*** jcooley_ has quit IRC | 21:53 | |
SpamapS | It is. Heat can either stop and whine, or rollback, on any failure. | 21:53 |
*** jcooley_ has joined #tripleo | 21:53 | |
lifeless | SpamapS: so at high scale we might want to add a third option of ignore failures | 21:54 |
SpamapS | I had always thought with a more convergence-focused Heat we could then argue for a third mode, which is to whine, but keep going in cases where that is allowed. | 21:54 |
lifeless | haha yes | 21:54 |
jog0 | lifeless: so there is the heat aspect to make rolling upgrades possible and then there is the how to actually do them per service | 21:55 |
jog0 | what order works, any gotchas etc | 21:55 |
*** cadenzajon has joined #tripleo | 21:56 | |
*** cadenzajon has left #tripleo | 21:56 | |
*** cadenzajon has joined #tripleo | 21:56 | |
greghaynes | lifeless: looks like the inject_partition=-2 hasnt fixed the qemu-nbd race cond for me | 21:58 |
openstackgerrit | James Slagle proposed a change to openstack/diskimage-builder: Add ability to use local cloud image https://review.openstack.org/68133 | 21:58 |
lifeless | jog0: right | 21:58 |
lifeless | greghaynes: it should have stopped qemu-nbd being used at all :( | 21:58 |
greghaynes | I can assume if the change is shown in /opt/stack/os-config-applier/templates/etc/nova/nova.conf on the node then the change is being used, yes? | 21:58 |
lifeless | greghaynes: possibly there is another place that qemu-nbd is being triggered from ? | 21:58 |
lifeless | greghaynes: once you do a os-collect-config --force --one - it should show up in /etc/nova/nova.conf | 21:59 |
greghaynes | ah, yep its shown in there :) | 21:59 |
* greghaynes investigates | 21:59 | |
lifeless | has nova-compute been restarted? | 22:00 |
greghaynes | Yes | 22:00 |
lifeless | jog0: SpamapS: https://etherpad.openstack.org/p/heat-features-tripleo should be updated perhaps | 22:00 |
lifeless | stevebaker: ^ | 22:00 |
lifeless | SpamapS: also speaking of convergence - https://etherpad.openstack.org/p/heat-workflow-vs-convergence | 22:01 |
*** CaptTofu has quit IRC | 22:06 | |
*** CaptTofu_ has joined #tripleo | 22:11 | |
jog0 | lifeless: perhaps the rolling upgrade card should be two: 1 for heat support and one for what order to upgrade in | 22:11 |
lifeless | jog0: you're thinking of the nuisance conductor thing ? | 22:12 |
*** lblanchard has quit IRC | 22:12 | |
jog0 | lifeless: yup | 22:12 |
jog0 | and of doing db migrations | 22:12 |
jog0 | in general | 22:12 |
lifeless | yeah, PITA stuff | 22:12 |
jog0 | lifeless: which is why I think that card is big | 22:13 |
jog0 | although once we have a good system to actually test ordering of upgrades out, everything becomes much clearer | 22:13 |
lifeless | I don't think it is really | 22:13 |
lifeless | if we have a dependency on the control plane in heat it will upgrade that first | 22:14 |
jog0 | so in nova we have done a lot of work on making RPC work across services with different versions | 22:15 |
jog0 | so new control plane can talk to old non control plane nodes | 22:15 |
jog0 | but I don't know the state of that for !nova | 22:15 |
lifeless | I think we need a https://etherpad.openstack.org/p/tripleo-mvp1 for this one | 22:16 |
lifeless | personally I think the compat stuff would be about 1000 times more obvious if each service was it's own code base | 22:17 |
jog0 | lifeless: I think that is irrelevent to MVP1 right? | 22:17 |
jog0 | (I agree with you though | 22:17 |
*** e0ne has quit IRC | 22:18 | |
jog0 | err MVP4 | 22:18 |
cadenzajon | I'm setting up a tripleo dev/test environment to get started with it and just ran across https://github.com/echohead/tripleo-dev it's pretty outdated, is there any use in it? or a new project that replaces it, beyond devtest.sh? | 22:18 |
lifeless | cadenzajon: https://git.openstack.org/cgit/openstack/tripleo-incubator | 22:18 |
*** rollerj has quit IRC | 22:19 | |
lifeless | wendar: got settled in now? | 22:20 |
wendar | lifeless: got all the set up out of the way | 22:20 |
lifeless | \o/ | 22:20 |
cadenzajon | lifeless: thanks. is there a "best fit" OS for hosting my dev/test tripleo VMs? Ubuntu, Redhat, etc? | 22:21 |
wendar | lifeless: From here, it's mostly about getting familiar with codebases and into a daily habit. | 22:21 |
SpamapS | cadenzajon: many of us are on Ubuntu, some are on Fedora | 22:25 |
SpamapS | cadenzajon: both of those need fairly recent versions.. Ubuntu 12.04 won't cut it. | 22:25 |
cadenzajon | spamaps: good to know... should I pull 13.10 and go to the latest release? | 22:28 |
*** jayg is now known as jayg|g0n3 | 22:29 | |
SpamapS | cadenzajon: 13.10 is what I'm on | 22:30 |
SpamapS | cadenzajon: I usually get the dev release around mid-dev-cycle (in fact upgrading my personal laptop to trusty right now) | 22:30 |
lifeless | jog0: ok so I think I see whats going on with communications | 22:31 |
lifeless | jog0: 'rolling upgrade' means a totally different thing in nova land to heat land | 22:31 |
jog0 | lifeless: what does it mean in heat land? | 22:31 |
lifeless | jog0: in nova land it refers to sequencing conductor -> db migrate -> other servives | 22:31 |
lifeless | jog0: in heat land it refers to doing only part of a scaling group at once | 22:32 |
lifeless | jog0: the current behaviour is like this - say you have a 10 server scaling group and you do a stack-update that needs to change the servers (e.g. new image) | 22:32 |
lifeless | it will spin up 10 new servers, then delete the 10 old ones | 22:32 |
jog0 | lifeless: thats part of what it means in nova land. in nova land it means not needing to upgrade all nova-computes at the same time. but to do that we specify an upgrade order that we will test | 22:33 |
lifeless | jog0: if it's set to rebuild, it rebuilds all 10 at once | 22:33 |
SpamapS | jog0: Oh btw, regarding "dunno about other projects" ... Heat, for instance, requires you to stop heat-engine, db_sync, then start the new engine. | 22:33 |
SpamapS | jog0: but heat-api and heat-engine are, in theory, able to be upgraded not in that sequence. | 22:34 |
SpamapS | err | 22:34 |
SpamapS | s/not in that sequence/independent of one another/ | 22:34 |
jog0 | SpamapS: ack | 22:34 |
SpamapS | jog0: anyway, heat should switch to nova's object API and then the DB will work well too. :-P | 22:34 |
lifeless | https://etherpad.openstack.org/p/tripleo-rolling-upgrades | 22:35 |
jog0 | lifeless: so to make sure I have this right: heat land rolling upgrade means 'being able to run a stack-update not all at once' | 22:35 |
jog0 | to use some ackwardly phrased English | 22:35 |
lifeless | not quite | 22:38 |
SpamapS | note that there are two words being interleaved that mean different things | 22:38 |
SpamapS | upgrade != update | 22:38 |
SpamapS | update in this context is heat's term for changing a running stack | 22:39 |
SpamapS | upgrade is specifically an update of software to a new version (I think) | 22:39 |
lifeless | SpamapS: review please https://review.openstack.org/#/c/69270/ | 22:40 |
SpamapS | lifeless: that will print out the private key into the log... | 22:40 |
SpamapS | Have not been following things closely.. but perhaps thats a bad idea? | 22:41 |
lifeless | SpamapS: the private key that is randomly created everytime we boot a testenv, and which has limited privs to just copy the seed, enumerate vms and start and stop vms | 22:41 |
*** epim has quit IRC | 22:41 | |
*** CaptTofu_ has quit IRC | 22:41 | |
lifeless | SpamapS: its also echoed to the log by the ci-client | 22:41 |
SpamapS | Ok | 22:42 |
SpamapS | just checking before I +2 :) | 22:42 |
*** noslzzp has quit IRC | 22:42 | |
lifeless | SpamapS: http://logs.openstack.org/08/68308/2/check/gate-tripleo-deploy/d356b09/console.html | 22:42 |
lifeless | 2014-01-27 03:01:28.810 | 2014-01-27 03:01:15,055 - testenv-client - INFO - Received job : {"remote-operations":"1", "host-ip":"192.168.1.16", "seed-ip":"192.168.1.17", "node-macs":"52:54:00:f9:00:30 52:54:00:96:3c:28 52:54:00:30:29:10", "ssh-key":"LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQ | 22:42 |
lifeless | ... | 22:42 |
*** jdob has quit IRC | 22:42 | |
openstackgerrit | A change was merged to openstack-infra/tripleo-ci: Be verbose in toci_devtest.sh. https://review.openstack.org/69270 | 22:42 |
*** jtomasek has quit IRC | 22:42 | |
jog0 | lifeless: I agree with the 4 steps in https://etherpad.openstack.org/p/tripleo-rolling-upgrades | 22:43 |
jog0 | so we want to maintain API access while doing control plane upgrade right? at least for most APIs | 22:43 |
lifeless | StevenK: morning :) | 22:43 |
jog0 | because as SpamapS, it sounds like heat can't do a db upgrade life | 22:44 |
jog0 | live | 22:44 |
jog0 | not sure how well nova does it today either | 22:44 |
lifeless | so | 22:44 |
lifeless | we don't need to magically get it all right on day one | 22:44 |
lifeless | being able to file bugs about where things are not suitable for deployment is a good thing | 22:45 |
jog0 | lifeless: sounds good, I am just trying to better understand what our desired goal is | 22:46 |
lifeless | ALL THE THINGS | 22:46 |
jog0 | for MVP4 that is | 22:46 |
lifeless | jog0: right | 22:46 |
*** morazi has quit IRC | 22:46 | |
lifeless | so in my head the goal is to go from 'downtime of APIS and all VMS stopped during deployment' | 22:46 |
lifeless | to 'VM workload keep working but you might not be able to start new things // stop old things during deployment' | 22:47 |
*** hewbrocca has quit IRC | 22:48 | |
jog0 | lifeless: I like it, although in my mind VM workload keep working (mostly) is a much easier target. mostly here would mean transient issues that resolve themselves in some window of time | 22:49 |
jog0 | but yeah that goal means we can turn off APIs for short amounts of time while upgrading control plane | 22:49 |
jog0 | (I think) | 22:50 |
dkehn | lifeless: have sometime for a conversation ? | 22:50 |
lifeless | dkehn: sure | 22:50 |
dkehn | lifeless: ok whick method, i.e. skype, gtalk, etc. | 22:51 |
lifeless | gtalk | 22:54 |
*** e0ne has joined #tripleo | 23:05 | |
dkehn | devananda: thx | 23:07 |
*** e0ne has quit IRC | 23:10 | |
*** matty_dubs is now known as matty_dubs|gone | 23:13 | |
*** sdague has quit IRC | 23:27 | |
*** sdague has joined #tripleo | 23:28 | |
*** jpeeler has quit IRC | 23:29 | |
*** jpeeler has joined #tripleo | 23:30 | |
*** ftcjeff has quit IRC | 23:36 | |
*** rbrady1 has joined #tripleo | 23:48 | |
*** clarkb has quit IRC | 23:48 | |
*** rbrady has quit IRC | 23:48 | |
*** clarkb has joined #tripleo | 23:49 | |
lifeless | lol | 23:51 |
lifeless | http://en.wikipedia.org/wiki/Jumbogram | 23:51 |
lifeless | An optional feature of IPv6, the jumbo payload option, allows the exchange of packets with payloads of up to one byte less than 4 GiB | 23:51 |
lifeless | I would lke to see that | 23:51 |
greghaynes | haha | 23:53 |
greghaynes | im sure most routers will just do fine with that :p | 23:53 |
greghaynes | running heat stack-update on overcloud results in compute node erroring in nova-compute with trying to connect to mysql via local socket (which is actually running on the other node) | 23:55 |
greghaynes | known bug? | 23:55 |
lifeless | greghaynes: I believe that ronelle was seeing that yesterday | 23:55 |
lifeless | greghaynes: check that the mysql url is correct in /etc/nova/nova.conf | 23:55 |
greghaynes | It is | 23:56 |
lifeless | whee | 23:56 |
lifeless | file a bug :) | 23:56 |
greghaynes | What project do you think? | 23:56 |
lifeless | tripleo to start with | 23:56 |
greghaynes | ok | 23:56 |
greghaynes | Also got some info on whats happening with my qemu-nbd, it happens when nova booting demo image from overcloud, best guess is that its because we load in just the qcow2, not the kernel/initrd images. Sounds sane / should we still not be doing qemu-nbd for that case? | 23:58 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!