fungi | mordred: do you happen to know which provider/region test-mordred-config-drive is in (or an easy way for me to find out)? | 00:01 |
---|---|---|
fungi | nevermind, it's also ord | 00:02 |
mordred | fungi: you can always eddit /var/cache/ansible-inventory/<tab> and look at the metadata in there associated with test-mordred-config-drive (in the future) | 00:03 |
mordred | there's probably a cooler way to do that | 00:03 |
mordred | butthat's what I do :) | 00:03 |
fungi | looks like maybe the openstack_project::server class isn't applied to our afs servers, which explains why they didn't get my sources.list fix | 00:05 |
fungi | jeblair: ^ | 00:05 |
fungi | was that intentional or an oversight or do you even remember? | 00:05 |
fungi | sorry, s/probably/definitely/ | 00:06 |
jeblair | fungi: i don't recall; mordred ^? | 00:06 |
mordred | I also do not remember | 00:06 |
fungi | they're using template instead of server | 00:06 |
fungi | looks like maybe it was just a mistake since server is a very thin wrapper around template? | 00:07 |
clarkb | should I go ahead and reboot nodepool to pick up the code changes and patch kernel? | 00:18 |
fungi | clarkb: yeah that can happen. double-check you see that new package installed first | 00:20 |
fungi | clarkb: though worth noting, node utilization is dropping like a stone on the graphs | 00:21 |
fungi | and it's not clear to me that it's due to a drop in workload | 00:21 |
clarkb | fun | 00:23 |
fungi | oh, it's ticked back up now. it may have been one of those "zuul spent an hour handling a reconfigure" | 00:26 |
clarkb | aptitude show linux-image-3.13.0-76-generic State: installed | 00:30 |
clarkb | so I think I am going to go ahead and do nodepool | 00:30 |
clarkb | ready? | 00:30 |
clarkb | in progress now | 00:33 |
clarkb | Linux nodepool.openstack.org 3.13.0-76-generic #120-Ubuntu SMP Mon Jan 18 15:59:10 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | 00:34 |
clarkb | deficit calculations just happened so I think nodepool is running, will keep an eye on it | 00:35 |
fungi | excellent | 00:43 |
fungi | sorry, had to bring a new mattress up a couple flights of stairs, back now | 00:44 |
fungi | so anyway, 269930 has merged. once i see the updates propagate to the afs servers i'll catch them up | 00:45 |
fungi | i've gotten review-dev updated. checking into pypi.region-b.ord-1 now | 00:46 |
fungi | it probably just doesn't have enough disk space to apt-get anything | 00:46 |
fungi | er, pypi.region-b-geo-1 | 00:47 |
fungi | pypi.region-b.geo-1 | 00:47 |
fungi | hrm, no, it has disk space | 00:47 |
fungi | oh, it crashed a dpkg run at some point in the past, so it's not updating | 00:48 |
fungi | and that got it to finish installing linux-image-3.13.0-76-generic | 00:49 |
fungi | so that's just the afs servers missing now | 00:50 |
fungi | and looks like the puppet update is hitting them | 00:50 |
fungi | and now i've got the new kernel on those, so we should be installed everywhere. i'll run one more play to confirm that | 00:54 |
fungi | oh, right, i need to clear the hostlist cache | 00:56 |
fungi | as mordred suggested | 00:56 |
fungi | okay, the skipped ones show ok=1 changed=0, the non-skipped ones show ok=2 changed=1, none are unreachable or failed | 01:02 |
fungi | and the count on the non-skipped servers matches what puppetboard reports for trusty servers | 01:03 |
fungi | so i think we're ready to start rebooting things (aside from nodepool, which clarkb already rebooted) | 01:03 |
fungi | though i may need to take a backseat on this part, as the evening's wearing on here and i still need to unbox this mattress | 01:04 |
anteaya | sounds like pip has just created more fun and I'm too tired to get involved | 01:25 |
anteaya | anything I can do to help this effort before I sign off for the night? | 01:25 |
anteaya | okay good night | 01:32 |
anteaya | congratulations on the new mattress fungi, I hope it gives you restful sleep | 01:32 |
* anteaya heads off to get some restful sleep of her own | 01:33 | |
clarkb | fungi: is there a list of things that need rebooting? maybe you have that handy and can make an etherpad? | 01:47 |
fungi | i simply queried puppetboard, but can whip that up easily enough | 01:47 |
fungi | https://etherpad.openstack.org/p/CVE-2016-0728 | 01:54 |
fungi | that's courtesy of http://puppetboard.openstack.org/fact/operatingsystemrelease/14.04 | 01:54 |
jeblair | Linux codesearch.openstack.org 3.13.0-76-generic #120-Ubuntu SMP Mon Jan 18 15:59:10 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | 02:10 |
jeblair | is that the version we're looking for? | 02:10 |
jeblair | fungi, clarkb: ^? | 02:10 |
clarkb | jeblair: yes | 02:10 |
fungi | jeblair: lgtm | 02:11 |
fungi | i've also noted on the etherpad that clarkb already took care of nodepool.o.o | 02:11 |
clarkb | thanks I just got hit by a fulisade of house tings so distracted. Appraisals are very interesting documents | 02:12 |
clarkb | I probably spelled that fwrong | 02:12 |
clarkb | here is 25 pages to read | 02:12 |
clarkb | have fun | 02:12 |
jeblair | i picked a few off, but am going for food now | 02:27 |
jeblair | jhesketh: and, ha! if you are bored, you could reboot some servers! :) | 02:27 |
jhesketh | jeblair: will do once I'm done here :-) | 02:28 |
jhesketh | (here being devstack promotes) | 02:28 |
* jhesketh didn't manage to keep up with everything in here... | 02:37 | |
jhesketh | have all those servers been upgraded and just need a restart? | 02:37 |
jhesketh | (those servers being the list in the etherpad) | 02:37 |
fungi | yep | 02:38 |
fungi | that would be our entire list of trusty servers. i've confirmed the new kernel package is installed on every one of them | 02:38 |
jhesketh | should I do any kind of alerts that services are going down for emergency restarts? | 02:38 |
fungi | i wouldn't for the majority of them | 02:38 |
fungi | though gerrit reboot and zuul reboot probably need to happen at the same time, and zuul's queue dumped/readded | 02:39 |
jhesketh | yeah that one won't be fun | 02:39 |
fungi | oh, wait, zuul doesn't need a reboot | 02:39 |
fungi | so maybe we can just hope review.o.o reboots quickly? | 02:40 |
fungi | pick a time when the gate is nowhere near reporting a change if you can | 02:40 |
fungi | zuul gets a free pass this time because it's on precise, so too old of a kernel to be affected | 02:40 |
fungi | we sort of lucked out that we haven't gotten as far as we'd like on the trusty migrations i guess | 02:41 |
jhesketh | my reading of the CVE is that it's not overly urgent since it requires an attacker to have an account already | 02:41 |
jhesketh | unless of course gerrit accounts count and then it's bad | 02:41 |
jhesketh | otherwise I can wait until later in my evening and things will be quieter for most | 02:41 |
fungi | yeah, non-urgent | 02:41 |
jhesketh | do we know if gerrit users can use the vector? | 02:41 |
fungi | they shouldn't be able to, no | 02:41 |
jhesketh | okay, I'll do the easy ones first and wait for a bit lower traffic for the harder ones | 02:42 |
fungi | but also you are well-located to be more awake than most of us at the times when activity on our systems is lowest | 02:42 |
jhesketh | things like ask.o.o are pretty public facing, what's your opinion on that one? | 02:42 |
jhesketh | indeed | 02:42 |
fungi | a few minutes of unannounced downtime for ask.o.o should be fine. i'm not sure the best way to reach the ask audience anyway since they're basically the people who can't suss mailing lists and don't even know what irc means | 02:44 |
jhesketh | hah | 02:44 |
jhesketh | hmm, asfd doesn't start automatically? | 02:47 |
fungi | maybe that was being saved for later | 02:59 |
fungi | could be there's still some tidying up to do on that deployment | 02:59 |
jhesketh | yeah I've made a note in the etherpad for somebody to look to | 03:01 |
jhesketh | *at even | 03:02 |
fungi | awesome--thanks | 03:30 |
-openstackstatus- NOTICE: review.openstack.org is being restarted to apply patches | 11:42 | |
*** ChanServ changes topic to "review.openstack.org is being restarted to apply patches" | 11:42 | |
*** ChanServ changes topic to "CVE-2016-0728 http://www.openwall.com/lists/oss-security/2016/01/19/2" | 11:53 | |
-openstackstatus- NOTICE: Restart done, review.openstack.org is available | 11:53 | |
*** AJaeger has joined #openstack-infra-incident | 12:01 | |
jhesketh | Most servers are restarted. I've updated the etherpad: https://etherpad.openstack.org/p/CVE-2016-0728 | 12:10 |
jhesketh | 5 servers need checking, probably the most important one being stackalytics.openstack.org | 12:10 |
jhesketh | I don't think the others matter but need checking sooner or later | 12:11 |
jhesketh | there are 5 more that still need restarting.. I suspect they won't be any trouble, but I had queries (see pad) | 12:12 |
*** crinkle_ has joined #openstack-infra-incident | 13:49 | |
*** lifeless_ has joined #openstack-infra-incident | 13:51 | |
*** ianw_ has joined #openstack-infra-incident | 13:54 | |
*** ianw has quit IRC | 13:56 | |
*** anteaya has quit IRC | 13:56 | |
*** clarkb has quit IRC | 13:56 | |
*** crinkle has quit IRC | 13:56 | |
*** lifeless has quit IRC | 13:56 | |
*** Zara has quit IRC | 13:56 | |
*** ianw_ is now known as ianw | 13:56 | |
*** Zara has joined #openstack-infra-incident | 13:56 | |
*** anteaya has joined #openstack-infra-incident | 14:03 | |
*** clarkb has joined #openstack-infra-incident | 14:04 | |
fungi | thanks jhesketh! | 15:25 |
*** crinkle_ is now known as crinkle | 18:04 | |
*** lifeless_ is now known as lifeless | 18:12 | |
*** AJaeger has left #openstack-infra-incident | 21:18 | |
*** ChanServ changes topic to "situation normal" | 21:54 | |
fungi | i guess we're still waiting for centos 7 kernel updates, unless we want to try the stap workaround ianw linked | 21:55 |
clarkb | I think the surface area is small enough on those that we can probably wait | 21:55 |
clarkb | selinux is supposedly something that makes it harder to exploit too and we run with that enabled | 21:56 |
*** mordred has quit IRC | 23:04 | |
*** mordred has joined #openstack-infra-incident | 23:06 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!