*** jhesketh_ is now known as jhesketh | 00:02 | |
anteaya | it sounds convincing to me | 00:15 |
---|---|---|
nibalizer | yah we'd use dns | 00:18 |
fungi | pleia2: anteaya: yes, that's the long and short of it | 00:18 |
nibalizer | that way we can start up storyboard02.o.o, get it all working right, then cutover dns real fast as the only change | 00:18 |
nibalizer | as the standard-unit-of-upgrading | 00:18 |
fungi | puppetboard would show the name from nova/inventory though | 00:19 |
*** bkero has joined #openstack-sprint | 00:19 | |
nibalizer | that is true | 00:20 |
nibalizer | and that would be sorta confusing | 00:20 |
fungi | i honestly wonder whether using a separately-zoned subdomain for these wouldn't solve some of our dns automation issues | 00:20 |
anteaya | okay confusing reigns in puppetboard, but the users can find the thing they are looking for | 00:20 |
anteaya | yay | 00:20 |
fungi | the launch can set up forward/reverse dns with the ordinal suffix, actual service addresses in the normal openstack.org zone are still maintained by hand though | 00:21 |
fungi | so storyboard.openstack.org ends up being a cname to storyboard02.infra.openstack.org or whatever | 00:22 |
* nibalizer has reviewed anything with the topic trusty-upgrades | 00:22 | |
nibalizer | anybody got a patch they are about to hit submit on before I turn in to a pumpkin? | 00:22 |
pleia2 | nibalizer: no no, a baseball | 00:22 |
fungi | we could (much more easily) delegate teh infra.openstack.org zone to something maintained by automation that way | 00:22 |
nibalizer | pleia2: are we bringing a glove? | 00:23 |
pleia2 | nibalizer: I am not | 00:23 |
nibalizer | i have been told to get a giants cap | 00:23 |
nibalizer | we can use that | 00:23 |
fungi | nibalizer: pumpkinize thyself | 00:23 |
pleia2 | fungi: I think we're going down a rabbit hole with delegated dns | 00:23 |
fungi | maybe | 00:23 |
pleia2 | nibalizer: there you go :) | 00:23 |
fungi | dns automation is unlikely to happen in the nearish future otherwise though | 00:23 |
pleia2 | yeah | 00:23 |
fungi | it's the there are two parties who aren't tightly coupled and don't coordinate both responsible for maintaining the openstack.org domain each with separate needs/goals problem | 00:24 |
fungi | i'm unsure i'll be able to convince the other group who co-maintain that domain with us that we can move the whole thing somewhere with a better api | 00:25 |
fungi | but if we limit our automation updates to a subdomain, we can delegate that somewhere else they don't need to worry about | 00:25 |
pleia2 | yeah, that's fair | 00:26 |
fungi | that's the only reason i'm even considering it | 00:26 |
fungi | anyway, food for thought. that's something we can pretty easily switch to later without any significant change in the naming plan | 00:26 |
fungi | _if_ we discover that lack of autoconfigured dns records is a blocker for automating the rest | 00:27 |
* pleia2 nods | 00:27 | |
*** baoli has quit IRC | 00:32 | |
*** baoli has joined #openstack-sprint | 00:33 | |
*** baoli has quit IRC | 00:33 | |
fungi | oh, here's one possible gotcha with the ordinal suffix | 00:33 |
fungi | do any of our puppet modules do special things with $::fqdn? | 00:33 |
*** baoli has joined #openstack-sprint | 00:34 | |
fungi | like, assume it's the site name for a name-based virtual host? | 00:34 |
fungi | we've been operating on the fqdn==service paradigm in some places for long enough we may have baked that assumption into configuration management | 00:35 |
bkero | Yeah, there's stuff like this: | 00:37 |
bkero | class { 'openstack_project::mirror': | 00:37 |
bkero | vhost_name => $::fqdn, | 00:37 |
bkero | and stuff like this: | 00:37 |
bkero | controller_public_address => $::fqdn, | 00:37 |
bkero | openstackci_password => hiera('openstackci_infracloud_password'), | 00:38 |
fungi | that will all need fixing as we come across it | 00:40 |
bkero | oh definitely | 00:43 |
bkero | openstack_project::mirror, openstack_project::storyboard, openstack_project::storyboard::dev, openstack_project::infracloud::controller are the classes that need updating (these are the entries that use ::fqdn in site.pp) | 00:45 |
bkero | infracloud::controller might be able to get away with still using it, but I'm not familiar enough to stay. Storyboard not so much. | 00:46 |
bkero | crinkle: ^ | 00:46 |
*** rfolco has quit IRC | 00:54 | |
*** rfolco has joined #openstack-sprint | 00:56 | |
*** anteaya has quit IRC | 01:18 | |
*** cdelatte has quit IRC | 01:37 | |
*** baoli has quit IRC | 03:05 | |
*** rfolco has quit IRC | 03:10 | |
*** baoli has joined #openstack-sprint | 03:44 | |
*** sivaramakrishna has joined #openstack-sprint | 04:26 | |
*** baoli has quit IRC | 05:16 | |
*** morgabra has quit IRC | 06:15 | |
*** zhenguo_ has quit IRC | 06:16 | |
*** zhenguo_ has joined #openstack-sprint | 06:18 | |
*** med_ has quit IRC | 06:21 | |
*** morgabra has joined #openstack-sprint | 06:23 | |
*** med_ has joined #openstack-sprint | 06:25 | |
*** med_ has quit IRC | 06:25 | |
*** med_ has joined #openstack-sprint | 06:25 | |
*** ig0r_ has joined #openstack-sprint | 08:40 | |
*** sivaramakrishna has quit IRC | 09:34 | |
*** cdelatte has joined #openstack-sprint | 11:03 | |
*** ig0r_ has quit IRC | 11:34 | |
*** rfolco has joined #openstack-sprint | 11:54 | |
*** ig0r_ has joined #openstack-sprint | 11:56 | |
*** pabelanger has joined #openstack-sprint | 12:09 | |
pabelanger | morning | 12:22 |
pabelanger | Going to start on zuul-mergers this morning | 12:23 |
pabelanger | looks like we can launch zm09.o.o then cycle out each other server | 12:23 |
pabelanger | or take zm01.o.o out of service | 12:24 |
pabelanger | and upgrade to trusty | 12:24 |
*** baoli has joined #openstack-sprint | 12:55 | |
*** baoli_ has joined #openstack-sprint | 12:56 | |
*** baoli has quit IRC | 13:00 | |
fungi | bkero: and that's just in the system-config repo. taking the puppet-storyboard repo for example, we have 4 different places in its manifests where $::fqdn is used directly | 13:17 |
fungi | so while the new naming plan has merit, i think there's a lot of cleanup work to be done before we can get there | 13:18 |
pabelanger | Hmm | 13:38 |
pabelanger | so, I am confused. Where does group_names come from? https://github.com/openstack-infra/ansible-puppet/blob/master/tasks/main.yml#L17 | 13:38 |
pabelanger | or how is it setup | 13:39 |
fungi | pabelanger: http://docs.ansible.com/ansible/playbooks_variables.html#magic-variables-and-how-to-access-information-about-other-hosts | 13:54 |
fungi | "group_names is a list (array) of all the groups the current host is in" | 13:55 |
fungi | so ansible creates that based on our group definitions | 13:55 |
pabelanger | fungi: Ya, I found that a few mins ago. Should have posted a follow up. Basically debugging why I cannot launch zm09.o.o and it looks to be a bug in your hiera data handling | 13:55 |
fungi | my hiera data handling is just fine, thank you ;) | 13:55 |
pabelanger | we fail to copy /etc/puppet/hieradata/production/group yaml files | 13:56 |
fungi | ahh, as in during launch-node.py or in general? | 13:56 |
pabelanger | ya, launch-node.py | 13:56 |
fungi | i'm not surprised. sounds like a bug/oversight | 13:58 |
pabelanger | I think there is a few issues, testing again as root to see what happens | 14:14 |
*** anteaya has joined #openstack-sprint | 14:14 | |
jeblair | pabelanger: i'd just drop zm01 and keep the numbers the same. we should be able to handle it. | 14:17 |
pabelanger | jeblair: Sure, I can do that. Is there a process to put zuul-merger into shutdown mode or just stopping the service is enough | 14:18 |
pabelanger | Also, it looks like our permissions on /etc/puppet/hieradata need to be updated, running launch-node.py as non-root user fails to populate some hieradata on the remote node. | 14:19 |
jeblair | pabelanger: nope, just stop | 14:20 |
pabelanger | jeblair: ack | 14:21 |
jeblair | nibalizer: the enumerated hostnames don't help with making cutover faster -- in your storyboard example, we would still need to cutover the production database, etc. the only things the hostnames help with is not having to use uuids temporarily while we have 2 servers. | 14:22 |
pabelanger | okay, zm01.o.o service stopped | 14:31 |
pabelanger | launching zm01.o.o on trusty | 14:32 |
*** anteaya has quit IRC | 14:42 | |
*** bkero has quit IRC | 14:42 | |
pabelanger | okay, never server up, but had these errors at the end of launch-node.py: http://paste.openstack.org/show/498627/ | 14:44 |
pabelanger | running /usr/local/bin/expand-groups.sh manually worked as expected | 14:48 |
pabelanger | jeblair: fungi: What is the procedure for the original zm01.o.o. Should I delete it or suspend it after DNS records have been updated? | 14:49 |
jeblair | pabelanger: i think you can delete it | 14:50 |
fungi | pabelanger: yeah, delete in nova | 14:50 |
pabelanger | jeblair: ack | 14:50 |
fungi | and in dns | 14:50 |
pabelanger | fungi: ack | 14:50 |
jeblair | anyone have any problems with cacti? if not, i'll delete the old one now | 14:50 |
fungi | oh, wait, you're just contracting and then expanding, rather than the other way around | 14:50 |
pabelanger | jeblair: nope, looks to be working here | 14:50 |
fungi | so no dns deletion needed, just dns changes i guess | 14:50 |
pabelanger | fungi: okay, great | 14:51 |
fungi | jeblair: cacti seems fine to me | 14:54 |
pabelanger | okay, I've decreased the TTL for zm02-zm08 to 5mins. Will help bring server online faster | 15:03 |
pabelanger | just waiting for zm01 to be picked up before moving on | 15:03 |
*** cdelatte has quit IRC | 15:35 | |
*** yujunz has joined #openstack-sprint | 15:36 | |
*** yujunz has quit IRC | 15:38 | |
jeblair | deleted old cacti | 15:39 |
*** bkero has joined #openstack-sprint | 15:40 | |
*** delattec has joined #openstack-sprint | 15:45 | |
*** anteaya has joined #openstack-sprint | 15:45 | |
pabelanger | jeblair: just a heads up, I had to run: ssh-keygen -f "/root/.ssh/known_hosts" -R cacti.openstack.org on puppetmaster.o.o | 15:54 |
pabelanger | jeblair: ansible was failing to SSH into the new server | 15:54 |
pabelanger | I had to do the same with zm01.o.o | 15:54 |
fungi | yep, host replacements require ssh key cleanup | 15:55 |
fungi | that's normal | 15:55 |
*** morgabra has quit IRC | 15:55 | |
*** morgabra has joined #openstack-sprint | 15:55 | |
jeblair | pabelanger: ah, thanks i forgot about that | 15:56 |
jeblair | pabelanger: though i had disabled it in ansible, or so i thought | 15:56 |
jeblair | pabelanger: was it trying to do that recently? | 15:56 |
pabelanger | jeblair: Ya, I just noticed it this run | 15:56 |
jeblair | pabelanger: hrm | 15:56 |
jeblair | it is in the emergency file | 15:56 |
pabelanger | jeblair: not sure how long it has been trying | 15:56 |
jeblair | 2fb22df0-c176-4885-9a66-5735519c719b # new cacti | 15:56 |
jeblair | cdaf8e04-9fce-4b70-af56-38b3208fe4b4 # old cacti | 15:56 |
fungi | as we discovered with the ssh timeouts a while back, ansible will still ssh into "disabled" hosts, it just won't run puppet apply | 15:56 |
pabelanger | odd | 15:56 |
jeblair | fungi: oh | 15:57 |
pabelanger | Ah, didn't know that | 15:57 |
*** delattec has quit IRC | 15:57 | |
*** anteaya has quit IRC | 15:57 | |
jeblair | i have removed it, so it should now be normal | 15:57 |
fungi | so related to my reply on the server naming thread, i'm planning to replace storyboard.openstack.org today with another storyboard.openstack.org and not storyboard01.openstack.org because i don't want to drag out our current upgrade efforts by insisting that we refactor our puppet modules to support arbitrary hostnames at the same time | 16:02 |
fungi | if replacing a non-numbered host with a numbered one works for certain classes/modules then i'm not opposed, and we should probably avoid baking further hostname assumptions into our config management, but for manifests which require potentially disruptive work to support that i'd rather separate that work from the upgrade work | 16:04 |
*** anteaya has joined #openstack-sprint | 16:06 | |
*** delattec has joined #openstack-sprint | 16:06 | |
pabelanger | fungi: what's the fix? rather then use ::fqdn, change it out for actually hostname? | 16:07 |
pabelanger | not sure that would work either... let me think about it for a bit | 16:10 |
fungi | pabelanger: the "fix" is to pass the name you want in as a parameter wherever you instantiate the module/class in question | 16:12 |
fungi | some of the hits for $::fqdn are just default values, so we may already be plumbed for it in a lot of places, but we're not necessarily passing those in today because we can just rely on the default | 16:13 |
anteaya | the freenode netsplits are taking me into the far corners of the universe, which while fun, may mean I disappear mid-conversation | 16:15 |
anteaya | don't take it personally | 16:15 |
anteaya | I'm trying to stay current by reading the logs | 16:15 |
pabelanger | fungi: so, looks like we needed to delete the original zm01.o.o DNS entries. Since we ended up with 2 of them | 16:17 |
anteaya | and I support using storyboard.openstack.org rather than 01 | 16:17 |
pabelanger | fungi: I would have expected the original to be overwritten like you mentioned too | 16:17 |
fungi | pabelanger: oh, yeah as i said modify the existing dns entries | 16:18 |
fungi | pabelanger: you obviously shouldn't add new ones | 16:18 |
fungi | that ends up turning it into a round-robin | 16:19 |
pabelanger | fungi: Yup, I'll do that moving forward | 16:19 |
*** ig0r_ has quit IRC | 16:24 | |
*** delatte has joined #openstack-sprint | 16:43 | |
*** delattec has quit IRC | 16:46 | |
pabelanger | Hmm | 16:49 |
pabelanger | graphite.o.o might be acting up again | 16:52 |
fungi | we get into an oom condition there semi-regularly | 16:53 |
anteaya | what do you mean when you say acting up | 16:53 |
pabelanger | http://grafana.openstack.org/dashboard/db/zuul-status?panelId=16&fullscreen doesn't appear to be updating any more | 16:53 |
fungi | some services not running? any recent oom killer entries in dmesg -T output? | 16:53 |
pabelanger | fungi: let me check | 16:53 |
pabelanger | last segfault was May 16 for dmesg | 16:55 |
pabelanger | maybe it isn't graphite but nodepool | 16:56 |
pabelanger | Hmm | 16:57 |
pabelanger | 2016-05-24 16:23:37,996 INFO nodepool.NodePool: Target jenkins01 is offline | 16:57 |
pabelanger | is last log entry for nodepool | 16:57 |
pabelanger | switching to openstack-infra | 16:57 |
fungi | ahh, yep, i put jenkins01 into prepare for shutdown yesterday as i had a glut of some 100 nodes in a ready state not running any jobs while we were under a backlog, and i didn't remember to restart it once it finished running the few jobs it did have in progress | 16:58 |
nibalizer | good morning | 17:05 |
anteaya | morning nibalizer | 17:05 |
bkero | howdy | 17:08 |
pabelanger | zm01.o.o looks to be good, I manually ran ansible | 17:11 |
pabelanger | ansible-playbook -vvv -f 10 /opt/system-config/production/playbooks/remote_puppet_else.yaml --limit zm01.openstack.org | 17:11 |
pabelanger | going to start on zm02.o.o shortly | 17:11 |
pabelanger | looks like zuul.o.o firewall is still referencing the old IP address of zm01.o.o. Waiting to see if puppet notices the difference | 17:20 |
nibalizer | pabelanger: woot | 17:22 |
pabelanger | fungi: I cannot remember, is iptables smart enough to refresh DNS every X mins or should I look to manually reload the rules | 17:24 |
fungi | you have to manually reload. host resolution is done when the rules are parsed | 17:26 |
pabelanger | okay, I suspected that was the issue | 17:26 |
pabelanger | fungi: which method do we use to reload iptables? | 17:26 |
pleia2 | probably should use service restart | 17:27 |
fungi | that aside, i'm not a fan of iptables rules that rely on host resolution. that means a compromised dns (without dnssec, mitm is relatively trivial) can adjust your firewall rules | 17:27 |
pleia2 | or reload for iptables | 17:27 |
pabelanger | Ya, I tend to do service reload iptables & | 17:27 |
pabelanger | or restart | 17:27 |
fungi | in at least most places we put ip addresses in our iptables rules | 17:27 |
pabelanger | but background to to make SSH happy | 17:27 |
pleia2 | screen++ | 17:28 |
pabelanger | pleia2: good choice | 17:28 |
pabelanger | okay, zm01.o.o now attached to gearman | 17:29 |
pabelanger | moving on to zm02.o.o | 17:30 |
*** baoli_ has quit IRC | 17:40 | |
pleia2 | do we have a story going for things to fix in xenial upgrade path? I couldn't find one, and I know I'm getting ahead of ourselves, but I have a bunch of errors over here on my plate I want to dump somewhere | 17:41 |
pleia2 | could just create another etherpad too | 17:41 |
pleia2 | noted in current etherpad, but planet may need to wait until our xenial upgrade because of broken planet-venus package | 17:42 |
pleia2 | for fun I quickly tried it on a xenial system, but our server manifest needs a fair amount of love for 16.04 | 17:42 |
pleia2 | moving on to test services on static | 17:44 |
pabelanger | zm02.o.o online, and processing gearman events | 17:49 |
pabelanger | okay, moving to zm03.o.o now | 17:52 |
anteaya | pleia2: I think a story for xenial upgrades is a good idea | 18:02 |
pleia2 | ok, I'll put it in system-config | 18:03 |
anteaya | good idea | 18:05 |
pabelanger | zm03.o.o up and running now | 18:10 |
anteaya | nice work | 18:10 |
pabelanger | moving on to zm04.o.o now | 18:12 |
pabelanger | could almost write an ansible playbook for this! | 18:12 |
pabelanger | going real smooth now | 18:12 |
*** ig0r_ has joined #openstack-sprint | 18:23 | |
pabelanger | great! 4/8 zuul-mergers are now ubuntu-trusty | 18:31 |
fungi | hrm, one issue with following the launch readme to the letter. it says to use /opt/system-config/production/launch but seems to want permission to modify ../playbooks/remote_puppet_adhoc.retry from there | 18:35 |
fungi | jeblair: were you running from a local clone of system-config to which your account had write access maybe? | 18:35 |
jeblair | fungi: that is possible | 18:35 |
pabelanger | fungi: I've been using a local clone also | 18:36 |
fungi | giving that a try now | 18:36 |
jeblair | fungi, pabelanger: we can disable retry files in the ansible config... | 18:38 |
fungi | looking good so far. i wonder if there's a flag it needs to tell it to write retry files somewhere other than the playbooks directory, or whether we should just switch teh docs | 18:39 |
fungi | oh, or that! | 18:39 |
pabelanger | jeblair: ++ | 18:39 |
jeblair | do we want to disable them everywhere? (i think probably so; i've never used them in our context) | 18:39 |
pabelanger | I've never used a .retry yet | 18:39 |
fungi | new problem... | 18:39 |
fungi | [WARNING]: log file at /var/log/ansible.log is not writeable and we cannot create it, aborting | 18:39 |
jeblair | fungi: i've seen that but i thought it was not fatal | 18:40 |
pabelanger | Ya, we need to update the permissions to puppet:puppet | 18:40 |
pabelanger | I am also lanuching a server ATM too | 18:40 |
pabelanger | wonder if that is the reason | 18:40 |
fungi | oh, hrm yeah looks like i failed for some other unspecified reason. need to run with --keep and try again so i can see the puppet logs | 18:40 |
pabelanger | I had too many issues running as non-root this morning. I've since moved to running launch-node.py as root | 18:42 |
pabelanger | I believe there is a permission issue on /etc/puppet/hieradata for non-root users | 18:43 |
pabelanger | I haven't debugged it more | 18:43 |
pabelanger | I've also had to patch launch-node with the follow hack: http://paste.openstack.org/show/498717/ | 18:44 |
pabelanger | otherwise group hieradata didn't seem to copy properly | 18:44 |
*** baoli has joined #openstack-sprint | 19:27 | |
pabelanger | taking a break after the meeting for a few minutes, before decided which host to do next. | 19:35 |
anteaya | pabelanger: makes sense, nice work getting them all done | 19:36 |
pabelanger | anteaya: thanks. People who did the puppet manifests deserve the credit, I just pushed buttons :D | 19:38 |
*** baoli has quit IRC | 19:39 | |
*** ig0r_ has quit IRC | 19:39 | |
anteaya | pabelanger: pushing the buttons helps | 19:43 |
pabelanger | okay, going to work on zuul-dev.o.o now | 20:06 |
*** baoli has joined #openstack-sprint | 20:07 | |
anteaya | pabelanger: cool | 20:09 |
pabelanger | okay, zuul-dev.o.o online and running ubuntu-trusty | 20:23 |
anteaya | yay | 20:23 |
*** baoli has quit IRC | 20:29 | |
*** baoli has joined #openstack-sprint | 20:29 | |
*** baoli has quit IRC | 20:34 | |
*** baoli has joined #openstack-sprint | 20:35 | |
pabelanger | okay, so it looks like graphite.o.o is ready. Moving to it | 20:37 |
pabelanger | going to need some help migrating the volume for it however | 20:38 |
anteaya | go go graphite | 20:42 |
pabelanger | I believe we'll need to do http://docs.openstack.org/infra/system-config/sysadmin.html#cinder-volume-management first on the current graphite.o.o server | 20:42 |
pabelanger | fungi: jeblair: do you have a moment to confirm ^. I need to do those steps for graphite.o.o to persist data | 20:44 |
anteaya | pabelanger: not sure about them but I'm in the tc meeting, 15 minutes remaining | 20:44 |
pabelanger | anteaya: ack | 20:44 |
fungi | pabelanger: yes, you'll need to deactivate the logical volumes on the current production server, detach them and attach them to the replacement server | 20:45 |
pabelanger | I stand correct, It seems there already is a volume | 20:46 |
pabelanger | | 505ff749-bf4a-4881-8a4e-ff2f50d1e0ca | graphite.openstack.org/main02 | in-use | 1024 | Attached to graphite.openstack.org on /dev/xvde | | 20:46 |
fungi | there are more steps than that but it's the gist (umount /var/lib/graphite/storage, vgchange -a n main, openstack server volume detach...) | 20:46 |
jeblair | pabelanger: right, there should be an existing volume and what fungi said. | 20:47 |
pabelanger | fungi: okay, let me get the replacement server up first, confirm puppet is happy, then review the steps to migrate the volume | 20:47 |
fungi | we haven't directly documented moving a volume from one server to another but that document gets you basic familiarity with the toolset there | 20:49 |
fungi | gist is to make sure you don't yank volumes out from under a server and corrupt them | 20:50 |
pabelanger | right | 20:50 |
pabelanger | okay, new server is online | 20:50 |
fungi | make sure they're firmly settled and taken out of service at both the filesystem level and volume group level before detaching them at the cinder level | 20:50 |
pabelanger | let me stop graphite from running on the old server | 20:51 |
jeblair | pabelanger: stop apache, carbon-cache, statsd | 20:51 |
pabelanger | jeblair: thanks | 20:52 |
pabelanger | okay, unmount /var/lib/graphite/storage next | 20:52 |
pabelanger | umount* | 20:53 |
pabelanger | vgchange -a n main next | 20:53 |
pabelanger | 0 logical volume(s) in volume group "main" now active | 20:54 |
pabelanger | fungi: confirming, that is correct? ^ | 20:54 |
fungi | pabelanger: yep. you can also confirm with vgs | 20:55 |
fungi | odd that lvs still shows the graphite logvol | 20:57 |
fungi | i don't remember if it's supposed to or not | 20:58 |
pabelanger | I'm not familiar enough with vgs: http://paste.openstack.org/show/498746/ | 20:58 |
pabelanger | I assume that looks correct? | 20:58 |
anteaya | pabelanger: i have opened the paste and stared at it | 20:59 |
anteaya | I have no idea what is should look like though | 21:00 |
pabelanger | fungi: jeblair: okay, I see 2 volumes attached to graphite.o.o using openstack volume list | 21:00 |
*** rfolco has quit IRC | 21:03 | |
pabelanger | fungi: pvs shows both volumes on the original host | 21:05 |
pabelanger | which I believe is expected | 21:05 |
fungi | i wonder if that's https://launchpad.net/bugs/1088081 happening | 21:05 |
openstack | Launchpad bug 1088081 in lvm2 (Ubuntu) "udev rules make it impossible to deactivate lvm volume group with vgchange -an" [High,Confirmed] - Assigned to Dimitri John Ledkov (xnox) | 21:05 |
jeblair | fungi, pabelanger: that looks correct to me | 21:06 |
jeblair | what's wrong? | 21:06 |
fungi | oh, i simply can't remember whether lvs and vgs show the logvol and vg even after making the vg unavailable with vgchange | 21:06 |
pabelanger | Nothings wrong, I am just confirming things are correct. First time detaching/attaching IPs | 21:06 |
pabelanger | err | 21:07 |
pabelanger | volumes | 21:07 |
jeblair | you use lvs to confirm that it's inactive -- the "o" attr in lvs says that it's "open" (which == active) | 21:07 |
jeblair | that is absent in lvs on graphite, so we're good | 21:07 |
jeblair | LV VG Attr LSize Origin Snap% Move Log Copy% Convert | 21:07 |
jeblair | graphite main -wi--- 2.00t | 21:07 |
jeblair | compare to another host: | 21:07 |
jeblair | LV VG Attr LSize Pool Origin Data% Move Log Copy% Convert | 21:07 |
jeblair | cacti main -wi-ao--- 20.00g | 21:07 |
pabelanger | great, continuing to detach the volume from openstack | 21:07 |
jeblair | (oh, a for active, o for open == mounted. but yeah, you want them both gone) | 21:08 |
fungi | ahh, for some reason i was thinking there was an equivalent flag in the vgs output but not finding it in the manpage | 21:09 |
pabelanger | okay, currently detaching volumes | 21:20 |
*** baoli has quit IRC | 21:25 | |
*** baoli has joined #openstack-sprint | 21:26 | |
pabelanger | still detaching | 21:35 |
pabelanger | fungi: is detaching a volume a time consuming process? | 21:46 |
anteaya | I'm not of the belief detaching a volume takes a long time | 21:49 |
anteaya | I don't recall it taking this long for other processes I have witnessed | 21:49 |
pabelanger | we are up to 45mins now | 21:52 |
pabelanger | to detach 2 volumes from graphite.o.o | 21:52 |
anteaya | that seems odd | 21:52 |
anteaya | I don't have anything helpful to offer | 21:53 |
anteaya | have you spent any time in the cinder channel before? | 21:53 |
anteaya | they are a nice group | 21:53 |
pabelanger | I have not | 21:53 |
anteaya | might be your excuse to introduce yourself | 21:53 |
fungi | pabelanger: it normally isn't time-consuming. maybe try halting the instance? | 21:53 |
anteaya | <-- going for a walk, back later | 21:53 |
fungi | i wonder if it's waiting for activity to "stop" on the volume (i've not seen that happen before afaik) | 21:54 |
pabelanger | fungi: Ya, I am not sure. I pvs and lvs return empty now | 21:55 |
pabelanger | my knowledge is lacking atm | 21:55 |
*** baoli has quit IRC | 21:59 | |
*** baoli has joined #openstack-sprint | 21:59 | |
pabelanger | fungi: jeblair: I have to step away for a few minutes. Still waiting for volumes to detach (detaching is current status). Will need some guidance on the reattach process, once openstack is happy again | 21:59 |
pabelanger | in the mean time, graphite.o.o is still down | 21:59 |
jeblair | pabelanger: wow, that wasn't instantaneous? | 21:59 |
fungi | may be necessary to open a ticket with fanatical support | 22:00 |
jeblair | huh, i don't see an option for online chat support | 22:08 |
jeblair | submitted ticket 160524-dfw-0003689 | 22:10 |
pabelanger | back | 22:14 |
pabelanger | jeblair: nope :( | 22:14 |
pabelanger | perhaps because I did both volumes back to back? | 22:15 |
pabelanger | okay, stepping away for a bit. Going to update status log with current outage | 22:28 |
*** SotK has quit IRC | 22:41 | |
*** SotK has joined #openstack-sprint | 22:43 | |
*** baoli has quit IRC | 22:55 | |
anteaya | I am much less attractive to the bugs than others are this season, I wonder if it is the garlic I have been taking as immune support | 23:33 |
anteaya | if so, double happiness | 23:33 |
*** yuikotakadamori has left #openstack-sprint | 23:34 | |
fungi | the other trick locals were fond of in the mountains where i grew up was to eat the wild onions that proliferated there | 23:35 |
anteaya | ah yes | 23:35 |
anteaya | sulfur | 23:35 |
fungi | but yeah i've heard that lots of garlic can have a similar effect | 23:35 |
anteaya | nice and smelly | 23:35 |
anteaya | I have not skimped on the onions this week either | 23:36 |
anteaya | though they are the tame variety, not nearly as pungent as the wild | 23:36 |
jhesketh | o/ | 23:39 |
fungi | i have the storyboard replacement booted again. will try switching to it shortly | 23:40 |
anteaya | jhesketh: great thank you | 23:45 |
anteaya | jhesketh: so here is the etherpad: https://etherpad.openstack.org/p/newton-infra-distro-upgrade-plans | 23:46 |
anteaya | fungi: yay | 23:46 |
anteaya | jhesketh: and line 64 is the apps entry | 23:46 |
anteaya | and docado is the person who is the ptl for the apps service | 23:46 |
anteaya | and fungi and I talked to him about what needs to be done to switch over | 23:47 |
anteaya | I'll get the url for that conversation | 23:47 |
anteaya | jhesketh: http://eavesdrop.openstack.org/irclogs/%23openstack-infra/%23openstack-infra.2016-05-24.log.html#t2016-05-24T20:07:42 | 23:47 |
anteaya | and unfortunately both he and I are horrible shades of yellow in that rendering | 23:48 |
anteaya | I'm going with the log version but at least you have the timestamp | 23:48 |
*** ianw has joined #openstack-sprint | 23:49 | |
* jhesketh gets up to speed | 23:50 | |
anteaya | jhesketh: take your time I'm between dinner courses | 23:50 |
anteaya | I'll be back in a bit for questions | 23:50 |
anteaya | not that I can answer them, but I can listen | 23:50 |
jhesketh | :-) | 23:51 |
anteaya | :) | 23:51 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!