clarkb | it was an early start for me today again and I think I've begun to run out of steam. See everyone tomorrow. I'm hoping to maybe start doing elasticsearch nodes | 00:04 |
---|---|---|
pabelanger | clarkb: ack | 00:06 |
fungi | looks like puppet is okay parsing 527532 so if that merges i'll give subunit-worker01 another try during tc office hours here in a bit | 00:12 |
pabelanger | fungi: ianw: mind a review on https://review.openstack.org/527447/ | 00:16 |
pabelanger | might try one more launch this evening | 00:16 |
fungi | sure thing | 00:17 |
jeblair | yay the paste change landed just in time for me to eod | 00:52 |
pabelanger | still having issues launching eavesdrop01.o.o, running with --keep now and debug in the mornig | 01:45 |
pabelanger | morning* | 01:45 |
*** baoli has joined #openstack-sprint | 01:48 | |
*** baoli_ has joined #openstack-sprint | 01:49 | |
*** baoli has quit IRC | 01:53 | |
*** baoli_ has quit IRC | 03:53 | |
*** baoli has joined #openstack-sprint | 03:54 | |
*** baoli has quit IRC | 04:02 | |
*** skramaja has joined #openstack-sprint | 05:14 | |
AJaeger | clarkb, jeblair, regarding 526946, see changes 1 and 3 of that series; they do what you say and that's the documented way to retire a repo | 06:10 |
frickler | ianw: for ethercalc I removed the npm_package_ensure because that package didn't exist in the 6.x repo, seems the command is provided by the nodejs package instead (re: https://review.openstack.org/#/c/527302/3/manifests/frontend.pp) | 08:35 |
ianw | frickler: oh, ok, so it's all bundled together? | 09:48 |
ianw | do you think that's the problem in https://review.openstack.org/527302 ? i'm going to have to dig into that tomorrow, i'm not sure what's going on | 09:50 |
ianw | i think that's basically holding up the ethercalc & status.o.o (that uses nodejs too) transitions | 09:50 |
frickler | ethercalc was working fine for me with the latest version, I think we only need to look into the data migration there | 10:04 |
frickler | ianw: I'll propose a version of 527302 without it and see how things go | 10:06 |
*** jkilpatr has quit IRC | 11:23 | |
*** jkilpatr has joined #openstack-sprint | 11:54 | |
*** skramaja_ has joined #openstack-sprint | 12:18 | |
*** skramaja has quit IRC | 12:18 | |
*** skramaja_ has quit IRC | 12:23 | |
*** skramaja has joined #openstack-sprint | 12:23 | |
*** skramaja has quit IRC | 13:08 | |
fungi | i want to say i ran into that rebuilding another server and put in a patch to deal with it... looking for it now | 14:12 |
pabelanger | morning! I'll be diving into eavesdrop01.o.o this morning again | 14:14 |
pabelanger | ah, think I see it. | 14:18 |
pabelanger | upstart script | 14:18 |
*** baoli has joined #openstack-sprint | 14:26 | |
*** openstackstatus has quit IRC | 14:36 | |
*** openstack has quit IRC | 14:39 | |
*** openstack has joined #openstack-sprint | 14:41 | |
*** ChanServ sets mode: +o openstack | 14:41 | |
fungi | frickler: ianw: found it. i attempted to work around it in puppet-openstack_health with https://review.openstack.org/508564 (and if memory serves it solved the problem) | 14:42 |
*** openstack has quit IRC | 14:43 | |
*** openstack has joined #openstack-sprint | 14:46 | |
*** ChanServ sets mode: +o openstack | 14:46 | |
frickler | fungi: yes, that description matches what I saw when testing ethercalc yesterday. updated https://review.openstack.org/527302 now accordingly | 14:49 |
fungi | ahh, i guess it's puppet-openstack_health you're working on, so maybe my patch wasn't a total fix | 14:50 |
frickler | fungi: seems a bit more needs to be done for xenial, yes. but ianw was implicitly reverting your patch and that made things worse | 14:51 |
frickler | clarkb: if you don't have other plans, I'd like to start doing the elasticsearch nodes with you, of course others can join if they like, but I'm assuming that it'll be one at a time anyway | 14:55 |
pabelanger | remote: https://review.openstack.org/527707 Add support for systemd init scripts | 15:03 |
pabelanger | when people have some time | 15:03 |
*** ianychoi has quit IRC | 15:04 | |
pabelanger | okay, I'll be deleting files01.o.o here shortly, unless somebody objects | 15:16 |
pabelanger | files02.o.o has been working for last 12 hours | 15:16 |
frickler | pabelanger: the init-type selection looks nicer than what ianw and me have been doing for ethercalc, minor issue though I think in 527707 | 15:17 |
pabelanger | looking | 15:17 |
pabelanger | ah right systemd-reload | 15:19 |
pabelanger | puppet doesn't do that by default | 15:19 |
pabelanger | let me see how ethercalc did it | 15:19 |
* fungi has his fingers crossed that subunit-worker01 will boot into a fully operational condition this time around | 15:21 | |
frickler | pabelanger: I just copied what I found in some other module here https://review.openstack.org/#/c/527144/13/manifests/init.pp | 15:22 |
pabelanger | yah | 15:23 |
pabelanger | remote: https://review.openstack.org/527707 Add support for systemd init scripts | 15:24 |
pabelanger | new patch up | 15:24 |
pabelanger | also copypasta the systemd hack | 15:25 |
fungi | yay! subunit-worker01 came up with the daemon working, except that i need to revert the removal patch and restart iptables on logstash.o.o now | 15:34 |
fungi | Revert "Remove subunit-worker01.openstack.org" https://review.openstack.org/527720 | 15:43 |
*** ianychoi has joined #openstack-sprint | 16:06 | |
jeblair | paste01 seems to work except that i think it's starting too early and the database hostname doesn't resolve. i'm sticking in an "After=network.target" to the unit file to see if that does it. | 16:07 |
jeblair | yep, that's got it. | 16:07 |
pabelanger | nice | 16:08 |
clarkb | frickler: ya I dont mind doing that but having a slower start today | 16:22 |
clarkb | frickler: not sure if you are still interested | 16:22 |
frickler | clarkb: interested yes, but a bit off the tracks currently, too. maybe later or tomorrow I guess | 16:25 |
clarkb | frickler: ok we will likely do them in a rolling fashion so should be nodes around tomorrow | 16:26 |
pabelanger | running into town for errands, waiting for 527707 to pass tests | 16:49 |
pabelanger | reviews would be also be helpful | 16:49 |
*** jkilpatr has quit IRC | 17:07 | |
clarkb | thinking about the elasticsearch servers and how we might want to do their upgrades. The easy way is just boot a new instance with a new 1TB backing volume and let cluster sync the data over. That will probably be fairly slow. Another way would be to stop shard allocations, stop es on host, remove cinder volume from host, put volume on new host and start es there to use the cinder volume to move the | 17:15 |
clarkb | bulk of the data | 17:15 |
clarkb | for the second method, what is the process of removing a volume like that? I guess remove it from lvm entirely first or is it safe to just have cinder remove it under lvm? | 17:15 |
fungi | the big concern is keeping in mind which one is acting as the "main" api endpoint for logstash, right? | 17:16 |
fungi | and i guess if you save it for last (or at least not first) you can adjust configuration for ls to repoint to an upgraded one first | 17:16 |
clarkb | fungi: yup | 17:16 |
fungi | important to remember that even though it's a six-way cluster, we have a spof where ls is connecting | 17:17 |
clarkb | I'm leaning towards using the second method myself, it is more complicated but should go much quicker overall | 17:17 |
fungi | for the second method there, yes you'd want to unmount, deactivate the vg, then cinder detach | 17:18 |
fungi | unmounting just to make sure it's flushed properly and because lvm would make you use some nasty options to deactivate the vg otherwise | 17:19 |
fungi | deactivating the vg because nova will protest if you ask it to remove a volume it believes is still in use by the instance | 17:20 |
clarkb | ok once I have properly caffeinated I will start writing a process down on an etherpad | 17:20 |
fungi | sometimes it still doesn't figure it out, in which case i end up halting the os on it | 17:20 |
clarkb | fungi: oh wow | 17:21 |
fungi | if you need to go even farther, you can delete the instance and then nova will usually get the hint ;) | 17:21 |
*** jkilpatr has joined #openstack-sprint | 17:23 | |
jeblair | i'll change paste to a cname for paste01 now | 17:29 |
jeblair | done | 17:33 |
clarkb | I got caught in the period of time where dns resolves no A record and no CNAME record for paste.openstack.org | 17:33 |
*** jkilpatr has quit IRC | 17:38 | |
AJaeger | clarkb, jeblair, https://review.openstack.org/526946 is ready now to merge - the noop job merged as planned | 17:40 |
AJaeger | (retirement of puppet-apps_site) | 17:40 |
jeblair | clarkb: i believe the negative ttl is 5m, so... should be over now? | 17:45 |
clarkb | jeblair: yup resolves now /me tests | 17:46 |
clarkb | http://paste.openstack.org/show/628884/ works | 17:46 |
fungi | if https://review.openstack.org/527720 gets approved, then i'll be able to take subunit-worker02 offline and rebuild it safely to get us back to two workers again | 17:47 |
clarkb | fungi: done | 17:48 |
AJaeger | thanks, now we can retire apps-site completely: https://review.openstack.org/526945 and https://review.openstack.org/526943 | 17:51 |
*** jkilpatr has joined #openstack-sprint | 17:52 | |
*** baoli has quit IRC | 17:59 | |
*** baoli has joined #openstack-sprint | 18:00 | |
fungi | AJaeger: thanks! chucked those into the gate now too | 18:08 |
AJaeger | thanks, fungi | 18:08 |
clarkb | I've got the elasticsearch upgrade process braindump in https://etherpad.openstack.org/p/elasticsearch-xenial-upgrade. Now to fill in the outline with actual commands | 18:15 |
pabelanger | and back | 19:01 |
pabelanger | I've just deleted files01.o.o | 19:03 |
pabelanger | clarkb: jeblair: mind if I get a review: https://review.openstack.org/527707/ system scripts for xenial on meetbot | 19:04 |
clarkb | pabelanger: ya I'll take a look | 19:12 |
clarkb | fungi: after reading much documentation I think I got the vg migration stuff correct in https://etherpad.openstack.org/p/elasticsearch-xenial-upgrade do you mind taking a look? | 19:13 |
clarkb | pabelanger: reviewed | 19:16 |
clarkb | pabelanger: left a comment on a thing | 19:16 |
fungi | clarkb: in my experience, the new system will have the lv known to devmapper within moments of the nova volume attach happening, but then again i wasn't doing vgexport on the old system either so ymmv | 19:18 |
clarkb | fungi: gotcha thanks | 19:18 |
clarkb | fungi: I can run pvs/vgs on the new side to see what state it is in before attempting the import | 19:19 |
pabelanger | clarkb: looking | 19:19 |
pabelanger | clarkb: yes, you are right. Fixing | 19:20 |
clarkb | I'll wait to rereview pabelanger's thing then its brunch and start on elasticsaerch after | 19:20 |
pabelanger | remote: https://review.openstack.org/527707 Add support for systemd init scripts | 19:21 |
clarkb | pabelanger: +2 | 19:24 |
clarkb | I'm in search of breakfast tacos now, back in a bit | 19:24 |
*** openstackstatus has quit IRC | 19:27 | |
pabelanger | thanks | 19:29 |
fungi | what, are you in austin or something?!? | 19:38 |
fungi | i got the impression no other city was allowed to have breakfast tacos | 19:39 |
pabelanger | Sadly no breakfast tocos near me | 19:40 |
pabelanger | fungi: mind a review again on 527707 | 19:40 |
fungi | you must be psychic, i had just pulled it up | 19:40 |
fungi | pabelanger: okay, lgtm. seems a little indirect having a separate name for the file resource than its path in that case, but not especially incorrect | 19:45 |
pabelanger | fungi: yah, syntax is valid, but didn't know how else to do it without creating 2 different entries | 19:48 |
fungi | sure, it's fine | 19:49 |
fungi | just took me a bit of staring to realize the file was being referred to by something other than its path, when we already knew the path and it was parameterized | 19:49 |
pabelanger | yah, maybe I should have done large if / else block over variables to make it easier to read | 19:51 |
clarkb | fungi made them myself, nowhere near as good as what you get in austin though | 19:52 |
fungi | ahh | 19:52 |
*** jkilpatr has quit IRC | 20:03 | |
fungi | subunit-worker01 is actively taking work from gearman now, so i'm shutting down/deleting 02 and removing it from dns, then will launch its replacement | 20:18 |
pabelanger | okay, launching eavesdrop01.o.o again | 20:21 |
fungi | okay, old subunit-worker02 is gone and the replacement is being booted | 20:25 |
fungi | gonna go grab late lunch/early dinner and check on this when i get back | 20:25 |
pabelanger | hey, it worked | 20:26 |
ianw | if i could get a couple of eyes on the codesearch changes https://review.openstack.org/527544 and https://review.openstack.org/527557 that would be great. i'd like it get it back to regular puppeting and then i'll upgrade it | 20:27 |
ianw | i actually tested that "live" on codesearch so it's all updated now too | 20:27 |
ianw | then i'll check out frickler's nodejs stuff and see where we are with those various bits | 20:28 |
ianw | first i need to deal with a cupcake emergency | 20:28 |
*** openstackstatus has joined #openstack-sprint | 20:28 | |
*** ChanServ sets mode: +v openstackstatus | 20:28 | |
pabelanger | okay, I am going to get a coffee then start the process to migrate volume from eavesdrop.o.o to eavesdrop01.o.o | 20:30 |
pabelanger | which will result in an outage of IRC logs | 20:31 |
pabelanger | should maybe first confirm we don't have any meetings going | 20:31 |
clarkb | pabelanger: ya we should do that | 20:34 |
* clarkb reviews ianw changes then will be attempting es07 upgrade | 20:35 | |
pabelanger | yah, we have a few more hours of meetings | 20:38 |
pabelanger | I'll hold off until a little later to migrate the volumes | 20:39 |
*** jkilpatr has joined #openstack-sprint | 20:39 | |
jeblair | deleted the old paste.openstack.org | 20:48 |
pabelanger | ack | 20:49 |
pabelanger | I'm going to take a stab at cacit02.o.o | 20:50 |
jeblair | pabelanger: lemme know if you have any questions | 20:51 |
pabelanger | will do | 20:52 |
clarkb | ianw: changes lgtm, left a comment but not worth a new patchset I don't think | 20:52 |
clarkb | es07 replacement is launching now | 20:59 |
pabelanger | Hmm, php5 isn't supported on xenial, only php7 it seems. will have to see how that plays with cacti | 21:03 |
pabelanger | as we try to load some php5 apache mods | 21:03 |
pabelanger | but for now, heading outside to play in the snow :D | 21:04 |
mtreinish | pabelanger: it looks like it should work: https://bugs.launchpad.net/ubuntu/+source/cacti/+bug/1571432 (assuming we use the packaged version) | 21:11 |
openstack | Launchpad bug 1571432 in cacti (Ubuntu) "Cacti package is incompatible with PHP7 on Xenial" [High,Fix released] - Assigned to Nish Aravamudan (nacc) | 21:11 |
clarkb | fungi: looks like launch failed for me because subunit-worker02 does not have dns records anymore which broke the firewall. I've confirmed that the firewalls on logstash workers are unhappy as well. However subunit works have nothign to do with logstash or logstash workers so I'm now going to take a detour to figure out why our firewall rules are a bit too greedy | 21:15 |
mtreinish | clarkb: they use the same gearman server so they need to have the hole open to get jobs from the server | 21:21 |
mtreinish | clarkb: unless things have changed there? | 21:21 |
clarkb | mtreinish: ya they need a hole to logstash.o.o over port 4730 | 21:21 |
clarkb | but as is we are allowing them all to talk on the elasticsearch ports too whcih isn't necessary | 21:22 |
mtreinish | ah, ok | 21:22 |
mtreinish | things are likely too greedy because I screwed up when I added the rules and just copy and pasted everything from the logstash workers for the subunit side in site.pp | 21:23 |
pabelanger | mtreinish: yah, I think we just need to fix the apache mod we enable | 21:25 |
clarkb | mtreinish: fungi https://review.openstack.org/527787 I think that should do it | 21:28 |
mtreinish | clarkb: +1 | 21:37 |
clarkb | fungi: also I think this means we can't remove things from dns without removing it from firewall rules first | 21:37 |
clarkb | fungi: because unfortunately the resulting behavior seems to be "be wide open" | 21:38 |
clarkb | though it didn't affect the trusty nodes? | 21:38 |
clarkb | oh! I know what it was | 21:38 |
clarkb | it was the adding of 01 back in but leaving 02 as well | 21:38 |
clarkb | that caused netfilter-persistent to reload rules | 21:39 |
clarkb | fungi: what I don't understand is why the rule got flushed | 21:41 |
clarkb | testing on logstash-worker01 if I remove subunit-worker02 rules load, then if I add it back again the old rules remain | 21:41 |
clarkb | and reading scripts that seems to be the intended effect | 21:41 |
pabelanger | remote: https://review.openstack.org/527793 Bump puppetlabs-apache to 1.11.1 | 21:46 |
pabelanger | jeblair: clarkb: figured out php7 issue for cacti^. Seems it is our only manifest right now using puppetlabs-apache, so bumps the dependency to latest 1.x release | 21:46 |
jeblair | pabelanger: know why we aren't using puppet-httpd? | 21:47 |
*** baoli has quit IRC | 21:48 | |
jeblair | maybe now would be a good time to switch? | 21:48 |
*** baoli has joined #openstack-sprint | 21:48 | |
pabelanger | jeblair: seems like an oversight when we did the convertion, but can dig into it. However, we do have a long standing issue to migrate back to puppetlabs-apache, since we merged our forked code upstream. We just never found the time to migrate back | 21:48 |
jeblair | pabelanger: oh ok. guess it doesn't matter then :) | 21:49 |
pabelanger | yah, if we are going to keep puppet around for next round of LTS, I think migrating to puppetlabs-apache is the right move. I did have a few patches up to start on some modules, but never pushed hard on them | 21:49 |
fungi | okay, back now... looks like stuff is going on? | 22:11 |
ianw | fricker: so "Error: /Package[gulp]: Provider npm is not functional on this host" is the main error @ http://logs.openstack.org/02/527302/5/check/legacy-puppet-beaker-rspec-infra/39955b9/job-output.txt.gz | 22:13 |
fungi | i have an 02 and am adding it back to dns now | 22:13 |
clarkb | fungi: thanks, I'll redo es07 boot shortly | 22:13 |
fungi | clarkb: there is once again dns for 023 | 22:13 |
fungi | 02 | 22:13 |
fungi | but the negative ttl will come into play for a few more minutes probably | 22:14 |
clarkb | ya I'll watch it | 22:14 |
fungi | may just be that we can't replace some systems while we're in the middle of replacing some other systems | 22:15 |
fungi | i need to at least restart {iptables,netfilter}-persistent on logstash.o.o | 22:16 |
clarkb | ya and all the workers as well | 22:17 |
clarkb | to pick up the rules | 22:17 |
fungi | that seems to have fixed the ability of the replacement subunit-worker02 to pull jobs again | 22:18 |
clarkb | I can help with that once I get es07 going | 22:18 |
fungi | all what workers? | 22:18 |
clarkb | logstash and subunit | 22:19 |
ianw | frickler: "When using the NodeSource repository, the Node.js package includes npm" so this is where i guess it's all getting confused | 22:19 |
clarkb | depending on if their iptables rules are loaded properly | 22:19 |
pabelanger | I have 2 patchs up for review https://review.openstack.org/527793/ and https://review.openstack.org/527796/ related to cacti02.o.o. reviews most welcome | 22:19 |
fungi | the logstash workers have iptables rules which involve the subunit workers? | 22:20 |
clarkb | fungi: yes, see https://review.openstack.org/#/c/527787/ | 22:20 |
clarkb | it is a bug, but I've attempted to address it in ^ | 22:20 |
fungi | ahh | 22:21 |
ianw | fungi: i think npm gets installed to /usr/local/bin ... this has shades of your pip issue? | 22:27 |
fungi | color me unsurprised | 22:27 |
ianw | although, maybe not, yours was calling with full path | 22:28 |
ianw | did we determine if /usr/local/bin was actually in the path correctly? | 22:28 |
fungi | nope, i have generally assumed that puppet has a default shell path of "" | 22:28 |
fungi | though if it's documented somewhere, i'm happy to start making new assumptions | 22:29 |
ianw | frickler: ok, my next attempt is to use nodejs 6.x ... which appears to be the oldest LTS release. 0.12 is EOL already, so maybe this is part of the failure. also, i don't understand how nodejs versioning works :) | 22:31 |
clarkb | fungi: lvm steps as described seem to work and cinder had no trouble detaching it | 22:36 |
fungi | cool | 22:36 |
clarkb | rebooting now to make sure fstab is happyness | 22:36 |
clarkb | aha foudn a missing step :) need to chown everything as uids/guids are not stable across installs | 22:38 |
fungi | oh! yeah usually that doesn't chance | 22:38 |
fungi | change | 22:38 |
clarkb | I'm editing DNS now but think I can bring new elasticsearch07 into existing cluster shortly | 22:41 |
ianw | frickler: yay, that worked! now, the question is if openstack_health is 100% ok with the later nodejs version. i think we just have to assume it is and fix anything as it occurs | 22:49 |
pabelanger | okay, stepping out for a few hours, but when I return I'll attempt to move eavesdrop volume to new server | 22:53 |
clarkb | alright reboots seem to be what are claerning out the firewall rules | 22:53 |
clarkb | fungi: ^ fyi | 22:53 |
clarkb | I've not put es07 in use yet as I am trying to figure out why reboots seem to make iptables unhappy | 22:53 |
*** harlowja has quit IRC | 22:54 | |
clarkb | Before=network.target <- really? | 22:55 |
clarkb | infra-root so ^ is an issue. Do you think we should install our own unit that starts it again after network.target? | 22:55 |
ianw | oh firewall rules and systemd ... i feel like we've discussed this before. maybe it was dib ... | 22:56 |
clarkb | thats a fairly major regression in ubuntu, but maybe it doesn't count beacuse the service has a new name | 22:57 |
clarkb | seems like there are ways to shadow system supplied units? but I may be misremembering. Worst case I think if we just had infra-netfilter-persistent with a different dependency it would work? | 22:58 |
ianw | https://review.openstack.org/#/c/293826/ is what i'm thinking of. "network-pre.target is a target that may be used to order services before any network interface is configured. It's primary purpose is for usage with firewall services that want to establish a firewall before any network interface is up." | 22:59 |
clarkb | problem here is we can't configure out firewall until networking is up | 23:00 |
clarkb | because dns resolution | 23:00 |
ianw | ahh | 23:00 |
fungi | well, my classic network security training says don't use dns names in firewall rules. that probably doesn't help us much | 23:01 |
clarkb | ya we could convert over to ip addresses everywhere | 23:02 |
clarkb | it makes managing it in config management more painful though | 23:02 |
fungi | yeah, mostly based on the idea that unless you hardcode ip addresses into firewall rules, your firewalls are only as secure as your nameservers | 23:02 |
clarkb | and ubuntu will refuse to install any rules if you use dns with default service :) | 23:03 |
ianw | and then does it actually re-check the name resolution at any point? or do you have to reload the rules anyway if the dns changes? | 23:03 |
* fungi notes that he mostly managed firewalls which protected nameservers, so priorities may have been mildly askew | 23:03 | |
clarkb | ianw: you have to reload if dns changes | 23:03 |
ianw | so it's not so much dns as ns :) | 23:04 |
fungi | this is a fair point | 23:04 |
fungi | nondynamic name service | 23:04 |
fungi | the internet has changed a lot, and become a freespirited party kind of place | 23:05 |
*** baoli has quit IRC | 23:05 | |
*** baoli has joined #openstack-sprint | 23:06 | |
clarkb | zuulv3.o.o uses names as well but has rules installed | 23:06 |
clarkb | (it is xenial) | 23:06 |
clarkb | jeblair: pabelanger ^ any idea if that was handled for zuul? | 23:06 |
ianw | i remember pabelanger doing a swizzle where we removed the rules, brought up hosts and then re-added them | 23:07 |
clarkb | fwiw I'm particularly worried about open ES on the internet because they are known for being abused, logstash-workers also exhibiting this but less difficult to abuse them | 23:08 |
jeblair | catching up | 23:10 |
fungi | yeah, having our elasticsearch api sockets default to "go away" seems like a good thing | 23:11 |
fungi | already dealt with a wiki server getting compromised within minutes of running with an exposed elasticsearch api listener | 23:11 |
fungi | that's plenty for me | 23:12 |
jeblair | clarkb: well, also if we ever run zuul with my repl patch installed and it answers connections from outside localhost, we, erm, would need to rotate all our secrets. | 23:12 |
clarkb | jeblair: ya, I checked it and it seems happy now | 23:12 |
clarkb | er it being the firewall at least | 23:12 |
jeblair | but i checked before using it and firewall seemed ok. i don't know why that is if there's a structural problem | 23:12 |
*** baoli has quit IRC | 23:12 | |
clarkb | I'm wondering if it was manually started to reload the rules after last reboot | 23:13 |
clarkb | (I also seem to recall checking the firewall when the repl was installed a while back as well) | 23:13 |
clarkb | Thinking about ways to address this, installing our own unit to run the same commands with different dependencies is probably safe and reliable | 23:14 |
clarkb | however | 23:15 |
clarkb | you'd want to order it such that it was done before "real" system service strt but after networking | 23:15 |
clarkb | and I think after networking is sort of the common system is up start everything else so that may be tricky | 23:15 |
ianw | fungi: you've looked at some "[" quoting issues ... http://logs.openstack.org/44/527144/13/check/legacy-puppet-beaker-rspec-infra/1ba8fea/job-output.txt.gz#_2017-12-13_00_11_35_740142 ring any bells | 23:17 |
ianw | i don't think the ethercalc rspec test is actually testing anything :/ | 23:17 |
clarkb | jeblair: fungi ianw trivial fix is to replace everything with ip addrs | 23:17 |
clarkb | it will add a lot of boilerplate around upgrades because ip addrs will be changing constantly | 23:17 |
fungi | yeah, i'll admit even though my knee-jerk security wonk reaction is to hardcode ip addresses, it's not at all convenient | 23:18 |
fungi | ianw: "No examples found" is certainly a new error for me | 23:19 |
fungi | almost seems like a mistranslation | 23:19 |
clarkb | I feel like this problem may deserve a walk | 23:19 |
clarkb | I'm nto coming up with any good ideas right now | 23:20 |
jeblair | the unit file is called netfilters-persistent? | 23:20 |
clarkb | jeblair: let me check | 23:20 |
fungi | yes | 23:20 |
clarkb | systemctl show netfilter-persistent | 23:20 |
clarkb | drop the s in netfilters | 23:20 |
jeblair | welp, on zuulv3 it's a sysvinit script | 23:20 |
clarkb | oh it might be here too I jsut asked systemctl for info /me looks | 23:20 |
jeblair | and that's provided by the package | 23:21 |
clarkb | ya it is here too | 23:21 |
fungi | the service on trusty was iptables-persistent and on xenial is netfilter-persistent but very well may be via systemvinit compat | 23:21 |
clarkb | so how does that sysv script decide to be before networking | 23:21 |
clarkb | # Required-Start: mountkernfs $remote_fs <- it must be parsing the lsb info? | 23:22 |
clarkb | $remote_fs should require networking though I think | 23:22 |
jeblair | does it look at rcS.d? | 23:23 |
jeblair | lrwxrwxrwx 1 root root 30 Jun 2 2017 S05netfilter-persistent -> ../init.d/netfilter-persistent | 23:23 |
jeblair | lrwxrwxrwx 1 root root 20 May 17 2017 S12networking -> ../init.d/networking | 23:23 |
clarkb | https://www.turnkeylinux.org/blog/debugging-systemd-sysv-init-compat doesn't say but does say where we can look for the generated compat stuff | 23:25 |
clarkb | jeblair: /lib/systemd/system/netfilter-persistent.service | 23:27 |
clarkb | my xenial node has that unit | 23:27 |
jeblair | clarkb: it's entirely possible zuulv3 doesn't work, and we just fixed it manually a while ago | 23:27 |
clarkb | I think that supercedes any compat layer because it won't look for compat scripts if there is a unit with the same name | 23:27 |
clarkb | that file is on zuulv3 too | 23:28 |
clarkb | so Ithink the sysv init script is just noise? | 23:28 |
jeblair | oh of course /lib/systemd. silly me for thinking this would be in etc. | 23:28 |
fungi | fhs to the rescue! or... nope | 23:29 |
clarkb | I bet if we delete the unit file the sys v init script would work though and run late | 23:29 |
clarkb | hwoeer I don't know if it would run before or after elasticsearch | 23:29 |
jeblair | clarkb: well, the symlinks have it running early too | 23:29 |
clarkb | (or zuul) | 23:29 |
jeblair | before networking | 23:29 |
clarkb | ah | 23:29 |
clarkb | oh ya 05 vs 12 | 23:29 |
jeblair | i also like the idea of using ip addresses, for extra robustness | 23:30 |
jeblair | i wonder if we could get puppet to do the substitution for us? | 23:30 |
clarkb | jeblair: I think so, we just edit the firewall rules that puppet writes and it will replace them | 23:31 |
jeblair | i mean, still have hostname in config management, but have puppet do a dns query and convert them to ip addresses for us | 23:31 |
clarkb | oh | 23:31 |
clarkb | I bet we could do something in the template at least | 23:32 |
jeblair | (that would be better that our current system since then our firewalls would automatically adjust as we changed dns) | 23:32 |
clarkb | ya | 23:32 |
jeblair | it doesn't make anything more secure than the dns servers, of course | 23:32 |
jeblair | though, i would like to make our dns servers more secure. :) | 23:33 |
fungi | now that we're considering running dns servers (aside from local caches), yes for sure | 23:34 |
fungi | dnssec ftw | 23:35 |
fungi | the proverbial "don't use dns in firewall rules" predated dnssec | 23:35 |
fungi | unfortunately, so did rackspace's dns service, unfortunately | 23:36 |
* fungi adds another unfortunately or two for good measure | 23:36 | |
clarkb | https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/system_administrators_guide/sect-Managing_Services_with_systemd-Unit_Files#brid-Managing_Services_with_systemd-Extending_Unit_Config | 23:37 |
clarkb | I'm going to test ^ with a new After config | 23:37 |
ianw | EmilienM: https://git.openstack.org/cgit/openstack-infra/openstack-zuul-jobs/tree/playbooks/legacy/puppet-beaker-rspec/run.yaml#n127 <- this code doesn't work, which i can fix ... but is the intention, if on a xenial node, to run puppet 4? | 23:41 |
*** harlowja has joined #openstack-sprint | 23:42 | |
EmilienM | ianw: ouch, do we use this role somewhere? | 23:42 |
EmilienM | le tme check | 23:42 |
ianw | EmilienM : the testing for https://review.openstack.org/#/c/527144/ , say | 23:42 |
EmilienM | oh that's infra modules ok | 23:43 |
ianw | EmilienM: http://logs.openstack.org/44/527144/13/check/legacy-puppet-beaker-rspec-infra/1ba8fea/job-output.txt.gz#_2017-12-13_00_11_35_740142 | 23:43 |
EmilienM | it's weird we have the playbooks in puppet-openstack-integration | 23:43 |
ianw | that's easy enough to fix, i mean we can just cut out the whole line in the new job | 23:43 |
ianw | but a) it doesn't seem to do anything and b) we'd want to test it with puppet3 for infra? | 23:44 |
EmilienM | I'm not sure which version of puppet infra deploys | 23:44 |
EmilienM | I hope puppet4 | 23:44 |
ianw | keep hoping :) | 23:44 |
fungi | i wouldn't be so sure | 23:44 |
ianw | puppet3 is what ships in xenial | 23:44 |
clarkb | I can't seem to get that rhel docs method for updating these things to work on ubuntu | 23:47 |
clarkb | DropInPaths=/etc/systemd/system/netfilter-persistent.service.d/after-network.conf <- it seems to know the config is there | 23:49 |
clarkb | but the contents of that don't seem ot override what is in the root unit | 23:49 |
clarkb | "Note that dependencies (After=, etc.) cannot be reset to an empty list, so dependencies can only be added in drop-ins. If you want to remove dependencies, you have to override the entire unit." | 23:51 |
clarkb | well then | 23:51 |
clarkb | I'm testing a complete override now | 23:56 |
ianw | EmilienM: proposed -> https://review.openstack.org/527811 puppet-beaker tests: don't use puppet 4 ... perhaps we need separate 4 tests? | 23:57 |
EmilienM | ianw: lgtm, I'll check with alex | 23:58 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!