*** hichihara has joined #openstack-infra | 00:00 | |
anteaya | well we never fail to mark feature freeze in new and exciting ways | 00:01 |
---|---|---|
mordred | clarkb: do we make nova quota queries in nodepool? | 00:01 |
mordred | clarkb: I thought we just hard-coded all of that in our config file | 00:02 |
clarkb | mordred: I dunno, can grep | 00:02 |
mordred | me too | 00:02 |
clarkb | git grep says no | 00:02 |
mordred | I agree | 00:02 |
clarkb | and that the config determines it | 00:02 |
*** VijayTripathi has quit IRC | 00:02 | |
*** dannywilson has quit IRC | 00:03 | |
cinerama | pleia2: so it looks like the openstackid bit is working with my patch. zanata does seem to have sprouted an additional login button though which is a bit odd | 00:03 |
*** camunoz is now known as camunoz_mtg | 00:04 | |
pleia2 | cinerama: I think we'll need to play around with that a bit anyway, since we need to make sure the *only* option for logging in is via openstackid | 00:04 |
pleia2 | cinerama: but yay! | 00:04 |
*** ZZelle_ has quit IRC | 00:04 | |
cinerama | pleia2: this is of course testing against a local install of openstackid so i don't have any accounts on there as yet but it does seem to be redirecting correctly | 00:04 |
cinerama | pleia2: oh, both buttons go to openstackid... | 00:05 |
pleia2 | cinerama: so it's more than theoretical! | 00:05 |
cinerama | pleia2: as i said, it's a bit odd :) | 00:05 |
pleia2 | cinerama: hah, fun | 00:05 |
*** achanda has quit IRC | 00:05 | |
fungi | this feature freeze we took down a major service provider... wondering how we can top that next cycle ;) | 00:05 |
*** baoli has quit IRC | 00:06 | |
jeblair | mordred: i think we're going to continue to try to delete those servers | 00:06 |
* fungi knows we probably didn't, but that's what history will record | 00:07 | |
fungi | should we stop nodepool, delete those rows and then clean up aliens later? | 00:07 |
*** baoli_ has joined #openstack-infra | 00:07 | |
clarkb | can we safely delete those rows out from under nodepool? | 00:07 |
clarkb | probably not | 00:07 |
clarkb | so shutdown is required | 00:07 |
clarkb | (just thinking out loud here) | 00:08 |
pleia2 | fungi: I'm going to go with that story | 00:08 |
*** baoli_ has quit IRC | 00:08 | |
*** asselin_ has joined #openstack-infra | 00:08 | |
*** garyh has quit IRC | 00:09 | |
jeblair | fungi, clarkb: i think row locks will prevent us from deleting while running. stopping, delete, then alien cleanup later is probably best. | 00:09 |
mordred | jeblair: oh - that's a good point | 00:09 |
fungi | i can give that a shot now unless someone else is already doing it | 00:10 |
jeblair | fungi: go for it | 00:10 |
*** oomichi has joined #openstack-infra | 00:11 | |
*** garyh has joined #openstack-infra | 00:11 | |
*** emagana has quit IRC | 00:11 | |
anteaya | yay the nova patch ttx needed merged | 00:13 |
reed | woot | 00:13 |
fungi | okay, it's running again and the hpcloud nodes are all gone from the db | 00:13 |
anteaya | just trove and cinder to go | 00:13 |
anteaya | cool | 00:13 |
jeblair | fungi: the log seems to not be mentioning hpcloud which is good | 00:14 |
*** YorikSar has quit IRC | 00:14 | |
SlickNik | anteaya: trove patch is in the gate as well. *fingers crossed* | 00:14 |
reed | meanwhile I'm patting myself on the back for having understood how mediawiki template work (at very high level) Check this out https://wiki.openstack.org/wiki/Template:InternshipIdea | 00:14 |
* reed proud of the useless (almost) knowledge accumulated | 00:14 | |
anteaya | SlickNik: yes I see that, and I have my reservations about that patch and shared my thoughts with ttx as well | 00:14 |
fungi | reed: cool--we have something similar for the third-party ci systems pages too | 00:15 |
anteaya | SlickNik: the fact that patchset 2 failed jenkins 5 times disconcerts me, but it is your project | 00:15 |
anteaya | and patchset 5 failed 3 times, etc. etc. | 00:15 |
reed | fungi, neat-o | 00:15 |
reed | fungi, now all those need are a category :) | 00:15 |
*** yamamoto has joined #openstack-infra | 00:15 | |
*** sputnik13 has quit IRC | 00:16 | |
fungi | reed: https://wiki.openstack.org/wiki/Template:ThirdPartySystemInfo | 00:16 |
SlickNik | anteaya: I looked into that. The earlier failures we were seeing were caused by a flaky assert in one of the unit tests that made it past the gate. | 00:16 |
anteaya | SlickNik: devs rechecking a patch rather than fixing it | 00:16 |
SlickNik | anteaya: I have a different patch that's going through check now to fix that issue. | 00:16 |
anteaya | SlickNik: yes | 00:16 |
anteaya | SlickNik: okay | 00:16 |
reed | fungi, do you mind if I add a category: to that template? | 00:17 |
fungi | reed: an additional category? i'm sure that's fine | 00:17 |
reed | fungi, or, in other words, how do you collect all the pages create with that template? | 00:17 |
anteaya | reed: what category are you thinking? | 00:17 |
reed | oh, I se it | 00:17 |
*** tiswanso_ has quit IRC | 00:17 | |
*** koolhead17 has quit IRC | 00:17 | |
anteaya | reed: https://wiki.openstack.org/wiki/ThirdPartySystems | 00:18 |
reed | anteaya, fungi, my bad, I din't see [[Category:ThirdPartySystems]] | 00:18 |
fungi | reed: yeah, they all wind up in https://wiki.openstack.org/wiki/Category:ThirdPartySystems | 00:18 |
reed | supercool | 00:18 |
anteaya | thanks | 00:18 |
anteaya | didn't know you didn't know | 00:18 |
reed | that part makes me hate mediawiki less | 00:18 |
reed | a little less | 00:18 |
SlickNik | referring to https://review.openstack.org/#/c/165995/ | 00:18 |
anteaya | well that's good | 00:18 |
*** sdake_ has joined #openstack-infra | 00:19 | |
openstackgerrit | Dan Prince proposed openstack-infra/system-config: Re-order tripleo Zuul images (to see if it helps) https://review.openstack.org/166055 | 00:19 |
SlickNik | anteaya: Was talking to fungi about exactly that yesterday. How to get folks to move away from the "recheck" habit. | 00:19 |
anteaya | SlickNik: what did you come away with as an understanding? | 00:20 |
anteaya | shutting off a test that exposes a race doesn't feel right to me, btw | 00:20 |
anteaya | but I'm curious to hear your take away from your conversation with fugi | 00:21 |
SlickNik | anteaya: It's a race in the test, not in the code. | 00:21 |
anteaya | fungi | 00:21 |
anteaya | okay | 00:21 |
SlickNik | For starters: Taking into consideration the number of rechecks a patch has gone through when reviewing the patchset. | 00:21 |
*** sdake__ has joined #openstack-infra | 00:21 | |
*** sdake has quit IRC | 00:22 | |
anteaya | that is a good place to begin, I agree | 00:23 |
*** VijayTripathi has joined #openstack-infra | 00:23 | |
*** stevemar has joined #openstack-infra | 00:23 | |
SlickNik | fungi mentioned that for the current patchset that number is now visible at the top of the review as well — which is super cool. | 00:23 |
anteaya | very helpful | 00:24 |
greghaynes | A nice thing we have in tripleo land is http://goodsquishy.com/downloads/s_tripleo-jobs.html which gives us pass rates, having stats on per-job pass rates could be pretty enlightening | 00:24 |
greghaynes | not sure if zuul has that already... | 00:24 |
clarkb | greghaynes: http://graphite.openstack.org | 00:25 |
*** sdake_ has quit IRC | 00:25 | |
greghaynes | since generally recheck is a side effect of a test that doesnt pass very often | 00:25 |
clarkb | jogo has a set of graphs that he uses | 00:25 |
clarkb | built from that data | 00:25 |
greghaynes | nice (although graphite) | 00:25 |
anteaya | SlickNik: sounds like you have a good place to begin | 00:26 |
greghaynes | Its neat because even just the % pass rate is a super useful stat | 00:26 |
anteaya | SlickNik: well done | 00:26 |
*** dprince has joined #openstack-infra | 00:26 | |
dprince | Still not seeing any Fedora 20 jobs in the TripleO rack. | 00:28 |
greghaynes | dprince: clarkb says its your keystone | 00:28 |
SlickNik | Sorry I was looking at the graphite metrics. | 00:28 |
greghaynes | dprince: apparently its configured to not use a routable address | 00:28 |
anteaya | SlickNik: yup | 00:28 |
*** tkelsey has joined #openstack-infra | 00:28 | |
clarkb | greghaynes: dprince ya just run nova --debug list | 00:28 |
dprince | greghaynes: so why are Ubuntu nodes running fine | 00:28 |
clarkb | you get back 10.1.8.37 as something to talk to | 00:29 |
clarkb | dprince: I do not know but in my investigating I ran into ^ | 00:29 |
SlickNik | anteaya: thanks! it's still WIP but we hope to get more disciplined about it. | 00:29 |
greghaynes | dprince: the other question id have for that is, did anything change? | 00:29 |
SlickNik | And any data we can gather around how we're doing is super useful. :) | 00:29 |
*** tjones1 has joined #openstack-infra | 00:29 | |
greghaynes | (on the tripleo ci cloud end) | 00:29 |
*** garyh has quit IRC | 00:29 | |
clarkb | dprince: "links": [{"href": "http://10.1.8.37:5000/v2.0/", "rel": "self"} | 00:30 |
dprince | clarkb: you may have found a different issue, but by my local testing Fedora nodes fire up fine, and I can get a floating IP too | 00:30 |
dprince | clarkb: that is the local IP, weird | 00:30 |
dprince | clarkb: let me check some other things | 00:30 |
clarkb | dprince: until ^ is fixed I don't think there is much I can do from this end | 00:30 |
anteaya | SlickNik: sure | 00:31 |
dprince | clarkb: okay. looking into it | 00:31 |
anteaya | SlickNik: the biggest thing I have seen is response to a test failure | 00:31 |
jeblair | clarkb, fungi, mordred: we have an okay to start performing alien cleanup | 00:31 |
openstackgerrit | Ian Wienand proposed openstack-infra/nodepool: Ignore stderr for documentation program output https://review.openstack.org/166057 | 00:31 |
anteaya | SlickNik: the more the devs think oh perhaps something is wrong with my patch | 00:31 |
fungi | jeblair: i'll fire that off now | 00:31 |
anteaya | SlickNik: the better the quaility of patches | 00:31 |
jeblair | clarkb, fungi, mordred: i think they're going to ask us to turn things on again to verify the issue | 00:32 |
clarkb | jeblair: while we have their attention maybe they can unlock that one node for us? | 00:32 |
jeblair | (after we cleanup) | 00:32 |
anteaya | SlickNik: when they believe that a bad structure is preventing their great code from merging, that is when problems arise | 00:32 |
jroll | anteaya: relatedly, I've found that when I review my own patches, they tend to improve | 00:32 |
fungi | jeblair: i'm not sure i like the sound of that ;) | 00:32 |
anteaya | jroll: good point | 00:32 |
jeblair | clarkb: good idea -- let's just delete everything and we'll give them a list of what we can't | 00:32 |
jeblair | fungi: they aren't sure they do either :) | 00:32 |
clarkb | jeblair: +1 | 00:32 |
*** tkelsey has quit IRC | 00:32 | |
jroll | anteaya: which is probably where that comes from, they re-read the patch to try to find the bug | 00:33 |
fungi | jeblair: clarkb: yep, i'll make an instance uuid list of whatever's left after deletes finish | 00:33 |
jeblair | fungi: let me/us know if you can use another hand on cleanup | 00:33 |
anteaya | jroll: exactly | 00:33 |
SlickNik | anteaya / jroll: ++ | 00:33 |
anteaya | SlickNik: so thanks for being proactive | 00:33 |
SlickNik | it's def a mindset thing. | 00:33 |
anteaya | SlickNik: taht helps | 00:33 |
anteaya | SlickNik: it is | 00:34 |
fungi | 499 alien nodes | 00:34 |
mordred | fungi: heh | 00:34 |
fungi | in hpcloud anyway | 00:34 |
jroll | whoa, we broke hp cloud? | 00:34 |
jroll | lol. | 00:34 |
* jroll is somewhat surprised hp fell over first | 00:35 | |
*** ddieterly has joined #openstack-infra | 00:35 | |
jeblair | fungi: i figure it's probably safe to run deletes in series across N parallel processes, where 6<N<12 | 00:35 |
*** markvoelker has quit IRC | 00:37 | |
ianw | dprince: i have no idea what's going on because scrollback is huge; but i see fedora 20 and just yesterday i fixed an issue with the latest f20 kernel update that creates a broken extlinux.conf and hence it no boot. | 00:38 |
clarkb | ianw: this is for the tripleo f20 nodes so we didn't think that was related | 00:39 |
ianw | ok, i figured there was hours of context i'm missing | 00:40 |
clarkb | ianw: and in attempts to investigate I ran into the 10.1.8.37 address coming back from keystone so tlaking to the cloud was cut short | 00:40 |
fungi | i have 10 parallel loops going over equal slices of the ~500 nodes | 00:40 |
fungi | hopefully this goes fairly quickly | 00:40 |
*** adalbas has joined #openstack-infra | 00:41 | |
reed | wow! I didn't know you could include inside a mediawiki page the content of another page by simply adding {{:name_of_the_page}} | 00:41 |
fungi | reed: you can also transclude subsections | 00:41 |
reed | they call it 'transclusion' http://www.mediawiki.org/wiki/Transclusion | 00:41 |
fungi | yep | 00:41 |
dprince | ianw: could be related | 00:42 |
jeblair | fungi: have an idea how long each delete is taking? | 00:42 |
fungi | jeblair: i can probably wall clock time one. just a moment | 00:42 |
dprince | ianw: we haven't had Fedora 20 tripleo jobs running all day | 00:42 |
dprince | ianw: for TripleO that is (not speaking in general) | 00:43 |
fungi | jeblair: the bad news is, ~30sec each | 00:43 |
jeblair | fungi: so could take 25 mins | 00:43 |
fungi | jeblair: yep | 00:43 |
*** adalbas has quit IRC | 00:44 | |
fungi | unless any "stuck" nodes cause some deletes to take an extra long time | 00:44 |
*** asettle has quit IRC | 00:44 | |
fungi | okay, i'm also seeing some go as quickly as 5 seconds, so i think it's extremely variable | 00:44 |
fungi | because you know, it's the cloud | 00:44 |
fungi | who needs consistency really? | 00:44 |
clarkb | fungi: its eventually ocnsistent | 00:45 |
clarkb | ianw: mordred 165792 | 00:45 |
clarkb | ianw: I think the whole point is to not raise the exception so that we can list the aliens that we do know about isntead of dying early and writing no prettytable | 00:46 |
greghaynes | I should really make a test for that | 00:46 |
greghaynes | since its been that kind of week | 00:46 |
mordred | ianw: yes, what clarkb said | 00:46 |
*** ddieterly has quit IRC | 00:46 | |
*** dalgaaf has quit IRC | 00:47 | |
*** tjones1 has quit IRC | 00:49 | |
clarkb | greghaynes: comment for you on 165682 | 00:50 |
*** ddieterly has joined #openstack-infra | 00:51 | |
ianw | mordred: ok, well it's still the only thing that writes to stderr like that ... maybe it should log.error . or ignore me, that's fine too :) | 00:51 |
*** bknudson has joined #openstack-infra | 00:51 | |
greghaynes | clarkb: yea, so I was a bit confused and not wanting to read all of nova client code - why is novaclient.images a list of nodes you say? | 00:52 |
jeblair | clarkb, fungi, mordred: i have to run now, i will check back in after dinner | 00:52 |
mordred | kk | 00:52 |
clarkb | greghaynes: I am pretty sure its listing nodes there | 00:52 |
clarkb | greghaynes: because of the ip addrs | 00:52 |
fungi | thanks jeblair | 00:52 |
greghaynes | clarkb: That list is instantiated as FakeClient.images | 00:52 |
greghaynes | oh wait | 00:53 |
greghaynes | its just our abstract list thing | 00:53 |
clarkb | no create_image is a different method there | 00:53 |
greghaynes | ugh, so this needs some more work | 00:53 |
greghaynes | the problem is its used in more than one place | 00:53 |
ianw | mordred: also, image_list doesn't want the same thing? | 00:53 |
clarkb | greghaynes: I don't think it needs major work | 00:53 |
clarkb | greghaynes: just a more generic fake object so the code doesn't read funny | 00:53 |
greghaynes | clarkb: well, I also need to use it for glance images. I think the right thing to do is just get rid of FakeGlanceImage and add the method I need to dummy | 00:54 |
clarkb | FakeCloudResource or something | 00:54 |
clarkb | greghaynes: ya or that | 00:54 |
greghaynes | Yea, thats essentially equivilent | 00:54 |
mordred | ianw: it may? | 00:55 |
greghaynes | ianw: Ive been fixing those bugs as I go when im adding fail tests, theres a lot of them | 00:55 |
greghaynes | so its likely other commands need it | 00:55 |
clarkb | greghaynes: so you're good with that -1? | 00:56 |
greghaynes | clarkb: yes | 00:56 |
greghaynes | I mean, its correct ;) | 00:56 |
greghaynes | clarkb: another note about that patch - it leaves fake dib images around in the test dir | 00:58 |
greghaynes | not sure how much we care about that | 00:58 |
clarkb | greghaynes: isn't the test dir a tmpdir fixture? | 00:59 |
clarkb | greghaynes: if thats the case it should be fine since the fixture should clean it up | 00:59 |
*** mfink_ has joined #openstack-infra | 01:00 | |
greghaynes | oh, good point, we must not be passing a path to the fixture dir for the dib image output dest | 01:00 |
*** corvusphone has joined #openstack-infra | 01:00 | |
greghaynes | ill mess with that too | 01:02 |
*** Sukhdev has joined #openstack-infra | 01:02 | |
clarkb | oh btw one suggestion on the jenkins bug I filed is that we use a jenkins cloud slave plugin instead | 01:03 |
*** Ryan_Lane has quit IRC | 01:03 | |
dprince | clarkb: try now | 01:03 |
dprince | clarkb: nova --debug lists | 01:03 |
dprince | list | 01:03 |
clarkb | dprince: yup that worked (well I did nova floating ip list | 01:03 |
clarkb | dprince: I think at least part of the problem is we have leaked floating IPs | 01:04 |
clarkb | so will clean that up now | 01:04 |
dprince | clarkb: I'm not aware of any changes we put into place on our side, so I'll check on how this keystone setting could have been altered. | 01:04 |
dprince | clarkb: otherwise I'm not sure how this ever worked | 01:05 |
clarkb | {"message": "Unknown auth strategy", "code": 500, "created": "2014-11-23T20:17:29Z"} errors like that for a precise node | 01:05 |
dprince | clarkb: Yeah, I saw those too. Possibly related to this keystone change? | 01:05 |
clarkb | dprince: maybe? | 01:06 |
*** mfink_ has quit IRC | 01:06 | |
dprince | clarkb: I was able to create a Fedora node 45 minutes ago | 01:06 |
clarkb | anyways let me clean up the floating ips and see if tht makes it a healthier cloud | 01:06 |
dprince | clarkb: and a floating ip too | 01:06 |
dprince | clarkb: yeah, step at a time. Cleanup and lets see :) | 01:06 |
*** LinuxJed_ has quit IRC | 01:07 | |
*** mayurig has joined #openstack-infra | 01:08 | |
*** dims has joined #openstack-infra | 01:08 | |
*** asettle has joined #openstack-infra | 01:10 | |
*** tnovacik has quit IRC | 01:10 | |
openstackgerrit | Darragh Bailey proposed openstack-infra/jenkins-job-builder: Treat non-existant output files as empty files https://review.openstack.org/166062 | 01:11 |
*** ghostpl_ has joined #openstack-infra | 01:11 | |
*** ddieterl_ has joined #openstack-infra | 01:11 | |
*** ddieterly has quit IRC | 01:12 | |
fungi | deletes are wrapping up now | 01:12 |
dprince | ianw: while I'm waiting is there a ticket open for the Fedora boot error you mentioned? | 01:12 |
corvusphone | clarkb: speaking of which, we should check ports and ips on hpcloud | 01:12 |
*** mfink_ has joined #openstack-infra | 01:12 | |
ianw | dprince: see the comments in https://review.openstack.org/#/c/165681/1/install_puppet.sh | 01:12 |
clarkb | corvusphone: I can do that now | 01:12 |
dprince | ianw: thanks | 01:12 |
ianw | dprince: my desire to debug grubby on f20 was/is quite low, especially when it works with a later version of it | 01:13 |
clarkb | ya floating IPs definitely leaked there | 01:13 |
dprince | ianw: sound fine to me | 01:13 |
clarkb | starting a round of deletions for hpcloud FIPs | 01:13 |
mordred | corvusphone: WOAH! when did you start corvusphoning? | 01:13 |
*** ivar-lazzaro has quit IRC | 01:13 | |
mordred | clarkb: harvard is going toe to toe with unc | 01:14 |
clarkb | mordred: I have my television on downstairs with no one watching it like a good MURICAN | 01:14 |
fungi | last of the deletes just finished but there are 52 which didn't delete, so i'm starting a second pass with just those | 01:14 |
anteaya | mordred: when we break hpcloud it appears | 01:14 |
clarkb | fungi: ok, I just starting floating ip deletion | 01:15 |
anteaya | clarkb: ha ha ha | 01:15 |
clarkb | and I just got rate limited | 01:15 |
openstackgerrit | Darragh Bailey proposed openstack-infra/jenkins-job-builder: Convert all inline publisher examples to tests https://review.openstack.org/166064 | 01:15 |
corvusphone | mordred: its just the webchat. I need to put a proper setup on my phone. | 01:15 |
*** LinuxJedi has joined #openstack-infra | 01:15 | |
*** ghostpl_ has quit IRC | 01:16 | |
clarkb | I have restarted floating ip deletes serially | 01:16 |
clarkb | can I just say that independently managing 3 different resources in order to get one working VM is really not fun | 01:17 |
*** markvoelker has joined #openstack-infra | 01:17 | |
clarkb | especially when I get rate limited doing it when reality is I need one api call to get one node (or maybe more than one node) | 01:18 |
*** asettle has quit IRC | 01:19 | |
*** asettle has joined #openstack-infra | 01:19 | |
lifeless | ratelimiting is a PITA | 01:21 |
*** Sukhdev has quit IRC | 01:21 | |
*** prad has quit IRC | 01:21 | |
mordred | clarkb: HARVARD JUST TOOK THE LEAD | 01:22 |
*** markvoelker has quit IRC | 01:22 | |
mordred | clarkb: 3-point shot and the foul | 01:22 |
mordred | clarkb: 1:15 to go | 01:22 |
*** otter768 has joined #openstack-infra | 01:23 | |
clarkb | ha I just turned it off because I realized I didn't need it on | 01:23 |
clarkb | but maybe I should go back and watch | 01:23 |
clarkb | lifeless: yes xargs needs a rate limit flag | 01:23 |
clarkb | I could put a sleep in the commands I suppose | 01:24 |
lifeless | yes | 01:24 |
lifeless | and cry into your sleep | 01:24 |
*** claudiub has quit IRC | 01:24 | |
*** corvusphone has quit IRC | 01:24 | |
openstackgerrit | Darragh Bailey proposed openstack-infra/jenkins-job-builder: Only query jenkins plugins if config provided https://review.openstack.org/158826 | 01:25 |
*** corvusphone has joined #openstack-infra | 01:25 | |
*** dmorita has quit IRC | 01:26 | |
mordred | clarkb: dude. that was almost CRAZY | 01:26 |
clarkb | dprince: {"message": "Failed to terminate process 16378 with SIGKILL: Device or resource busy", "code": 500, "created": "2015-03-20T01:04:29Z"} is the error I see on a random f20 node | 01:27 |
*** otter768 has quit IRC | 01:27 | |
corvusphone | mordred: who won? | 01:27 |
clarkb | dprince: also floating IPs should be cleaned up now | 01:27 |
mordred | corvusphone: unc | 01:28 |
mordred | corvusphone: at. the. end | 01:28 |
* mordred using weechat android ... | 01:29 | |
clarkb | I am going to run the delete port script on hpcloud now | 01:29 |
mordred | clarkb: cool | 01:29 |
mordred | fungi: we ready to start ramping up again yet? | 01:30 |
*** garyh has joined #openstack-infra | 01:30 | |
clarkb | floating IPs were cleaned up so its just the ports left though port list was small so if we have leaked there its minimal | 01:31 |
fungi | each pass through retrying to delete i manage to knock down a few more, but there are still 45 which haven't succeeded yet | 01:31 |
mordred | that's so special | 01:31 |
openstackgerrit | Darragh Bailey proposed openstack-infra/jenkins-job-builder: Convert all inline publisher examples to tests https://review.openstack.org/166064 | 01:31 |
fungi | should we just give that list of uuids to hpcloud and fire back up? | 01:31 |
fungi | we were able to delete >90% anyway | 01:32 |
clarkb | I am being dragged to dinner | 01:32 |
clarkb | will check in later | 01:32 |
fungi | oh, the requests must still be getting processed because it's dropped to 38 now | 01:33 |
mordred | kk | 01:33 |
*** tiswanso has joined #openstack-infra | 01:33 | |
mordred | fungi: so - maybe we should just turn back on and see how it goes? | 01:34 |
fungi | i guess that's what "Request to delete server X has been accepted." means | 01:34 |
fungi | mordred: yeah, it's probably safe to hand-revert 166043 now | 01:34 |
mordred | fungi: well, it's hand applied :) | 01:34 |
mordred | fungi: I'll do that now | 01:35 |
fungi | indeed | 01:35 |
mordred | fungi: rate: 4.0 -- is that the thing I should adjust to adjust api rate limit? | 01:36 |
fungi | yeah | 01:36 |
mordred | rackspace is set to 1.0 | 01:36 |
mordred | maybe I should set hp to that too to be nice? | 01:36 |
fungi | mordred: i think it's inversely named and is actually a frequency? | 01:38 |
fungi | as in the lower the number the faster we poll (value indicating fraction of a second delay between polls) | 01:38 |
*** baoli has joined #openstack-infra | 01:39 | |
corvusphone | Rackspace is much lower (faster) than 1.0 | 01:40 |
*** garyh has quit IRC | 01:40 | |
corvusphone | 4.0 is one request every 0.8 secs (4.0/5) | 01:40 |
corvusphone | (5 HP providers) | 01:41 |
*** virmitio has quit IRC | 01:42 | |
mordred | corvusphone: kk. cool | 01:43 |
openstackgerrit | Stephanie Miller proposed openstack-infra/puppet-zanata: Add OpenID login provider support to Zanata config https://review.openstack.org/166073 | 01:43 |
*** ddieterl_ has quit IRC | 01:45 | |
*** baoli_ has joined #openstack-infra | 01:45 | |
*** ddieterly has joined #openstack-infra | 01:45 | |
*** harlowja_ is now known as harlowja_away | 01:46 | |
*** otter768 has joined #openstack-infra | 01:46 | |
mordred | you know - when I get the shade patch done | 01:47 |
*** baoli has quit IRC | 01:48 | |
mordred | some of the metadata caching support may be nice - like flavors | 01:48 |
*** garyh has joined #openstack-infra | 01:49 | |
mordred | or - it's possible that "tail -f debug.log | grep hpcloud is only showing me FlavorListTask ... | 01:49 |
fungi | i'm winding down here, but will try to keep an eye on irc for a little while longer | 01:49 |
*** tsg has quit IRC | 01:50 | |
anteaya | the trove patch ttx has been waiting on is in the top of the gate | 01:51 |
cinerama | pleia2, StevenK: did we ever work out the deal with the zanata client? | 01:51 |
anteaya | and these two cinder patches: https://review.openstack.org/#/q/status:open+topic:cinder-driver-removals,n,z | 01:52 |
anteaya | and I do believe that is all ttx needs | 01:52 |
*** camunoz_mtg has quit IRC | 01:52 | |
StevenK | cinerama: Packaging it is hard. | 01:54 |
StevenK | cinerama: We don't have support for building and updating the packaging anyway. | 01:54 |
StevenK | cinerama: I have a WIP patch to change system-config to install the cli client on the proposal slave | 01:54 |
cinerama | StevenK: sounds like we should just install like we do with other nonpackaged stuff in puppet | 01:55 |
StevenK | cinerama: Yes, which is what my WIP patch does | 01:55 |
cinerama | StevenK: oh cool. is it up anywhere yet? | 01:56 |
StevenK | cinerama: I want to test it before pushing it up | 01:56 |
cinerama | StevenK: bo-ring :) | 01:56 |
StevenK | Hah | 01:56 |
mordred | fungi, clarkb, jeblair: it FEELS like nothing is happening, other than hp listing flavors | 01:56 |
cinerama | insert dos equis man hipchat emoji here | 01:57 |
clarkb | mordred are there deficit calculations and allocations? | 01:57 |
clarkb | mordred thats what Ibwould look for in the log | 01:57 |
*** tqtran has quit IRC | 01:58 | |
*** asettle has quit IRC | 01:58 | |
*** tiswanso has quit IRC | 01:58 | |
*** garyh has quit IRC | 02:00 | |
mordred | clarkb: 2015-03-20 01:57:16,230 DEBUG nodepool.NodePool: Deficit: bare-trusty: 0 (start: 232 min-ready: 8 ready: 240 capacity: 75) | 02:00 |
*** woodster_ has quit IRC | 02:00 | |
clarkb | ok so it wants to start 232 thats good | 02:00 |
clarkb | below that should be the allocations any to hpcloud? | 02:00 |
clarkb | oh! do we still have images? | 02:01 |
clarkb | maybe those are building? | 02:01 |
*** tiswanso has joined #openstack-infra | 02:01 | |
mordred | clarkb: welll... a) I don't see allocations - but I did just get a 500 error | 02:01 |
*** asettle has joined #openstack-infra | 02:02 | |
mordred | clarkb: just went to hell | 02:03 |
mordred | clarkb: quotas back to 0 | 02:03 |
clarkb | oh? | 02:03 |
mordred | yah | 02:03 |
mordred | same thing as before | 02:03 |
*** unicell1 has quit IRC | 02:04 | |
*** asselin_ has quit IRC | 02:04 | |
mordred | I show 381 nodes in nodepool in some sort of state | 02:05 |
mordred | clarkb: does our rate limit apply to database as well? | 02:05 |
mordred | gah | 02:06 |
mordred | to delete | 02:06 |
clarkb | mordred ya | 02:06 |
*** camunoz_mtg has joined #openstack-infra | 02:06 | |
clarkb | so maybe bump it up with quota 0 | 02:06 |
mordred | ok - I just set the rate to 16 | 02:06 |
mordred | with quota 0 | 02:06 |
*** patrickeast has quit IRC | 02:08 | |
*** yamahata has quit IRC | 02:11 | |
clarkb | did that help? | 02:11 |
jroll | mordred: are these the times where you wish you had access to HP control plane? | 02:11 |
mordred | jroll: NO | 02:11 |
jroll | lol | 02:11 |
mordred | clarkb: the api plane seems to be recovering | 02:11 |
dprince | clarkb: are all the F20 nodes failing that way? | 02:11 |
jroll | interesting, I would want to figure out what's wrong | 02:11 |
anteaya | are there times you wish you had access to HP control plane, mordred? | 02:12 |
dprince | clarkb: any successful onces? | 02:12 |
*** nelsnels_ has joined #openstack-infra | 02:12 | |
mordred | jroll: well, I mean - I sort of would - but I don't like the semblance of cuplability for a system I don't own | 02:12 |
dprince | clarkb: nm, I can see those as well | 02:13 |
clarkb | dprince I cant check right now, getting foods | 02:13 |
*** sigmavirus24 is now known as sigmavirus24_awa | 02:14 | |
jroll | mordred: I guess I get that; kind of like how y'all think I have magic rackspace powers | 02:14 |
jroll | :P | 02:14 |
*** jyuso has quit IRC | 02:14 | |
*** nelsnelson has quit IRC | 02:14 | |
*** tiswanso has quit IRC | 02:15 | |
mordred | clarkb: 16 wasn't enough for them to recover - I've set it to 64 | 02:16 |
clarkb | mordred ok | 02:16 |
anteaya | jroll: you don't have magic rackspace powers? | 02:16 |
* anteaya realizes another illusion is blown | 02:16 | |
jroll | lol | 02:18 |
jroll | anteaya: I have rackspace internal irc and ldap | 02:18 |
jroll | turns out those are pretty useful | 02:18 |
*** markvoelker has joined #openstack-infra | 02:18 | |
anteaya | you do have magic powers | 02:18 |
anteaya | whew | 02:18 |
jroll | :P | 02:18 |
*** markvoelker has quit IRC | 02:22 | |
*** mayurig has quit IRC | 02:23 | |
openstackgerrit | greghaynes proposed openstack-infra/nodepool: Monkeypatch Fake Clients for tests https://review.openstack.org/165682 | 02:23 |
*** sdake__ has quit IRC | 02:25 | |
*** sdake has joined #openstack-infra | 02:25 | |
*** bhunter71 has joined #openstack-infra | 02:27 | |
greghaynes | clarkb: ^ I think that addresses your comments | 02:28 |
mordred | clarkb: oh god | 02:29 |
mordred | I just looked at nova source code | 02:29 |
greghaynes | mordred: some things cant be unseen? | 02:30 |
mordred | # NOTE(johannes): The quota code uses SQL locking to ensure races don't | 02:30 |
mordred | # cause under or over counting of resources. To avoid deadlocks, this | 02:30 |
mordred | # code always acquires the lock on quota_usages before acquiring the lock | 02:30 |
mordred | # on reservations. | 02:30 |
mordred | *STABSTABSTAB* | 02:30 |
*** kaisers1 has joined #openstack-infra | 02:31 | |
mordred | clearly written by someone who knows nothing about databases | 02:31 |
mordred | and should stop writing database code | 02:31 |
anteaya | so talking with thingee, he is way more forgiving of ci account ops than I am, but it is his call and I am supporting him | 02:31 |
anteaya | he might be coming in here and asking for a ci account to be disabled | 02:31 |
mordred | okie | 02:31 |
anteaya | I've given him a paste with all the info like the gerrit id if he decides to go ahead with it | 02:31 |
anteaya | ttx's nova and trove patches are in | 02:32 |
anteaya | and I'm going to bed | 02:32 |
anteaya | night | 02:32 |
*** jamielennox is now known as jamielennox|lunc | 02:32 | |
*** jamielennox|lunc is now known as jamielennox|food | 02:32 | |
*** kaisers has quit IRC | 02:32 | |
mordred | corvusphone: I'm pumpkin-ing - I'm lurking, but need to stop doing active things | 02:32 |
mordred | corvusphone: the current status is that we're back off for creates and deletes for hp in nodepool are throttled to 64 | 02:33 |
mordred | I think we can find an appropriate throttle number - but I also think there is an issue that should be solved internally too | 02:33 |
mordred | so I'm not particularly interested in chasing the tip of a thundering herd | 02:33 |
mordred | which is what we're doing right now | 02:33 |
*** unicell has joined #openstack-infra | 02:38 | |
*** bhunter71 has quit IRC | 02:39 | |
*** woodster_ has joined #openstack-infra | 02:39 | |
openstackgerrit | Jerry Zhao proposed openstack-infra/nodepool: add option to use ipv6 for image update and node launching https://review.openstack.org/156178 | 02:40 |
*** bhunter71 has joined #openstack-infra | 02:40 | |
tchaypo | dstufft: around, perchance? | 02:41 |
*** sputnik13 has joined #openstack-infra | 02:45 | |
*** achanda has joined #openstack-infra | 02:45 | |
*** ujuc has joined #openstack-infra | 02:45 | |
*** unicell has quit IRC | 02:45 | |
*** mfink__ has joined #openstack-infra | 02:48 | |
*** unicell has joined #openstack-infra | 02:48 | |
*** mfink_ has quit IRC | 02:48 | |
mordred | clarkb, corvusphone: acually, I'm trying one more thing - I'm trying turning creates back on with the crazy-low rate limit | 02:50 |
*** weshay has quit IRC | 02:51 | |
*** weshay has joined #openstack-infra | 02:52 | |
*** amotoki has joined #openstack-infra | 02:53 | |
*** tsg has joined #openstack-infra | 02:55 | |
*** asettle has quit IRC | 02:55 | |
*** greghaynes has quit IRC | 02:58 | |
clarkb | mordred how is that going? | 02:59 |
*** ghostpl_ has joined #openstack-infra | 02:59 | |
*** garyh has joined #openstack-infra | 03:00 | |
mordred | clarkb: so far so good | 03:01 |
mordred | clarkb: last time it took a while before stuff started dying | 03:01 |
*** jamielennox|food is now known as jamielennox | 03:01 | |
mordred | clarkb: but I'm 95% convinced taht deletes are the problem | 03:01 |
mordred | so we don't see the problem until we start trying to delete things | 03:01 |
clarkb | huh | 03:02 |
*** asettle has joined #openstack-infra | 03:02 | |
*** mwagner_lap has joined #openstack-infra | 03:03 | |
*** ddieterly has quit IRC | 03:04 | |
mordred | clarkb: I believe it's a thundering herd that's caused by the quota code doing table locks, combined with delete using soft deletes and bad queries so taht the delete query quota updating is tying up the create quota calculation in the table lock | 03:04 |
mordred | so if delete performance is slow, it casues everything to stack up | 03:04 |
*** greghaynes has joined #openstack-infra | 03:05 | |
*** sdake_ has joined #openstack-infra | 03:05 | |
*** subscope_ has joined #openstack-infra | 03:05 | |
*** xyang1 has joined #openstack-infra | 03:08 | |
*** ghostpl_ has quit IRC | 03:09 | |
*** radez is now known as radez_g0n3 | 03:09 | |
*** sdake has quit IRC | 03:10 | |
mordred | clarkb: it seems to not be falling over | 03:10 |
*** garyh has quit IRC | 03:10 | |
*** sdake has joined #openstack-infra | 03:10 | |
clarkb | thats good | 03:11 |
clarkb | did they deploy new nova quota code recently? | 03:11 |
*** amotoki has quit IRC | 03:11 | |
mordred | don't think so | 03:12 |
mordred | I think it was our scheduling fix earlier that triggered this particular interaction | 03:12 |
corvusphone | Mordred we should probably revert my nodepool patch | 03:12 |
corvusphone | It will serialize deletes | 03:13 |
clarkb | except thibgs have done exceptionslly poorly there the last few weekz | 03:13 |
clarkb | but maybe this just makesnit worse | 03:13 |
corvusphone | Super slow but will not overwhelm them | 03:13 |
clarkb | ya | 03:13 |
*** sdake_ has quit IRC | 03:14 | |
corvusphone | Yes. I mean we should do that now while they fix it. Because frankly this is the worlds easiest dos | 03:14 |
mordred | corvusphone: well, the current rate limiting is holding steady | 03:15 |
mordred | corvusphone: I have not checked to see if we're winding up with any nodes | 03:15 |
*** coolsvap has joined #openstack-infra | 03:16 | |
mordred | corvusphone: also - the cloud noc folks are very motivated to figure out root cause on the delete thing | 03:17 |
*** nelsnels_ has quit IRC | 03:18 | |
corvusphone | at 12s per request were probably performing worse than before | 03:18 |
mordred | yeah | 03:19 |
*** markvoelker has joined #openstack-infra | 03:19 | |
corvusphone | But I guess it won't hurt to keep it like tgis | 03:19 |
corvusphone | And its easy to ramp up if the NOC asks us to | 03:19 |
*** jyuso1 has joined #openstack-infra | 03:19 | |
mordred | corvusphone: well, also - we can try reverting your patch in the morning when we're all awake | 03:20 |
*** dims has quit IRC | 03:20 | |
*** bhunter71 has quit IRC | 03:20 | |
corvusphone | Ok you don't have to convince me :) | 03:21 |
*** markvoelker has quit IRC | 03:23 | |
*** coolsvap|afk has joined #openstack-infra | 03:23 | |
openstackgerrit | Merged openstack-infra/nodepool: Move nodepool creation in tests to common method https://review.openstack.org/165581 | 03:25 |
*** coolsvap has quit IRC | 03:25 | |
*** coolsvap|afk is now known as coolsvap | 03:26 | |
*** coolsvap is now known as coolsvap|afk | 03:26 | |
*** coolsvap|afk is now known as coolsvap | 03:27 | |
*** otter768 has quit IRC | 03:27 | |
*** emagana has joined #openstack-infra | 03:28 | |
*** otter768 has joined #openstack-infra | 03:29 | |
*** spzala has quit IRC | 03:30 | |
*** corvusphone has quit IRC | 03:31 | |
*** sputnik13 has quit IRC | 03:33 | |
*** sputnik13 has joined #openstack-infra | 03:37 | |
*** otter768 has quit IRC | 03:39 | |
*** gyee has quit IRC | 03:41 | |
*** sputnik13 has quit IRC | 03:44 | |
*** achanda has quit IRC | 03:49 | |
*** ujuc has quit IRC | 03:51 | |
*** asettle has quit IRC | 03:52 | |
*** ujuc has joined #openstack-infra | 03:54 | |
*** sputnik13 has joined #openstack-infra | 03:55 | |
*** armax has quit IRC | 03:55 | |
*** asettle has joined #openstack-infra | 03:56 | |
*** asettle has quit IRC | 03:56 | |
*** asettle has joined #openstack-infra | 03:57 | |
*** achanda has joined #openstack-infra | 03:59 | |
*** achanda has quit IRC | 04:01 | |
*** mayurig has joined #openstack-infra | 04:01 | |
*** dannywilson has joined #openstack-infra | 04:02 | |
*** ddieterly has joined #openstack-infra | 04:05 | |
*** dannywilson has quit IRC | 04:05 | |
*** dannywilson has joined #openstack-infra | 04:06 | |
zaro | clarkb: any interest going to NW linux fest this year? | 04:07 |
*** sabeen has joined #openstack-infra | 04:08 | |
*** Sukhdev has joined #openstack-infra | 04:08 | |
clarkb | I thoughr about it but probably wont make it | 04:09 |
*** amotoki has joined #openstack-infra | 04:09 | |
*** ddieterly has quit IRC | 04:09 | |
*** achanda has joined #openstack-infra | 04:12 | |
*** sputnik13 has quit IRC | 04:15 | |
*** camunoz_mtg has quit IRC | 04:15 | |
*** achanda has quit IRC | 04:15 | |
*** VijayTripathi has quit IRC | 04:16 | |
*** sputnik13 has joined #openstack-infra | 04:16 | |
*** rlucio has quit IRC | 04:17 | |
*** Somay has joined #openstack-infra | 04:19 | |
*** markvoelker has joined #openstack-infra | 04:19 | |
*** sushilkm has joined #openstack-infra | 04:20 | |
*** sushilkm has left #openstack-infra | 04:20 | |
*** mmedvede has joined #openstack-infra | 04:21 | |
*** mayurig has quit IRC | 04:21 | |
zaro | i think i'll be there, 1st time. any tip on where to stay? | 04:23 |
*** markvoelker has quit IRC | 04:24 | |
*** dims has joined #openstack-infra | 04:25 | |
*** wuhg has joined #openstack-infra | 04:26 | |
*** sputnik13 has quit IRC | 04:27 | |
*** camunoz_mtg has joined #openstack-infra | 04:27 | |
clarkb | not really the only time I went I stayed in bad hotel off freeway | 04:28 |
wuhg | how can i add search by subject keyword to https://review.openstack.org/#/q/status:open+project:openstack-dev/devstack,n,0033de7e000285e9 | 04:28 |
clarkb | I would try downtown or water front areas if possible | 04:28 |
*** Qiming_ has joined #openstack-infra | 04:29 | |
clarkb | wuhg message:"some message" | 04:30 |
Qiming_ | hello, infra | 04:31 |
thingee | waiting on something that's blocking the tag for Cinder in k3...it has one job stuck in queued for some time. | 04:31 |
Qiming_ | another review is appreciated: https://review.openstack.org/#/c/164963/ | 04:31 |
thingee | 166003 review | 04:31 |
wuhg | clarkb: thanks ,it works | 04:32 |
*** dims has quit IRC | 04:32 | |
*** rkukura has quit IRC | 04:33 | |
*** asettle has quit IRC | 04:35 | |
*** sputnik13 has joined #openstack-infra | 04:36 | |
*** tkelsey has joined #openstack-infra | 04:36 | |
*** tkelsey has quit IRC | 04:41 | |
*** sputnik13 has quit IRC | 04:44 | |
*** sputnik13 has joined #openstack-infra | 04:46 | |
*** Sukhdev has quit IRC | 04:53 | |
*** ghostpl_ has joined #openstack-infra | 04:54 | |
*** rkukura has joined #openstack-infra | 04:54 | |
*** baoli_ has quit IRC | 04:57 | |
*** yamahata has joined #openstack-infra | 04:58 | |
*** sputnik13 has quit IRC | 04:59 | |
*** sputnik13 has joined #openstack-infra | 05:00 | |
*** amotoki_ has joined #openstack-infra | 05:00 | |
*** ghostpl_ has quit IRC | 05:01 | |
*** sigmavirus24_awa is now known as sigmavirus24 | 05:02 | |
*** sputnik13 has quit IRC | 05:02 | |
*** chlong has quit IRC | 05:02 | |
*** sputnik13 has joined #openstack-infra | 05:03 | |
*** ddieterly has joined #openstack-infra | 05:06 | |
*** achanda has joined #openstack-infra | 05:08 | |
*** VijayTripathi has joined #openstack-infra | 05:08 | |
*** ddieterly has quit IRC | 05:10 | |
*** mriedem_away has quit IRC | 05:18 | |
*** mriedem has joined #openstack-infra | 05:18 | |
*** mriedem has quit IRC | 05:18 | |
*** mriedem has joined #openstack-infra | 05:18 | |
*** chlong has joined #openstack-infra | 05:19 | |
*** markvoelker has joined #openstack-infra | 05:20 | |
*** garyh has joined #openstack-infra | 05:21 | |
*** markvoelker has quit IRC | 05:25 | |
*** coolsvap is now known as coolsvap|afk | 05:28 | |
*** jyuso1 has quit IRC | 05:31 | |
*** garyh has quit IRC | 05:31 | |
*** alexpilotti has quit IRC | 05:35 | |
*** tsg has quit IRC | 05:35 | |
*** coolsvap|afk is now known as coolsvap | 05:36 | |
*** sputnik13 has quit IRC | 05:46 | |
*** sputnik13 has joined #openstack-infra | 05:47 | |
*** hdd has joined #openstack-infra | 05:48 | |
*** Somay has quit IRC | 05:52 | |
*** sputnik13 has quit IRC | 05:54 | |
*** reed has quit IRC | 05:55 | |
*** dannywilson has quit IRC | 05:58 | |
*** xyang1 has quit IRC | 06:00 | |
*** chlong has quit IRC | 06:01 | |
*** hdd has quit IRC | 06:03 | |
*** sdake_ has joined #openstack-infra | 06:04 | |
*** VijayTripathi has quit IRC | 06:08 | |
*** sdake has quit IRC | 06:08 | |
*** sputnik13 has joined #openstack-infra | 06:11 | |
*** BharatK has quit IRC | 06:11 | |
*** BharatK has joined #openstack-infra | 06:12 | |
*** sputnik13 has quit IRC | 06:12 | |
*** chlong has joined #openstack-infra | 06:13 | |
*** [HeOS] has quit IRC | 06:16 | |
*** dims has joined #openstack-infra | 06:18 | |
*** nilasae has joined #openstack-infra | 06:18 | |
*** emagana has quit IRC | 06:18 | |
*** sdake has joined #openstack-infra | 06:18 | |
*** markvoelker has joined #openstack-infra | 06:21 | |
*** sdake_ has quit IRC | 06:22 | |
*** dims has quit IRC | 06:24 | |
*** markvoelker has quit IRC | 06:26 | |
*** mrda is now known as mrda-afk | 06:31 | |
*** garyh has joined #openstack-infra | 06:32 | |
*** fifieldt has joined #openstack-infra | 06:32 | |
openstackgerrit | Steve Kowalik proposed openstack-infra/system-config: Add zanata-cli utility to proposal slave https://review.openstack.org/166109 | 06:32 |
*** jamielennox is now known as jamielennox|away | 06:35 | |
StevenK | pleia2, cinerama: ^ | 06:35 |
*** macjack has joined #openstack-infra | 06:36 | |
*** deepakcs has joined #openstack-infra | 06:40 | |
*** macjack has quit IRC | 06:40 | |
thingee | gate queue just restart? | 06:40 |
thingee | I had two jobs pending to cut cinder that were five mins left from being done...and now back to an hour | 06:41 |
* thingee wants sleep | 06:41 | |
*** macjack has joined #openstack-infra | 06:41 | |
*** garyh has quit IRC | 06:42 | |
*** subscope_ has quit IRC | 06:42 | |
*** teran has quit IRC | 06:43 | |
*** jyuso1 has joined #openstack-infra | 06:44 | |
*** sigmavirus24 is now known as sigmavirus24_awa | 06:44 | |
*** ghostpl_ has joined #openstack-infra | 06:44 | |
thingee | and gate just restarted all my builds again | 06:48 |
*** emagana has joined #openstack-infra | 06:49 | |
*** juggler_ is now known as juggler | 06:50 | |
*** macjack has quit IRC | 06:50 | |
*** mrunge has joined #openstack-infra | 06:52 | |
openstackgerrit | yolanda.robla proposed openstack-infra/project-config: Add stackforge/puppet-nscld https://review.openstack.org/165922 | 06:53 |
*** yolanda has joined #openstack-infra | 06:54 | |
*** emagana has quit IRC | 06:54 | |
*** ghostpl_ has quit IRC | 06:55 | |
*** yamahata has quit IRC | 06:58 | |
*** yamahata has joined #openstack-infra | 06:58 | |
*** Bsony has quit IRC | 07:01 | |
*** fandi has joined #openstack-infra | 07:05 | |
*** fandi has quit IRC | 07:06 | |
*** fandi has joined #openstack-infra | 07:07 | |
*** ddieterly has joined #openstack-infra | 07:08 | |
*** coolsvap is now known as coolsvap_ | 07:09 | |
*** fandi has quit IRC | 07:10 | |
*** scheuran has joined #openstack-infra | 07:10 | |
*** fandi has joined #openstack-infra | 07:10 | |
*** ddieterly has quit IRC | 07:12 | |
*** fandi has quit IRC | 07:13 | |
*** fandi has joined #openstack-infra | 07:14 | |
*** emagana has joined #openstack-infra | 07:15 | |
*** fandi has quit IRC | 07:17 | |
*** achanda has quit IRC | 07:17 | |
*** fandi has joined #openstack-infra | 07:17 | |
*** emagana_ has joined #openstack-infra | 07:18 | |
*** achuprin has quit IRC | 07:19 | |
*** emagana has quit IRC | 07:20 | |
*** sabeen has quit IRC | 07:20 | |
*** fandi has quit IRC | 07:21 | |
*** fandi has joined #openstack-infra | 07:22 | |
*** markvoelker has joined #openstack-infra | 07:22 | |
*** emagana_ has quit IRC | 07:23 | |
*** fandi has quit IRC | 07:25 | |
*** fandi has joined #openstack-infra | 07:26 | |
*** markvoelker has quit IRC | 07:27 | |
*** achanda has joined #openstack-infra | 07:29 | |
*** fandi has quit IRC | 07:29 | |
*** fandi has joined #openstack-infra | 07:30 | |
openstackgerrit | Jan Provaznik proposed openstack-infra/project-config: Create os-cloud-management project on Stackforge https://review.openstack.org/165433 | 07:31 |
*** achuprin has joined #openstack-infra | 07:32 | |
*** fandi has quit IRC | 07:33 | |
*** fandi has joined #openstack-infra | 07:33 | |
*** yfried|afk is now known as yfried | 07:33 | |
openstackgerrit | greghaynes proposed openstack-infra/nodepool: Monkeypatch Fake Clients for tests https://review.openstack.org/165682 | 07:36 |
*** fandi has quit IRC | 07:37 | |
*** fandi has joined #openstack-infra | 07:37 | |
*** camunoz_mtg has quit IRC | 07:38 | |
*** fandi has quit IRC | 07:39 | |
*** shardy has joined #openstack-infra | 07:40 | |
*** garyh has joined #openstack-infra | 07:42 | |
*** Bsony has joined #openstack-infra | 07:43 | |
GheRivero | morning | 07:43 |
*** chlong has quit IRC | 07:45 | |
*** ildikov has quit IRC | 07:48 | |
openstackgerrit | Merged openstack-infra/project-config: Update puppet-setproxy to belong to Gozer group https://review.openstack.org/164820 | 07:48 |
*** arxcruz has joined #openstack-infra | 07:48 | |
yolanda | hi AJaeger, thx for the approval. How can we manage to get some people added to the gozer gerrit group? | 07:49 |
yolanda | morning GheRivero | 07:49 |
*** e0ne has joined #openstack-infra | 07:51 | |
*** yfried is now known as yfried|afk | 07:51 | |
*** garyh has quit IRC | 07:53 | |
*** ibiris_away is now known as ibiris | 07:55 | |
*** markus_z has joined #openstack-infra | 07:56 | |
*** Somay has joined #openstack-infra | 07:57 | |
*** jistr has joined #openstack-infra | 07:58 | |
*** asselin_ has joined #openstack-infra | 08:00 | |
*** jcoufal has joined #openstack-infra | 08:01 | |
*** dtantsur|afk is now known as dtantsur | 08:02 | |
*** Somay has quit IRC | 08:04 | |
*** asselin_ has quit IRC | 08:05 | |
*** Somay has joined #openstack-infra | 08:05 | |
*** __mimir has joined #openstack-infra | 08:08 | |
*** ddieterly has joined #openstack-infra | 08:08 | |
*** __mimir has quit IRC | 08:09 | |
*** dims has joined #openstack-infra | 08:09 | |
*** __mimir has joined #openstack-infra | 08:09 | |
*** Somay has quit IRC | 08:11 | |
*** ghostpl_ has joined #openstack-infra | 08:11 | |
*** oomichi has quit IRC | 08:12 | |
*** tnovacik has joined #openstack-infra | 08:12 | |
*** emagana has joined #openstack-infra | 08:12 | |
*** Longgeek has joined #openstack-infra | 08:12 | |
*** ddieterly has quit IRC | 08:13 | |
*** ominakov has joined #openstack-infra | 08:13 | |
openstackgerrit | greghaynes proposed openstack-infra/nodepool: Don't die while doing alien list https://review.openstack.org/165792 | 08:14 |
*** mpavone has joined #openstack-infra | 08:15 | |
*** emagana has quit IRC | 08:17 | |
*** ghostpl_ has quit IRC | 08:17 | |
*** dims has quit IRC | 08:18 | |
*** _nadya_ has joined #openstack-infra | 08:19 | |
*** e0ne has quit IRC | 08:19 | |
*** openstackgerrit has quit IRC | 08:22 | |
*** openstackgerrit has joined #openstack-infra | 08:22 | |
openstackgerrit | greghaynes proposed openstack-infra/nodepool: Dont die on alien-image-list failure https://review.openstack.org/166132 | 08:22 |
*** markvoelker has joined #openstack-infra | 08:22 | |
*** achanda has quit IRC | 08:23 | |
*** achanda has joined #openstack-infra | 08:27 | |
*** ildikov has joined #openstack-infra | 08:27 | |
*** markvoelker has quit IRC | 08:27 | |
*** dboik_ has quit IRC | 08:28 | |
*** deepakcs has quit IRC | 08:30 | |
*** boris-42 has quit IRC | 08:32 | |
*** hashar has joined #openstack-infra | 08:36 | |
*** Bsony_ has joined #openstack-infra | 08:39 | |
*** Bsony has quit IRC | 08:40 | |
*** dtantsur is now known as dtantsur|bbl | 08:41 | |
AJaeger | yolanda, wait for one of the infra roots to add you to the gozer gerrit group. Let's ask fungi or clarkb to it during the US morning. | 08:46 |
*** achanda has quit IRC | 08:50 | |
*** marun has quit IRC | 08:51 | |
*** garyh has joined #openstack-infra | 08:53 | |
*** stevemar has quit IRC | 08:54 | |
*** Somay has joined #openstack-infra | 08:56 | |
*** andreykurilin_ has joined #openstack-infra | 08:58 | |
*** dannywilson has joined #openstack-infra | 08:59 | |
*** skolekonov has joined #openstack-infra | 09:00 | |
*** andreykurilin_ has quit IRC | 09:03 | |
*** dannywilson has quit IRC | 09:03 | |
*** garyh has quit IRC | 09:04 | |
*** andreykurilin_ has joined #openstack-infra | 09:04 | |
*** ildikov has quit IRC | 09:06 | |
*** emagana has joined #openstack-infra | 09:07 | |
*** Somay has quit IRC | 09:07 | |
*** Ala has joined #openstack-infra | 09:07 | |
*** andreykurilin__ has joined #openstack-infra | 09:08 | |
*** yamahata has quit IRC | 09:09 | |
*** andreykurilin_ has quit IRC | 09:09 | |
*** ___mimir has joined #openstack-infra | 09:10 | |
*** emagana has quit IRC | 09:11 | |
*** Somay has joined #openstack-infra | 09:12 | |
*** __mimir has quit IRC | 09:13 | |
*** ghostpl_ has joined #openstack-infra | 09:13 | |
*** jamielennox|away is now known as jamielennox | 09:14 | |
*** tkelsey has joined #openstack-infra | 09:19 | |
*** markvoelker has joined #openstack-infra | 09:23 | |
*** tkelsey has quit IRC | 09:24 | |
*** ghostpl_ has quit IRC | 09:24 | |
*** tkelsey has joined #openstack-infra | 09:24 | |
*** Longgeek has quit IRC | 09:24 | |
*** Longgeek has joined #openstack-infra | 09:25 | |
*** zz_johnthetubagu is now known as johnthetubaguy | 09:25 | |
*** andreykurilin__ has quit IRC | 09:27 | |
*** andreykurilin_ has joined #openstack-infra | 09:27 | |
*** derekh has joined #openstack-infra | 09:28 | |
*** markvoelker has quit IRC | 09:28 | |
*** Longgeek has quit IRC | 09:30 | |
*** dizquierdo has joined #openstack-infra | 09:32 | |
*** mtreinish has quit IRC | 09:35 | |
*** mtreinish has joined #openstack-infra | 09:36 | |
*** amotoki has quit IRC | 09:37 | |
*** _nadya_ has joined #openstack-infra | 09:40 | |
*** Qiming__ has joined #openstack-infra | 09:41 | |
*** yfried|afk is now known as yfried | 09:43 | |
*** andreykurilin_ has quit IRC | 09:43 | |
*** ZZelle has quit IRC | 09:43 | |
*** ZZelle has joined #openstack-infra | 09:44 | |
*** Qiming_ has quit IRC | 09:44 | |
*** ominakov has quit IRC | 09:49 | |
*** ominakov has joined #openstack-infra | 09:50 | |
yolanda | hi AJaeger, ok | 09:52 |
*** yfried is now known as yfried|afk | 09:54 | |
*** hichihara has quit IRC | 09:56 | |
*** BobBall_AWOL is now known as BobBall | 09:58 | |
*** yfried|afk is now known as yfried | 09:59 | |
*** emagana has joined #openstack-infra | 10:01 | |
*** ssam2 has joined #openstack-infra | 10:02 | |
*** yamamoto has quit IRC | 10:02 | |
*** garyh has joined #openstack-infra | 10:04 | |
*** emagana has quit IRC | 10:05 | |
*** yamamoto has joined #openstack-infra | 10:05 | |
*** sileht has quit IRC | 10:07 | |
*** e0ne has joined #openstack-infra | 10:08 | |
*** yfried is now known as yfried|afk | 10:10 | |
*** ddieterly has joined #openstack-infra | 10:10 | |
*** dimsum__ has joined #openstack-infra | 10:11 | |
*** yfried|afk is now known as yfried | 10:14 | |
*** ddieterly has quit IRC | 10:14 | |
*** garyh has quit IRC | 10:15 | |
*** pblaho__ is now known as pblaho | 10:15 | |
*** mfink__ has quit IRC | 10:19 | |
*** hashar has quit IRC | 10:21 | |
*** sileht has joined #openstack-infra | 10:22 | |
*** sushilkm has joined #openstack-infra | 10:23 | |
*** markvoelker has joined #openstack-infra | 10:24 | |
*** yamamoto has quit IRC | 10:25 | |
*** Longgeek has joined #openstack-infra | 10:26 | |
*** yamamoto has joined #openstack-infra | 10:27 | |
*** yamamoto has quit IRC | 10:28 | |
*** markvoelker has quit IRC | 10:29 | |
*** yfried is now known as yfried|afk | 10:30 | |
*** Longgeek has quit IRC | 10:31 | |
*** ___mimir has quit IRC | 10:33 | |
*** rlandy has joined #openstack-infra | 10:36 | |
*** pc_m has joined #openstack-infra | 10:40 | |
*** erlon has joined #openstack-infra | 10:42 | |
*** YorikSar has joined #openstack-infra | 10:47 | |
openstackgerrit | Valeriy Ponomaryov proposed openstack/requirements: Bumg ddt to min version 0.7.0 https://review.openstack.org/166162 | 10:49 |
*** sushilkm has left #openstack-infra | 10:49 | |
*** yfried|afk is now known as yfried | 10:50 | |
*** ___mimir has joined #openstack-infra | 10:51 | |
openstackgerrit | Valeriy Ponomaryov proposed openstack/requirements: Bump ddt to min version 0.7.0 https://review.openstack.org/166162 | 10:51 |
*** yamamoto has joined #openstack-infra | 10:52 | |
*** BharatK has quit IRC | 10:52 | |
*** enikanorov has quit IRC | 10:54 | |
*** emagana has joined #openstack-infra | 10:55 | |
*** e0ne is now known as e0ne_ | 10:55 | |
*** enikanorov has joined #openstack-infra | 10:55 | |
*** tkelsey has quit IRC | 10:56 | |
*** tkelsey has joined #openstack-infra | 10:56 | |
*** e0ne_ is now known as e0ne | 10:57 | |
*** emagana has quit IRC | 11:00 | |
*** Longgeek has joined #openstack-infra | 11:01 | |
*** tnovacik has quit IRC | 11:03 | |
*** enikanorov has quit IRC | 11:04 | |
*** enikanorov has joined #openstack-infra | 11:05 | |
*** Somay has quit IRC | 11:05 | |
*** enikanorov has quit IRC | 11:06 | |
*** ghostpl_ has joined #openstack-infra | 11:06 | |
*** enikanorov has joined #openstack-infra | 11:07 | |
*** yfried is now known as yfried|afk | 11:07 | |
*** Somay has joined #openstack-infra | 11:08 | |
*** _nadya_ has quit IRC | 11:09 | |
*** mpaolino has joined #openstack-infra | 11:09 | |
*** ddieterly has joined #openstack-infra | 11:11 | |
*** Qiming_ has joined #openstack-infra | 11:11 | |
*** cdent has joined #openstack-infra | 11:12 | |
*** baoli has joined #openstack-infra | 11:13 | |
*** baoli has quit IRC | 11:13 | |
*** Qiming__ has quit IRC | 11:15 | |
*** ddieterly has quit IRC | 11:15 | |
*** garyh has joined #openstack-infra | 11:16 | |
*** jcoufal has quit IRC | 11:16 | |
*** enikanorov has quit IRC | 11:26 | |
*** garyh has quit IRC | 11:26 | |
*** enikanorov has joined #openstack-infra | 11:27 | |
*** yfried|afk is now known as yfried | 11:27 | |
openstackgerrit | Chris Dent proposed openstack/requirements: Update gabbi to 0.12.0 https://review.openstack.org/156253 | 11:29 |
*** enikanorov has quit IRC | 11:30 | |
*** enikanorov has joined #openstack-infra | 11:31 | |
*** ldnunes has joined #openstack-infra | 11:31 | |
*** yfried is now known as yfried|afk | 11:37 | |
*** enikanorov has quit IRC | 11:39 | |
*** enikanorov has joined #openstack-infra | 11:40 | |
*** dtantsur|bbl is now known as dtantsur | 11:41 | |
*** otter768 has joined #openstack-infra | 11:42 | |
*** jlanoux has joined #openstack-infra | 11:43 | |
*** dprince has quit IRC | 11:43 | |
*** claudiub has joined #openstack-infra | 11:45 | |
*** mestery is now known as mestery_afk | 11:45 | |
*** ominakov has quit IRC | 11:47 | |
*** otter768 has quit IRC | 11:47 | |
*** emagana has joined #openstack-infra | 11:49 | |
*** e0ne is now known as e0ne_ | 11:49 | |
openstackgerrit | Merged openstack/requirements: Remove failing project nova-docker https://review.openstack.org/156260 | 11:54 |
*** emagana has quit IRC | 11:54 | |
*** fbo has joined #openstack-infra | 11:54 | |
*** fifieldt has quit IRC | 11:54 | |
*** fifieldt_ has joined #openstack-infra | 11:55 | |
*** dizquierdo has quit IRC | 11:55 | |
*** sdake has quit IRC | 11:57 | |
*** fifieldt__ has joined #openstack-infra | 11:58 | |
*** pelix has joined #openstack-infra | 11:59 | |
*** dizquierdo has joined #openstack-infra | 11:59 | |
*** e0ne_ is now known as e0ne | 11:59 | |
*** fifieldt_ has quit IRC | 12:00 | |
openstackgerrit | Dmitry Tantsur proposed openstack/requirements: Add ironic-discoverd to projects.txt https://review.openstack.org/156270 | 12:01 |
*** yfried|afk is now known as yfried | 12:01 | |
openstackgerrit | Sean Dague proposed openstack-infra/os-loganalyze: extract static methods https://review.openstack.org/165850 | 12:02 |
openstackgerrit | Sean Dague proposed openstack-infra/os-loganalyze: unwind test class multiple inheritance https://review.openstack.org/165851 | 12:02 |
openstackgerrit | Sean Dague proposed openstack-infra/os-loganalyze: let tests be run from test file location https://review.openstack.org/165799 | 12:02 |
*** markvoelker has joined #openstack-infra | 12:03 | |
*** ghostpl_ has quit IRC | 12:03 | |
*** Qiming__ has joined #openstack-infra | 12:06 | |
*** eharney has quit IRC | 12:07 | |
*** Qiming_ has quit IRC | 12:07 | |
*** Somay has quit IRC | 12:10 | |
*** rfolco has joined #openstack-infra | 12:10 | |
TheJulia | good morning | 12:10 |
*** ihrachyshka has joined #openstack-infra | 12:11 | |
*** dprince has joined #openstack-infra | 12:11 | |
*** ibiris is now known as ibiris_away | 12:11 | |
*** ddieterly has joined #openstack-infra | 12:11 | |
*** yfried is now known as yfried|afk | 12:11 | |
*** jlanoux has quit IRC | 12:14 | |
*** radez_g0n3 is now known as radez | 12:16 | |
*** ddieterly has quit IRC | 12:16 | |
*** radez_g0n3 has joined #openstack-infra | 12:16 | |
*** radez_g0n3 is now known as radez | 12:16 | |
*** ghostpl_ has joined #openstack-infra | 12:17 | |
*** dkliban_afk is now known as dkliban | 12:18 | |
*** anthonyper has quit IRC | 12:18 | |
*** anthonyper has joined #openstack-infra | 12:18 | |
*** aysyd has joined #openstack-infra | 12:21 | |
*** ibiris_away is now known as ibiris | 12:22 | |
openstackgerrit | Merged openstack/requirements: Bump novaclient version https://review.openstack.org/162492 | 12:22 |
*** jaypipes has joined #openstack-infra | 12:22 | |
Kiall | any requirements core besides sean about? https://review.openstack.org/#/c/158287/ :) | 12:25 |
*** garyh has joined #openstack-infra | 12:27 | |
*** Longgeek has quit IRC | 12:27 | |
AJaeger | sdague, https://review.openstack.org/#/c/164077/ is needed to fix important bugs in our documentation toolchain, please reconsider your -2 | 12:27 |
*** gordc has joined #openstack-infra | 12:27 | |
*** baoli has joined #openstack-infra | 12:29 | |
*** bknudson has quit IRC | 12:29 | |
sdague | AJaeger: can we just remove the docs from g-r | 12:29 |
sdague | because honestly, there is no reason for the doc repos to be in there | 12:30 |
sdague | especially as your freeze windows are different | 12:30 |
*** sdake has joined #openstack-infra | 12:30 | |
AJaeger | sdague, we had this discussion already ;) We really like the syncing of requirements and somebody should implement this in a different way... | 12:31 |
sdague | yep, so then you have to live with freeze restrictions | 12:31 |
sdague | you can't have it both ways | 12:31 |
AJaeger | sdague, but I just had one idea: We already sync from openstack-manuals the glossary, we could sync requirements, let me investigate | 12:31 |
sdague | my patience on this point is pretty limitted | 12:31 |
*** yfried|afk is now known as yfried | 12:32 | |
AJaeger | sdague, it was submitted a week ago - wasn't that before the feature freeze? | 12:32 |
sdague | and, honestly, openstack-manuals is such a small number of projects, it's way easier for you folks to sync your projects directly and not do these g-r round trips | 12:32 |
sdague | AJaeger: doesn't matter when it's submitted | 12:32 |
sdague | it didn't land | 12:32 |
AJaeger | sdague, when do you unfreeze? Is that before Kilo is released? | 12:33 |
sdague | after all integrated projects have stable branches | 12:33 |
*** kgiusti has joined #openstack-infra | 12:34 | |
*** baoli has quit IRC | 12:34 | |
*** bswartz has quit IRC | 12:34 | |
sdague | I *litterally* have no idea why you think g-r makes any sense for documentation team | 12:35 |
sdague | if the projects weren't in projects.txt you would have already landed these changes in your repos | 12:35 |
AJaeger | sdague, I have to leave for a meeting now - I understand your arguments but need a different solution. | 12:36 |
sdague | I don't know why | 12:36 |
AJaeger | Once we have one, I happily do the changes on the documentation side. | 12:36 |
sdague | no, seriously, you have what, 6 repos? | 12:36 |
AJaeger | sdague, when we did this, we had 10+ | 12:36 |
*** adalbas has joined #openstack-infra | 12:36 | |
sdague | right, but you don't now | 12:37 |
*** garyh has quit IRC | 12:37 | |
AJaeger | ;) | 12:37 |
sdague | and, even then, it would have been so much faster for you to local sync all those then going through the g-r process | 12:37 |
sdague | it makes 0 sense that you keep insisting on that | 12:38 |
*** hodos has joined #openstack-infra | 12:40 | |
*** adalbas has quit IRC | 12:41 | |
*** e0ne is now known as e0ne_ | 12:41 | |
*** Longgeek has joined #openstack-infra | 12:42 | |
*** unicell1 has joined #openstack-infra | 12:43 | |
*** emagana has joined #openstack-infra | 12:43 | |
*** e0ne_ is now known as e0ne | 12:44 | |
*** unicell has quit IRC | 12:44 | |
*** markus_z has quit IRC | 12:46 | |
sdague | fungi: can you land - https://review.openstack.org/#/c/165542/ - I think it will fix some of the es indexing | 12:47 |
*** emagana has quit IRC | 12:48 | |
*** markus_z has joined #openstack-infra | 12:49 | |
*** ddieterly has joined #openstack-infra | 12:50 | |
*** sdake_ has joined #openstack-infra | 12:52 | |
*** adalbas has joined #openstack-infra | 12:53 | |
*** ddieterly has quit IRC | 12:53 | |
*** pelix has quit IRC | 12:53 | |
*** bknudson has joined #openstack-infra | 12:54 | |
*** baoli has joined #openstack-infra | 12:54 | |
*** sdake has quit IRC | 12:55 | |
openstackgerrit | Rafael Folco proposed openstack-infra/system-config: Updates to running-your-own CI docs: Changes required https://review.openstack.org/162268 | 12:55 |
*** baoli has quit IRC | 12:58 | |
*** baoli has joined #openstack-infra | 12:59 | |
*** ChuckC_ has joined #openstack-infra | 13:00 | |
*** ChuckC has quit IRC | 13:01 | |
*** bradjones has joined #openstack-infra | 13:01 | |
*** ChuckC_ has quit IRC | 13:05 | |
*** enikanorov has quit IRC | 13:09 | |
*** mattfarina has joined #openstack-infra | 13:10 | |
*** enikanorov has joined #openstack-infra | 13:11 | |
*** ihrachyshka has quit IRC | 13:14 | |
*** yfried is now known as yfried|afk | 13:15 | |
*** eharney has joined #openstack-infra | 13:15 | |
*** xyang1 has joined #openstack-infra | 13:16 | |
*** ChuckC_ has joined #openstack-infra | 13:19 | |
*** bswartz has joined #openstack-infra | 13:19 | |
*** dustins has joined #openstack-infra | 13:22 | |
*** ildikov has joined #openstack-infra | 13:23 | |
openstackgerrit | Paul Belanger proposed stackforge/gertty: Add missing requirement for six https://review.openstack.org/166218 | 13:24 |
*** JoshNang has quit IRC | 13:24 | |
*** eharney has quit IRC | 13:25 | |
*** zz_dimtruck is now known as dimtruck | 13:25 | |
openstackgerrit | Paul Belanger proposed stackforge/gertty: Add missing requirement for six https://review.openstack.org/166218 | 13:25 |
*** JoshNang has joined #openstack-infra | 13:26 | |
*** eharney has joined #openstack-infra | 13:26 | |
*** dimsum__ has quit IRC | 13:27 | |
dprince | If I manually clear out the TripleO RH1 cloud instances will nodepool discover them missing on its next cycle and recreate them? | 13:32 |
*** ffrog has joined #openstack-infra | 13:33 | |
*** nilasae is now known as nilasae|afk | 13:33 | |
*** eharney has quit IRC | 13:33 | |
*** Longgeek has quit IRC | 13:34 | |
*** amotoki_ has quit IRC | 13:35 | |
mordred | dprince: if not, it's super simple to delete them from nodepool's database | 13:35 |
*** yfried|afk is now known as yfried | 13:35 | |
*** amotoki has joined #openstack-infra | 13:35 | |
*** cdent has quit IRC | 13:36 | |
dprince | mordred: could you delete them for me? | 13:36 |
mordred | dprince: sure | 13:36 |
dprince | mordred: the TripleO RH1 zone nodes. | 13:36 |
*** jamielennox is now known as jamielennox|away | 13:36 | |
*** peristeri has joined #openstack-infra | 13:37 | |
*** emagana has joined #openstack-infra | 13:37 | |
*** garyh has joined #openstack-infra | 13:37 | |
mordred | dprince: you're in luck - nodepool already doesn't think it has any nodes there | 13:38 |
dprince | mordred: great. any idea how long before I see new ones spawning? | 13:39 |
mordred | let me look at the logs real quick ... | 13:39 |
*** wuhg has quit IRC | 13:39 | |
*** gaelL_ has quit IRC | 13:39 | |
mordred | dprince: should be soon - nodepool shows a demand | 13:41 |
mordred | dprince: 2015-03-20 13:39:44,584 DEBUG nodepool.NodePool: Deficit: tripleo-f20: 31 (start: 31 min-ready: 8 ready: 0 capacity: 0) | 13:41 |
mordred | 2015-03-20 13:39:44,603 DEBUG nodepool.NodePool: Deficit: tripleo-precise: 51 (start: 51 min-ready: 8 ready: 0 capacity: 0) | 13:41 |
*** rhe00 has quit IRC | 13:41 | |
*** gaelL has joined #openstack-infra | 13:41 | |
openstackgerrit | Merged openstack/requirements: Import cap.py tool to cap explicit dependencies https://review.openstack.org/155454 | 13:41 |
openstackgerrit | Merged openstack/requirements: Up pymongo version to avoid memory leak https://review.openstack.org/123995 | 13:42 |
dprince | mordred: cool, thanks | 13:42 |
*** rhe00 has joined #openstack-infra | 13:42 | |
*** emagana has quit IRC | 13:42 | |
openstackgerrit | Merged openstack/requirements: Block eventlet 0.17.0 https://review.openstack.org/158287 | 13:42 |
mordred | dprince: oh! no, we have you turned off ... | 13:42 |
*** mpavone has quit IRC | 13:42 | |
mordred | dprince: one sec - let me see what your quota setting should e | 13:42 |
openstackgerrit | Paul Belanger proposed stackforge/gertty: Add support for tox -epep8 https://review.openstack.org/166229 | 13:43 |
*** sushilkm has joined #openstack-infra | 13:43 | |
*** sushilkm has left #openstack-infra | 13:43 | |
*** otter768 has joined #openstack-infra | 13:43 | |
mordred | dprince: k. NOW you should start seeing nodes build | 13:43 |
*** ddieterly has joined #openstack-infra | 13:43 | |
dprince | mordred: okay, thanks. Will watch these closely | 13:44 |
*** Qiming__ is now known as Qiming | 13:45 | |
*** yfried is now known as yfried|afk | 13:45 | |
Qiming | hello, openstack-infra, another review of this new project proposal is appreciated: https://review.openstack.org/#/c/164963/ | 13:46 |
Qiming | thanks | 13:46 |
dprince | mordred: I see them going ACTIVE, and floatingips too | 13:47 |
mordred | dprince: woot! | 13:47 |
dprince | mordred: 1 major outage in a year isn't too bad. Thinking the root cause was a MySQL issue of sorts | 13:47 |
*** otter768 has quit IRC | 13:48 | |
*** garyh has quit IRC | 13:48 | |
dprince | mordred: still looking into the logs but simply bouncing MySQL and clearing out some things made it happy again (we think) | 13:48 |
*** ildikov has quit IRC | 13:48 | |
mordred | cool! yeah - that's actually pretty solid I think | 13:48 |
mordred | I mean, we had issues with hp public cloud yesterday that were also mysql related ... so it's fair :) | 13:48 |
*** eharney has joined #openstack-infra | 13:49 | |
*** mtanino has joined #openstack-infra | 13:50 | |
*** ihrachyshka has joined #openstack-infra | 13:51 | |
*** tkelsey has quit IRC | 13:51 | |
*** dimsum__ has joined #openstack-infra | 13:51 | |
*** raginbajin has quit IRC | 13:53 | |
*** hdd has joined #openstack-infra | 13:54 | |
*** amitgandhinz has joined #openstack-infra | 13:54 | |
*** raginbajin has joined #openstack-infra | 13:55 | |
*** dboik has joined #openstack-infra | 13:55 | |
*** alexpilotti has joined #openstack-infra | 13:56 | |
openstackgerrit | Merged openstack-infra/project-config: puppet-openstack update https://review.openstack.org/163333 | 13:57 |
openstackgerrit | Merged openstack-infra/project-config: Custom OVERRIDE_ENABLED_SERVICES for heat-dsvm-functional https://review.openstack.org/162487 | 13:57 |
openstackgerrit | Merged openstack-infra/project-config: Run ironicclient functional tests as STACK_USER https://review.openstack.org/163552 | 13:57 |
*** hdd has quit IRC | 13:59 | |
openstackgerrit | Paul Belanger proposed openstack-infra/project-config: Add pep8 / py27 gates for gertty https://review.openstack.org/166234 | 13:59 |
sdague | mordred: speaking of hpcloud, is that recovered yet? | 13:59 |
mordred | sdague: kinda - we've found a rate limit that seems to be working out ok and not causing death | 14:00 |
*** notnownikki has joined #openstack-infra | 14:00 | |
*** tqtran has joined #openstack-infra | 14:00 | |
sdague | ok, we still have like 500 nodes in building | 14:00 |
mordred | but we haven't poked further to see if we can increase it | 14:00 |
*** tqtran has quit IRC | 14:01 | |
*** _nadya_ has joined #openstack-infra | 14:01 | |
*** yfried|afk is now known as yfried | 14:02 | |
*** dansmith is now known as superdan | 14:03 | |
mordred | sdague: the underlying problem seems to be a thundering herd issue with an interaction between slow deletes and quota interactions | 14:03 |
mordred | sdague: in that something in there takes enough time that if our API rate hits above a certain point, Mysql can't service faster than it's getting new queries | 14:04 |
sdague | interesting, would be nice if we could get a more direct link into the ops to figure out what that hot spot is, and if it's fixable in the code side | 14:06 |
*** esker has joined #openstack-infra | 14:06 | |
mordred | sdague: so you see TONS of things in _refresh_quota_usages | 14:06 |
openstackgerrit | Merged openstack-infra/project-config: Add experimental job for Manila scenario tests https://review.openstack.org/164102 | 14:06 |
*** cdent has joined #openstack-infra | 14:06 | |
openstackgerrit | Merged openstack-infra/project-config: Change project description text https://review.openstack.org/164501 | 14:07 |
mordred | sdague: I'm certain I could set that up- I mean, I started looking at nova source code last night, then started backing away quietly | 14:07 |
openstackgerrit | Merged openstack-infra/project-config: Change node param for ec2api rally job. https://review.openstack.org/164717 | 14:07 |
sdague | yeh, the quotas code is ... problematic | 14:07 |
mordred | the select for update is just not a good idea :) | 14:07 |
sdague | yeh, most of that is getting unwound | 14:08 |
openstackgerrit | Monty Taylor proposed openstack-infra/system-config: Turn HP back on with lower rate limit https://review.openstack.org/166239 | 14:10 |
mordred | I was also thinking - in addition to adding rebuild support to nodepool | 14:10 |
mordred | we have knowledge of what our desired amount of nodes is at any given point in time - we should look in to sending multi-node requests | 14:11 |
mordred | so rather than saying "nova boot" 100 times, we should say "nova boot --count=100" - perhaps | 14:11 |
clarkb | you still need 100 fip attaches, this was sort of my point yesterday about why this is :( | 14:13 |
mordred | yah. that part is still :( | 14:13 |
clarkb | sure we optimize one but but still we are o(n) because cloud | 14:13 |
dprince | mordred: are there public logs I could view to gain insight into why the Fedora jobs are still queued? | 14:14 |
mordred | yah - but one thing at a time | 14:14 |
dprince | mordred: Seeing active instances getting deleted now. Makes me thing something is failing with regards to setting up the Fedora slaves | 14:14 |
mordred | clarkb: are we public-ing the nodepool logs? ^^ | 14:14 |
dprince | mordred: FWIW the Ubuntu jobs seem to be running fine we think | 14:14 |
clarkb | no because openstacj leaks private data to logs | 14:15 |
*** esker has quit IRC | 14:15 | |
*** hodos has quit IRC | 14:15 | |
dprince | mordred: not fine actually, but at least trying to run... | 14:15 |
*** esker has joined #openstack-infra | 14:15 | |
clarkb | dprince I gave you the error froma fedora node yesterday | 14:15 |
dprince | clarkb: right, we think we solved that one. | 14:16 |
dprince | clarkb: nodepool got turned off yesterday for TripleO | 14:16 |
dprince | clarkb: now it is back on again so we are checking some things | 14:16 |
*** sputnik13 has joined #openstack-infra | 14:19 | |
mordred | sdague: while we're on the subject - why does a delete api call take a long time? is it blocking on something rather than just plopping a delete request on a queue? | 14:19 |
sdague | it's an async call as far as I know | 14:19 |
*** mestery_afk has quit IRC | 14:19 | |
fungi | yeah, if you manually nova delete, you'll see it says that it accepted the request, but it doesn't actually disappear from nova list for a while | 14:20 |
*** prad has joined #openstack-infra | 14:21 | |
*** Qiming_ has joined #openstack-infra | 14:21 | |
fungi | so are we still wanting to revert the nodepool patch from yesterday? | 14:22 |
sdague | mordred: nope, I'm wrong, it's sync | 14:23 |
mordred | fungi: _I_ don't | 14:23 |
fungi | i assume from the looks of the graph that hpcloud is still in a bad way | 14:23 |
mordred | sdague: I'd suggest making it async - since deleting doesn't actually happen at that point anyway | 14:23 |
sdague | yeh, easier said than done | 14:23 |
mordred | sdague: like, we block on the aPI call for 30 seconds , and still have to wait for hours for the node to get deleted | 14:24 |
mordred | sdague: :) | 14:24 |
sdague | I'm looking through that code path | 14:24 |
clarkb | dprince: BadRequest: Error. Unable to associate floating ip (HTTP 400) (Request-ID: req-cedfd88b-ba1e-4a4e-aa19-e6d131fd8db7) | 14:24 |
*** tqtran has joined #openstack-infra | 14:24 | |
mordred | sdague: I'm in a chat with folks about the issues - do you want me to bring up that you think it might be interesting to dig in? | 14:24 |
*** Qiming has quit IRC | 14:24 | |
sdague | mordred: sure, though honestly it probably won't be today | 14:25 |
*** che-arne has joined #openstack-infra | 14:25 | |
mordred | sdague: k. I'll bring that up in a different email then | 14:25 |
dprince | clarkb: sigh, so the same error again? | 14:25 |
*** timcline has joined #openstack-infra | 14:25 | |
*** ominakov has joined #openstack-infra | 14:26 | |
clarkb | dprince: yes | 14:26 |
*** bhunter71 has joined #openstack-infra | 14:26 | |
dprince | clarkb: thanks | 14:26 |
clarkb | at least from nodepools perspective that is what is happening | 14:26 |
dprince | clarkb: are those public? | 14:26 |
*** e0ne is now known as e0ne_ | 14:26 | |
clarkb | dprince: no we can't make these logs public because openstack clients refuse to sanitize their logging | 14:26 |
mordred | clarkb: have we checked that recently? it's possible it's been fixed | 14:26 |
dprince | clarkb: also, do you see lot of these or just a few. Using clients myself I'm seeing floatingip's get assigned just fine | 14:27 |
clarkb | mordred: I haven't checked it since last summit but also not sure we have upgraded any clients since last summit either | 14:27 |
*** e0ne_ is now known as e0ne | 14:27 | |
dprince | clarkb: well, at least I did for the first round of instances | 14:27 |
fungi | i think they (some?) still do it in debug, so if we set debug on a client lib using service and it applies transitively, credentials in logs | 14:27 |
*** peristeri has quit IRC | 14:27 | |
sdague | mordred: so my guess, honestly, is it's all the quotas calculations is the delete cost | 14:27 |
clarkb | dprince: 65 since the log was last roatated | 14:27 |
openstackgerrit | Merged openstack/requirements: Remove hardware-specific proliantutils module https://review.openstack.org/158000 | 14:27 |
openstackgerrit | Merged openstack/requirements: Do not break on projects without setup.cfg https://review.openstack.org/156220 | 14:28 |
clarkb | dprince: looks like it rotated ~6 hours ago | 14:28 |
openstackgerrit | Merged openstack/requirements: Add a script to find cruft global requirements https://review.openstack.org/148071 | 14:28 |
fungi | where "it" is print full copies of what's being sent in the api calls, which includes credentials | 14:28 |
*** tsg_ has joined #openstack-infra | 14:28 | |
dprince | clarkb: I see many instances with floatingips actually. This one default-net=10.2.8.125, 66.187.229.119; tripleo-bm-test=192.168.1.79. The 66. address is the floatingip | 14:29 |
*** peristeri has joined #openstack-infra | 14:29 | |
clarkb | dprince: what is the instance uuid? | 14:29 |
dprince | clarkb: daf50f7d-d73c-460b-9634-274462c6e6c4 | 14:30 |
*** yfried is now known as yfried|afk | 14:30 | |
clarkb | dprince: Exception: Timeout waiting for ssh access is the error from that node | 14:31 |
clarkb | which may be related to the issue ianw discovered on rax f20 nodes (prevented node from booting properly so ssh would fail) | 14:31 |
dprince | clarkb: right, I was thinking similar | 14:32 |
jd__ | ok sorry to ask here but I'm dumb; why is https://review.openstack.org/#/c/164182/ not merging? what do I miss? | 14:32 |
mordred | sdague: well, that would fit with my napkin theory | 14:33 |
dprince | clarkb: could just be slowness though, still trying some things. | 14:33 |
*** wenlock has joined #openstack-infra | 14:33 | |
mordred | sdague: since it was the quota code that was killing the db - and it was mostly happening when we were not ratelimiting the deletes - so if they're long and sync ... that'll pile up easily | 14:33 |
clarkb | mordred: sdague if this affects kilo nova hopefully it is treated as a critical bug and we can look at it before we release | 14:34 |
*** scheuran has quit IRC | 14:34 | |
clarkb | jd__: I am not sure at first glance, let me poke around | 14:34 |
fungi | dprince: are you able to see the virtual console for it? when we ran into that, if it's what we ran into, the console was looping with the bootloader failing to find a config | 14:35 |
*** prad has quit IRC | 14:35 | |
openstackgerrit | Merged openstack-infra/project-config: Add job for network based elastic-recheck queries test https://review.openstack.org/164869 | 14:35 |
mordred | clarkb: I agree - it's effectively a DDOS in a box right now | 14:35 |
dprince | fungi: I can get at it I think. Will involve some tunnelling trickery so give me a bit. | 14:35 |
sdague | clarkb: this is a super long standing issue that requires substantial architecture changes | 14:35 |
mordred | awesome | 14:35 |
*** prad has joined #openstack-infra | 14:36 | |
openstackgerrit | Merged openstack-infra/project-config: Add gate check skip for rst/doc files os-ansible-deployment repository https://review.openstack.org/164271 | 14:36 |
clarkb | sdague: huh I guess we never tripped it before because we serialized deletes | 14:37 |
fungi | jd__: clarkb: is there a dependency loop there? some of the changes depending on that one also have depends-on commit message headers set to other changes in the same project which i think might be also in that git dependency chain | 14:37 |
fungi | i'm trying to map out the dependencies behind it but they're a little complex | 14:37 |
clarkb | sdague: which is why we had such a large delete backlog in the node graphs for so long | 14:37 |
clarkb | fungi: I am looking at zuul logs | 14:38 |
clarkb | fungi: hopefully between the two we get an answer | 14:38 |
mordred | clarkb: I just had a VERY evil thought | 14:38 |
sdague | right | 14:38 |
mordred | clarkb: what if we stopped deleting full-stop | 14:38 |
mordred | clarkb: and just replaced our delete calls with rebuild calls | 14:38 |
mordred | since delete is broken | 14:38 |
jd__ | fungi: ah good hint let me check | 14:38 |
mordred | it means our consumption would never decrease | 14:38 |
mordred | but actually our load against the clouds would be much less | 14:38 |
clarkb | 2015-03-20 10:23:37,390 DEBUG zuul.IndependentPipelineManager: Change <Change 0x7fef948f42d0 164182,6> does not match pipeline requirement <ChangeishFilter required_approvals: [{'username': 'jenkins', 'verified': [1, 2]}]> is interesting, we probably want a verified of 0 to be valid for merge check | 14:39 |
clarkb | mordred: we need to test it with rackspace | 14:39 |
mordred | clarkb: yah | 14:39 |
mordred | clarkb: test that rebuild works you mean? | 14:39 |
clarkb | mordred: iirc they were the cloud that said don't use rebuild | 14:39 |
clarkb | ya | 14:40 |
*** tonytan4ever has joined #openstack-infra | 14:40 | |
clarkb | because I am pretty sure rax's feedback a while back was rebuild is :( | 14:40 |
clarkb | so we didn't keep looking into it with much priority | 14:40 |
openstackgerrit | Julien Danjou proposed openstack-infra/project-config: Move Gnocchi from Stackforge to OpenStack https://review.openstack.org/162146 | 14:40 |
openstackgerrit | Julien Danjou proposed openstack-infra/project-config: Remove some tests to Gnocchi https://review.openstack.org/164211 | 14:40 |
mordred | my god. so HP actively wants us to rebuild and RAX actively wants us to not | 14:40 |
mordred | that's so great | 14:40 |
clarkb | mordred: well that may have changed | 14:40 |
sdague | I wonder if that's because rebuild on xen is hokey? | 14:41 |
mordred | clarkb: it would take _slightly_ more work in nodepool than just a quick hack, btw | 14:41 |
jeblair | clarkb: i don't remember negative feedback from rax about rebuild | 14:41 |
*** armax has joined #openstack-infra | 14:41 | |
*** asselin_ has joined #openstack-infra | 14:41 | |
*** dustins_ has joined #openstack-infra | 14:42 | |
jd__ | fungi: there was a loop with another repo but at a later point in the branch, not sure that's the issue | 14:42 |
clarkb | jeblair: the feedback was it will perform worse in our cloud so please do what you are doing now iirc | 14:42 |
fungi | jd__: looks like you found it. you had a child of a child of that change which was I5ddb00a depending on I56f1988 which was in turn depending on I5ddb00a | 14:42 |
jeblair | clarkb: where was that feedback? | 14:42 |
clarkb | jeblair: here in irc when mordred asked them about it | 14:42 |
jd__ | fungi: ok if that's it cool :) | 14:42 |
jeblair | clarkb: i remember jogo saying he might want to make some things more efficient, but that's it | 14:42 |
fungi | jd__: i _think_ zuul tries to build up the entire dependency set including children and parents of the given change and if it finds a loop anywhere in there it aborts | 14:42 |
jd__ | ack :) | 14:43 |
fungi | jeblair: ^ yes? | 14:43 |
jeblair | fungi: that should be the case yes | 14:43 |
jd__ | fungi: jeblair: sounds like it, my recheck has been picked! :) | 14:43 |
jd__ | thanks guys <3 | 14:43 |
fungi | awesome | 14:43 |
jeblair | clarkb: who provided that feedback? | 14:44 |
clarkb | jeblair: I do not remember the specific individual | 14:44 |
*** dustins has quit IRC | 14:44 | |
*** sushilkm has joined #openstack-infra | 14:45 | |
*** marun has joined #openstack-infra | 14:45 | |
*** sushilkm has left #openstack-infra | 14:45 | |
*** sputnik13 has quit IRC | 14:45 | |
jeblair | clarkb: well, who shall we ask again then? | 14:45 |
*** rlandy has quit IRC | 14:46 | |
clarkb | I am reading logs... | 14:46 |
sdague | so, I think all the time is probably in this - https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/api.py#L3463 . Optimizing there would probably be the place to do it, however stuff like that has enough ripple effects that it's definitely not a post freeze issue | 14:47 |
*** enikanorov has quit IRC | 14:47 | |
*** claudiub has quit IRC | 14:48 | |
*** garyh has joined #openstack-infra | 14:48 | |
*** enikanorov has joined #openstack-infra | 14:48 | |
jeblair | mordred: anyway, please let's not invest time in making this version of nodepool use rebuild. it would be a huge change to the algorithm that we will then throw away with zuulv3. if hpcloud can't improve, let's switch back to the old task algorithm and save rebuild for zuulv3. | 14:49 |
openstackgerrit | yolanda.robla proposed openstack-infra/system-config: Don't hardcode pip.conf values https://review.openstack.org/166252 | 14:49 |
mordred | jeblair: that's not what I was talking about | 14:50 |
clarkb | looks like phschwartz thought it was a good idea in http://eavesdrop.openstack.org/irclogs/%23openstack-infra/%23openstack-infra.2014-06-12.log and was going to work on a patch. So at least according to that log they were ok with it | 14:50 |
clarkb | now to see if my memory is based on results from writing that change? | 14:50 |
*** yfried|afk is now known as yfried | 14:50 | |
mordred | jeblair: I was talking about not changing the algorithm - and just literally never making a delete call on an existing node in nodepool, but instead changing the body of deleteNode to call rebuild | 14:51 |
mordred | jeblair: which is why I said it was an evil idea - since it would essentially keep us at max utilization constantly | 14:51 |
jeblair | mordred: i understood that. that's still a major change to the algorithm. you have to decide what to rebuild into, etc. | 14:51 |
mordred | jeblair: fair nuff | 14:52 |
jeblair | biab | 14:52 |
*** ___mimir has quit IRC | 14:53 | |
*** masayukig_ has joined #openstack-infra | 14:53 | |
mordred | clarkb: we don't seem to mark subnodes with deleted state in the db | 14:55 |
mordred | clarkb: we just call cleanupNode on them and then call node.delete() | 14:55 |
yolanda | ah, clarkb, i need you, or fungi | 14:56 |
*** aysyd has quit IRC | 14:56 | |
yolanda | a new gozer group was created | 14:56 |
yolanda | for stackforge projects | 14:56 |
yolanda | and i need someone to add people there | 14:56 |
*** mrunge has quit IRC | 14:56 | |
clarkb | yolanda: who should be the first group member (they can add the remaining members) | 14:56 |
yolanda | you can add myself | 14:57 |
clarkb | yolanda: you have two accounts, can you give me the account id number for the one you are using? | 14:57 |
yolanda | ah, ok | 14:57 |
clarkb | mordred: that doesn't update teh subnodes table? | 14:57 |
yolanda | it's still that legacy thing | 14:57 |
yolanda | let me check | 14:57 |
mordred | clarkb: don't think so | 14:57 |
mordred | clarkb: I could be wrong though | 14:57 |
mordred | clarkb: I'm not REALLY looking at that - mainly said it here to remind me to look further | 14:58 |
clarkb | there is a state column and on the running DB they have different states | 14:58 |
fungi | we supposedly have 388 alien nodes in hpcloud right now | 14:58 |
fungi | should we be playing whack-a-mole with these still? | 14:58 |
yolanda | ah, mordred, jeblair, for the rebuild, i told Tim that should be better to create a spec | 14:58 |
mordred | fungi: I'm VERY confused as to how we keep growing alien node there | 14:59 |
yolanda | as it involves changes on nodepool logic, for the capacity algorithm | 14:59 |
*** garyh has quit IRC | 14:59 | |
clarkb | fwiw I am not finding any follwup to the above convesation in the logs so I may have misremembered something that was said | 14:59 |
yolanda | clarkb href="https://login.launchpad.net/+id/yMkMBPe" | 14:59 |
mordred | yolanda: yes - it's definitely spec worthy - but I agree with jim, it's more likely something we'll want to do as part of zuulv3 | 14:59 |
clarkb | phschwartz: any idea where you got with nodepool using rebuild? | 14:59 |
*** aysyd has joined #openstack-infra | 14:59 | |
fungi | mordred: i can grab an example uuid and put together a boot/delete/whatever timeline if someone in hpcloud noc wants to trace the corresponding api calls | 14:59 |
clarkb | yolanda: can you give me the gerrit account id? https://review.openstack.org/#/settings/ is wher you can find it | 14:59 |
yolanda | clarkb, yolanda.robla | 15:00 |
mordred | fungi: of one of our aliens? yeah - let's try that | 15:00 |
clarkb | yolanda: done | 15:00 |
anteaya | the weather has been good so I have to start boiling down sap today, will take me a few hours to get set up, back later | 15:00 |
yolanda | mordred, yes, concern i had is that you only need a rebuild if you still have demand of these types of nodes | 15:00 |
mordred | yolanda: you can rebuild a node to a different type | 15:00 |
mordred | yolanda: you don't have to be that fancy | 15:01 |
*** dimsum__ has quit IRC | 15:01 | |
*** AJaeger has quit IRC | 15:01 | |
yolanda | ah, nice, didn't know that it was possible | 15:01 |
yolanda | but how about flavor ? will that work for different flavours? | 15:01 |
yolanda | clarkb, thx | 15:02 |
clarkb | no, I think rebuild basically takes an arbitrary image, writes it over an existing VMs disk, then reboots the VM | 15:02 |
clarkb | so flavor needs to be constant | 15:02 |
*** stevemar has joined #openstack-infra | 15:02 | |
fungi | mordred: cool, seeing what i can put together from our logs for a sample | 15:02 |
yolanda | and we have different ones for bare, devstack, right? | 15:02 |
clarkb | yolanda: we do not | 15:02 |
clarkb | but nodepool will probably need to solve that generally | 15:02 |
clarkb | since others may | 15:02 |
yolanda | we have it downstream, less memory for bare | 15:03 |
yolanda | so it should pick a different flavour | 15:03 |
fungi | also as we start using nodepool for more varied tasks, we may want it for ourselves too | 15:03 |
fungi | so we would go from having per-label demand to per-flavor demand, i guess | 15:04 |
mordred | yes - all of those things are true | 15:04 |
clarkb | it would be both, because per-label would determine what to boot into | 15:05 |
mordred | however - at the moment - none of those things are real _current_ requirements | 15:05 |
fungi | or probably two tiers there since we would still possibly want to pre-boot and attach the workers before the jobs want to run things on them | 15:05 |
fungi | yeah, that | 15:05 |
mordred | they are requirements we should do - and should take in to account when we design the thing | 15:05 |
*** reed has joined #openstack-infra | 15:05 | |
fungi | hrm... the math on that model is going to get fun | 15:06 |
openstackgerrit | Merged openstack/requirements: Add ironic-lib to project.txt https://review.openstack.org/161603 | 15:06 |
fungi | but i need to ponder other more immediate concerns right now, so will revisit later | 15:06 |
clarkb | we are essentially taking over nova's scheduling problem | 15:06 |
clarkb | which as a user is not what I would like to be spending my time doing | 15:06 |
clarkb | bhunter71: can you see my comment on 161994? curious to hear what you think about that | 15:07 |
*** ociuhandu has quit IRC | 15:08 | |
sdague | hmmmm jeblair / clarkb - https://review.openstack.org/#/c/165851/ another one of those zuul merge incorrect errors | 15:09 |
openstackgerrit | Merged openstack-infra/devstack-gate: Add ironic-lib to devstack-vm-gate-wrap.sh https://review.openstack.org/161600 | 15:10 |
openstackgerrit | Merged openstack-infra/devstack-gate: Remove configurable testr artifact processing https://review.openstack.org/161422 | 15:11 |
*** ffrog has quit IRC | 15:11 | |
*** mjturek1 has joined #openstack-infra | 15:11 | |
*** armax has quit IRC | 15:11 | |
bhunter71 | clarkb: thanks, I think that helps. I wanted to change the again, anyway. | 15:11 |
bhunter71 | sorry,I wanted to change the format anyway. | 15:12 |
clarkb | bhunter71: as long as it sorts well I think its fine | 15:12 |
clarkb | sdague: jeblair looked into that yesterday and found gerrit doesn't always show the dependency when you push a series and query it immediately | 15:13 |
*** yamahata has joined #openstack-infra | 15:13 | |
*** dimsum__ has joined #openstack-infra | 15:14 | |
openstackgerrit | Merged openstack-infra/devstack-gate: XenAPI: Highlight that eth4 does not exist outside the Citrix environment https://review.openstack.org/165607 | 15:14 |
openstackgerrit | Merged openstack/requirements: Bump tempest-lib min version https://review.openstack.org/166044 | 15:14 |
sdague | clarkb: ah, right | 15:14 |
*** dannywilson has joined #openstack-infra | 15:14 | |
*** mestery has joined #openstack-infra | 15:15 | |
*** sputnik13 has joined #openstack-infra | 15:15 | |
clarkb | mordred: looking at graphs the hpcloud error rate is almost 100% | 15:15 |
mordred | clarkb: awesome | 15:15 |
clarkb | mordred: so while we may not be making nova fall over, it isn't doing us any good | 15:15 |
*** radez is now known as radez_g0n3 | 15:15 | |
*** nelsnelson has joined #openstack-infra | 15:16 | |
*** dimsum__ has quit IRC | 15:16 | |
mordred | clarkb: well, time to dive back in to figuring out what's failing with node boots | 15:16 |
*** rkukura_ has joined #openstack-infra | 15:16 | |
*** yfried is now known as yfried|afk | 15:17 | |
*** sushilkm has joined #openstack-infra | 15:17 | |
*** sushilkm has left #openstack-infra | 15:17 | |
*** emagana has joined #openstack-infra | 15:17 | |
*** nelsnelson has quit IRC | 15:17 | |
*** rkukura has quit IRC | 15:17 | |
*** rkukura_ is now known as rkukura | 15:17 | |
*** yamahata has quit IRC | 15:17 | |
*** sputnik13 has quit IRC | 15:18 | |
*** nelsnelson has joined #openstack-infra | 15:18 | |
*** ddieterly has quit IRC | 15:18 | |
*** radez_g0n3 is now known as radez | 15:18 | |
*** sdake has joined #openstack-infra | 15:19 | |
*** ddieterly has joined #openstack-infra | 15:19 | |
*** ajmiller has joined #openstack-infra | 15:19 | |
jogo | jeblair: yeah, johnthetubaguy said there is some nova / xen side work to do | 15:20 |
clarkb | oh cool I am not crazy | 15:20 |
jogo | jeblair: to make rebuild more efficient | 15:20 |
jogo | I think it involved making sure they don't delete the image and redownload it during a rebuild | 15:20 |
*** sdake_ has quit IRC | 15:20 | |
johnthetubaguy | jogo: ah, yeah, its more the idea we could cache every image thats in use, not just base images | 15:21 |
*** sputnik13 has joined #openstack-infra | 15:21 | |
*** sdake_ has joined #openstack-infra | 15:21 | |
*** jogo is now known as flashgordon | 15:21 | |
*** openstackgerrit has quit IRC | 15:21 | |
johnthetubaguy | jogo: xenapi doesn't use the image cache stuff, this work has just dropped on my plate actually, although there are some more urgent things before I will get to this, it should happen | 15:21 |
*** openstackgerrit has joined #openstack-infra | 15:22 | |
clarkb | jenkins02 appears to be spiralling into thread leak terribleness | 15:22 |
clarkb | I am going to put it in shutdown mode so that I can get data for my upstream bug | 15:22 |
flashgordon | johnthetubaguy: :/ | 15:22 |
*** sigmavirus24_awa is now known as sigmavirus24 | 15:22 | |
flashgordon | even without that would rebuild be faster or slower or the same as boot delete cycles | 15:22 |
johnthetubaguy | flashgordon: it should be a little faster/more reliable due the lack of IP stuff, and scheduling etc, but thats going to be saving 5-10seconds I would guess | 15:23 |
*** sdake__ has joined #openstack-infra | 15:23 | |
*** sdake has quit IRC | 15:23 | |
*** hdd has joined #openstack-infra | 15:24 | |
mordred | yah. so it wouldn't kill rax -a nd it would be a huge win for hp | 15:24 |
johnthetubaguy | flashgordon: with the above change, it should save 190-200 seconds | 15:24 |
BobBall | So there is no reason to favour delete/rebuild in RAX? | 15:24 |
johnthetubaguy | BobBall: there is a reason to favour rebuild, it saves quite a few error conditions from being possible | 15:24 |
johnthetubaguy | BobBall: its marginal though | 15:25 |
openstackgerrit | Joe Gordon proposed openstack-infra/project-config: Add new keystone tempest job to only run keystone tests https://review.openstack.org/164314 | 15:25 |
johnthetubaguy | flashgordon: of course you can rebuild to a new image now too, so no needed to delete at the end of the day, technically | 15:25 |
*** sdake_ has quit IRC | 15:26 | |
flashgordon | nice! that is pretty neat | 15:27 |
flashgordon | mordred hopefully that gave you the information you needed, any other rax/rebuild questions for johnthetubaguy ? | 15:28 |
mordred | flashgordon: nope. that's awesome. thanks johnthetubaguy ! | 15:28 |
*** sputnik13 has quit IRC | 15:28 | |
johnthetubaguy | mordred: do let me know if anything crops up | 15:28 |
mordred | johnthetubaguy: also, we now know that you're the new pvo in terms of us bugging someone about rax nova questions | 15:28 |
*** sputnik13 has joined #openstack-infra | 15:28 | |
mordred | johnthetubaguy: hope you're ok with that | 15:28 |
mordred | :) | 15:28 |
*** sputnik13 has quit IRC | 15:28 | |
johnthetubaguy | mordred: I noticed a few errors seem to happen just after you switch the images over | 15:28 |
johnthetubaguy | mordred: lol, sure | 15:29 |
*** ayoung has quit IRC | 15:30 | |
clarkb | ok I hae a thread dump for jenkins02, is that a safe thing to upload to jenkins' jira? | 15:31 |
clarkb | I will put a copy of it in my homedir on jenkins02 | 15:31 |
clarkb | johnthetubaguy: what sort of errors? because that may make it not useable for us | 15:31 |
fungi | johnthetubaguy: actually one of the reasons it's attractive to us is that we often end up waiting up to an hour for rax to assign an ip address on a nova boot request at peak activity, and rebuild was seen as a potential workaround for that | 15:32 |
*** carl_baldwin has joined #openstack-infra | 15:32 | |
johnthetubaguy | clarkb: I mean the existing method is hitting some image not found issues, like you deleted the image during the build, but not totally sure | 15:32 |
clarkb | johnthetubaguy: I see, not really something that is a problem generally, but a few corner issues | 15:33 |
johnthetubaguy | fungi: hmm, never seen that take an hour, but yes, it would totally side step that | 15:33 |
fungi | so that 10-15 second performance gain from reusing network configuration is maybe more like 3600 seconds | 15:33 |
clarkb | we can likely live with that :) | 15:33 |
fungi | johnthetubaguy: at times we've been told that the regions we're booting in simply had no available ip addresses and so nova was waiting on some to free up | 15:34 |
johnthetubaguy | fungi: not according to the reporting on my side, not seen any of the builds take quite that long, would love to dig into that if it comes up again | 15:34 |
*** yfried|afk is now known as yfried | 15:34 | |
johnthetubaguy | fungi: oh yeah, we worked around that now, only takes 15 mins to go back in the pool | 15:34 |
fungi | johnthetubaguy: aha, so old info then. that's awesome | 15:34 |
*** yfried has quit IRC | 15:34 | |
johnthetubaguy | fungi: I think the switch configs had to get updated, etc | 15:34 |
mordred | that woudl be one of the benefits on the other side too - floating ips stay with a node that is rebuilt | 15:34 |
*** asselin_ has quit IRC | 15:35 | |
johnthetubaguy | mordred: yeah, so we don't support those yes, sigh, but rebuild will do the trick for now | 15:35 |
fungi | though we do still recycle instances quickly enough that 15 minutes to return them to the pool is potentially a lot of waste from rackspace's perspective still | 15:35 |
johnthetubaguy | s/yes/yet/ | 15:35 |
mordred | johnthetubaguy: we LOVE that you don't have floating ips | 15:35 |
mordred | johnthetubaguy: we HATE floating ips | 15:35 |
*** ociuhandu has joined #openstack-infra | 15:35 | |
mordred | johnthetubaguy: I really hope that you don't stop supporting servers having real ips | 15:35 |
mordred | johnthetubaguy: because I would consider that a regression from teh thing you do now which is very awesome | 15:36 |
johnthetubaguy | mordred: they are going to be an additional, AFAIK | 15:36 |
mordred | yay! | 15:36 |
* mordred hugs johnthetubaguy | 15:36 | |
* johnthetubaguy sends hug to brad mcconnall | 15:36 | |
fungi | 301 hug redirect | 15:37 |
*** dustins_ has quit IRC | 15:37 | |
*** dimsum__ has joined #openstack-infra | 15:37 | |
johnthetubaguy | :) | 15:37 |
mordred | now, if only I could convince johnthetubaguy to start using dhcp my life would be complete ... | 15:37 |
*** Qiming__ has joined #openstack-infra | 15:37 | |
johnthetubaguy | mordred: yeah, I am kinda requesting that, but its not on a roadmap I have seen right now | 15:37 |
*** jaypipes is now known as leakypipes | 15:37 | |
*** thedodd has joined #openstack-infra | 15:37 | |
clarkb | mordred: fungi: I have skimmed the thread dump from jenkins02, the only thing it seems to expose are the server's name, its ip address, some slave names, and some job names running on those slaves | 15:38 |
*** Qiming_ has quit IRC | 15:38 | |
clarkb | mordred: fungi do either of you want to double check it isn't leaking anything dangerous? | 15:38 |
johnthetubaguy | mordred: so config drive and cloud-init might do all that for you, not had chance to test it, so you can kill the agent in your image, if you want (needs an extra image prop xenapi_use_agent=False I think) | 15:38 |
fungi | clarkb: where did you save it? in your homedir? | 15:38 |
mordred | johnthetubaguy: yes, that is correct | 15:38 |
clarkb | fungi: yup on jenkins02 | 15:38 |
mordred | johnthetubaguy: except you need patched config drive | 15:38 |
johnthetubaguy | mordred: patched config drive? | 15:38 |
mordred | johnthetubaguy: but that's fine - we have a workaround/know how to deal with it | 15:38 |
dprince | clarkb,mordred: so to disable the RH1 TripleO (temporarily) while we try some things do we need to push a patch? | 15:39 |
johnthetubaguy | mordred: OK, if its working thats cool | 15:39 |
mordred | johnthetubaguy: yes - upstream config drive does not yet support reading teh passthrough network info | 15:39 |
mordred | johnthetubaguy: but yeah - it's a thing we have a plan for | 15:39 |
clarkb | dprince: ya, you want to update the nodepool.yaml.erb file to set your regions max servers to 0 | 15:39 |
BobBall | Quick config drive question... does rax support volume IDs for config drive? | 15:39 |
dprince | clarkb,mordred: I mean I know we can update the nodepool conf... or would it be okay if we just temporarily block the public API port? | 15:39 |
mordred | johnthetubaguy: but if you used dhcp - we wouldn't need to work around anything | 15:39 |
johnthetubaguy | mordred: ack | 15:39 |
clarkb | dprince: I think temporarily blocking the API port is also fine | 15:39 |
*** ChuckC_ is now known as ChuckC | 15:39 | |
dprince | clarkb: okay, we might do that just so we don't have to bother you as much :), thansk | 15:40 |
johnthetubaguy | BobBall: volume IDs? we just do what the upstream code does, I would have to check | 15:40 |
mordred | BobBall: yes | 15:40 |
johnthetubaguy | ah, there you go | 15:40 |
mordred | BobBall: rax allows you to mount config-2 | 15:40 |
jeblair | dprince: yeah, an icmp reject would be best i think (don't just drop it) | 15:41 |
*** weshay has quit IRC | 15:41 | |
dprince | jeblair: okay, will try that | 15:41 |
BobBall | I don't _think_ I mean disk-by-label - I mean specifying config_drive=<volume_id> when creating the server. Personally never done it so don't know what the use case is for that though :) | 15:41 |
*** ominakov has quit IRC | 15:41 | |
BobBall | jhesketh mentioned it on https://review.openstack.org/#/c/155770/ | 15:42 |
jeblair | clarkb, mordred, yolanda: can you dump the information we just gathered about rebuild into a comment on https://review.openstack.org/#/c/164371/ ? | 15:42 |
jeblair | clarkb, mordred, yolanda: we can put a paragraph about rebuild into the next iteration | 15:43 |
mordred | jeblair: yes. I can do that | 15:43 |
*** ominakov has joined #openstack-infra | 15:43 | |
*** otter768 has joined #openstack-infra | 15:44 | |
*** weshay has joined #openstack-infra | 15:44 | |
yolanda | ah, i need to read that spec | 15:45 |
jroll | johnthetubaguy: by 'patched configdrive' mordred means this thing that's deployed in our cloud: https://review.openstack.org/#/c/153097/ | 15:45 |
jroll | (in case that wasn't clear) | 15:46 |
*** claudiub has joined #openstack-infra | 15:46 | |
* jroll wonders if we can get that merged within a year from the original patch | 15:46 | |
mordred | jeblair: I think I captured everything | 15:46 |
johnthetubaguy | jroll: yeah, it should also have the regular network info in there two in the XenServer based VMs, I think, so regular cloud-init should have picked it up, in theory | 15:47 |
clarkb | mordred: comment on flavor? | 15:47 |
jroll | johnthetubaguy: yeah, I don't think infra uses cloud-init though | 15:48 |
clarkb | jroll: we do | 15:48 |
jroll | orly | 15:48 |
jroll | do you use patched cloud-init or? | 15:48 |
clarkb | jroll: at least as of yesterday, things sort of changed yesterday afternoon | 15:48 |
jroll | heh | 15:48 |
clarkb | mordred: speaking of, did all images get rebuilt this morning without cloud init? | 15:48 |
*** otter768 has quit IRC | 15:49 | |
pabelanger | jeblair, nice, spec. I was thinking about multi-nodes this morning. After looking into the current subnodes setup, I was having some troubles getting subnodes to use a different image then the parent. | 15:50 |
*** baoli has quit IRC | 15:50 | |
jeblair | pabelanger: yeah, it's about 1/4 of the spec-writing necessary for the zuulv3 work i outlined in an email a while ago. i'm hoping to write up another chunk today. | 15:51 |
mordred | clarkb: no - because we didn't ever move past hpcloud-b5 yesterday because it all went to hell | 15:51 |
clarkb | mordred: except that nodepool rebuilds images every day | 15:51 |
mordred | clarkb: good point - then if image builds were successful ...yes | 15:52 |
jeblair | mordred, clarkb: i'll add the note about flavors | 15:52 |
*** ominakov has quit IRC | 15:52 | |
*** harlowja_at_home has joined #openstack-infra | 15:52 | |
fungi | mordred: here's an example alien node leak which hpcloud noc can maybe analyze from their side http://paste.openstack.org/show/193961/ | 15:53 |
mordred | jeblair: nod. thanks - I knew I was missing something | 15:53 |
clarkb | mordred: looks like the snapshots are all still building | 15:53 |
clarkb | mordred: and the dib images haven't been uploaded yet because some are still building | 15:54 |
clarkb | mordred: so we haven't flipped that switch yet but it is in progress | 15:54 |
mordred | fungi: did we actually submit a delete server task there? | 15:54 |
mordred | clarkb: cool | 15:54 |
clarkb | mordred: and maybe that will affect the hpcloud error rate if the metadata server is still hosed there | 15:54 |
fungi | mordred: that's a good question. nodepool says it deleted the node, but it didn't explicitly log the api call itself so... depends on how much we trust that nodepool is making those calls? | 15:55 |
fungi | mordred: i guess we could add all provider api responses to the debug log. not sure how much bloat that would add | 15:56 |
clarkb | fungi: does that uuid ever show up in the nodepool log? | 15:56 |
fungi | clarkb: nope | 15:56 |
clarkb | fungi: I have a hunch that the 502 happens before assigning a uuid, nodepool says I don't need to delete this node in the cloud because it never exists (no uuid) and simply removes it from the db | 15:56 |
clarkb | then at some point in time hpcloud says "here have a node" | 15:57 |
fungi | clarkb: great point. nodepool may be assuming that errors from a boot call are always going to be cleaned up on the provider side | 15:57 |
clarkb | jroll: so we do still use cloud init, we are currently rebuilding our images to stop using it because ec2 metadata server isn't very reliable | 15:57 |
clarkb | fungi: yes, and I am not sure it can assume much else | 15:58 |
fungi | clarkb: also i'm not entirely sure how it would ever be able to be 100% certain that it needs to clean those up | 15:58 |
fungi | yeah, agreed | 15:58 |
*** unicell1 has quit IRC | 15:58 | |
jroll | clarkb: right, thought you were moving away from it, though | 15:58 |
mordred | jroll: yup | 15:59 |
jroll | cool | 15:59 |
mordred | jroll: two different efforts - one is related to getting dib images to work on rackspace to start with | 15:59 |
jesusaurus | clarkb: fungi: yeah last night i had to clean out a bunch of nodes that nodepool listed as aliens | 15:59 |
*** garyh has joined #openstack-infra | 15:59 | |
jroll | mordred: right | 15:59 |
mordred | jroll: this one is reacting to the fact taht ec2 metadata in hp is horky - so we want to stop using it there sooner | 15:59 |
jroll | heh | 16:00 |
*** harlowja_at_home has quit IRC | 16:00 | |
openstackgerrit | David Lyle proposed openstack/requirements: Raise cap for Django to allow 1.7 https://review.openstack.org/155353 | 16:00 |
mordred | I should remove in hp - I believe it's horky anywhere it exists | 16:00 |
openstackgerrit | Sean Dague proposed openstack/requirements: Bump sahara client version https://review.openstack.org/155428 | 16:00 |
fungi | jesusaurus: seems to me like a nova bug, if the behavior we're theorizing is actually responsible | 16:00 |
jroll | mordred: considering rackspace doesn't have a metadata service, that should make life easier | 16:00 |
*** ominakov has joined #openstack-infra | 16:00 | |
jeblair | fungi, mordred: i'm _pretty_ sure looking at the log in http://paste.openstack.org/show/193961/ that we will have issued a create server api call, gotten a 502 response from that, therefore we never received the server id, but the server was actually created | 16:00 |
mordred | jroll: you'd think | 16:00 |
clarkb | jeblair: yup that is my following too | 16:01 |
clarkb | s/following/reading/ english hard | 16:01 |
mordred | jeblair: that seems likely | 16:01 |
jroll | ha | 16:01 |
mordred | so - it's possible that a 502'd api call can still result in a booted node | 16:01 |
*** dizquierdo has quit IRC | 16:01 | |
fungi | so... if hpcloud can confirm from their side the circumstances which cause the boot call to return an error but allow the server to still be built, that needs to be filed as a bug against nova yeah? | 16:01 |
jeblair | fungi, mordred, clarkb: is there some, like, nova metadata we can stick in a create call that will show up on an inventory later so we could link that server to a "failed" api call? | 16:01 |
*** Qiming_ has joined #openstack-infra | 16:01 | |
mordred | jeblair: yes | 16:01 |
mordred | jeblair: we can put anyting we want in the nova metadata | 16:02 |
fungi | i like the canary idea there | 16:02 |
jeblair | mordred: and that's included in the create call so it's synchronous? | 16:02 |
mordred | yup | 16:02 |
*** dimsum__ has quit IRC | 16:02 | |
mordred | jeblair: we can add more to my patch for that if you want | 16:02 |
clarkb | fungi: yes I think that is a nova bug | 16:02 |
jeblair | cool, so we should probably do that, but also, i do think api calls that return failures while succeeding are bad form :) | 16:02 |
mordred | jeblair: https://review.openstack.org/#/c/126621/ | 16:02 |
clarkb | fungi: if api returns error node should not be booted | 16:02 |
mordred | jeblair: indeed | 16:02 |
mordred | jeblair: so, I addeda nodepool dict to that metadata - would be simple to put more things in that | 16:03 |
jeblair | mordred: cool | 16:03 |
fungi | jeblair: so are you thinking stick an identifier for the nodepoold in as metadata so that it can say "here's an instance in the list, metadata says it's one i built, but i don't have any record of it, delete now"? | 16:03 |
jeblair | fungi: yeah | 16:04 |
clarkb | fungi: jeblair I think we can do that with the node name fwiw | 16:04 |
jeblair | fungi: i think it'd have to be since we delete the node record from the db | 16:04 |
clarkb | the name is essentially metadata that we already have | 16:04 |
fungi | clarkb: not necessarily. consider multiple nodepoolds using a common tenant | 16:04 |
jeblair | fungi: and you may be surprised at this -- i don't want to keep records of every node we've ever created. I've seen where that goes ;) | 16:04 |
clarkb | fungi: oh hrm | 16:04 |
*** Qiming__ has quit IRC | 16:04 | |
mordred | I think it's cheap to add more things into the nova metadata | 16:04 |
fungi | jeblair: yeah, that's why i'm guessing we just have a reusable id of the nodepoold itself (maybe specified in its config) | 16:05 |
clarkb | fungi: ya you are right, so each nodepool would need to add metadata that uniquely identified a booted node to a nodepool instance | 16:05 |
fungi | "here stick this value in the metadata of ever instance you boot" | 16:05 |
*** masayukig_ has quit IRC | 16:05 | |
fungi | er, every | 16:05 |
clarkb | fungi: ya | 16:05 |
jeblair | wfm | 16:05 |
*** sdake has joined #openstack-infra | 16:05 | |
fungi | and if nodepoold sees its own canary there, it knows it's one it built | 16:06 |
*** masayukig_ has joined #openstack-infra | 16:06 | |
fungi | in theory we could accomplish it without metadata by namespacing the instance hostnames, but that's ugliness | 16:06 |
mordred | yeah - especially when we have a friendly metadata structure to use | 16:06 |
clarkb | flashgordon: ^ any idea if the above api returned error but nova booted a node is already a filed bug? | 16:06 |
*** armax has joined #openstack-infra | 16:07 | |
clarkb | flashgordon: and if not ideas on how hpcloud can confirm it is a nova issue? | 16:07 |
*** amotoki has quit IRC | 16:07 | |
*** amotoki has joined #openstack-infra | 16:07 | |
ttx | jeblair: fyi I wrote a new app for design summit scheduling -- one that allows PTLs to directly edit bits of info on sched.org | 16:09 |
ttx | Currently at https://github.com/ttx/summitsched | 16:09 |
*** sdake__ has quit IRC | 16:09 | |
*** garyh has quit IRC | 16:10 | |
ttx | sched.org is all-or-nothing, this will allow to delegate maintenance of parts of the schedule content to people | 16:10 |
*** ominakov has quit IRC | 16:10 | |
*** ominakov has joined #openstack-infra | 16:11 | |
ttx | also enforces all sorts of rules, like prefixing of session titles with track name | 16:11 |
jeblair | ttx: cool, you want to run it in infraland? | 16:11 |
jeblair | ttx: (also, yay prefixing titles!) | 16:11 |
*** Qiming_ has quit IRC | 16:11 | |
*** ominakov has quit IRC | 16:12 | |
ttx | jeblair: it will require infraland resourecs -- whether I'll be able to fuly puppetize it or just request an empty box with root shell is yet tbd | 16:12 |
ttx | (depending on how much time I'll have on my hands) | 16:12 |
jeblair | ttx: we've already got a puppet model for 'install django and run the syncdb thing' | 16:13 |
jeblair | ttx: so it shouldn't be too hard | 16:13 |
jeblair | ttx: (graphite.o.o does that i think) | 16:13 |
ttx | jeblair: yeah, just need to add the inital data load (track names and lead usernames) | 16:13 |
ttx | Also allows multiple people to help for the same track | 16:14 |
ttx | rather than be PTL-reserved | 16:14 |
ttx | Currently considerign the ability to tag a session with multiple types, so that it appears on multiple tracks | 16:14 |
*** ayoung has joined #openstack-infra | 16:15 | |
ttx | but the sched data model is pretty weak | 16:15 |
*** esker has quit IRC | 16:15 | |
ttx | (its API is weak too) | 16:15 |
clarkb | fungi: were you going to check that thread dump? | 16:16 |
fungi | clarkb: ahh, yep, grepping through it now | 16:16 |
fungi | funny, just noticed that the spelling check in my irc client believes "grepping" is an actual word | 16:17 |
*** baoli has joined #openstack-infra | 16:17 | |
*** sigmavirus24 is now known as sigmavirus24_awa | 16:17 | |
*** masayukig_ has quit IRC | 16:18 | |
*** thingee has quit IRC | 16:18 | |
fungi | clarkb: nothing troublesome that i can find | 16:20 |
openstackgerrit | Clark Boylan proposed openstack-infra/project-config: Disable -dibtest jobs https://review.openstack.org/166302 | 16:20 |
*** sigmavirus24_awa is now known as sigmavirus24 | 16:20 | |
openstackgerrit | Clark Boylan proposed openstack-infra/system-config: Cleanup devstack-(trusty|precise)-dib images https://review.openstack.org/158891 | 16:21 |
*** masayukig_ has joined #openstack-infra | 16:21 | |
clarkb | mordred: ^ getting those two changes in should allow you to delete the devstack-precise-dib and devstack-trusty-dib images on nodepool.o.o freeing up ~16GB of disk | 16:21 |
yolanda | mordred, i see your change for rate=64 for hpcloud? is really that needed? that's too high, going to make nodepool sloooow | 16:21 |
*** achuprin has quit IRC | 16:21 | |
yolanda | and i'm worried for the part i'm affected | 16:21 |
fungi | hrm, the test nodes graph says we have no nodes in use now | 16:21 |
mordred | yolanda: it's required for us | 16:21 |
yolanda | but this means 1 api call per 64 secs?! | 16:21 |
mordred | nope | 16:22 |
mordred | well, for us | 16:22 |
fungi | nodepool list says 14 nodes in use | 16:22 |
mordred | it's 1 ever 12.8 | 16:22 |
mordred | because we have 5 hpcloud regions | 16:22 |
mordred | yolanda: I would not copy that setting if I were you | 16:22 |
mordred | yolanda: if the noc is not yelling at you, your current setting is fine | 16:22 |
yolanda | no, of course :) | 16:22 |
fungi | i think something in the image updates for may have tanked rackspace node builds? | 16:22 |
clarkb | fungi: all the clouds are basically 100% error rate | 16:22 |
yolanda | but i was reviewing that change and my alerts raised | 16:22 |
mordred | fungi: oh, that's not great | 16:22 |
fungi | i'm going to grab a console for one now and see what's going on | 16:23 |
clarkb | fungi: its possible thats the remove cloud init oing into affect | 16:23 |
*** dimsum__ has joined #openstack-infra | 16:23 | |
fungi | clarkb: yeah, that was my suspicion as well | 16:23 |
clarkb | so we may need to delete todays build in rax then revert those changes | 16:23 |
mordred | sigh | 16:23 |
jeblair | mordred, yolanda: it's also only required for the un-merged change that does rate limiting on start of requests instead of end | 16:23 |
clarkb | fungi: thank you for checking the thraed dump I will upload that to the bug now | 16:23 |
yolanda | ah, jeblair, i looked at that, i'm willing that this merged | 16:23 |
yolanda | nodepool is our daily pain | 16:24 |
clarkb | yolanda: nodepool or the cloud? | 16:24 |
clarkb | if there are bugs in nodepool we should fix them | 16:24 |
mordred | clarkb: same thing for them - they only have one cloud | 16:24 |
yolanda | 75% cloud 25% nodepool? | 16:24 |
mordred | clarkb: which is, I believe, why they're more eager for the rebuild stuff | 16:24 |
rcarrillocruz | ++ | 16:25 |
rcarrillocruz | rebuild would be a killer feature for us | 16:25 |
jeblair | mordred, yolanda: i think this is no better than what we had before and actually i think it is a little worse. i think we should not merge that change and go back to our previous config | 16:25 |
yolanda | if you look at the charts, most of nodes are in building and delete status | 16:25 |
mordred | jeblair: yah | 16:25 |
*** unicell has joined #openstack-infra | 16:25 | |
yolanda | mordred, jeblair, have you ever thought about nodepool serving docker instances? for lots of simple tests that will make things much faster | 16:27 |
yolanda | a pep8 test, an alphabetized one | 16:27 |
yolanda | these are very very simple things | 16:28 |
yolanda | why a full vm for that? | 16:28 |
jeblair | yolanda: yes, i have. that's something that we could consider doing in zuulv3 as well. but again, it would be very complicated in the current system. partially because of the nodepool allocation system, but also complex for us because docker is not secure. | 16:28 |
yolanda | i was running tests on lxc when i started and i've always had that on my mind | 16:28 |
mordred | jeblair: ++ | 16:28 |
yolanda | jeblair, but discriminating the kind of tests that could use it... it could be a real helper | 16:29 |
yolanda | you cannot run a tempest test there, but can run pep8 or unit testing | 16:29 |
fungi | clarkb: mordred: no errors on the console. they're booting up to a local login prompt, but not reachable on the ip address reported by nova list (no response even via ping) | 16:29 |
mordred | fungi: these are the rax nodes? | 16:29 |
fungi | mordred: yep | 16:29 |
jeblair | yolanda: you don't need to convince me, i understand. | 16:29 |
zaro | morning | 16:30 |
fungi | mordred: looks like just since this morning's image updates there | 16:30 |
mordred | fungi: well, I mean, that definitely points to delete images and revert | 16:30 |
yolanda | jeblair, and how could we achieve it? some spec for it, and tied to the new nodepool spec? | 16:30 |
fungi | mordred: agreed. doing so now | 16:30 |
mordred | I do NOT understand why | 16:30 |
mordred | but figuring out why is a task for later | 16:30 |
*** sabari has quit IRC | 16:31 | |
*** spzala has joined #openstack-infra | 16:31 | |
jeblair | yolanda: yes, it could be done either as part of, or after, the zuulv3 work. | 16:31 |
yolanda | that will be a killer feature for the simple tests | 16:32 |
rcarrillocruz | i think docker is cool, but besides security implications, i think docker should be kept at the nova provider layer, and not on nodepool | 16:32 |
rcarrillocruz | would be great to maybe have that new hp infra cloud with docker or something | 16:32 |
clarkb | its important to note that containers don't address the current problems because you still need somewhere to run the container. So the current issues need to be fixed first regardless | 16:32 |
yolanda | do providers allow it? | 16:32 |
rcarrillocruz | and devoting nodes for pep8 | 16:32 |
rcarrillocruz | etc | 16:32 |
jeblair | clarkb: yep | 16:32 |
*** sabari has joined #openstack-infra | 16:32 | |
jeblair | okay, so i'd like to defer this conversation for later... | 16:32 |
yolanda | the way i had it implemented, is that i had x static slaves, that were serving lxc containers | 16:33 |
jeblair | and instead breach the subject that we have no workers right now. | 16:33 |
*** skolekonov has quit IRC | 16:33 | |
mordred | jeblair: yes. | 16:33 |
jeblair | well, i mean, we have 20. | 16:33 |
mordred | jeblair: statistically, that's no workers | 16:33 |
*** dkranz has quit IRC | 16:34 | |
jeblair | we have 295 building in rax | 16:34 |
clarkb | fungi is deleting the new images we just built | 16:35 |
fungi | all images built in rackspace in the last few hours are now deleted. hopefully we see some recovery there shortly | 16:35 |
jeblair | okay cool | 16:35 |
clarkb | that should get us back to pre cloud init removal. Then we also need to revert those changes | 16:35 |
clarkb | otherwise this will regress over the weekend | 16:35 |
jeblair | i don't see anything happening on the hpcloud side | 16:35 |
jeblair | as far as the noc asking us to load test or anything | 16:35 |
*** amotoki has quit IRC | 16:36 | |
jeblair | so i'd like to just go ahead and revert back to thursday morning's config | 16:36 |
fungi | that works for me | 16:36 |
mordred | k. | 16:36 |
jeblair | mordred: if hpcloud improves something, we can check our logs for deletion times | 16:36 |
*** ayoung has quit IRC | 16:37 | |
*** ociuhandu has quit IRC | 16:37 | |
jeblair | mordred: we see 30 second delete api calls enough that we should see an improvement in that time if they manage to make an improvement | 16:37 |
mordred | jeblair: yah | 16:37 |
*** tjones1 has joined #openstack-infra | 16:37 | |
jeblair | rcarrillocruz, yolanda: ^ (if they are able to improve things, this would help you too) | 16:37 |
*** MrAboii has joined #openstack-infra | 16:38 | |
*** dkranz has joined #openstack-infra | 16:38 | |
yolanda | so jeblair, what's the issue, api response went worse than normal? | 16:38 |
jeblair | yolanda: i believe it has slowed gradually over time | 16:39 |
*** achuprin has joined #openstack-infra | 16:39 | |
openstackgerrit | James E. Blair proposed openstack-infra/system-config: Revert "Turn off HP Public Cloud" https://review.openstack.org/166308 | 16:40 |
jeblair | mordred: can you ninja that ^ | 16:40 |
*** EmilienM is now known as EmilienM|afk | 16:41 | |
mordred | jeblair: yup | 16:41 |
clarkb | jeblair: revert sounds good to me as well | 16:41 |
openstackgerrit | Merged openstack-infra/system-config: Revert "Turn off HP Public Cloud" https://review.openstack.org/166308 | 16:42 |
mordred | jeblair: don't forget, puppet is disabled on nodepool | 16:42 |
*** andreykurilin_ has joined #openstack-infra | 16:42 | |
jeblair | mordred: yep. i plan on stopping nodepool, re-installing master, running puppet apply, and starting nodepool | 16:42 |
jeblair | fungi: are you ready for me to do that ^ ? | 16:42 |
clarkb | I have made progress on https://issues.jenkins-ci.org/browse/JENKINS-27514 just by reading through this to update the bug | 16:43 |
jeblair | (er, puppet agent) | 16:43 |
fungi | jeblair: yes, go for it | 16:43 |
*** yamahata has joined #openstack-infra | 16:43 | |
*** Ala has quit IRC | 16:43 | |
openstackgerrit | Merged openstack-infra/project-config: Add new project faafo to Stackforge https://review.openstack.org/164668 | 16:44 |
*** tsg_ has quit IRC | 16:44 | |
clarkb | https://github.com/jenkinsci/ssh-slaves-plugin/blob/ssh-slaves-1.9/src/main/java/hudson/plugins/sshslaves/SSHLauncher.java#L1213 hanging coupled with a synchronized method appears to be leaking all of the threads | 16:44 |
jeblair | +if type dpkg-reconfigure >/dev/null 2>&1 && ! test -f /etc/ssh/ssh_host_rsa_key | 16:44 |
jeblair | +then | 16:44 |
jeblair | + dpkg-reconfigure openssh-server | 16:44 |
jeblair | +fi | 16:44 |
jeblair | puppet did that ^ | 16:45 |
jeblair | in case that impacts your thinking about the content of images that were built this morning | 16:45 |
fungi | that was part of the "regen ssh host keys ourself" patch | 16:45 |
jeblair | yeah, seems like it may not have been applied | 16:45 |
fungi | i don't think the rackspace images were getting that far | 16:45 |
jeblair | ok | 16:45 |
fungi | seems like they weren't actually configuring their network interfaces | 16:45 |
fungi | at least from the limited testing i was able to do | 16:46 |
*** dprince has quit IRC | 16:46 | |
*** dprince has joined #openstack-infra | 16:46 | |
fungi | they were booted to login prompts for several minutes but i got no ping response from the ip addresses reported for them by nova | 16:46 |
openstackgerrit | Monty Taylor proposed openstack-infra/project-config: Revert "Regenerate ssh host key on boot" https://review.openstack.org/166310 | 16:47 |
openstackgerrit | Monty Taylor proposed openstack-infra/project-config: Revert "Remove ssh host keys during image build" https://review.openstack.org/166311 | 16:47 |
fungi | can also check what nodepoold saw from those. if it really was an ssh host key problem we'd get connection closed. if lack of networking then connection timeout | 16:47 |
fungi | i'l go hunting in the logs | 16:47 |
openstackgerrit | Monty Taylor proposed openstack-infra/system-config: Revert cloud-init removal https://review.openstack.org/166312 | 16:50 |
*** david-lyle_ has joined #openstack-infra | 16:50 | |
mordred | I'm going to ninja those reverts above, unless there is opposition | 16:50 |
mordred | clarkb, fungi, jeblair, pleia2 ^^ | 16:50 |
jeblair | mordred: are you abandoning the effort? | 16:51 |
*** david-lyle_ has quit IRC | 16:51 | |
*** harlowja_away is now known as harlowja_ | 16:51 | |
mordred | jeblair: well, it didn't fix hp, and it broke rax - so I think regrouping and starting over is probably in order, yeah? | 16:52 |
*** Sukhdev has joined #openstack-infra | 16:52 | |
pleia2 | makes sense | 16:52 |
jeblair | mordred: (btw, if we are in a similar situation again, i believe we could mitigate the extremely slow hpcloud boot times by lowering max-servers for one of the providers) | 16:52 |
jeblair | mordred: yeah -- will we get the same effect eventually with the dib work? | 16:52 |
mordred | well - we're doing much more active and methodical testing with the dib work | 16:53 |
mordred | to understand what we need at boot time for realz | 16:53 |
*** sigmavirus24 is now known as sigmavirus24_awa | 16:53 | |
jeblair | mordred: so just roll "don't have cloud-init depend on metadata server but also make sure we get fresh host keys" into that? | 16:53 |
mordred | that removing cloud-init somehow broke rackspace which I thought didn't use it is mindboggling to me | 16:53 |
*** dustins has joined #openstack-infra | 16:54 | |
fungi | looks like what nodepoold logged was "Timeout waiting for server <UUID> in rax-xxx" so it never got as far as testing ssh | 16:54 |
mordred | jeblair: yah- I mean, I'd like to get an answer sooner - but worst case yes | 16:54 |
mordred | jeblair: because I want to undersatnd what about that broke rackspace | 16:54 |
jeblair | mordred: okay sounds good | 16:54 |
mordred | since it defies my understanding of how the rackspace nodes work | 16:54 |
mordred | which isn't good :) | 16:54 |
jeblair | if you look at the node graph now, you can basically see what it's like when we delete all our instances at once. a steep decline from rax, and we've leveled out waiting on hpcloud | 16:55 |
*** dtantsur is now known as dtantsur|afk | 16:55 | |
jeblair | and now starting to build up in rax | 16:55 |
jeblair | is anyone deleting hpcloud aliens? if not, i'll start on that | 16:59 |
fungi | i had not started yet | 16:59 |
openstackgerrit | Merged openstack-infra/project-config: Make VPNaaS StrongSwan functional gate voting https://review.openstack.org/165392 | 16:59 |
fungi | my best guess is that something we did broke rax instances' ability to configure their network interface/routing and if the hypervisor can't ping the interface nova never reports it as ready? | 17:00 |
*** psedlak has joined #openstack-infra | 17:00 | |
rcarrillocruz | jeblair: nod, we've been plagued by those slow delete api calls... last thing we heard, the Neutron guys were looking at it | 17:00 |
clarkb | fungi: I am wondering if nova agent is tied to cloud init somehow | 17:01 |
jeblair | fungi: do you have an easy way to reconcile two alien lists? | 17:01 |
clarkb | jeblair: comm -12 | 17:01 |
*** baoli has quit IRC | 17:01 | |
clarkb | I learned this from sdague, its a neat little trick | 17:01 |
*** baoli has joined #openstack-infra | 17:01 | |
fungi | jeblair: you mean to diff them? i'll give you my script | 17:02 |
clarkb | I just use `comm -12 file1 file2` | 17:02 |
jeblair | clarkb: that's pretty cool | 17:02 |
*** baoli has quit IRC | 17:02 | |
fungi | jeblair: world's worst bash one-liner http://paste.openstack.org/show/193973/ | 17:02 |
*** wenlock has quit IRC | 17:02 | |
*** baoli has joined #openstack-infra | 17:02 | |
clarkb | jeblair: it does get cranky about unsorted inputs so I sort the files first usually | 17:02 |
openstackgerrit | Merged openstack-infra/project-config: Remove check-tempest-dsvm-f20 https://review.openstack.org/165532 | 17:03 |
*** Ryan_Lane has joined #openstack-infra | 17:03 | |
fungi | clarkb: oh, neat. that would get rid of my hacky nested loops | 17:03 |
fungi | also i need to go get some lunch, but will be back shortly | 17:04 |
*** markus_z has quit IRC | 17:04 | |
clarkb | johnthetubaguy: any ideas on why purging cloud-init from our images would break our ability to have working networking on rax nodes? is nova agent piggy backing off of somethign cloud init does? | 17:04 |
*** sarob has joined #openstack-infra | 17:04 | |
SpamapS | jeblair: Are you aware of anybody who has successfully used gear w/ eventlet? | 17:05 |
* SpamapS needs to get back to real work.. has fallen down a gearman hole lately | 17:05 | |
openstackgerrit | Monty Taylor proposed openstack-infra/project-config: Disable metadata in cloud-init config https://review.openstack.org/166318 | 17:06 |
mordred | jeblair: ok - rather than the full remove cloud-init - I would like to modify it ^^ | 17:06 |
jeblair | SpamapS: i think there may be an unmerged patch to that effect in gear's review queue | 17:06 |
mordred | SpamapS: ^^ can you check me that that's not insane? | 17:06 |
mordred | clarkb: ^^ I think that is more inline with what you were saying yesterday | 17:06 |
*** Somay has joined #openstack-infra | 17:06 | |
SpamapS | jeblair: in my experimenting with gear as an oslo.messaging driver.. it's not working. | 17:06 |
fungi | jeblair: sdague: oh, as for yesterday's heat functional discussion, https://review.openstack.org/166030 seems to have gotten the job back to an hour consistently | 17:07 |
jeblair | fungi: yay | 17:07 |
SpamapS | And I've spent way more time than I ever should have on this, so I think it's time to WIP it and circle back later. :-/ | 17:07 |
clarkb | mordred: ya I was also suggestion we use config drive but I don't think thats necessary for the short term | 17:07 |
jeblair | SpamapS: 97533 | 17:07 |
clarkb | fungi: awesome | 17:07 |
*** ociuhandu has joined #openstack-infra | 17:07 | |
anteaya | fungi jeblair do you think we can change the timeouts for the heat job then? | 17:08 |
mordred | clarkb: yes - I don't think we need the config drive part | 17:08 |
jeblair | SpamapS: but yeah, i support you in not rat-holing on it :) | 17:08 |
clarkb | mordred: agreed | 17:08 |
openstackgerrit | Merged openstack-infra/project-config: Adds compute-hyperv in StackForge https://review.openstack.org/165611 | 17:08 |
mordred | clarkb: but I think we can satisfy the intent with that patch | 17:08 |
SpamapS | jeblair: ah yes just found that. Well that is exactly what I ran into. | 17:08 |
clarkb | mordred: but if config drive were enabled it should also just work | 17:08 |
clarkb | assuming new enough cloud-init | 17:08 |
mordred | clarkb: yes | 17:08 |
SpamapS | jeblair: oh you should have hid this from me. Now it will be calling to me from the bottom of the rat hole. ;) | 17:08 |
clarkb | so its win win | 17:08 |
mordred | it turns out rackspace IS using cloud-init for something in addition to nova-agent | 17:09 |
clarkb | mordred: do you know what that is? | 17:09 |
mordred | as evidenced by the existence of /etc/cloud/cloud.cfg.d/10_rackspace.cfg | 17:09 |
mordred | so - I think they are using it for many of the things | 17:09 |
mordred | just not the things that nova-agent is doing | 17:09 |
openstackgerrit | Merged openstack-infra/os-loganalyze: fix supports_sev matching https://review.openstack.org/165542 | 17:09 |
mordred | *boggles* | 17:09 |
*** tnovacik has joined #openstack-infra | 17:09 | |
SpamapS | mordred: sanity checked | 17:10 |
openstackgerrit | Merged openstack-infra/project-config: new-project: stackforge/python-senlinclient https://review.openstack.org/164963 | 17:10 |
clarkb | looking at this java code I think that there is zero reason to synchronize that method. I wish java devs wouldn't default to doing that its a horrible practice. Instead we need to synchronize around the connection and session objects which are not class level but object level | 17:10 |
*** garyh has joined #openstack-infra | 17:10 | |
fungi | anteaya: probably if they're waay higher than an hour | 17:10 |
openstackgerrit | Merged openstack-infra/os-loganalyze: let tests be run from test file location https://review.openstack.org/165799 | 17:10 |
fungi | also, this seems to have worked yesterday... http://lists.openstack.org/pipermail/foundation-board/2015-March/thread.html | 17:10 |
mordred | fungi: woot! | 17:11 |
clarkb | fungi: anteaya they are currently set to 2 hours, I think we should reduce to 90 minutes or so | 17:11 |
mordred | clarkb, fungi: mind if a push through the reverts and the new attempt at cloud-init and kick another hpcloud image rebuild? | 17:11 |
openstackgerrit | Merged openstack-infra/os-loganalyze: extract static methods https://review.openstack.org/165850 | 17:11 |
*** e0ne has quit IRC | 17:11 | |
clarkb | mordred: if you are around today to babysit fine by me :) | 17:12 |
*** wenlock has joined #openstack-infra | 17:12 | |
mordred | k. I'm going to run to get the bag of coffee beans right now - if there are no objections when I get back, I will do that next | 17:13 |
clarkb | mordred: you will need to free disk space again | 17:13 |
anteaya | clarkb: just found the patch so will offer something around 90 minutes noting that 60 minutes would be the ideal | 17:13 |
mordred | clarkb: that's so exciting | 17:13 |
*** psedlak is now known as psedlak^afk | 17:13 | |
clarkb | mordred: you can `sudo -H -u nodepool dib-image-delete $imageid` | 17:13 |
anteaya | fungi: k | 17:13 |
fungi | mordred: lgtm though i didn't vet the cloud-init config syntax i'm assuming SpamapS did | 17:14 |
clarkb | mordred: where imageid is the id for the older of the devstack-precise-dib devstack-trusty-dib and devstack-centos7-dib images | 17:14 |
fungi | okay, really going to lunch now. bbiaw | 17:14 |
*** Sukhdev has quit IRC | 17:15 | |
mordred | fungi: fwiw, I copied that content directly from the rackspace nodes | 17:15 |
clarkb | mordred: https://issues.jenkins-ci.org/browse/JENKINS-27514 and https://review.openstack.org/#/c/158891/ should allow us to properly remove those images until we need them for rax | 17:15 |
SpamapS | fungi: it's yaml, it can't be wrong. ;) | 17:15 |
*** achanda has joined #openstack-infra | 17:15 | |
*** notnownikki has quit IRC | 17:16 | |
*** dboik_ has joined #openstack-infra | 17:16 | |
*** psedlak^afk is now known as psedlak | 17:16 | |
clarkb | mordred: there is a comment in those files about dpkg reconfiguring | 17:16 |
clarkb | mordred: is that going to be something you need to do or something that will override your changes if it happens? | 17:16 |
*** psedlak is now known as psedlak^afk | 17:16 | |
mordred | SpamapS: ^^ ? | 17:17 |
*** psedlak^afk is now known as psedlak | 17:17 | |
mordred | clarkb: I don't know - i've never used cloud-init successfullly | 17:17 |
SpamapS | ugh | 17:17 |
*** AJaeger has joined #openstack-infra | 17:17 | |
SpamapS | I think you might have to put the answer in debconf, let me check | 17:18 |
clarkb | I am pretty sure that this is the real reason people use docker | 17:18 |
mordred | yup | 17:18 |
clarkb | not the packaging or the potential security | 17:18 |
mordred | yup | 17:18 |
clarkb | but the "I just want this damn process to run" functionality | 17:18 |
mordred | yup | 17:18 |
mordred | because everything else has lost track of that being the use case people are trying to solve 95% of the time | 17:19 |
AJaeger | sdague, some of the requirements we have in openstack-manuals are unique and we could remove them. Note that trove also uses the docbook XML toolchain and thus needs openstack-doc-tools. So, what about the following: | 17:19 |
SpamapS | holllyyy crap | 17:19 |
* SpamapS did not need to see cloud-init's postinst today | 17:19 | |
SpamapS | don't look at it | 17:19 |
SpamapS | face melting | 17:19 |
clarkb | this should be simple, but after diving into upstart sysv compat on ubuntu I no longer assume anything about how running proceses should be simple at boot | 17:19 |
openstackgerrit | Anita Kuno proposed openstack-infra/project-config: Reduce timeout for heat functional job https://review.openstack.org/166320 | 17:19 |
AJaeger | sdague, allow in projects.txt in requirments "soft" projects where we do not require all requirements - and set that flag for the doc projects. And then remove their unique requirements? | 17:19 |
*** jistr has quit IRC | 17:19 | |
*** dboik has quit IRC | 17:20 | |
*** pblaho has quit IRC | 17:20 | |
SpamapS | ok yeah | 17:20 |
*** psedlak has quit IRC | 17:20 | |
SpamapS | mordred: so clarkb is right in being concerned | 17:20 |
* anteaya is not a fan of swearing in channel | 17:20 | |
SpamapS | the debconf value cloud-init/datasources will be injected there | 17:21 |
clarkb | anteaya: sorry | 17:21 |
*** garyh has quit IRC | 17:21 | |
clarkb | SpamapS: and that will happen only if dpkg-reconfigure is called right? so if the package is updated? | 17:21 |
anteaya | clarkb: np, thanks | 17:21 |
clarkb | we can probably get away with the change as is | 17:21 |
clarkb | but it may also lead to weirdness down the road if we aren't carefuk | 17:21 |
*** openstackgerrit has quit IRC | 17:21 | |
clarkb | *careful | 17:21 |
SpamapS | clarkb: updates will cause it yes | 17:21 |
*** openstackgerrit has joined #openstack-infra | 17:22 | |
openstackgerrit | Somay Jain proposed openstack-infra/jenkins-job-builder: Adding more configurable options in Notifications plugin https://review.openstack.org/163137 | 17:22 |
*** kgiusti has quit IRC | 17:22 | |
SpamapS | you can make a file, 91_reallydatasources.cfg | 17:22 |
*** dmorita has joined #openstack-infra | 17:22 | |
SpamapS | cloud-init reads them in order and will do a __dict__.update() using the new one | 17:22 |
SpamapS | so that might be the safest way | 17:23 |
SpamapS | mordred: ^ | 17:23 |
clarkb | SpamapS: that sounds simple and reliable, I like it | 17:23 |
SpamapS | or echo "cloud-init cloud-init/datasources Configdrive,None" | debconf-set-selections | 17:24 |
SpamapS | but really, debconf, DIAF. :-P | 17:25 |
*** Bsony_ has quit IRC | 17:25 | |
*** mjturek1 has quit IRC | 17:26 | |
*** gyee has joined #openstack-infra | 17:27 | |
*** gampel has joined #openstack-infra | 17:29 | |
cinerama | hey pleia2 | 17:30 |
openstackgerrit | Anita Kuno proposed openstack-infra/project-config: Reduce timeout for heat functional job https://review.openstack.org/166320 | 17:30 |
*** koolhead17 has joined #openstack-infra | 17:31 | |
*** pc_m has quit IRC | 17:31 | |
*** armax has quit IRC | 17:32 | |
morganfainberg | lbragstad, https://bugs.launchpad.net/keystone/+bug/1433311 is not wishlist, this is higher prio | 17:33 |
openstack | Launchpad bug 1433311 in Keystone "Fernet tokens current don't support token bind" [Medium,Triaged] | 17:33 |
*** tsg has joined #openstack-infra | 17:33 | |
morganfainberg | whoopse wrong channel | 17:33 |
*** pelix has joined #openstack-infra | 17:35 | |
jeblair | clarkb, fungi, mordred: hpcloud alien deletes are running | 17:36 |
*** sputnik13 has joined #openstack-infra | 17:37 | |
*** koolhead17 has quit IRC | 17:37 | |
clarkb | jeblair: cool, do you want me to kick off a floating ip cleanup too? I can also start the leaked port deletion script | 17:37 |
*** pelix has quit IRC | 17:38 | |
clarkb | jeblair: or I can give you my one liner for FIPs if you want to run it | 17:38 |
anteaya | morganfainberg: I was going to say | 17:38 |
*** ivar-lazzaro has joined #openstack-infra | 17:38 | |
jeblair | clarkb: why don't you kick it off? but i'm guessing it won't have much to do | 17:38 |
clarkb | ok | 17:38 |
jeblair | clarkb: i think most of these happened before we got to the fip state | 17:38 |
jeblair | stage | 17:38 |
morganfainberg | anteaya, yeah i know :P | 17:38 |
*** arxcruz has quit IRC | 17:38 | |
morganfainberg | anteaya, tooooooo many irc channels | 17:38 |
*** pelix has joined #openstack-infra | 17:39 | |
anteaya | I've never seen lbragstad say anything in this channel | 17:39 |
anteaya | I'd make fun of his hat if he did | 17:39 |
clarkb | jeblair: `venv/bin/neutron floatingip-list | grep -v '10\.0\.' | sed -e '1,3d' -e '$d' | cut -d'|' -f 2 | xargs -n 1 -P 1 venv/bin/neutron floatingip-delete` is the one liner fwiw | 17:39 |
anteaya | morganfainberg: I've been waiting for the opportunity | 17:39 |
clarkb | and its done, only 4 to delete | 17:39 |
*** ivar-lazzaro has quit IRC | 17:39 | |
morganfainberg | anteaya, ++ yes! | 17:40 |
*** ivar-lazzaro has joined #openstack-infra | 17:40 | |
*** aysyd has quit IRC | 17:42 | |
anteaya | morganfainberg: I _know_ he is wearing it | 17:42 |
morganfainberg | anteaya, i'm sure he is! | 17:42 |
*** otter768 has joined #openstack-infra | 17:45 | |
openstackgerrit | Merged openstack/requirements: Update gabbi to 0.12.0 https://review.openstack.org/156253 | 17:45 |
*** ghostpl_ has quit IRC | 17:45 | |
*** aysyd has joined #openstack-infra | 17:46 | |
*** dmorita has quit IRC | 17:47 | |
*** fandi has joined #openstack-infra | 17:47 | |
*** fandi has quit IRC | 17:47 | |
*** sabeen1 has joined #openstack-infra | 17:47 | |
*** dmorita has joined #openstack-infra | 17:48 | |
*** ayoung has joined #openstack-infra | 17:49 | |
*** otter768 has quit IRC | 17:50 | |
*** VijayTripathi has joined #openstack-infra | 17:50 | |
*** ghostpl_ has joined #openstack-infra | 17:50 | |
openstackgerrit | Merged openstack-infra/project-config: Drop ironic tempest regex, stop running all of Tempest https://review.openstack.org/161420 | 17:52 |
*** coolsvap_ is now known as coolsvap|afk | 17:52 | |
openstackgerrit | Monty Taylor proposed openstack-infra/project-config: Disable metadata in cloud-init config https://review.openstack.org/166318 | 17:53 |
mordred | SpamapS, clarkb: ^^ there - that also tells debconf | 17:53 |
*** dmorita has quit IRC | 17:54 | |
*** andreykurilin_ has quit IRC | 17:54 | |
jeblair | clarkb, fungi, mordred: with nodepool running on hpcloud, i can only do about two alien delete processes in parallel | 17:55 |
clarkb | mordred: you prefer that over SpamapS' suggestion? | 17:55 |
openstackgerrit | Merged openstack-infra/project-config: Add a python-ironicclient src job https://review.openstack.org/163632 | 17:55 |
*** mpaolino has quit IRC | 17:55 | |
jeblair | clarkb, fungi, mordred: 2 pushes create time up to 45 seconds / request, 3 occasionally pushes it over 60 seconds which is what we have the api timeout set to | 17:55 |
jeblair | (more than 3 regularly pushes it over that limit) | 17:56 |
*** patrickeast has joined #openstack-infra | 17:56 | |
jeblair | so it's going to take a really long time to run | 17:56 |
*** mrmartin has joined #openstack-infra | 17:56 | |
openstackgerrit | Merged openstack-infra/project-config: Turn on oslo.messaging coverage report https://review.openstack.org/164022 | 17:56 |
mordred | clarkb: that was one of his suggestions | 17:56 |
mordred | clarkb: that will ensure that if it gets re-run, the file will remain set | 17:57 |
mordred | clarkb: the other thing seems really confusing to me | 17:57 |
*** dmorita has joined #openstack-infra | 17:57 | |
mordred | jeblair: yoy | 17:57 |
clarkb | mordred: ok | 17:57 |
*** Bsony has joined #openstack-infra | 17:58 | |
openstackgerrit | Merged openstack/requirements: Bump keystonemiddleware requirement https://review.openstack.org/164573 | 17:59 |
clarkb | mordred: lgtm | 17:59 |
clarkb | I am going to pop out now for early lunch. Back in a bit | 18:00 |
*** _nadya_ has quit IRC | 18:00 | |
openstackgerrit | Merged openstack-infra/system-config: Revert cloud-init removal https://review.openstack.org/166312 | 18:01 |
openstackgerrit | Merged openstack/requirements: Bump requests-mock version https://review.openstack.org/162493 | 18:01 |
openstackgerrit | Merged openstack/requirements: Update pip and pip-missing-reqs https://review.openstack.org/159293 | 18:01 |
*** armax has joined #openstack-infra | 18:01 | |
*** shardy has quit IRC | 18:01 | |
mordred | SpamapS: dare I even ask why the file is called 90_dpkg ? | 18:02 |
mordred | SpamapS: I mean, it has nothing to do with configuring dpkg - it's a setting that configures data sources | 18:02 |
*** sdake_ has joined #openstack-infra | 18:03 | |
mordred | oh - sod it - I need to do a different patch on rh systems don't I? | 18:03 |
johnthetubaguy | mordred: I think they did something evil inside cloud init to stop it racing with the agent… | 18:03 |
mordred | johnthetubaguy: all to avoid running dhcp | 18:03 |
mordred | johnthetubaguy: the mind boggles | 18:03 |
mordred | at the amount of effort that has been expended to chat that | 18:03 |
mordred | chase | 18:04 |
mordred | not chat | 18:04 |
johnthetubaguy | mordred: basically, the agent needs to setup network, before cloud-init starts if I remember | 18:04 |
mordred | yup | 18:04 |
mordred | I've looked through the init-script hacks for that | 18:04 |
johnthetubaguy | ah, OK, thats the bit I knew about | 18:05 |
*** e0ne has joined #openstack-infra | 18:05 | |
*** Swami has joined #openstack-infra | 18:05 | |
mordred | clarkb, SpamapS: new version of that patch coming - I didn't think to test for ubuntu first | 18:05 |
johnthetubaguy | so I assume you don't have an image metadata tag telling nova not to talk to the agent on your image, but thats another thing that can stop that working | 18:05 |
mordred | johnthetubaguy: well, we deleted cloud-init earlier | 18:05 |
mordred | because we bake keys into the images | 18:06 |
johnthetubaguy | mordred: a good way to test the agents OK is to changepassword, after rebooting your VM, after removing cloud-init from it, if that works? | 18:06 |
mordred | but it turns out that breaks something else on rackspace | 18:06 |
*** Sukhdev has joined #openstack-infra | 18:06 | |
mordred | something related to networking | 18:06 |
*** sdake has quit IRC | 18:06 | |
*** mrmartin has quit IRC | 18:06 | |
mordred | which surprised me - because I was expecting ... OH! I think I know | 18:06 |
johnthetubaguy | mordred: hmm, we certainly don't use cloud-init for anything critical, I can only think of the init hack for them being linked | 18:06 |
clarkb | mordred maybe do spamaps thing as it should be distro agnostic? | 18:07 |
mordred | clarkb: sigh. ok. I want to go on record as saying it makes me angry, fwiw | 18:07 |
johnthetubaguy | I am curious, why do you need to remove cloud-init? | 18:08 |
mordred | johnthetubaguy: we don't need it | 18:09 |
johnthetubaguy | hmm, OK | 18:09 |
mordred | johnthetubaguy: but we were going for the easy way to stop hammering the hp cloud metadata service | 18:09 |
mordred | johnthetubaguy: all of our nodes boot from images we build | 18:09 |
johnthetubaguy | ah, that makes more sense, gotcha | 18:09 |
mordred | johnthetubaguy: we're currently working on a project which is "make an image that can boot on both rackspace and hp that contains neither nova-agent nor cloud-init" | 18:10 |
johnthetubaguy | mordred: eek, gotcha | 18:10 |
johnthetubaguy | a worth aim | 18:10 |
mordred | johnthetubaguy: I'd give in and use cloud-init if we didn't have to patch cloud-init to get networking info on rax | 18:10 |
johnthetubaguy | s/worth/worthy/ | 18:10 |
mordred | but we do - so it's also a pita | 18:10 |
*** derekh has quit IRC | 18:11 | |
johnthetubaguy | mordred: I didn't think you should have to do that patch though, the regular info should have been there two, sounds like a bug | 18:11 |
johnthetubaguy | s/two/too/ | 18:11 |
mordred | it's not - the patch hasn't landed upstream | 18:11 |
mordred | to pass neutron IP info through to config-drive | 18:11 |
mordred | rax is VERY THANKFULLY deploying teh same info currently into a vendor extension (thank you thank you) | 18:11 |
mordred | but until it lands upstream, the patch to consume from cloud-init isn't even proposed to cloud-init | 18:12 |
mordred | and cloud-init upstream is currently doing a 2.0 rewrite anyway | 18:12 |
johnthetubaguy | mordred: yeah, I am thinking there was a way if you set flat_injected=True, I thought on XenServer we had a hack that did that injection into the old location, but I never got chance to test that in production yet | 18:12 |
johnthetubaguy | mordred: ah, interest | 18:12 |
johnthetubaguy | afraid I have to run off now | 18:13 |
johnthetubaguy | its getting dark in the UK, and I have an extra tuba rehersal tonight this week | 18:13 |
openstackgerrit | Monty Taylor proposed openstack-infra/project-config: Disable metadata in cloud-init config https://review.openstack.org/166318 | 18:13 |
mordred | johnthetubaguy: have fun at rehearsal! | 18:13 |
anteaya | johnthetubaguy: you are after all, the tuba guy | 18:13 |
mordred | SpamapS: ^^ can you sanity check that for me please? | 18:13 |
johnthetubaguy | :) | 18:13 |
*** pc_m has joined #openstack-infra | 18:14 | |
*** gampel has quit IRC | 18:15 | |
jeblair | mordred, fungi, clarkb: my plan today is to tend to the slow alien delete process, but otherwise avoid any nodepool changes, and take it easy this afternoon and write up more zuulv3 specs so i'm not burned out for our maint tomorrow | 18:16 |
anteaya | jeblair: is there anything I can do to help? I'm holding off reviewing/approving stuff as I don't want to tax the few workers we have | 18:17 |
jeblair | anteaya: i would not worry about that. approve at will; it'll get through it eventually. | 18:18 |
clarkb | jeblair sounds good, should mordred avoid the cloud init change then? | 18:18 |
anteaya | okay | 18:18 |
*** tkelsey has joined #openstack-infra | 18:21 | |
*** garyh has joined #openstack-infra | 18:22 | |
fungi | jeblair: that sounds like a great plan | 18:23 |
mordred | clarkb, jeblair: the cloud-init change shouldn't affect the other api stuff much | 18:23 |
*** kgiusti has joined #openstack-infra | 18:23 | |
*** dboik_ has quit IRC | 18:23 | |
*** dboik has joined #openstack-infra | 18:24 | |
fungi | jeblair: as for alien deletes, i usually just do them entirely serially unless the quantity is enormous (like the ~500 we needed to delete yesterday) | 18:25 |
*** johnthetubaguy is now known as zz_johnthetubagu | 18:25 | |
*** ghostpl_ has quit IRC | 18:25 | |
*** ghostpl_ has joined #openstack-infra | 18:27 | |
anteaya | what is z/tempest? it is in zuul/layout.yaml but I dont' know what repo it corresponds to: http://git.openstack.org/cgit/openstack-infra/project-config/tree/zuul/layout.yaml#n2826 | 18:29 |
*** dboik has quit IRC | 18:30 | |
anteaya | I don't even know where else to look to find out | 18:31 |
*** dboik has joined #openstack-infra | 18:31 | |
pleia2 | cinerama: so yeah, saw StevenK's patch and updated the topic on it this morning so it shows up in reviews with all the other zanata patches, thanks for reviewing, I'll have a look in a bit | 18:31 |
* anteaya nips out to get more sap, back in a minute | 18:32 | |
*** garyh has quit IRC | 18:32 | |
*** MarkAtwood has joined #openstack-infra | 18:33 | |
cinerama | pleia2: kool | 18:33 |
*** MrAboii has quit IRC | 18:34 | |
*** arxcruz has joined #openstack-infra | 18:35 | |
*** crc32 has joined #openstack-infra | 18:35 | |
fungi | anteaya: it's a dummy project used to set up a transitive co-gating relationship between actual projects | 18:35 |
anteaya | fungi: ah ha | 18:36 |
fungi | anteaya: it's basically abusing zuul's queue sharing algorithm to establish an equivalency between multiple jobs in case a project runs one of those but not the others | 18:37 |
anteaya | fungi: I found it because I am reviewing https://review.openstack.org/#/c/165648/1 | 18:38 |
*** edwarnicke has quit IRC | 18:38 | |
anteaya | which reduces the prevelence of neutron-large-ops on projects | 18:38 |
*** sweston has quit IRC | 18:38 | |
*** ujuc has quit IRC | 18:38 | |
nibalizer | mordred: so for using 1 cert for everone with puppet apply i think we have to set this | 18:39 |
nibalizer | https://docs.puppetlabs.com/references/latest/configuration.html#nodename | 18:39 |
anteaya | should the z/tempest neutron-large-ops job also be scaled back? | 18:39 |
nibalizer | then we can set certname to everyonecert.lol.openstack.org | 18:39 |
*** dougwig has quit IRC | 18:39 | |
*** erw has quit IRC | 18:39 | |
nibalizer | thanks to Hunner for that one | 18:39 |
mordred | nibalizer: nod | 18:39 |
Hunner | and just whitelist that cert at the puppetdb | 18:40 |
Hunner | Puppetdb won't care about the cert that is auth'd, only the contents of the payload | 18:40 |
fungi | anteaya: probably not since we likely still want to make sure projects which do continue to run that job co-gate with others which don't | 18:40 |
anteaya | very good, thanks | 18:41 |
*** Somay has quit IRC | 18:41 | |
anteaya | I understood about 20% of what you told me about that dummy project but hopefully I can get a better visualization of it at some point | 18:41 |
openstackgerrit | Jeremy Stanley proposed openstack-infra/system-config: Move security.openstack.org to HTTPS https://review.openstack.org/155099 | 18:46 |
*** dmorita has quit IRC | 18:46 | |
*** spzala has quit IRC | 18:46 | |
*** VijayTripathi1 has joined #openstack-infra | 18:48 | |
*** VijayTripathi has quit IRC | 18:51 | |
*** pradk has joined #openstack-infra | 18:55 | |
*** pradk has quit IRC | 18:55 | |
fungi | clarkb: you've flown through vancouver inbound to the usa before right? i'm looking at flying back directly and trying to figure out if i need to leave buffer time in my first layover for customs or if they really do us customs when boarding in vancouver and then treat the connections as usa domestic... | 18:55 |
*** prad has quit IRC | 18:56 | |
*** e0ne has quit IRC | 18:57 | |
*** mjturek1 has joined #openstack-infra | 18:58 | |
*** e0ne has joined #openstack-infra | 18:58 | |
*** prad has joined #openstack-infra | 18:59 | |
*** mjturek1 has left #openstack-infra | 19:00 | |
*** ssam2 has quit IRC | 19:00 | |
clarkb | they do customs in canada for us departures | 19:01 |
clarkb | so add time for that | 19:01 |
clarkb | yvr was pretty quick about it though | 19:02 |
fungi | good to know, and conversely i can scale back on my first layover since i won't need to claim and re-check my luggage | 19:02 |
*** emagana has quit IRC | 19:02 | |
*** tkelsey has quit IRC | 19:03 | |
sdague | oh, man - https://review.openstack.org/#/c/125944/ - fungi / clarkb either of you want to put the final +2 on that one? That would make for an awesome Friday | 19:03 |
*** emagana has joined #openstack-infra | 19:04 | |
fungi | as for the working-group-on-a-train idea, amtrak apparently has three coach options for the portland->vancouver run ranging from us$48-114... does it matter which coach ticket i get? | 19:04 |
openstackgerrit | Merged openstack-infra/project-config: Disable metadata in cloud-init config https://review.openstack.org/166318 | 19:04 |
openstackgerrit | Merged openstack-infra/project-config: Revert "Remove ssh host keys during image build" https://review.openstack.org/166311 | 19:04 |
openstackgerrit | Merged openstack-infra/project-config: Revert "Regenerate ssh host key on boot" https://review.openstack.org/166310 | 19:04 |
*** tqtran is now known as tqtran_afk | 19:04 | |
*** [HeOS] has joined #openstack-infra | 19:04 | |
cdent | aw, fungi, I'm jealous, the train ride from portland to vancouver is beautiful | 19:05 |
fungi | cdent: join us! do some openstacking on a train | 19:05 |
*** rlucio has joined #openstack-infra | 19:06 | |
jroll | oh, that would be cool | 19:06 |
cdent | too late, got plane tickets already, in vancouver, out seattle, taking the train south afterwards | 19:06 |
*** achanda has quit IRC | 19:07 | |
*** nilasae|afk has quit IRC | 19:07 | |
*** armax has quit IRC | 19:08 | |
*** tjones1 has quit IRC | 19:08 | |
*** e0ne has quit IRC | 19:08 | |
*** EmilienM|afk is now known as EmilienM | 19:10 | |
*** e0ne has joined #openstack-infra | 19:10 | |
*** sdake has joined #openstack-infra | 19:11 | |
fungi | sdague: as awesome as it is, browsers are going to choke on it. see comment | 19:13 |
*** dimtruck is now known as zz_dimtruck | 19:14 | |
AJaeger | sdague: did you see my question above? | 19:14 |
anteaya | flashgordon: I'm currently reviewing: https://review.openstack.org/#/c/165652/1 where do I see the list of projects grenade does test? | 19:14 |
*** ghostpl_ has quit IRC | 19:14 | |
sdague | fungi: gotcha | 19:15 |
cinerama | neat. so looks like sjc to yvr via train takes a couple days but you have to stop overnight in seattle the way it's calculating it | 19:15 |
*** sdake_ has quit IRC | 19:15 | |
clarkb | fungi amtrak assigms seats as you vheck in | 19:15 |
sdague | AJaeger: only barely, I'm wrapping up one last thing then calling it a week | 19:15 |
clarkb | usually its easy to at least take over the food/observation cars | 19:15 |
fungi | sdague: e.g. https://review.openstack.org won't be allowed to embed http://zuul.openstack.org json content for security reasons | 19:16 |
anteaya | flashgordon: just if it is listed here with an upgrade-<project> file? http://git.openstack.org/cgit/openstack-dev/grenade/tree/ | 19:16 |
sdague | fungi: yeh... hmmm... so I definitely had this working before | 19:16 |
fungi | clarkb: open lounge areas. got it | 19:16 |
sdague | is there a new change there? | 19:16 |
cinerama | no assigned seating on capitol corridor when i've ridden but that may be different | 19:16 |
*** ghostpl_ has joined #openstack-infra | 19:17 | |
fungi | sdague: were you doing it with overrides in the javascript console? that might make a difference | 19:17 |
AJaeger | sdague: ok, I'll followup via email - enjoy the weekend | 19:17 |
sdague | fungi: I was injecting it directly in the js console | 19:18 |
sdague | so that could be | 19:18 |
fungi | sdague: at least i know when it's come up in the past, having an javascript in an https-served page call an http url to retrieve data has caused browser security warnings/errors | 19:18 |
sdague | yeh, I can believe that | 19:19 |
fungi | sdague: though if the javascript is being provided via the js debug console instead of the site, that may not be spotted | 19:19 |
*** e0ne has quit IRC | 19:19 | |
sdague | yep, good call | 19:19 |
fungi | sdague: well, not so much a call, as this use case is precisely why i added the changes to get zuul also serving its status data via https | 19:20 |
*** spzala has joined #openstack-infra | 19:20 | |
sdague | well, I will be excited when it shows up. | 19:20 |
fungi | because it came up before in discussion as a prerequisite for the status embedding we wanted | 19:21 |
anteaya | flashgordon: and grenade doesnt' seem to be testing devstack-gate (which is good since I don't know what ti would test) so perhaps we should remove it from there as well | 19:21 |
openstackgerrit | Merged openstack-infra/project-config: Split grenade out of integrated-gate template https://review.openstack.org/165651 | 19:21 |
*** achanda has joined #openstack-infra | 19:21 | |
*** pelix has quit IRC | 19:22 | |
anteaya | zaro: do you have all the code merged that needs to be merged for tomorrow? | 19:22 |
*** tjones1 has joined #openstack-infra | 19:25 | |
*** tjones1 has left #openstack-infra | 19:25 | |
clarkb | anteaya should be this is just moving to trusty | 19:26 |
mordred | SpamapS: back away. back slowly away | 19:26 |
*** MarkAtwood has quit IRC | 19:26 | |
mordred | SpamapS: (oh, I was scrolled back ... that was a response to a long time ago) | 19:26 |
clarkb | and I think I reviewdd and got all those trusty related changes merged. good to double chwck though | 19:27 |
anteaya | clarkb: awesome | 19:27 |
*** MarkAtwood has joined #openstack-infra | 19:28 | |
anteaya | I recall the js minifier patch was about to look for that, that needs to be in for tomorrow, just wondering if something else got discovered that I missed | 19:28 |
anteaya | patch, I was | 19:28 |
anteaya | there it is, I have reviewed, it has yet to be approved: https://review.openstack.org/#/c/165145/ | 19:29 |
*** hashar has joined #openstack-infra | 19:31 | |
*** sarob has quit IRC | 19:31 | |
clarkb | oh thats a nee one.will review after eating blts | 19:31 |
*** MarkAtwood has quit IRC | 19:32 | |
cinerama | oh pleia2 when you get a chance, the spec mentions ansible playbooks - if you have the location i wouldn't mind taking a look to see if there's stuff we missed in the modules | 19:32 |
mordred | clarkb: how many blts are you going to eat? | 19:32 |
*** garyh has joined #openstack-infra | 19:33 | |
*** zz_dimtruck is now known as dimtruck | 19:33 | |
*** hashar is now known as hasharConfcall | 19:34 | |
fungi | all teh blts | 19:36 |
*** tiswanso has joined #openstack-infra | 19:36 | |
*** yamahata has quit IRC | 19:36 | |
openstackgerrit | Merged openstack-infra/project-config: Update forge-upload job to use tags https://review.openstack.org/164016 | 19:37 |
*** yamahata has joined #openstack-infra | 19:37 | |
fungi | clarkb: when you have a moment between blts #4 and #5, another portland travel logistics question... is 2.5 hours from landing at pdx to amtrak departure from union station easy enough to accomplish via public transit? | 19:37 |
pleia2 | cinerama: I'll forward them to (they weren't strictly open sourced, just grabbed and sanitized by Red Hat IT and shared with me) | 19:38 |
greghaynes | mordred: Made a couple fixes to your https://review.openstack.org/#/c/165792/ in case you didnt see | 19:38 |
cinerama | pleia2: oh cool thanks | 19:38 |
pleia2 | cinerama: which address to send them to? | 19:38 |
cinerama | pleia2: either hp or personal is fine | 19:38 |
mordred | greghaynes: I am a fan of fixes! | 19:38 |
pleia2 | cinerama: I don't know your personal address, so just PM me what you prefer :) | 19:39 |
clarkb | fungi ya should be, take red line downtown ~hour + catch bus/yellow/green to union station ~20 minutes | 19:39 |
clarkb | you can walk that last step too | 19:39 |
fungi | clarkb: cool. the neighborhood around union station looked marginally familiar on the map but wasn't sure what the closest stop on the red line was | 19:40 |
greghaynes | fungi: youre portlanding!? | 19:40 |
greghaynes | oh, im guessing this is for summit | 19:40 |
fungi | greghaynes: for to ride teh trainz for summit, yes | 19:41 |
*** andreykurilin_ has joined #openstack-infra | 19:41 | |
fungi | greghaynes: though i have a talk accepted at oscon so will be back ~ a month later too | 19:41 |
greghaynes | awesome, yes as clarkb said the max red line is kind of a direct airport -> amtrak | 19:41 |
*** ZZelle_ has joined #openstack-infra | 19:42 | |
*** sushilkm has joined #openstack-infra | 19:42 | |
*** sushilkm has left #openstack-infra | 19:42 | |
*** sushilkm has joined #openstack-infra | 19:42 | |
*** sushilkm has left #openstack-infra | 19:42 | |
fungi | i always travel with a hiking pack as my checked luggage, so easy for me to walk a few miles briskly with it if needed | 19:42 |
greghaynes | Nice, I actually just booked a trip to your area for july :) | 19:42 |
*** dimtruck is now known as zz_dimtruck | 19:43 | |
fungi | ooh! you should get a paper in for all things open and come to nc in october (though it's the week before tokyo, so maybe you actually shouldn't unless you're insane) | 19:43 |
*** garyh has quit IRC | 19:44 | |
*** ihrachyshka has quit IRC | 19:44 | |
greghaynes | haha, the wife would be thrilled! (not really) | 19:44 |
mordred | fungi: I need to submit for ATO | 19:44 |
mordred | fungi: except - really it's the week before tokyo? | 19:44 |
fungi | mordred: SUBMIT! | 19:44 |
* mordred sobs | 19:44 | |
fungi | mordred: it's sunday through tuesday this time though, so there's a few days buffer at least | 19:45 |
clarkb | the best part of this time of year is cadbury eggs | 19:45 |
pleia2 | ++ | 19:45 |
fungi | just in case you wanted higher-octane sugar inside your normal sugar | 19:46 |
*** otter768 has joined #openstack-infra | 19:46 | |
clarkb | fungi: yes | 19:46 |
clarkb | I got a dozen :) | 19:46 |
clarkb | ok time to review that change for gerrit | 19:46 |
clarkb | anteaya: any others you can find? | 19:47 |
anteaya | not for tomorrow | 19:47 |
anteaya | hoping to hear from zaro | 19:47 |
anteaya | do states have cadbury easter creme eggs now? | 19:47 |
anteaya | I had believed you didn't | 19:47 |
mordred | anteaya: we've had cadbury eggs for my entire life | 19:48 |
anteaya | cool | 19:48 |
mordred | anteaya: it's possible that there is an additional thing that we don't have | 19:48 |
anteaya | not sure what I'm thinking of then | 19:48 |
fungi | one of the few cadbury products we get here in the states | 19:48 |
pleia2 | cadbury in general isn't very common here | 19:48 |
pleia2 | but we get the eggs :d | 19:48 |
mordred | to us, it's the company that makes the eggs | 19:48 |
fungi | unless you go to import shops | 19:48 |
mordred | fungi: MURICA! | 19:48 |
anteaya | https://en.wikipedia.org/wiki/Cadbury_Creme_Egg | 19:49 |
*** baoli has quit IRC | 19:49 | |
fungi | i quite like the cadbury currant bars | 19:49 |
anteaya | fungi pleia2 oh okay | 19:49 |
fungi | but muricans also mostly don't know what currants are either | 19:49 |
anteaya | I don't know the currant bars | 19:49 |
fungi | or call them "tiny raisins" | 19:49 |
anteaya | fungi: well there's that | 19:49 |
anteaya | :) | 19:49 |
pleia2 | fungi: not chocolate chips | 19:49 |
clarkb | I think currants are those weird things we ate in belgium | 19:49 |
anteaya | really? | 19:49 |
anteaya | I don't consider currants belgian | 19:50 |
*** baoli_ has joined #openstack-infra | 19:50 | |
*** otter768 has quit IRC | 19:50 | |
clarkb | I think they just had them there | 19:50 |
clarkb | because ya we don't really have them inthis country | 19:50 |
* krotscheck has a supplier of redcurrants in Seattle. ALL TO MYSELF. | 19:51 | |
clarkb | ok js people, why would we bother to go through the trouble of minifying jquery on trusty for ~15kb | 19:51 |
*** rfolco has quit IRC | 19:51 | |
clarkb | krotscheck: ^ see https://review.openstack.org/#/c/165145/6/modules/openstack_project/manifests/gerrit.pp | 19:51 |
mordred | krotscheck: also, if you didn't see the other day - ubuntu apparently ships jquery.min.js as a symlink to jquery.js | 19:52 |
*** emagana has quit IRC | 19:52 | |
krotscheck | clarkb: Ehn. It doesn't hurt? | 19:52 |
anteaya | clarkb: something about ensuring the toggle ci button works | 19:52 |
anteaya | clarkb: not invalidating your question though | 19:52 |
clarkb | anteaya: ya, mostly trying to figure out if this is worth the trouble | 19:52 |
krotscheck | To be honest, serving javascript up as gzip is more effective than minifcation. | 19:53 |
fungi | clarkb: i agree 15kb extra that your browser's going to cache anyway isn't necessarily worth the effort to puppet compressing it | 19:53 |
krotscheck | So I usually don't bother minifying. | 19:53 |
anteaya | clarkb: always worth it to ask that question | 19:53 |
krotscheck | Also, minifying makes production debugging hard. | 19:53 |
*** nilasae has joined #openstack-infra | 19:53 | |
krotscheck | "Exception thrown in line 1" -> Line 1 is 16K characters of text. | 19:54 |
clarkb | krotscheck: ya, though at least in gerrits case its all minified otherwise and impossible to debug so thats less of a concern | 19:54 |
fungi | anteaya: also it's a cadbury chocolate bar with currants and almonds. tasty, tasty stuff | 19:54 |
anteaya | fungi: I don't think I've ever seen that | 19:54 |
*** nilasae has quit IRC | 19:54 | |
anteaya | sounds very tasty indeed | 19:54 |
fungi | anteaya: i've only found it in the uk | 19:54 |
*** nilasae has joined #openstack-infra | 19:54 | |
anteaya | ah | 19:54 |
clarkb | the other question I have is what will yui-compressor do if fed an already minified version of the file? | 19:54 |
anteaya | I'll look for it next time I'm there | 19:55 |
mordred | clarkb: dude, krotscheck has convinced me we should not bother | 19:55 |
anteaya | haven't spent much time in the uk yet, mostly just passing through | 19:55 |
fungi | clarkb: we shouldn't be re-feeding the already minified file into it? | 19:55 |
* fungi re-checks that change | 19:55 | |
clarkb | fungi: oh right yup | 19:55 |
fungi | yeah, it | 19:55 |
fungi | grrr | 19:55 |
zaro | anteaya, clarkb : yo! this is needed for the trusty upgrade, https://review.openstack.org/#/c/165145/ | 19:55 |
mordred | clarkb: and, in fact, should maybe stop minifying anywhere just because it would let us delete more puppet | 19:55 |
clarkb | ok so we do need to address the broken button | 19:55 |
fungi | it's minifying the normal version not the min.js file | 19:55 |
anteaya | zaro: yes the very patch we are talking about | 19:55 |
clarkb | fungi: ya | 19:56 |
anteaya | zaro: glad you are here | 19:56 |
zaro | was out to lunch and now back | 19:56 |
fungi | i'm on board with serve readable source code from our servers | 19:56 |
fungi | because we're open | 19:56 |
anteaya | zaro: so how much do you care if we minify the js | 19:56 |
*** ajmiller_ has joined #openstack-infra | 19:56 | |
anteaya | zaro: because right now the group is leaning towards not bothering | 19:56 |
clarkb | zaro: can we not go through the trouble of minifying that file and simply have puppet do a smlink to /usr/share/javascript/jquery/jquery.js? | 19:56 |
zaro | anteaya: fungi & jeblair seems to think it's important | 19:57 |
anteaya | zaro: okay so if they come back in favour of not bothering that is okay with you? | 19:57 |
fungi | zaro: i only felt it was important to actually have a minified file if we're serving it as jquery.min.js, but if we can serve the full source and _call_ it jquery.js i'm cool with that | 19:58 |
mordred | clarkb: I'm voting for "have puppet do a symlink" | 19:58 |
clarkb | I think I am fine with the change as is at this point too | 19:58 |
zaro | uhhm, i think that's to make better perfomance. | 19:58 |
zaro | i'm not sure how much better though. | 19:58 |
mordred | zaro: krotscheck says it won't do that really | 19:58 |
clarkb | but for simplicity a symlink would probabl be best | 19:58 |
clarkb | the file size differences is about 15kb | 19:58 |
mordred | fungi: yes - my issue with the debian package was that they called it .min.js | 19:58 |
fungi | mordred: mine too. i think that's worse than just not including the file | 19:59 |
krotscheck | The only real benefit is download speed, and that's heavily dependent on your browser's caching settings, the server's use of cache invalidation headers, and the server's use of mod_gzip | 19:59 |
fungi | javascript minification is, to some extent, an obsessive compulsive disorder some people have about squeezing every last bit of whitespace out of files they serve even if their webserver is going to turn around and gzip-encode it anyway | 19:59 |
mordred | krotscheck: all of which are going to do a better job than minification | 19:59 |
greghaynes | Yea, really the use case for gaining speed via minification isnt something I belive youall have | 19:59 |
* mordred hands krotscheck an extra box of redcurrants | 19:59 | |
clarkb | mordred: where are we with https://review.openstack.org/#/c/166318/ ? have images building in hpcloud and rax yet? | 19:59 |
fungi | it's not like we're minifying our html | 19:59 |
krotscheck | From what I remember, the gzip algorithm actually works better on things with large regular words rather than things collapsed to single-character varnames. | 20:00 |
mordred | clarkb: image just uploaded to hpcloud-b5, I kicked b4 just now | 20:00 |
greghaynes | The only time ive seen that download size make a big difference is when youre dealing with things like mobile where its more of a slowstart issue than download size issue | 20:00 |
clarkb | krotscheck: that sounds right, because it does prefixes (or suffixes, maybe both) so you need longer strings that overlap | 20:00 |
*** ajmiller has quit IRC | 20:00 | |
*** ghostpl_ has quit IRC | 20:00 | |
anteaya | zaro: so this patch was to ensure the toggle ci button works, yes? https://review.openstack.org/#/c/165145/ | 20:01 |
zaro | yes | 20:01 |
*** andreykurilin_ has quit IRC | 20:01 | |
clarkb | mordred: cool, I will keep an eye on nodes there, devstack-trusty? | 20:01 |
greghaynes | Like, either way, you might not even be talking a full packet in size difference when you gzip both versions so there is effectively no difference ;) | 20:01 |
anteaya | zaro: okay great, can we get the toggle ci button working without having to minify the js? | 20:01 |
mordred | clarkb: yah | 20:01 |
*** ociuhandu has quit IRC | 20:02 | |
zaro | yes, it works wihtout minifying js | 20:02 |
anteaya | zaro: how would you feel if we went that way? | 20:02 |
clarkb | zaro: oh, so it works today without that change? | 20:02 |
mordred | clarkb: b4 has it | 20:02 |
zaro | clarkb: no, it will be broken on trusty | 20:02 |
clarkb | zaro: ok, so we do need a change, but it doesn't have to be that change | 20:03 |
zaro | clarkb: it works today because it's on precise. precise lib-jquery library provides min.js file | 20:03 |
clarkb | right rather than a symlink | 20:03 |
zaro | so you proprose just linking to the .js file, yes that will work as well | 20:04 |
zaro | actually it has to be a copy not a link | 20:04 |
clarkb | zaro: we can do that then, just switch to using the real file not the .min.js | 20:05 |
*** ChuckC has quit IRC | 20:05 | |
zaro | clarkb: ok, maybe fungi and jeblair should chime in on that since i thought they wanted the min.js | 20:05 |
fungi | zaro: i only felt it was important to actually have a minified file if we're serving it as jquery.min.js, but if we can serve the full source and _call_ it jquery.js i'm cool with that | 20:05 |
*** Sukhdev has quit IRC | 20:06 | |
jeblair | zaro: i'm okay with the non-minified file. it is a regression since we are serving it now, but the argument that it won't actually be any worse makes sense. we can try it, and if it is, we can go with what you have. | 20:07 |
zaro | no min.js is cool with me if there's no benefit | 20:07 |
clarkb | and maybe we can file a bug with debuntu about this | 20:07 |
mordred | clarkb: it would distract them from fixing python | 20:07 |
greghaynes | Do youall do gzipping of those files when you serve them? | 20:08 |
pleia2 | clarkb: videos aren't online yet, but the slides for the "life of a logstash event" talk are up and helpfully detailed https://speakerdeck.com/elastic/life-of-a-logstash-event | 20:08 |
anteaya | yay, so we looking forward to your new patch zaro, which we hope to review and merge in the next few hours | 20:08 |
*** edwarnicke has joined #openstack-infra | 20:08 | |
clarkb | pleia2: thank you, was it a good talk? | 20:08 |
pleia2 | clarkb: it was great | 20:08 |
zaro | cool, i'll fix up. maybe try to add that bug as well. LP right? | 20:08 |
fungi | i mean, i sort of know why they did that. they can't ship jquery.min.js for certain reasons, but some other packaged web applications may be hard-coded to serve a file called jquery.min.js, so someone thought this was the most pragmatic compromise | 20:08 |
pleia2 | clarkb: I'll let you know when the video shows up :) | 20:08 |
clarkb | pleia2: awesome, I should bug you when logstash derps now :) | 20:08 |
*** hdd has quit IRC | 20:09 | |
greghaynes | my browswer says youall do gzip, so \O/ | 20:09 |
clarkb | fungi: wait, I thought the can if there is an FOSS toolchain to gnerate the file | 20:09 |
pleia2 | clarkb: haha, I might actually be able to help! | 20:09 |
clarkb | fungi: I don't see how that is any different than say shipping a compiled gcc | 20:09 |
greghaynes | Yes, I thought jquery is mit license | 20:09 |
mordred | clarkb: ++ | 20:10 |
fungi | clarkb: yeah, though the reliability of that was potentially in question when the javascript-jquery package landed in debian in the timeframe in which trusty imported it before it froze for release | 20:10 |
*** Bsony has quit IRC | 20:10 | |
* fungi looks to see if it's still that way in testing/unstable | 20:10 | |
*** Bsony has joined #openstack-infra | 20:11 | |
jeblair | greghaynes: thanks for checking! :) | 20:11 |
jeblair | (confirming gzip) | 20:12 |
clarkb | pleia2: it is interesting that they still use that scaling architecture, I threw it out after about a day because it doesn't scale :) | 20:12 |
*** tiswanso has quit IRC | 20:12 | |
clarkb | pleia2: we run N indexers instead of funneling it down to 1 indexer | 20:12 |
* jeblair gets back to writing words | 20:12 | |
fungi | clarkb: mordred: jeblair: yeah, the libjs-jquery 1.7.2+dfsg-3.2 from jessie and sid has a separate min.js file not a symlink | 20:13 |
fungi | and it's definitely smaller by roughly the right amount | 20:13 |
*** dougwig has joined #openstack-infra | 20:13 | |
*** timcline has quit IRC | 20:14 | |
clarkb | cool | 20:14 |
*** ajmiller_ is now known as ajmiller | 20:14 | |
*** erw has joined #openstack-infra | 20:14 | |
fungi | so looks like it was restored to sanity. it was a symlink in 1.7.2+debian-2.1 because the minification relied on uglify which was not at that time destined to make it into the wheezy release | 20:15 |
greghaynes | clarkb: batch processing does a ton for scalng ;) | 20:15 |
fungi | (circa november 2012) | 20:15 |
*** mfink_ has joined #openstack-infra | 20:15 | |
clarkb | mordred: devstack-trusty-1426881119.template.openstack.org is that the image I should be looking for? | 20:15 |
clarkb | greghaynes: you can do it without batch processing either | 20:16 |
clarkb | greghaynes: every shipper could just be an indexer too | 20:16 |
mordred | clarkb: yes | 20:16 |
*** sweston has joined #openstack-infra | 20:16 | |
mordred | and it's uploaded to 2-5 now - and 1 is in progress | 20:16 |
greghaynes | clarkb: Yes, I imagine under the hood thats what youre gaining by scaling via replication though | 20:16 |
clarkb | greghaynes: replication only affects query scaling not indexing | 20:16 |
clarkb | or maybe you don't mean es replication | 20:17 |
greghaynes | oh, I did but I guess it works differently than I thought. Maybe thats a good performance improvement if we run into write scaling issues | 20:17 |
greghaynes | to somehow batch writes | 20:18 |
pleia2 | clarkb: the ELK family is interesting and *young* so even in 2 years since you first set up our system it's changed a ton, better scaling support across the board has been one of the big things | 20:18 |
pleia2 | clarkb: during one of the talks, they said they thought "logstash dropping things on the floor" was mostly an unusual bug/hardware failure/something but they came to realize it's a real thing once they started doing bigger testing | 20:19 |
clarkb | pleia2: ya supposedly the elasticsearch 1.0 release performs much better but they keep having CVEs for their groovy script support which is a bit :( | 20:19 |
greghaynes | but sounds like youre just effectively making readonly replicas? | 20:19 |
pleia2 | clarkb: ah, that is unfortunate | 20:19 |
clarkb | greghaynes: no we basically have N logstash indexers that each process a file at a time | 20:19 |
clarkb | greghaynes: they talk to local ES clients that are part of the cluster which then index the data on the data nodes | 20:20 |
clarkb | greghaynes: but aiui you only ever write to the primary shard for indexing, so replicas don't help indexing performance | 20:20 |
*** mfink_ has quit IRC | 20:20 | |
clarkb | greghaynes: they do however help reads beacuse you can read from any shard that has the data on any node when doing queries | 20:20 |
clarkb | and if you lose a node with primary shards replicas will become primary shards so you also get ha from them | 20:21 |
*** claudiub has quit IRC | 20:21 | |
*** erlon is now known as erlon_away | 20:21 | |
mordred | clarkb: ok - all of hpcloud has the new devstack-trusty image | 20:22 |
clarkb | mordred: ok | 20:22 |
clarkb | mordred: have you also updated in rax to make sure we don't break rax tomorrow morning? | 20:22 |
*** melwitt has joined #openstack-infra | 20:22 | |
openstackgerrit | Khai Do proposed openstack-infra/system-config: Fix jquery setup on Gerrit server. https://review.openstack.org/165145 | 20:22 |
mordred | clarkb: no - I'll do that next | 20:22 |
zaro | clarkb, anteaya, fungi ^ | 20:23 |
mordred | clarkb: I'm bulding bare-trusty right now | 20:23 |
clarkb | mordred: ok, probabl just start with one region there | 20:23 |
mordred | clarkb: yah | 20:23 |
mordred | clarkb: although I _expect_ it to be a no-op since I got the file from rax - but still | 20:23 |
greghaynes | clarkb: so, you obviously have to do some kind of write to the non-primary shards otherwise they dont have the data ;) I think you are effectively doing batch write replication though. Locally youre not but when you replicate you will which is where you tend to hit scaling issues | 20:23 |
mordred | clarkb: oh - duh. bare-trusty is not dib | 20:24 |
*** melwitt_ has joined #openstack-infra | 20:24 | |
mordred | clarkb: I got an overquota error- apparently we're sitting at 600 nodes on hp | 20:24 |
*** dkliban is now known as dkliban_afk | 20:24 | |
clarkb | zaro: see comment | 20:24 |
*** thingee has joined #openstack-infra | 20:24 | |
clarkb | greghaynes: oh I think maybe we have confused each other. The scaling issues are in logstash not es | 20:24 |
greghaynes | clarkb: ty for the explanation though, kinda want to figure out more about tha tsetup | 20:24 |
mordred | greghaynes: we wold love for someone other than clarkb to actualy understand it :) | 20:25 |
clarkb | greghaynes: so the problem is scaling up cputime for logstash indexer process which means running one of those is bad for scaling | 20:25 |
clarkb | greghaynes: es is actually pretty good at scaling up, every time I have added nodes it has helped | 20:25 |
greghaynes | clarkb: ah! I should stop assuming all problems are database problems | 20:25 |
clarkb | greghaynes: but basicaly ruby that runs lots of regexes in the jvm is slow :) | 20:25 |
*** peristeri has quit IRC | 20:25 | |
mordred | clarkb: it's a fair asumption | 20:25 |
openstackgerrit | Khai Do proposed openstack-infra/system-config: Fix jquery setup on Gerrit server. https://review.openstack.org/165145 | 20:25 |
mordred | gha | 20:25 |
*** dprince has quit IRC | 20:26 | |
nibalizer | jeblair: we confirmed that puppet-blacksmith can create the puppetforge module when it submits the first time | 20:26 |
mordred | nibalizer: neat | 20:26 |
nibalizer | or i guess more accurately that the forge api is smart enough | 20:26 |
zaro | clarkb: argg! shuld be good now | 20:26 |
clarkb | zaro: one more thing | 20:26 |
clarkb | zaro: sorry should've caught that on the previous patchset | 20:26 |
*** melwitt has quit IRC | 20:27 | |
*** armax has joined #openstack-infra | 20:27 | |
zaro | don't we only want to copy only on package updates? | 20:28 |
clarkb | mordred: for my sanity, the thing that we think caused hpcloud troubles was lots of deletes piling up after jeblairs change beacuse they were no longer serialized? and we had lots of deletes because metadata was broken. To address this we reverted jeblairs change and are avoiding metadata service | 20:28 |
clarkb | zaro: the notify is independent. you need to tell gerrit to rebuild its stuff once you update the file | 20:28 |
clarkb | zaro: so ya you need both things, the package subscription and the notify to make gerrit rebuild things | 20:28 |
mordred | clarkb: yes | 20:29 |
clarkb | mordred: any idea on whether or not metadata service is being fixed in hpcloud? | 20:29 |
mordred | clarkb: well, we had a non-zero number of build failures that indicated some issue with metadata service in their logs | 20:29 |
mordred | clarkb: I do not expect that it is - I think it is merely a fundamentally broken part of openstack | 20:29 |
clarkb | where non zero is vast majority | 20:29 |
clarkb | mordred: it may be, but we have only been experiencing trouble in hpcloud for about 2-3 weeks now | 20:30 |
clarkb | mordred: basically I am trying to work backwards and see if we can attribute all of this to the same problem | 20:30 |
mordred | clarkb: I think in general the mysql there is unhappy | 20:30 |
clarkb | oh I see and metadata relies on that to get its info | 20:30 |
mordred | AIUI | 20:30 |
clarkb | gotcha so ya it may actually all be related | 20:30 |
zaro | clarkb: doesn't the GerritSiteHEader.html get reloaded on every browser page reload? | 20:31 |
clarkb | because I think we leak resources when these random failures happen that report fail to nodepool without and resource uuids | 20:31 |
clarkb | zaro: no gerrit only notices that file has changed if you touch it | 20:31 |
clarkb | which is what the exec does | 20:31 |
mordred | clarkb: yes - although it seems that we may want to more systemically account for the 502 followed by resource pattern | 20:32 |
clarkb | mordred: ya I think the metadata idea from fungi is a good one | 20:32 |
mordred | clarkb: like - it's quite possible that we will ALWAYS have registered an actual request when that happens | 20:32 |
clarkb | mordred: I am just trying to assert that This si what made hpcloud broken for us over the last two weeks | 20:33 |
mordred | yah | 20:33 |
clarkb | because if it isn't then we also have other things to debug | 20:33 |
clarkb | flashgordon: ^ around? we ould like to talk about that and how this may be a nova bug | 20:33 |
*** tqtran_afk is now known as tqtran | 20:33 | |
mordred | clarkb: I kinda think that we should trap for 500 errors, and if we get them, assume that the request succeeded and that we need to poll nova for the uuid based on the hostname we requested and try to resume once we have one | 20:34 |
greghaynes | It would be helpful to know how they determine request throttling... | 20:34 |
*** andreykurilin_ has joined #openstack-infra | 20:34 | |
fungi | clarkb: mordred: i assert the metadata idea was jeblair's | 20:34 |
openstackgerrit | Khai Do proposed openstack-infra/system-config: Fix jquery setup on Gerrit server. https://review.openstack.org/165145 | 20:34 |
mordred | clarkb: because it's a frequent enough thing - it may fit the profile of "here is another way we've discovered clouds fail that we work around" | 20:34 |
*** zz_dimtruck is now known as dimtruck | 20:34 | |
greghaynes | seems like the case were optimizing for is if we make a request, it fails, can we immediately make a second one - not an we async a bunch of requests out | 20:34 |
mordred | as in - I think we should just put more logic in our retry-timeout code there - and not delete the incomplete database record until we've reached the timeout | 20:35 |
clarkb | greghaynes: I think part of the problem here is that throttling is $request/time when $request may be one of many requests that all have different costs | 20:35 |
zaro | clarkb: ok, done. thanks | 20:35 |
greghaynes | clarkb: oh joy | 20:35 |
mordred | yup | 20:35 |
mordred | delete, for instance, is very expensive | 20:35 |
clarkb | zaro: lgtm thanks | 20:36 |
*** emagana has joined #openstack-infra | 20:37 | |
zaro | clarkb, fungi : noticed that jquery.min.js is used for other servers as well so probably should keep an eye out for those when moving to trusty. | 20:37 |
anteaya | fungi mordred pleia2 jeblair https://review.openstack.org/#/c/165145/ is up for review, be best if we had it in for tomorrow | 20:37 |
clarkb | mordred: 687d0191-2605-4a8b-a1e5-cd773366c9b5 is up console log looks mostly ok to me | 20:38 |
mordred | clarkb: woot | 20:38 |
clarkb | mordred: I think nodepool is waiting to get it a floating ip now | 20:38 |
clarkb | mordred: but hopefully it goes used/ready soon | 20:38 |
*** eharney has quit IRC | 20:38 | |
clarkb | zaro: good point, we use it for zuul status and other tools | 20:39 |
*** dustins has quit IRC | 20:39 | |
pleia2 | anteaya: that's the patch that clarkb and zaro are talking about how :) | 20:39 |
pleia2 | s/how/now | 20:39 |
openstackgerrit | Doug Wiegley proposed openstack-infra/project-config: For neutron and neutron-lbaas, skip more wasted jobs https://review.openstack.org/166035 | 20:39 |
anteaya | well clark is +2 on the patch, so my read is that he is happy as is | 20:40 |
*** tsg_ has joined #openstack-infra | 20:40 | |
pleia2 | yeah, I'm going to hold off until checks pass | 20:40 |
anteaya | fair enough | 20:40 |
*** bswartz has quit IRC | 20:40 | |
pleia2 | but it's still on my radar | 20:40 |
anteaya | great, as long as we have it merged for tomorrow | 20:40 |
anteaya | didn't want it to get lost | 20:40 |
clarkb | mordred: so, thinking about rebuilds and I know jeblair wants it to be zuul v3, but if we changed from queues for provider managers to heaps where delete had a higher cost/lower priority we could then update them to be rebuilds which have a lower cost then resort and bam now we have new node | 20:41 |
clarkb | mordred: I think that may be a relatively easy change assuming the heap implementation doesn't make us cry | 20:41 |
*** tsg has quit IRC | 20:41 | |
*** AJaeger has quit IRC | 20:41 | |
jeblair | clarkb: do you know why i want to work on that in zuulv3? | 20:41 |
clarkb | jeblair: no | 20:42 |
greghaynes | clarkb: heapq? | 20:42 |
clarkb | greghaynes: ya but needs to be thread safe | 20:42 |
clarkb | greghaynes: so likely needs a wrapper of some sort, doable I just haven't thought about implementation much | 20:42 |
jeblair | okay, so i've tried to explain this already, but i guess i haven't been sucessful. | 20:42 |
greghaynes | clarkb: what about starvation? | 20:42 |
jeblair | i will try again | 20:42 |
jeblair | rebuilding is a big change in logic for nodepool | 20:43 |
jeblair | currently the whole thing is built around delete/create | 20:43 |
jeblair | specifically the allocator assumes that behavior | 20:43 |
jeblair | the allocator is _incredibly_ complex at this point | 20:43 |
clarkb | jeblair: yes, so thats why I am thinking about how to make it without making it a big change and I think using a heap above can make it not a big change | 20:43 |
jeblair | no individual actually understands how it works | 20:43 |
jeblair | changing from the delete/create cycle to rebuild means altering the allocator | 20:44 |
*** dimtruck is now known as zz_dimtruck | 20:44 | |
clarkb | jeblair: I don't think it has to | 20:44 |
clarkb | when you call create it would replace a delete with a rebuild if there are any rebuilds, otherwise place a create on the heap | 20:44 |
jeblair | clarkb: okay, how can we avoid that then? | 20:44 |
*** garyh has joined #openstack-infra | 20:45 | |
clarkb | when you call delete it just adds a delete like normally happens | 20:45 |
*** sdake_ has joined #openstack-infra | 20:45 | |
clarkb | but the api for the allocator remains the same, the provider manager just returns a node back into the scheduler that may or may not have been rebuilt | 20:45 |
clarkb | greghaynes: starvation is somethign to worry about, BUT we are already so starved doing deletes I don't think it will be worse | 20:46 |
*** timcline has joined #openstack-infra | 20:46 | |
*** hyakuhei has joined #openstack-infra | 20:46 | |
greghaynes | hrm, I mean, I could wip out a siple PI loop for that ;) | 20:46 |
greghaynes | because thats effectively what you need here too | 20:47 |
greghaynes | but seems like different problem first | 20:47 |
clarkb | I am fairly positive that the naive implemetnation would work just fine at least compared to the current situation | 20:47 |
greghaynes | yea, with this kind of stuff its almost always a 'just test it' | 20:48 |
*** sdake has quit IRC | 20:49 | |
*** enikanorov has quit IRC | 20:50 | |
*** radez is now known as radez_g0n3 | 20:50 | |
pleia2 | zaro: precise apply failed on https://review.openstack.org/#/c/165145/ note inline about it | 20:50 |
jeblair | clarkb: i could see how that would work with hpcloud in its current situation. i do not believe we would end up doing very many, if any, rebuilds on rax because deletes happen so quickly there. | 20:50 |
clarkb | jeblair: ya, it likely would not work well in a situation where delete isn't very high cost | 20:50 |
jeblair | clarkb: and you would not need to change the allocator, unless you wanted to fix the rax problem | 20:51 |
jeblair | clarkb: though you would need to change quite a bit of the rest of nodepool -- the delete and create threads | 20:51 |
*** enikanorov has joined #openstack-infra | 20:51 | |
*** adalbas has quit IRC | 20:52 | |
openstackgerrit | Doug Hellmann proposed openstack/requirements: Fix oslo caps for kilo https://review.openstack.org/166377 | 20:53 |
jeblair | so here's why i want to do this in zuulv3 -- nodepool is really complicated, and hard to maintain and hard to test changes. the allocation system in v3 will be so much simpler -- node requests are just a fifo, and the allocator just needs to find spare capacity as it comes up. | 20:53 |
zaro | pleia2: uggh, the puppet file doesn't have refreshonly. it was used for exec from previous patch. will fix up | 20:53 |
jeblair | it will be really simple to build systems like this on top of it | 20:53 |
greghaynes | jeblair: is v3 just the one spec at this point? | 20:54 |
pleia2 | zaro: thanks :) | 20:54 |
fungi | i could sort of see if nodepool sees we're running at capacity so it's deferring creation, then waiting demand could get spun to rebuild calls on nodes which complete jobs instead of making delete calls for them, but that would then only kick in when you're out of capacity everywhere | 20:54 |
*** garyh has quit IRC | 20:54 | |
jeblair | whereas changing fundamental things about nodepool is really hard right now. i'd rather try to avoid spending a lot of time working on an implementation in the current complex system which i want to get rid of. | 20:55 |
fungi | which wouldn't necessarily be much of an improvement over the current situationm | 20:55 |
fungi | yep, makes complete sense | 20:55 |
jeblair | greghaynes: yes. i was in the middle of writing the next part | 20:56 |
mordred | I think I have a patch almost finished to deal with the current weird hp failure, btw | 20:56 |
clarkb | ya I think the ultimate goal of zuul v3 is good. I just don't know how long we can limp along at <200 useable nodes at any time | 20:56 |
jeblair | greghaynes: (it's also an email, which is a good chunk of it) | 20:56 |
greghaynes | ah, ok. I should find that | 20:56 |
jeblair | clarkb: it's more like 300, but yeah | 20:57 |
*** nilasae is now known as nilasae|zzz | 20:57 | |
openstackgerrit | Khai Do proposed openstack-infra/system-config: Fix jquery setup on Gerrit server. https://review.openstack.org/165145 | 20:57 |
zaro | pleia2: ^ | 20:57 |
*** rkukura has quit IRC | 20:58 | |
clarkb | zaro: pleia2 oh refresh only must be an exec thing | 20:58 |
fungi | greghaynes: http://lists.openstack.org/pipermail/openstack-infra/2015-February/002471.html | 20:58 |
pleia2 | clarkb: yeah, seems so | 20:58 |
mordred | clarkb, jeblair: either one of you happen to have one of the 500 level exceptions on create tracebacks laying around handy? | 20:59 |
fungi | mordred: there was one in my paste earlier | 20:59 |
pleia2 | zaro: thanks, I'll keep an eye on tests and approve when things pass | 20:59 |
clarkb | mordred: I do not but I can probably grep one for you if that helps | 20:59 |
mordred | ah! foudn one | 20:59 |
jeblair | clarkb: if we need to do it, then we need to do it. honestly, for something like that i expect that one or two infra-cores will spend a week or two babysitting it and fixing things _after_ we've merged the change. we never find all the edge cases right away. | 20:59 |
greghaynes | fungi: tyty | 20:59 |
fungi | greghaynes: ywyw | 21:00 |
clarkb | jeblair: yes, it wouldn't be a low cost change to implement. But I do think we can avoid allocation complications | 21:00 |
*** tkelsey has joined #openstack-infra | 21:00 | |
*** achuprin has quit IRC | 21:00 | |
*** sarob has joined #openstack-infra | 21:00 | |
*** _nadya_ has joined #openstack-infra | 21:01 | |
jeblair | clarkb: how would you like to proceed? | 21:02 |
*** baoli_ has quit IRC | 21:02 | |
*** emagana has quit IRC | 21:02 | |
openstackgerrit | Merged openstack/requirements: Bump sahara client version https://review.openstack.org/155428 | 21:02 |
*** esker has joined #openstack-infra | 21:03 | |
clarkb | I am reading heapq docs to see how terrible a priority queue implementation might be. It doesn't look like replacing arbitrary entries is very performant or easy | 21:04 |
*** tkelsey has quit IRC | 21:04 | |
*** dmorita has joined #openstack-infra | 21:05 | |
*** emagana has joined #openstack-infra | 21:05 | |
clarkb | though can probably work aroudn that simply by using two Queue.Queues, since really we only have two priorities. delete and everything else | 21:05 |
*** Ryan_Lane has quit IRC | 21:05 | |
clarkb | let me see how terrible updating the code to do this might be really quickly | 21:06 |
*** _nadya_ has quit IRC | 21:06 | |
greghaynes | clarkb: You can just rebuild the queue, I think we have *gasp* hundreds of nodes, right? | 21:06 |
*** hasharConfcall is now known as hashar | 21:07 | |
openstackgerrit | Monty Taylor proposed openstack-infra/nodepool: Deal with failures that succeed https://review.openstack.org/166383 | 21:07 |
clarkb | greghaynes: ya but this way I also get thread safety so I am just going to go with it | 21:07 |
mordred | clarkb, fungi, jeblair: ^^ there is a first stab at adding some smarts about waiting for teh server record to show up after a 500 | 21:07 |
greghaynes | clarkb: well we have to make inserts thread safe too, IMO why not just put locks around it and call it good | 21:07 |
clarkb | greghaynes: because Queue.Queue is thread safe | 21:07 |
mordred | I would like to posit that it might get us a decent amount more nodes in hpcloud - given how many aliens we keep growing | 21:08 |
mordred | I'd like eyes on the approach before I spend too much time working on testing it | 21:08 |
jeblair | mordred: i believe most of the aliens have been related to our failed experiment from yesterday | 21:08 |
mordred | jeblair: we had them before our experiment | 21:09 |
jeblair | mordred: before then, we only had a handful every few weeks i want to say? | 21:09 |
mordred | jeblair: possibly - but I think it's a regular failure mode of hpcloud now | 21:09 |
mordred | jeblair: so handling it is appropriate | 21:09 |
clarkb | jeblair: we had ~150 a couple days ago | 21:09 |
jeblair | clarkb: okay, that's more than i recall dealing with. | 21:10 |
*** zz_dimtruck is now known as dimtruck | 21:10 | |
mordred | yah - I remember trying to get the noc folks to care | 21:10 |
*** ldnunes has quit IRC | 21:13 | |
krotscheck | clarkb, mordred: I stand corrected, minified does actually gzip even more -> http://paste.openstack.org/show/194069/ | 21:14 |
clarkb | krotscheck: meh its 5kb difference :) | 21:14 |
*** mattfarina has quit IRC | 21:15 | |
*** bswartz has joined #openstack-infra | 21:15 | |
krotscheck | clarkb: You mean 50 | 21:15 |
krotscheck | ? | 21:15 |
openstackgerrit | Merged openstack-infra/project-config: Use template for Rally py34 job https://review.openstack.org/164858 | 21:15 |
*** timcline has quit IRC | 21:15 | |
krotscheck | (well, 43) | 21:16 |
fungi | krotscheck: that's also a much newer jquery than we're talking about | 21:17 |
krotscheck | fungi: True. | 21:17 |
fungi | the difference in size is substantial from 1.7 | 21:17 |
*** emagana has quit IRC | 21:18 | |
clarkb | oh was I off by order of magnitude beacuse they increased size by an order of magnitude? | 21:18 |
krotscheck | Well, 1.11 also has a ~40KB difference. | 21:18 |
jeblair | jquery was downloaded from review 15062 times yesterday (out of 40385 requests; most were 304 not modified) | 21:18 |
krotscheck | I actually think it's documentation. | 21:19 |
*** emagana has joined #openstack-infra | 21:19 | |
fungi | oh, wait, it's not. i was looking at the file earlier and seeing an order of magnitude difference in size | 21:19 |
clarkb | maybe all my maths are off | 21:19 |
fungi | fridaymath | 21:19 |
clarkb | mordred: that node I gave you a uuid for is still building fwiw | 21:19 |
krotscheck | Yeah, caching is definitely a the thing that needs to happen. | 21:19 |
* greghaynes is curious what the thing trying to be optimized is | 21:21 | |
greghaynes | because if its either download speed or bandwidth then google jquery cdn will be the best fix | 21:21 |
greghaynes | but if its just effort then not sure it matters | 21:21 |
fungi | mostly effort as far as i'm concerned. but also i like us serving actual readable/modifiable source code | 21:22 |
krotscheck | Well, the reason I care right now is taht I've had a bunch of discussions with packagers and other frontend people, and I'm trying to come up with sane JS policies to propose to the TC. | 21:22 |
*** timcline has joined #openstack-infra | 21:22 | |
krotscheck | Things like: Don't minify, learn to cache instead. | 21:22 |
*** emagana has quit IRC | 21:23 | |
fungi | but also make it possible for the deployer to decide to switch out the js for minified versions if they really want to go through the effort to be that ocd about it | 21:23 |
krotscheck | But for that I need data. | 21:23 |
clarkb | jeblair: you know where this really gets complicated? the fact that we have to delete >1 thing :( | 21:24 |
jroll | fungi: fwiw, you can serve a source map to modern browsers so that they can make the JS readable | 21:24 |
*** hyakuhei has quit IRC | 21:24 | |
mordred | krotscheck: data++ | 21:24 |
fungi | jroll: true, like shipping separate stripped binaries and symbol files | 21:25 |
jroll | yep | 21:25 |
mordred | clarkb: still building seems very lame | 21:25 |
greghaynes | actually, that brings up a good point krotscheck ^ the biggest gain of not minifying if its software were making is that we can actually debug errors | 21:25 |
mordred | clarkb: devstack-trusty-1426883075.template.openstack.org <-- rax-dfw rebuilt | 21:25 |
greghaynes | otherwise you have to do that souce mapping hackery | 21:26 |
SpamapS | mordred: https://review.openstack.org/#/c/166383/ -1'd | 21:26 |
SpamapS | mordred: but I think the idea is sound and worthwhile. | 21:26 |
mordred | SpamapS: thanks - good feedbacks | 21:26 |
krotscheck | greghaynes: Ooooh yes. I know that pain acutely. | 21:26 |
fungi | krotscheck: jeblair: anyway, assuming that all those downloads yesterday were gzip-compressed, that's ~630mib additional data which would have been downloaded if we weren't minifying. so not enormous | 21:27 |
SpamapS | greghaynes: my experience has been that building a simple way to run in a debug mode using the query string helps with that. foo/?dontminify=1 | 21:28 |
clarkb | SpamapS: good luck getting that into gerrit | 21:28 |
SpamapS | clarkb: They don't have UI engineers with sanity requirements? :) | 21:29 |
greghaynes | SpamapS: The problem is a lot of the time you have a setup where clients send backtraces to you when they error | 21:29 |
greghaynes | (I wonder if any of our deployers do that?) | 21:29 |
greghaynes | Since you kind of want to know if the code you send them works | 21:29 |
krotscheck | greghaynes: I thought that's why we have tests. | 21:29 |
krotscheck | :D | 21:29 |
greghaynes | heh | 21:29 |
fungi | SpamapS: sanity requirements? it was initially developed at google. i think they just don't have ui engineers | 21:31 |
fungi | also, i think the current and new webuis for gerrit are further proof of that conjecture ;) | 21:31 |
*** kgiusti has left #openstack-infra | 21:32 | |
jeblair | jquery is not used by gerrit | 21:32 |
krotscheck | Well, they built Angular, which is pretty cool. | 21:33 |
krotscheck | But then they ported it to TypeScript. | 21:33 |
krotscheck | So I'm not certain what that says about them. | 21:33 |
openstackgerrit | Clark Boylan proposed openstack-infra/nodepool: Rough rough shape of what rebuilds might look like https://review.openstack.org/166387 | 21:33 |
clarkb | jeblair: ^ is the basic shape of the thing | 21:34 |
fungi | jeblair: nope, but getting switches into the gerrit request syntax to switch between serving multiple javascript files was the ui sanity question, not really jquery specifically | 21:34 |
jeblair | krotscheck: s/built/hired developer of/ | 21:34 |
clarkb | jeblair: but you are right there are some hairy bits, I have commented on them with TODOs and am curious if you think its worth figuring those bits out | 21:34 |
krotscheck | ....oh. Well then. | 21:34 |
krotscheck | Still, TypeScript. | 21:34 |
krotscheck | ick. | 21:34 |
krotscheck | (Though maybe not really) | 21:35 |
clarkb | jeblair: biggest thing is DeleteServerTask needs to become more atomic within nodepool | 21:35 |
*** teran has joined #openstack-infra | 21:35 | |
*** dboik has quit IRC | 21:35 | |
clarkb | jeblair: so that when it fires it handles all of the delete tasks otherwise rebuilt nodes would need ot figure out floating ips and keypairs some other way | 21:35 |
*** spzala has quit IRC | 21:36 | |
BobBall | How can I have a gerrit reporter comment to gerrit without voting? Seems that I _must_ have a value after the "gerrit:" tag which gets translated to something that's actually sent through. Is there a no-op command I can add? | 21:36 |
jeblair | BobBall: look at the definition of our experimental pipeline | 21:37 |
BobBall | oh - {} - how obvious... Thanks. | 21:37 |
jeblair | clarkb: do you think we should defer an existing priority effort here? | 21:37 |
*** esker has quit IRC | 21:38 | |
*** VijayTripathi1 has quit IRC | 21:39 | |
clarkb | jeblair: I think getting the deletions working correctly with the various resources that need to be deleted + rebuilds will likely sink quite a bit of time and probably are not worth it | 21:39 |
openstackgerrit | Monty Taylor proposed openstack-infra/nodepool: Deal with failures that succeed https://review.openstack.org/166383 | 21:39 |
clarkb | jeblair: we already know that this stuff is hairy even without rebuilds | 21:39 |
clarkb | (we leak floating ips for example) | 21:40 |
mordred | ok. if we're doing that | 21:40 |
openstackgerrit | Monty Taylor proposed openstack-infra/nodepool: Use rebuild instead of delete https://review.openstack.org/166388 | 21:40 |
mordred | I wasn't pushing mine up because we weren't doing it and I didn't want to be a randomization function | 21:40 |
mordred | but I think it can be much much simplier | 21:40 |
clarkb | mordred: your change won't work for the same rasons I think | 21:40 |
clarkb | mordred: no yours has the same issues | 21:41 |
mordred | ok | 21:41 |
mordred | just saying | 21:41 |
clarkb | mordred: the problem here is that a VM isn't just a VM | 21:41 |
clarkb | it should be and we should give that feedback to nova and neutron | 21:41 |
clarkb | but unfortunately today it isn't | 21:41 |
mordred | clarkb: I don't know that that matters in this case | 21:42 |
clarkb | oh I see you just never call cleanupServer at all | 21:42 |
mordred | yeah | 21:42 |
mordred | that's why I say "this will just keep things at max usage" | 21:43 |
clarkb | ya, the one place that may be a problem is with snapshot builds, maybe | 21:43 |
clarkb | I do not know how "clean" a rebuild is for that | 21:43 |
mordred | I think a follow up could be done to deal with that - but my main thought experiment was "can this be done without affecting the algorithm at all" | 21:43 |
clarkb | anyways I think the deal I was describing isn't as simple as I thought because delete is really at least 3 delete operations | 21:44 |
clarkb | and as soon as we have to make that more atomic it becomes complicated | 21:44 |
mordred | yes. getting delete right is very hard | 21:44 |
mordred | because of that | 21:44 |
*** EmilienM is now known as EmilienM|PTO | 21:44 | |
mordred | I wish re-using floating-ips was less insane - but the race condition ... ZOMG - I just had an idea for that - probably a zuulv3 idea | 21:45 |
mordred | but I don't know why I didn't think of it before | 21:45 |
mordred | I've been trying to think about floating-ip reuse atomiticy in pure openstack terms | 21:45 |
mordred | but we have a database | 21:45 |
mordred | which means we can deal with the multi-thread race condition issues with allocation flags in the db, rather than in the nova/neutron api | 21:46 |
jeblair | okay, i'd like us to make some kind of a decision here | 21:46 |
fungi | so basically, to summarize, 166383 will go ahead and delete (or attempt to delete) the errored instance and wait for it to no longer appear in the nova list... but are we necessarily able to track it at that point (e.g. has it provided an instance uuid so it can be identified)? | 21:46 |
jeblair | i was expecting to spend this afternoon working on zuulv3 specs | 21:46 |
jeblair | but i'm getting nowhere, because we're still talking about rebuilds | 21:46 |
jeblair | so, can we decide to either pursure this thing or not? | 21:46 |
mordred | fungi: no - 383 does the opposite | 21:46 |
jeblair | i don't see any way it's simple | 21:47 |
mordred | fungi: it continues to try to use a node even if it gets a 500 error - since we've learned that 500 errors are lies | 21:47 |
*** otter768 has joined #openstack-infra | 21:47 | |
clarkb | jeblair: I am with you now, we have not pursued it yet because it is complicated in various ways. Despite that we have limped along with create-delete we should be able to make that work while zuulv3 happens | 21:47 |
jeblair | i think it's at least one person working on it a while, and at least one core babysitting it for a week | 21:47 |
jeblair | if people think it's worth doing, let's knock something off the priority list and do it | 21:47 |
jeblair | or put it on the priority list backlog | 21:47 |
*** aysyd has quit IRC | 21:47 | |
fungi | mordred: ahh, yeah i forget that it can actually become usable even after a 5xx rather than simply hang around broken | 21:47 |
jeblair | but it's a big enough thing that i don't think we have any more marginal time for it | 21:48 |
mordred | I think it's maybe worth it - and I Agree it's not going to be quick and easy | 21:48 |
*** doug-fish has left #openstack-infra | 21:48 | |
mordred | I say maybe because I think one of our clouds is not goign to get any better any time soon, and I think that we've learned that delete calls are super expensive in openstack | 21:48 |
*** hashar has quit IRC | 21:48 | |
jeblair | mordred: so are create. do we know how expensive rebuild is? | 21:48 |
mordred | so the ability to avoid them may give us a much larger amount of bandwidth | 21:48 |
mordred | jeblair: I expect it to be the same | 21:49 |
mordred | which makes it 1/2 as expensive | 21:49 |
fungi | however, we're past feature freeze now... we might have quite a few months we can limp along with the current create/delete model | 21:49 |
jeblair | has anyone tried this and timed it on hpcloud? | 21:49 |
*** jamielennox|away is now known as jamielennox | 21:49 | |
mordred | I think timrc did some initial benchmarks, yes | 21:49 |
jeblair | fungi: i agree | 21:50 |
clarkb | timrc did, I don't have his numbers handy but they were an improvement | 21:50 |
mordred | fungi: that is a good point | 21:50 |
openstackgerrit | Merged openstack-infra/system-config: Fix jquery setup on Gerrit server. https://review.openstack.org/165145 | 21:50 |
*** timcline has quit IRC | 21:50 | |
fungi | hence i'd rather see those several months runway invested hard in zuul v3 rather than continuing to try to make incremental improvements to the current design | 21:50 |
mordred | I don't think it's urgent anymore because of FF - and I've been solidly on the "do it later" camp because adding it now will make the nodepool-shade patch larger | 21:51 |
fungi | which might rob us of the resources we need to get new zuul in time for the _next_ feature freeze | 21:51 |
*** otter768 has quit IRC | 21:51 | |
anteaya | zaro: so we have all we need for tomorrow? | 21:51 |
clarkb | mordred: and you don't expect hpcloud to correct whatever has ailed it over the last couple weeks? | 21:52 |
mordred | no | 21:52 |
mordred | or, rather | 21:52 |
clarkb | I am fairly positive this was not a problem a couple mnoths ago | 21:52 |
mordred | I think we need to assume for planning purposes that it will not | 21:52 |
mordred | so that if it does, it will be a pleasant surprise | 21:53 |
*** enikanorov has quit IRC | 21:53 | |
jeblair | mordred, clarkb: let's do this. think about it a bit more, and talk to timrc if you want. do some experiments to see if we would actually gain anything, and if you want to do it, propose it as either a backlog priority effort, or propose that we bump a current priority effort for it at the next meeting. | 21:53 |
jeblair | mordred, clarkb: how's that sound? | 21:53 |
* timrc perks up | 21:53 | |
clarkb | jeblair: ya, we should definitely test it in the new hpcloud situation too | 21:53 |
mordred | jeblair: I think that's a great plan | 21:54 |
*** mriedem has quit IRC | 21:54 | |
clarkb | jeblair: not sure that timrc's testing captured what it is like now | 21:54 |
timrc | clarkb, I pastebin'ed my numbers on the review for 'rebuild' | 21:54 |
clarkb | timrc: yes, but you did so before we blew up hpcloud yesterday | 21:54 |
*** weshay has quit IRC | 21:54 | |
*** enikanorov has joined #openstack-infra | 21:54 | |
mordred | there is another thing - which is that I think one of us might need to test in the openstackjenkins2 tenant | 21:54 |
timrc | clarkb, Yeah... want me to rerun the script? | 21:54 |
mordred | timrc: so if you can provide a script | 21:54 |
timrc | There is a script... | 21:55 |
mordred | clarkb: I say that beacuse I think delete time might be tied to tenant account too | 21:55 |
clarkb | mordred: gotcha | 21:55 |
*** garyh has joined #openstack-infra | 21:55 | |
timrc | Give me a second... I'm running off of what my carrier says is 4G on a beach in south Texas. | 21:55 |
mordred | I don't know that it is - but if there are database table issues, then our soft-delete database history could make a difference | 21:55 |
mordred | timrc: it's not urgent | 21:55 |
clarkb | but ya I think checking that performance is a good next step and from there we can decide if its worth the effort | 21:55 |
mordred | clarkb: ++ | 21:55 |
timrc | mordred, clarkb, jeblair: Numbers: http://paste.openstack.org/show/187334/ Script: http://paste.openstack.org/show/187333/ | 21:56 |
jeblair | okay, i'm going to get back to writing the zuulv3 spec so that it stops being an imaginary thing, and so we can get closer to actually working on it instead of blocking on me | 21:56 |
fungi | thanks jeblair! | 21:56 |
mordred | jeblair: woohoo! | 21:57 |
jeblair | that's region-a, which is another difference, yeah? | 21:57 |
fungi | mordred: on the "limping along" front, any updates on whether hpcloud west is something we should try? | 21:57 |
clarkb | jeblair: yes we are in region-b | 21:57 |
timrc | clarkb, So... when you say blew up hpcloud... what does that actuall mean? I've been on vacation. | 21:57 |
SpamapS | bewm | 21:57 |
jeblair | kabloom | 21:57 |
SpamapS | as in, boom with a fratboy accent | 21:57 |
fungi | timrc: maybe you should avoid worrying about it while vacationing. seems like a waste of a good vacation | 21:57 |
mordred | timrc: go back to vacation - you don't want to know | 21:57 |
clarkb | timrc: https://community.hpcloud.com/status/incident/2944 | 21:58 |
timrc | fungi, Do you have kids? | 21:58 |
fungi | timrc: point taken ;) | 21:58 |
timrc | ;) | 21:58 |
*** gordc has quit IRC | 21:58 | |
SpamapS | it kind of makes sense.. rebuild does an update on the row a few times (for status, image id) but otherwise just happens all in nova-compute | 21:58 |
mordred | yah - also, we dont' have to re-floating-ip | 21:58 |
mordred | so the number of API calls it takes is considerably less | 21:58 |
mordred | and if API call limit is one of our blockers | 21:59 |
SpamapS | create has to be scheduled | 21:59 |
mordred | then that might actually be more important - or at least an important factor | 21:59 |
*** carl_baldwin has quit IRC | 22:00 | |
SpamapS | yeah reducing API calls would be a win especially for hpcloud's current ailments | 22:00 |
fungi | agreed, testing rebuild performance in our tenant while hpcloud is failing to respond to most of our api calls might be an interesting performance datapoint | 22:00 |
SpamapS | also with HPCloud floundering, does this increment the priotity a bit on infra cloud? | 22:00 |
*** armax has quit IRC | 22:00 | |
fungi | SpamapS: dunno. i haven't seen recent updates the people who were writing the infra-cloud bits, though i could have missed some | 22:01 |
fungi | er, from the people | 22:01 |
jeblair | SpamapS: i think infra cloud is already fairly high priority because of this | 22:01 |
jeblair | fungi: may be a misconception here... lemme splain | 22:01 |
jeblair | i've asked the new folks joining our team to pitch in on existing efforts because i don't want there to be an infra-cloud team which is separate from the infra team | 22:02 |
*** mrmartin has joined #openstack-infra | 22:02 | |
SpamapS | I've moved writing the initial docs changes up to the top of my priority, to be multi-plexed with nodepool and shade testing. | 22:02 |
fungi | oh! yes that's an extremely good thing | 22:02 |
*** tnovacik has quit IRC | 22:02 | |
jeblair | i think that's going pretty well so far | 22:02 |
jeblair | so i think that as SpamapS finishes up the doc he's writing... | 22:03 |
fungi | that explains the recent uptick in people getting more involved in general infra stuff, so i'm thrilled. seems to have worked out well so far | 22:03 |
jeblair | we can start to slot that effort into the priority list when one or more things wrap up | 22:03 |
*** sabeen1 has quit IRC | 22:04 | |
mordred | SpamapS: actually ... I was going to ask you if you'd inject work on nodepool-dib into your priority list | 22:04 |
fungi | yep, i missed that's how it was ramping up. i may have skimmed one of the meeting logs from when i was on vacation a little too lightly | 22:04 |
jeblair | and we'll all be working on it together, at least as much as we do anything else -- some people are going to focus on it more than others, but it'll operate more like the other things we've got going on | 22:04 |
clarkb | SpamapS: did the first step of homogenizing hardware get started? /me wonders where we are at | 22:04 |
mordred | SpamapS: because I think both of the main blocking tasks there you are exceptionally well suited to attack | 22:04 |
mordred | clarkb: no - we have done no tasks there | 22:04 |
*** Swami has quit IRC | 22:05 | |
*** jamielennox is now known as jamielennox|away | 22:05 | |
mordred | clarkb: I have run puppet on a node in each cloud region - so you have a login on them | 22:05 |
* greghaynes might also be able to help with nodepool-dib if SpamapS is spread too thin | 22:05 | |
*** garyh has quit IRC | 22:05 | |
mordred | clarkb: but they have not been cleaned in any way - pending what jeblair is discussing before | 22:05 |
SpamapS | mordred: by all means, push things onto my stack. :) | 22:05 |
jeblair | also, we expect to have a few more people joining soon too | 22:05 |
mordred | SpamapS, greghaynes: Ng and GheRivero are starting to look as well - but the two tricky tasks are: | 22:06 |
clarkb | mordred: I understood that was step 0 what is step -1? | 22:06 |
fungi | out of curiosity, and feel free to point me to existing descriptions/documentation, what sort of initial capacity are we expecting out of the current hardware? | 22:06 |
mordred | clarkb: current infra priority efforts | 22:06 |
jeblair | so i hope that happens in a time frame where they can also pitch in on non-infra-cloud things, but also be here when we really start on infra cloud | 22:06 |
mordred | SpamapS, greghaynes: get a working base image that can boot on both rackspace and hp for ubuntu and centos/fedora | 22:06 |
clarkb | mordred: right but aiui its just an internal hp ticket to have someone physically located in the same building as the hardware move some pcie cards around | 22:06 |
flashgordon | anteaya: pong, re: devstack-gate triggers grenade which is why we still should run grenade on it | 22:06 |
flashgordon | clarkb: pong, which bug? | 22:07 |
greghaynes | mordred: ah, and AIUI the issue there is just rax networking? | 22:07 |
mordred | clarkb: yes - but we need to also do some more design on networking vlans, which is an infra team task before we set that in motion | 22:07 |
clarkb | flashgordon: `nova boot` returns 502 error from api server, but then nova boots the node anyways | 22:07 |
mordred | greghaynes, SpamapS: we have a script that can handle rax networking | 22:07 |
mordred | current issue is making sure that script runs at the right tie during boot | 22:07 |
SpamapS | mordred: Ah see here I thought you had that well in hand and it was about done. | 22:07 |
flashgordon | clarkb: what is the full response? | 22:07 |
clarkb | mordred: gotcha | 22:08 |
mordred | greghaynes, SpamapS: Ng has been looking at it some, but is battling rackspace london | 22:08 |
flashgordon | clarkb: very odd | 22:08 |
*** dboik has joined #openstack-infra | 22:08 | |
mordred | greghaynes, SpamapS: but, in any case, you guys know a lot about dib things too :) | 22:08 |
clarkb | mordred: have that nova 500 exception handy? | 22:08 |
*** marun has quit IRC | 22:08 | |
mordred | clarkb: one sec ... | 22:08 |
anteaya | flashgordon: okay where do I find a list of projects grenade runs on? | 22:08 |
mordred | clarkb: it's a ClientException "unknown error" fwiw | 22:09 |
mordred | SpamapS, greghaynes: the second thing is the "port nodepool to shade" task - which yolanda started and GheRivero started looking at | 22:09 |
flashgordon | anteaya: two answers, 'git grep project-name' in grenade | 22:09 |
mordred | but it's going to be a not-small patch | 22:09 |
mordred | so collaboration is likely important | 22:09 |
clarkb | flashgordon: ClientException: Unknown Error (HTTP 500) | 22:09 |
mordred | it's going to involve porting smarts from nodepool into shade in a few places | 22:10 |
flashgordon | anteaya: and http://git.openstack.org/cgit/openstack-dev/grenade/tree/ check for upgrade-* as you mentioned above | 22:10 |
fungi | clarkb: flashgordon http://paste.openstack.org/show/193961/ | 22:10 |
clarkb | we apparently don't have any 502's in the log so I was wrong about specific error | 22:10 |
fungi | 502 error actually | 22:10 |
SpamapS | mordred: writing more tests will end up being a good parallel effort to maintain while that is ongoing. | 22:10 |
anteaya | flashgordon: horizon isn't there, devstack-gate isn't there | 22:10 |
clarkb | oh two different types of Unknown Error. The best kind of Unknown | 22:10 |
anteaya | flashgordon: neutron isn't there | 22:10 |
openstackgerrit | melanie witt proposed openstack-infra/project-config: Adjust regression exceptions for Nova Cells V1 job https://review.openstack.org/166396 | 22:10 |
mordred | SpamapS: yes indeed - but it would be great to shove a facehead into the bucket of that - I'm betting it will expose some specific things that need specific testing | 22:11 |
flashgordon | upgrade-neutron | 22:11 |
greghaynes | mordred: so for the first task - is the state of things that were all good for images in hpcloud and weve yet to get booting images in rax or is there hpcloud issues as well? | 22:11 |
SpamapS | I keep forgetting that infra uses topics so nicely in gerrit. | 22:11 |
clarkb | greghaynes: hpcloud is good | 22:11 |
* SpamapS finds all the revies | 22:11 | |
SpamapS | reviews even | 22:11 |
flashgordon | anteaya: https://github.com/openstack-dev/grenade | 22:11 |
flashgordon | anteaya: http://paste.openstack.org/show/194083 is what I see | 22:12 |
clarkb | flashgordon: but remember I was complaining that we were leaking nodes? this is how | 22:12 |
mordred | greghaynes: it's actually mostly that centos/fedora aka systemd is weird | 22:12 |
flashgordon | devstack-gate calls grenade so the relationship is the other way around | 22:12 |
flashgordon | clarkb: ahh | 22:12 |
mordred | greghaynes: we MAY be really close to being awesome everywhere | 22:12 |
flashgordon | clarkb: that log isn't really useful hmmm | 22:12 |
greghaynes | mordred: yep, thats the whole we need our script to run at the right time WRT networking and cloud-init, yes? | 22:13 |
*** amitgandhinz has quit IRC | 22:13 | |
mordred | greghaynes: but we need to do empirical testing of the axises of (ubuntu, debian, centos, fedora) * (hpcloud, racspace) | 22:13 |
anteaya | flashgordon: what is upgrade-infra? | 22:13 |
clarkb | flashgordon: yes well we only have nova to blame for that :) but I think the key bit is that even after a 5XX error it is possible for nova to continue scheduling a node happily | 22:13 |
greghaynes | ok | 22:13 |
mordred | greghaynes: yes - except no cloud-init | 22:13 |
SpamapS | mordred: maybe we need to make our script a little more fuzzy | 22:13 |
fungi | flashgordon: not really useful, but also all that novaclient tells us | 22:13 |
greghaynes | oh, we changed that again | 22:13 |
clarkb | mordred: wait | 22:13 |
SpamapS | as in, run it backgrounded and keep trying, as long as we block bad things from happening. | 22:13 |
clarkb | mordred: lets back up on that, we just tried no cloud-init and it failed spectacularly | 22:13 |
clarkb | mordred: I think we should cloud init for this reason | 22:13 |
clarkb | mordred: at least as a first stab | 22:14 |
mordred | nonononoo | 22:14 |
mordred | nononononononononononono | 22:14 |
mordred | nononononon | 22:14 |
fungi | clarkb: but remember that means installing our own non-distro-packaged cloud-init too | 22:14 |
mordred | that is tottally differeent | 22:14 |
mordred | please don't confuse the issues | 22:14 |
greghaynes | I think hes stuck in a loop | 22:14 |
clarkb | mordred: I don't think I am, you just said no cloud init | 22:14 |
mordred | no | 22:14 |
SpamapS | reboot him | 22:14 |
clarkb | mordred: we just tried that, it broke | 22:14 |
mordred | hangon | 22:14 |
mordred | no | 22:14 |
* timrc gets popcorn | 22:15 | |
*** esker has joined #openstack-infra | 22:15 | |
mordred | it broke because we assumed that rax wasn't using cloud-init in their images | 22:15 |
mordred | they are | 22:15 |
mordred | we are not using their images | 22:15 |
flashgordon | anteaya: upgrade infra is some random stuff AFAIK | 22:15 |
mordred | in order to use cloud-init in our own images | 22:15 |
* SpamapS sends a SIGHUP | 22:15 | |
mordred | we MUST use a patched version of cloud-init | 22:15 |
anteaya | flashgordon: sigh | 22:15 |
mordred | and it all gets very complex | 22:15 |
mordred | I promise - we do not need to start over from scratch on this effort | 22:15 |
anteaya | flashgordon: okay I have to go get more sap before I burn what is on the stove | 22:15 |
SpamapS | right I believe nibalizer was working on cloud-init-in-a-virtualenv for that purpose? | 22:15 |
anteaya | flashgordon: I'll look at it again later, thanks | 22:15 |
mordred | we are an initscript away from being done | 22:15 |
mordred | noe | 22:15 |
mordred | NO seriously | 22:16 |
mordred | can we not start over from scratch | 22:16 |
SpamapS | ok for some other weird purpose. :) | 22:16 |
mordred | I don't care why he was working on it | 22:16 |
*** ddieterly has quit IRC | 22:16 | |
mordred | I don't want to keep having this argument | 22:16 |
SpamapS | I'm up for not starting over | 22:16 |
mordred | we are almost done with this | 22:16 |
SpamapS | mordred: is there somewhere I can look at the result of things not working right? | 22:16 |
mordred | it works - we're fine - we need to test it and make sure we've covered the combinations | 22:16 |
clarkb | mordred: so, can you clarify what rax does use cloud init for and why we don't need to use it for that? | 22:16 |
SpamapS | mordred: oh so it's truly at a point of needing to be reasoned about and landed, not smoke tested? | 22:17 |
flashgordon | clarkb: can you run this with novaclient debug logs on? | 22:17 |
mordred | SpamapS: yes | 22:17 |
flashgordon | otherwise I don't have enough to go on | 22:17 |
*** abramley has quit IRC | 22:17 | |
mordred | clarkb: they have designed images that depend on cloud-init - I have not dug in to why | 22:17 |
mordred | but it's irrelevant | 22:17 |
greghaynes | so, somewhat related question - mordred is there any sane way to get creds to boot some rax vms | 22:17 |
mordred | we have booted appropriate images in racksapce that do not have cloud-init and they work fine | 22:17 |
SpamapS | mordred: ok, please point me at a starting point.. just the element in project-config ? Active review? | 22:17 |
flashgordon | anteaya: no worries. so devstack has a lib/infra section | 22:18 |
mordred | greghaynes: yes - use your amex - I will approve it :) | 22:18 |
flashgordon | upgrade-infra calls taht | 22:18 |
greghaynes | mordred: easy enough | 22:18 |
mordred | greghaynes: just make sure they don't put you in london | 22:18 |
mordred | and you'll have to request glance being turned on | 22:18 |
mordred | SpamapS: one sec - I'm looming | 22:18 |
mordred | looking | 22:18 |
*** esker has quit IRC | 22:19 | |
mordred | SpamapS, greghaynes: https://review.openstack.org/#/c/154132/ | 22:19 |
pleia2 | fungi: just saw your superuser.o.o interview, very nice! | 22:19 |
fungi | greghaynes: i would say sign up for the iopenedthecloud.com promotion, but they stopped offering it and took the form offline | 22:19 |
flashgordon | clarkb: not nova -- 10.5.3 502 Bad Gateway | 22:19 |
mordred | needs to be finished - and we'll want to import that repo into gerrit (the one in source-repositories) | 22:19 |
fungi | pleia2: you're welcome! ;) | 22:19 |
flashgordon | http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html | 22:19 |
mordred | but I figured just leaving it there until we're good is fine | 22:19 |
greghaynes | fungi: Yea, missed that boat :( | 22:19 |
*** esker has joined #openstack-infra | 22:19 | |
SpamapS | ahh base-elements as a topic | 22:20 |
*** abramley has joined #openstack-infra | 22:20 | |
*** arxcruz has quit IRC | 22:20 | |
clarkb | flashgordon: it happens with 500 errors too iirc | 22:20 |
clarkb | fungi: ^ did we narrow it down only to the 502? | 22:20 |
flashgordon | clarkb: and nova doesn't raise any bad gateways | 22:20 |
fungi | clarkb: i didn't see the 500 examples but i'll look for one | 22:21 |
flashgordon | clarkb: do you have a 500 error log as well? | 22:21 |
clarkb | flashgordon: its the same just s/502/500/ | 22:21 |
flashgordon | clarkb: hopefully if its a nova bug it will have a error message | 22:21 |
clarkb | flashgordon: Unknown Error | 22:21 |
fungi | clarkb: flashgordon: i wouldn't be surprised if this is some sort of network device sitting in front of teh api endpoint getting overwhelmed | 22:21 |
flashgordon | no message | 22:21 |
clarkb | fungi: ya | 22:21 |
flashgordon | fungi: yeah that is my bet too | 22:21 |
mordred | SpamapS, greghaynes: I imagine that you and Ng can probably knock it out in like, 30 minutes | 22:21 |
flashgordon | which cloud is this on ? | 22:21 |
clarkb | its entirely possible that nova is doing that it is told while a frontend device derps | 22:21 |
SpamapS | mordred: I wonder if it would be simpler to land that script in the review, and then once we know it works, publish it as its own repo? | 22:22 |
clarkb | flashgordon: hpcloud | 22:22 |
flashgordon | clarkb:ah | 22:22 |
mordred | SpamapS: I'm fine with whatever works best for your brain | 22:22 |
greghaynes | mordred: yea, I suspect 90% of the effort is going to be just getting setup with rax properly | 22:22 |
mordred | greghaynes: yup | 22:22 |
mordred | but once you are - it'll make future testing things easier | 22:22 |
mordred | because making sure shade works in both places is important too | 22:22 |
greghaynes | yep, good point | 22:22 |
SpamapS | mordred: oh its like, a thing, with setup.cfg and stuff | 22:23 |
clarkb | fungi: iopenedthecloud ends next month too for those of us that have it iirc | 22:23 |
mordred | SpamapS: we have this cookiecutter thing ... | 22:23 |
clarkb | fungi: I need to find hosting that doesn't charge a $50 base fee | 22:23 |
SpamapS | mordred: its more about it being best for velocity. Less moving parts in the beginning. | 22:23 |
SpamapS | lurve me some single purpose well tested repos, but that this is not. ;) | 22:23 |
mordred | SpamapS: yes - TOTALLY - I say do it - we can put it back later | 22:23 |
fungi | clarkb: yeah i was never able to get them to correct my account to add that promotion so i've just been paying for about a year | 22:23 |
SpamapS | mordred: ok | 22:23 |
* mordred must run away ... | 22:24 | |
harlowja_ | clarkb hey, do u know if that virtualenv change ever happened so that https://review.openstack.org/#/c/164836/ can get rechecked? | 22:24 |
SpamapS | greghaynes: so, I suggest you start trying to use that review, and I will whip it into shape to be landed | 22:24 |
SpamapS | mordred: sounds good, we got this | 22:24 |
clarkb | harlowja_: no one has written it yet | 22:24 |
clarkb | harlowja_: feel free to | 22:24 |
harlowja_ | k | 22:24 |
fungi | clarkb: so far the 500 errors i'm finding are tripleo | 22:24 |
greghaynes | SpamapS: Yep | 22:24 |
greghaynes | mordred: Yes, all your base-elements are belong to us | 22:25 |
clarkb | harlowja_: we have been battling the cloud exploded fires | 22:25 |
harlowja_ | np | 22:25 |
*** prad has quit IRC | 22:25 | |
clarkb | harlowja_: but it should be as simple as updating the line I linked with the latest version | 22:25 |
harlowja_ | ya | 22:25 |
clarkb | harlowja_: or replacing the version specifier with ensure => latest | 22:25 |
harlowja_ | will get a review up | 22:25 |
clarkb | harlowja_: ^ is likely the change we really want since we don't care about aging virtualenv/pip/setuptools | 22:26 |
harlowja_ | ya | 22:26 |
fungi | clarkb: oh, found some http 500 errors in hpcloud but they're all for delete calls so far | 22:26 |
clarkb | fungi: so its possible that only 502s caused the leaks | 22:26 |
fungi | ooh! ClientException: Unknown Error (HTTP 503) | 22:27 |
fungi | there's another to hunt down | 22:27 |
fungi | that was also on a deletion | 22:27 |
flashgordon | clarkb: any idea of who I can switch to after the rax thing ends | 22:27 |
flashgordon | mikal: I can has free cloud? | 22:28 |
clarkb | flashgordon: no, I haven't really looked. I would stick with rax if the $50 base charge wasn't a thing | 22:28 |
flashgordon | $50 base whaaat | 22:28 |
fungi | clarkb: flashgordon: maybe https://www.runabove.com/ | 22:28 |
clarkb | https://www.arpnetworks.com/ are supposed to be good but not openstack | 22:28 |
fungi | they have pretty low rates and are supposedly basic openstack services | 22:28 |
zaro | anteaya: yep, looks like everything is in place. | 22:28 |
openstackgerrit | Joshua Harlow proposed openstack-infra/system-config: Always try to use the latest virtualenv https://review.openstack.org/166404 | 22:28 |
harlowja_ | clarkb ^ ok, let's see how that goes | 22:29 |
*** ociuhandu has joined #openstack-infra | 22:29 | |
* flashgordon wants free | 22:29 | |
clarkb | fungi: their prices are pretty good and its openstack | 22:30 |
*** YogeeBear has joined #openstack-infra | 22:31 | |
greghaynes | highly reccomend arpnetworks | 22:32 |
clarkb | yes but not openstack | 22:33 |
greghaynes | :p | 22:33 |
greghaynes | yea, if you want to actually do dev and need a cloud then :( | 22:33 |
greghaynes | I need to deploy an openstack in my rack so I can do this | 22:34 |
fungi | i gave up having a small datacenter in my house when i moved to the beach | 22:34 |
fungi | so remote virtual machines are now pretty necessary for me | 22:34 |
greghaynes | fungi: I actually colo | 22:35 |
greghaynes | but yea | 22:35 |
greghaynes | thers upsides and downsides | 22:35 |
fungi | a colo near where i live would be way more expensive than what i'm doing now | 22:35 |
openstackgerrit | Joe Gordon proposed openstack-infra/project-config: Don't run neutron-large-ops on neutron advanced services https://review.openstack.org/165648 | 22:35 |
reed | fungi, when you have time, remember to pull the list of new ATCs... you can do it on Monday | 22:35 |
fungi | and "near" would be ~2-3 hours drive | 22:35 |
greghaynes | eek | 22:35 |
fungi | reed: yep, on my list for this weekend | 22:35 |
fungi | though might end up being monday | 22:36 |
greghaynes | We end up having pretty cheap colo here in pdx (although its not the best DC) and I actually just do it because my home became way too hot otherwise | 22:36 |
reed | no work over weekend, fungi | 22:36 |
clarkb | greghaynes: by not the best DC I think you mean its basically someones garage | 22:36 |
*** bknudson has quit IRC | 22:36 | |
clarkb | greghaynes: because lol shelf servers | 22:36 |
fungi | reed: what work? i'm happily enjoying retired life | 22:36 |
reed | LOL | 22:36 |
greghaynes | clarkb: haha, its actually impressive as the building and infra goes, but yea they are pretty low key on how they manage it | 22:37 |
greghaynes | clarkb: tata was going to move their HQ there then the bubble burst and the dc was just kinda overbuilt and underused | 22:37 |
fungi | greghaynes: my home datacenter had a separate air conditioning unit for exactly that reason | 22:37 |
fungi | and it was in my basement, so in the winter i'd just open the door to the rest of the house and use the computers as auxiliary heating | 22:38 |
*** YogeeBear has left #openstack-infra | 22:38 | |
greghaynes | nice! | 22:38 |
fungi | i had relay racks bolted to heavy duty shipping pallets with swivel casters underneath as my poor-man's raised floor, so i could move them around in the room as needed | 22:40 |
fungi | 3x3 grid of 500lb-rated swivel casters underneath each | 22:40 |
clarkb | runabove won't let me try their free tier without supplying a credit card | 22:40 |
clarkb | I get it but :( | 22:41 |
*** erlon_away has quit IRC | 22:41 | |
openstackgerrit | Clint 'SpamapS' Byrum proposed openstack-infra/project-config: Add config-drive element https://review.openstack.org/154132 | 22:41 |
openstackgerrit | Clint 'SpamapS' Byrum proposed openstack-infra/project-config: Add elements for Infra servers https://review.openstack.org/140840 | 22:41 |
SpamapS | greghaynes: ^ | 22:41 |
SpamapS | removed the gitorious dependency until we can get that code into its own proper stackforge (openstack??) project. | 22:42 |
fungi | clarkb: yeah, and also a friend of mine tried to sign up with it for the very small start-up he was working for, and they insisted he provide paperwork proving the company existed. it's a french parent company, so "shut up and take my money" doesn't work with them like it does with, say, amazon | 22:42 |
SpamapS | greghaynes: also what happened to your devuser patch to dib? | 22:42 |
fungi | SpamapS: gitorious? you mean gitlab right? ;) | 22:42 |
greghaynes | SpamapS: not merged yet | 22:42 |
SpamapS | fungi: sourceforge ftw? | 22:43 |
fungi | SpamapS: apparently | 22:43 |
greghaynes | SpamapS: oh, looks like they added a linter for some dib stuff that it needs to be updated for https://review.openstack.org/#/c/153439/ | 22:43 |
greghaynes | SpamapS: You going to use it? | 22:43 |
*** sdake has joined #openstack-infra | 22:43 | |
clarkb | wow and I can't boot any instances because they are reserved for paying customers currently | 22:44 |
SpamapS | greghaynes: I had a moment of wanting to just run 'kvm foo.qcow2' to test .. and no user to login as. ;) | 22:44 |
*** mriedem has joined #openstack-infra | 22:44 | |
clarkb | I wanted to see what their sandbox instances are. container vs vm etc | 22:44 |
clarkb | I am in a waiting list though so I shall wait | 22:44 |
SpamapS | clarkb: just setsockopt O_NONBLOCK .. 60% of the time it works, every time. | 22:45 |
greghaynes | SpamapS: Also, not sure if you saw: https://review.openstack.org/#/c/156433/ | 22:45 |
fungi | clarkb: so far the 503 errors are also deletions, but in rackspace | 22:45 |
clarkb | weird | 22:45 |
greghaynes | SpamapS: Im unsure if the fact that people have tested in rax means that is verified as working or if people have been building images by hand, but it would be good to test | 22:45 |
nibalizer | what was I supposedly doing? | 22:46 |
greghaynes | nibalizer: fixing everything | 22:46 |
nibalizer | well i can confirm i am not doing that | 22:46 |
nibalizer | SpamapS: ? | 22:46 |
fungi | clarkb: flashgordon: oh, here's a ClientException: Unknown Error (HTTP 500) on create in hpcloud, so we do see some with that too | 22:47 |
*** baoli has joined #openstack-infra | 22:47 | |
greghaynes | nibalizer: we were talking about why you were messing with cloud-init in venv | 22:47 |
*** sdake_ has quit IRC | 22:47 | |
greghaynes | but its actually not relevant since were decidedly not trying to cloud-init ATM | 22:47 |
clarkb | Ithink I may have discovered that if I use horizon with runabove I can boot an instance <_< | 22:47 |
clarkb | we will see if it actually successfully boots | 22:48 |
nibalizer | okay coool, yea burn cloud-init in the firepit plz | 22:48 |
nibalizer | hopefully its author isn't in this room | 22:48 |
fungi | oh, he is | 22:49 |
*** baoli has quit IRC | 22:49 | |
* nibalizer apologizes | 22:50 | |
clarkb | woot I got a node, total hacks | 22:50 |
clarkb | it looks like their $2.50/month node is a kvm vm | 22:50 |
clarkb | which is winning | 22:50 |
fungi | what specs for ram/disk? | 22:51 |
clarkb | 2GB ram 20GB disk | 22:51 |
flashgordon | fungi: do you have the full trace? | 22:51 |
clarkb | but its oversubscribed, the ~$10/month is supposedly not oversubscribed | 22:52 |
clarkb | also you have to login as admin | 22:52 |
fungi | flashgordon: sure, but the files/lines are the same as from the 502 | 22:52 |
flashgordon | fungi: :/ was hoping for some other data | 22:52 |
flashgordon | fungi: once again w/o the debug logs from novaclient ... | 22:53 |
clarkb | also they give you a real ip addr | 22:53 |
clarkb | but no ipv6 | 22:53 |
fungi | flashgordon: yep, it's exactly the same point in the client, just a (slightly) different http error code | 22:53 |
flashgordon | so 500's are generally something went really wrong | 22:54 |
flashgordon | so not to surprised that case is leaking things | 22:54 |
*** enikanorov has quit IRC | 22:55 | |
clarkb | fungi: my node seems to be in europe too | 22:55 |
fungi | clarkb: yeah, i think they have several datacenters in europe and also at least one in quebec | 22:55 |
clarkb | ya I wasn't given an option via horizon to chose a region, I should dig into that more | 22:56 |
*** enikanorov has joined #openstack-infra | 22:56 | |
flashgordon | fungi: my hunch is its a layer on top of OpenStack as well | 22:57 |
flashgordon | fungi: there may be a way to better detect what nodes nodepool started versus others | 22:57 |
*** dimsum__ has quit IRC | 22:58 | |
flashgordon | to make it easier to detect zombies | 22:58 |
*** hdd has joined #openstack-infra | 22:58 | |
clarkb | flashgordon: there is, we can write metadata on each node that states which nodepool booted the node | 22:58 |
clarkb | then if that nodepool doesn't know about that node it can delete it | 22:58 |
clarkb | or adopt it I guess | 22:58 |
anteaya | zaro: great well I'm going offline then so I can be coherent tomorrow, see those who will be there at 1500 | 22:58 |
*** andreykurilin_ has quit IRC | 22:58 | |
flashgordon | clarkb: there me be a better way | 22:59 |
*** dimtruck is now known as zz_dimtruck | 22:59 | |
*** dimsum__ has joined #openstack-infra | 22:59 | |
flashgordon | clarkb: yeah nova boot --meta | 22:59 |
flashgordon | that kind of metadata | 22:59 |
*** dimsum__ is now known as dims | 23:00 | |
*** tsg_ has quit IRC | 23:00 | |
fungi | flashgordon: yep, we definitely have an idea of how we might do that (or just ignore the 5xx errors per mordred's proposed patch) but regardless if it was likely to be a bug in nova we wanted to figure out whether we had enough details to make a useful bug report or identify if it's an already known issue | 23:01 |
clarkb | mordred: that node I was follwing in hpcloud went ready and was then used | 23:01 |
clarkb | mordred: and is now in the delete queue | 23:01 |
clarkb | mordred: so I think if you change works in rax (where are we on that) then we are good | 23:02 |
flashgordon | fungi: in the 5xx case do you get a instance id? | 23:02 |
flashgordon | fungi: ahh | 23:02 |
*** boris-42 has joined #openstack-infra | 23:02 | |
fungi | flashgordon: not in the api response i don't think | 23:02 |
zaro | anteaya: thanks for remind the crew, see you tomorrow. | 23:03 |
clarkb | fungi: correct we get the 50X not a uuid | 23:03 |
clarkb | which then leads to leaking the node, mordreds hackaround is to query based on the name we told to boot | 23:03 |
fungi | zaro: looking forward to it | 23:03 |
fungi | in good news, zuul has about finished chipping away at its waiting jobs | 23:04 |
*** thedodd has quit IRC | 23:05 | |
*** garyh has joined #openstack-infra | 23:06 | |
fungi | speaking of alien nodes, 217 at the moment | 23:06 |
fungi | jeblair: when you said slowly deleting those did you manually do so or have you been doing it continuously in a loop? if the former, i'll go ahead and do another cleanup pass while i'm thinking about it | 23:07 |
*** mtanino has quit IRC | 23:08 | |
*** sarob has quit IRC | 23:08 | |
*** ghostpl_ has joined #openstack-infra | 23:08 | |
flashgordon | clarkb: ahh that is the workaround | 23:09 |
flashgordon | fungi: so if no req-id its not a nova bug | 23:09 |
flashgordon | as a general rule of thumb | 23:09 |
clarkb | flashgordon: oh you want reqid | 23:10 |
clarkb | flashgordon: I was talking instance uuid | 23:10 |
*** wenlock has quit IRC | 23:10 | |
fungi | i went ahead and started deleting the current alien nodes | 23:11 |
flashgordon | clarkb: err I meant instance uuid | 23:14 |
flashgordon | well really either | 23:14 |
*** ddieterly has joined #openstack-infra | 23:14 | |
flashgordon | if get neither it isn't a nova thing | 23:14 |
jeblair | fungi: i had 2 processes going through them; they are finished now (sorry i don't know the completion time, but it was probably within the last hour) | 23:15 |
*** esker has quit IRC | 23:15 | |
*** garyh has quit IRC | 23:15 | |
fungi | jeblair: cool, well i've got a serialized pass going now | 23:16 |
jeblair | fungi: good, that should reduce the load and decrease the chance of timeouts on our side (which cause more alien nodes) | 23:17 |
fungi | so since you started your pass, we accumulated more than 200 additional | 23:17 |
*** sarob has joined #openstack-infra | 23:17 | |
*** Bsony has quit IRC | 23:18 | |
fungi | but demand is now down to the point where i don't think we're going to continue accumulating many from here through the weekend | 23:18 |
jeblair | fungi: yeah. my gut is we can ascribe some of them to my additional activity (especially since i started with 10 of them in parallel before i realized the effect). but not all of them. perhaps i would discount that by 50-100. so still a serious problem. | 23:18 |
*** esker has joined #openstack-infra | 23:18 | |
fungi | it's also about time to wind down here and do some friday night things, but i'll keep an eye on irc in case something goes horribly, horribly wrong | 23:19 |
*** ibiris is now known as ibiris_away | 23:21 | |
pleia2 | fungi: enjoy, see you in the morning | 23:21 |
fungi | absolutely | 23:22 |
*** enikanorov has quit IRC | 23:22 | |
*** enikanorov has joined #openstack-infra | 23:23 | |
*** achanda has quit IRC | 23:24 | |
SpamapS | greghaynes: reviewed your compress_and_save thing | 23:24 |
*** ociuhandu has quit IRC | 23:25 | |
*** achanda has joined #openstack-infra | 23:26 | |
*** mrmartin has quit IRC | 23:27 | |
greghaynes | hrm? | 23:29 |
*** unicell has quit IRC | 23:29 | |
* greghaynes has too many things lying around | 23:29 | |
*** unicell has joined #openstack-infra | 23:29 | |
greghaynes | the VHD one? | 23:29 |
*** tkelsey has joined #openstack-infra | 23:31 | |
SpamapS | greghaynes: hah sorry I meant VHD | 23:32 |
SpamapS | greghaynes: but I said compress_and_save because thats what I made a comment on | 23:32 |
*** tsg has joined #openstack-infra | 23:34 | |
pleia2 | doh, I think the toggle ci button broke on production review.o.o | 23:35 |
*** tkelsey has quit IRC | 23:35 | |
SpamapS | hopefully it broke and always shows CI because I hate that CI is hidden by default now. ;) | 23:36 |
pleia2 | yeah, and jenkins results aren't up at the top with our votes | 23:36 |
SpamapS | oh well I like that. ;) | 23:37 |
openstackgerrit | James E. Blair proposed openstack-infra/infra-specs: WIP: Add Zuul v3 spec. https://review.openstack.org/164371 | 23:37 |
SpamapS | (the results at the top) :-p | 23:37 |
*** unicell has quit IRC | 23:37 | |
jeblair | (still not done, but a little more defined) | 23:37 |
*** unicell has joined #openstack-infra | 23:37 | |
*** tonytan4ever has quit IRC | 23:37 | |
jeblair | pleia2: did merging that change result in a broken symlink? | 23:38 |
*** esker has quit IRC | 23:38 | |
pleia2 | jeblair: oddly not, we've still got jquery.js and jquery.min.js with old timestamps and living out their lives as separate files | 23:39 |
pleia2 | oh, it's the static one from /home I should be looking at | 23:39 |
jeblair | SpamapS: well, the idea is that only the latest ci is shown. so it should never be hidden, but there's no need to wade through 100 auto generated messages | 23:39 |
jeblair | pleia2: ah, one idea is that gerrit may need to be restarted -- it's got something weird going on with the hash of the file that i'm not sure we fully understand | 23:40 |
SpamapS | jeblair: I think I would like that better if I worked on some of the projects with 100's of auto generated CI results. :) | 23:40 |
pleia2 | -rw-r--r-- 1 root root 243K Mar 20 22:14 jquery.js | 23:40 |
pleia2 | so that changed | 23:40 |
jeblair | pleia2: want to restart gerrit and see if it fixes it? | 23:41 |
pleia2 | it's also broken on review-dev and our new server | 23:41 |
*** harlowja_ has quit IRC | 23:41 | |
openstackgerrit | Clint 'SpamapS' Byrum proposed openstack-infra/project-config: Add config-drive element https://review.openstack.org/154132 | 23:41 |
jeblair | pleia2: do we have gerrit running on our new server? | 23:41 |
*** markvoelker has quit IRC | 23:42 | |
pleia2 | oh, no, it just redirects | 23:42 |
*** harlowja has joined #openstack-infra | 23:42 | |
pleia2 | jeblair: we can try a restart, I'll need some help there though | 23:43 |
pleia2 | maybe on review-dev first? but I don't know if review-dev has any special weirdness | 23:44 |
*** che-arne has quit IRC | 23:45 | |
*** emagana has joined #openstack-infra | 23:45 | |
pleia2 | note to self: don't find these things at 16:30 on friday | 23:46 |
jeblair | :) | 23:46 |
jeblair | yeah, why don't you try review-dev first | 23:46 |
jeblair | pleia2: nothing tricky about gerrit restarts; /etc/init.d/gerrit restart should do it | 23:46 |
*** ajmiller has quit IRC | 23:46 | |
jeblair | i'm here as backup | 23:46 |
pleia2 | do we use init.d or service gerrit restart? | 23:47 |
jeblair | pleia2: i suppose service is more correct; i think init.d works tho | 23:47 |
pleia2 | I'll do init.d today, here goes on review-dev | 23:47 |
SpamapS | jeblair: do you only wear your grey beard when you're on IRC, or sometimes at the market too? ;-) | 23:47 |
* SpamapS secretly wishes upstart and systemd had never been invented and /etc/init.d/ was still "the way" | 23:48 | |
jeblair | SpamapS: i stop people at the market and tell them what i think about systemd | 23:48 |
pleia2 | still broken :( | 23:48 |
jeblair | SpamapS: which basically means i blend in perfectly in berkeley | 23:48 |
SpamapS | "Hello, do you have a few minutes to discuss pid 1?" | 23:48 |
pleia2 | no wait, I think it's ok! | 23:49 |
pleia2 | https://review-dev.openstack.org/#/c/5270/ | 23:49 |
SpamapS | "Have you considered what will happen to your zombie processes when you die?" | 23:49 |
jeblair | pleia2: lgtm! | 23:49 |
jeblair | pleia2: so the only advice i'd give about a prod gerrit restart is don't do it right before zuul is about to merge a change | 23:50 |
pleia2 | so, restarting real gerrit, anything special to do re: telling people or anything? | 23:50 |
pleia2 | ah | 23:50 |
jeblair | pleia2: current top of the queue is 11 mins out so you should be fine | 23:50 |
pleia2 | ok, I'm going to do it now then | 23:50 |
jeblair | sounds good | 23:50 |
*** pc_m has quit IRC | 23:51 | |
SpamapS | hah, distributed systems are hard. | 23:51 |
jeblair | if we were to do this in the middle of the day, i might consider a statusbot notice, but no one is around except SpamapS so. :) | 23:51 |
pleia2 | hehe | 23:52 |
SpamapS | jeblair: that guy wouldn't know what to do with statusbot notices anyway | 23:52 |
pleia2 | alright, back up, let's see | 23:52 |
pleia2 | all better! | 23:52 |
pleia2 | thanks jeblair | 23:52 |
jeblair | yay! | 23:52 |
jeblair | as a bonus, gerrit will be nice and speedy until we shut it down again tomorrow. :) | 23:52 |
pleia2 | so javascript changes require a gerrit restart | 23:52 |
pleia2 | makes sense (what) | 23:52 |
jeblair | pleia2: right? :) | 23:52 |
jeblair | i thought that touching the site include file was supposed to avoid that, but i'm not certain we fully understand what's going on | 23:53 |
* pleia2 nods | 23:53 | |
jeblair | and restarting is faster than figuring it out. | 23:53 |
SpamapS | pleia2: re: "sense" http://www.quickmeme.com/img/a5/a5fd9f50473ea78ab4a5668771803996dfaebe931facffc060a9c530337dc7e7.jpg | 23:53 |
pleia2 | SpamapS: ++ | 23:54 |
jeblair | what a nice way to end the day | 23:54 |
SpamapS | some day.. I'll figure out why downloading an image from cloud-images.ubuntu.com on my home connection tops out at 1Mbit | 23:54 |
SpamapS | but if I download it to an hpcloud instance, and then to my home box, 40Mbit all the way :-P | 23:54 |
SpamapS | which at least is effectively 20x faster but 100x more annoying. | 23:55 |
*** dannywilson has quit IRC | 23:56 | |
*** gyee has quit IRC | 23:59 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!