*** sweston has quit IRC | 00:04 | |
*** lcostantino has quit IRC | 00:05 | |
*** julim has quit IRC | 00:05 | |
openstackgerrit | A change was merged to openstack-dev/hacking: Fixed warning H302 when used with six.moves https://review.openstack.org/99467 | 00:05 |
---|---|---|
*** mmaglana has quit IRC | 00:05 | |
*** oomichi_sleeping is now known as oomichi | 00:09 | |
*** dims__ has quit IRC | 00:11 | |
*** dims has joined #openstack-infra | 00:11 | |
*** hemna is now known as hemna_ | 00:12 | |
openstackgerrit | Michael Krotscheck proposed a change to openstack-infra/storyboard-webclient: Error message & notification handling https://review.openstack.org/99515 | 00:13 |
ianw | sdague: sorry, another go at https://review.openstack.org/#/c/99047/ ... i got reports dbus restart was killing gnome. i tested and restarting just firewalld is sufficient on rackspace images | 00:16 |
sdague | why in gods name are people running gnome on their devstack envs? | 00:17 |
ianw | sdague: yeah... maybe you can forgive me for not noticing that failure case :) | 00:18 |
*** dkehn_ has joined #openstack-infra | 00:18 | |
*** dkehn_ is now known as dkehnx | 00:19 | |
openstackgerrit | Michael Krotscheck proposed a change to openstack-infra/storyboard-webclient: Error message & notification handling https://review.openstack.org/99515 | 00:20 |
mordred | sdague: because, you know .. | 00:20 |
mordred | sdague: X11aaS | 00:20 |
mordred | also known as ... X11 | 00:20 |
*** dims has quit IRC | 00:21 | |
*** _nadya_ has quit IRC | 00:25 | |
*** zhiyan_ is now known as zhiyan | 00:27 | |
*** homeless has quit IRC | 00:27 | |
*** matsuhashi has joined #openstack-infra | 00:27 | |
*** dims has joined #openstack-infra | 00:27 | |
*** praneshp has quit IRC | 00:28 | |
*** nosnos has joined #openstack-infra | 00:32 | |
*** HenryG_ has joined #openstack-infra | 00:33 | |
*** yamahata has joined #openstack-infra | 00:33 | |
SpamapS | sdague: didn't you mean "why in gods name are people running gnome" ? | 00:33 |
*** thuc has joined #openstack-infra | 00:34 | |
*** thuc has quit IRC | 00:36 | |
*** thuc has joined #openstack-infra | 00:37 | |
*** jhesketh has quit IRC | 00:37 | |
*** thuc has quit IRC | 00:37 | |
*** thuc has joined #openstack-infra | 00:38 | |
*** thuc has quit IRC | 00:38 | |
*** thuc has joined #openstack-infra | 00:39 | |
*** richm has left #openstack-infra | 00:39 | |
*** thuc_ has joined #openstack-infra | 00:40 | |
*** HenryG_ has quit IRC | 00:42 | |
*** thuc has quit IRC | 00:44 | |
*** asettle has quit IRC | 00:45 | |
*** jhesketh has joined #openstack-infra | 00:46 | |
*** ildikov_ has joined #openstack-infra | 00:48 | |
fungi | cde with motif should be enough for anybody | 00:49 |
*** ildikov has quit IRC | 00:51 | |
*** mfer has joined #openstack-infra | 00:52 | |
*** _nadya_ has joined #openstack-infra | 00:52 | |
*** CaptTofu_ has quit IRC | 01:01 | |
*** CaptTofu_ has joined #openstack-infra | 01:02 | |
*** yaguang has joined #openstack-infra | 01:02 | |
*** chianingwang has quit IRC | 01:02 | |
mordred | SpamapS: I'm running gnome:old-stable myself | 01:03 |
mordred | aka mate | 01:04 |
mordred | clock-applet FTW! | 01:04 |
*** chianingwang has joined #openstack-infra | 01:04 | |
jogo | sdague: re INFO logs too bad, because that would drop the data in ES way down. | 01:04 |
jogo | sdague: re heat test, do you have a good scenario to test? | 01:04 |
jogo | stevebaker fungi: can we prioritize https://review.openstack.org/#/c/99517/ in the gate | 01:05 |
jogo | mordred: ^ | 01:05 |
*** CaptTofu_ has quit IRC | 01:06 | |
jogo | Alex_Gaynor: want hacking 0.9.2 now? or think you can find another bug first? | 01:07 |
Alex_Gaynor | jogo: heh, a release would be awesome, I could delete a few crufty noqa's | 01:07 |
fungi | jogo: looks like it's got ~20 minutes until it passes check tests, but if it's contributing significantly to gate slowness i can promote it to the front once it finishes check jobs | 01:08 |
jogo | fungi: its the top of the http://status.openstack.org/elastic-recheck/ | 01:08 |
jogo | that would be great | 01:09 |
jogo | Alex_Gaynor: done, its working its way through release queue. When its done want the honors of sending out an email on the hacking 0.9 thread about the fix | 01:12 |
Alex_Gaynor | jogo: not especially :-) | 01:12 |
jogo | Alex_Gaynor: I'll send one out | 01:14 |
Alex_Gaynor | jogo: thanks | 01:14 |
*** gokrokve has joined #openstack-infra | 01:14 | |
sdague | jogo: look at the existing tests that are failing which keep the heat job from voting, that would be a good start | 01:17 |
*** alkari has quit IRC | 01:17 | |
sdague | I think we need to be able to consistently create and delete a single stack before going the large ops route | 01:17 |
sdague | jogo: if we didn't log API requests, how would you *ever* figure out what's going on :) | 01:18 |
sdague | ianw: +2 | 01:18 |
*** thuc_ has quit IRC | 01:20 | |
sdague | mordred: you do realize that the calendar-indicator in Ubuntu does exactly that timezone switch thing. Though you'd have to be running unity. | 01:20 |
* sdague ducsk | 01:20 | |
sdague | or ducks even | 01:20 |
*** thuc has joined #openstack-infra | 01:20 | |
morganfainberg | oh.. i just realized... | 01:21 |
morganfainberg | exchange or whatever can do UTC meetings... can someone explain to me my gmail doesn't let you set a meeting for UTC timezone? :P | 01:21 |
ianw | sdague: thanks! | 01:22 |
tchaypo | morganfainberg: you really want me to launch into my rant about gmail's handling of timezones? | 01:22 |
tchaypo | how about I just rant abotu the fact that it says "Sydney, Melbourne (GMT+10)" | 01:23 |
morganfainberg | tchaypo, lol | 01:23 |
tchaypo | but Sydney and Melbourne are GMT+11 for half the year, so does that mean the appointment is in Sydney time or in GMT+10? | 01:23 |
mordred | sdague: the calendar-indicator in Ubuntu does not do the same things | 01:24 |
*** thuc has quit IRC | 01:24 | |
tchaypo | FWIW they have a timezone that's "(GMT+00:00) GMT (no daylight saving)" | 01:24 |
tchaypo | I *think* tat's the same thing as UTC, but it makes no sense, as GMT doesn't ever have daylight saving | 01:25 |
SpamapS | mordred: you're still clinging to your clock applet? lulz | 01:25 |
morganfainberg | tchaypo, they do? huh, i thought the GMT one they had did DST | 01:25 |
morganfainberg | tchaypo, oh the iceland one thats right, GMT no DST, but thats a hack | 01:25 |
tchaypo | There's a "GMT+0 London", which will presumably have DST | 01:25 |
*** sarob has quit IRC | 01:25 | |
morganfainberg | right | 01:26 |
mordred | SpamapS: clinging to - hell, I just got it BACK after being without it for 2 years because UI got taken over by people who clearly do not actually use linux | 01:26 |
morganfainberg | i don't want DST, IRC meetings for OS | 01:26 |
*** sarob has joined #openstack-infra | 01:26 | |
tchaypo | In my list I've got Reykjavik, St Helena, GMT (No daylight saving), Dublin, Lisbon | 01:26 |
*** esker has joined #openstack-infra | 01:26 | |
SpamapS | mordred: I've grown accustomed to the unity clock applet.. which is lacking only the map. :-P | 01:26 |
mordred | and the weather/temperature. and the timezone indicators. and the easy timezone switching between favorites | 01:27 |
*** gyee has quit IRC | 01:28 | |
mordred | SpamapS: http://imgur.com/5vgSM3P | 01:29 |
*** gokrokve has quit IRC | 01:30 | |
*** nati_ueno has quit IRC | 01:30 | |
mordred | SpamapS: there is a lovely and useful pile of information in that display. now, is it all necessarily related to _time_? No. but sometimes purity of interface needs to DIAF in favor of shit that works well | 01:30 |
*** sarob_ has joined #openstack-infra | 01:30 | |
*** gokrokve has joined #openstack-infra | 01:30 | |
*** sarob has quit IRC | 01:30 | |
*** zehicle_in_sfo has quit IRC | 01:30 | |
morganfainberg | mordred, lol | 01:31 |
mordred | in unity/gnome3 I can also add a weather applet, but then if I change locations, I need to change locations in two places to get it to show me things in the menu bar, which is silly | 01:31 |
mordred | also, none of those things give me the quick graphical indicator of "are people asleep in that location" - which is helpful | 01:32 |
mordred | (I can go on about this for a while if anyone wants more) | 01:32 |
sdague | :) | 01:33 |
* mordred needs to figure out how to keep the Mate guys funded... | 01:34 | |
sdague | damn, now you are nearly inspiring me to write it, mostly for the map visualization | 01:34 |
openstackgerrit | Joshua Hesketh proposed a change to openstack-infra/config: Fetch graphitejs zuul dependency https://review.openstack.org/98018 | 01:34 |
openstackgerrit | Joshua Hesketh proposed a change to openstack-infra/config: Use the latest jquery on zuul https://review.openstack.org/98029 | 01:34 |
*** gokrokve has quit IRC | 01:35 | |
sdague | night all | 01:35 |
*** yjiang has joined #openstack-infra | 01:36 | |
mordred | night sdague | 01:36 |
openstackgerrit | Alex Gaynor proposed a change to openstack-dev/hacking: Mark hacking as being a universal wheel https://review.openstack.org/99528 | 01:36 |
*** trinaths has joined #openstack-infra | 01:38 | |
dims | jogo, Didn't you mean "0.9.2 has just been released"? | 01:38 |
jogo | dims: I reused the old thread, but yes | 01:39 |
dims | ack. thx | 01:39 |
*** zns has quit IRC | 01:39 | |
fungi | reset second from the front of the gate, so i'll promote 99517,3 to the front as soon as it reports | 01:42 |
*** rwsu has quit IRC | 01:42 | |
*** chianingwang_ has joined #openstack-infra | 01:44 | |
*** Ryan_Lane1 has quit IRC | 01:44 | |
jogo | fungi: thanks | 01:47 |
*** _nadya_ has quit IRC | 01:50 | |
*** trinaths has quit IRC | 01:51 | |
openstackgerrit | A change was merged to openstack-infra/jenkins-job-builder: re-arrange docs for clarity https://review.openstack.org/98467 | 01:52 |
*** alexandra_ has joined #openstack-infra | 01:53 | |
*** alexandra_ is now known as asettle | 01:53 | |
tchaypo | so I'm trying to write an email to the tripleo-team asking people to spend more time reviewing as our backlog is getting bigger | 01:53 |
tchaypo | someone suggested yesterdat that other emails with tips like "if it just needs a rebase, please rebase it" to openstack-dev, but I can't find it | 01:54 |
tchaypo | does anyone here have any memory of such an email? | 01:54 |
morganfainberg | tchaypo, well i can tell you keystone has success with that type of stuff | 01:55 |
*** thomasbiege1 has joined #openstack-infra | 01:55 | |
morganfainberg | tchaypo, often someone will help with rebases, if there is a minor nit (otherwise good) upload a fix.. mostly "help get things through" | 01:55 |
mordred | yah. aroudn here we'll also single-approve trivial things too | 01:55 |
mordred | you know - spelling error in a README? single approve | 01:55 |
mordred | change in zuul's internal algorithms? double approve and probably lots of prayer | 01:55 |
morganfainberg | mordred, oh that is a good idea. | 01:56 |
mordred | it's a thing to be careful with - and if there is even the slightest shred of doubt, I default back to double | 01:56 |
morganfainberg | mordred, yeah. | 01:57 |
morganfainberg | mordred, true - i have a hard enough time to convince people to single approve translation imports - because we've had 2 approver ground in to much | 01:57 |
*** thomasbiege has quit IRC | 01:58 | |
morganfainberg | i'd rather have that issue than the inverse though. | 01:58 |
mordred | yup | 01:58 |
*** sarob_ has quit IRC | 02:02 | |
*** sarob has joined #openstack-infra | 02:03 | |
*** sweston has joined #openstack-infra | 02:03 | |
*** sarob has quit IRC | 02:07 | |
*** masayukig has quit IRC | 02:10 | |
*** masayukig has joined #openstack-infra | 02:12 | |
*** _nadya_ has joined #openstack-infra | 02:17 | |
*** chianingwang_ has quit IRC | 02:18 | |
*** cp16net has joined #openstack-infra | 02:19 | |
*** otter768 has joined #openstack-infra | 02:21 | |
*** CaptTofu_ has joined #openstack-infra | 02:23 | |
*** penguinRaider has joined #openstack-infra | 02:24 | |
*** sarob has joined #openstack-infra | 02:25 | |
*** CaptTofu_ has quit IRC | 02:28 | |
*** mmaglana has joined #openstack-infra | 02:28 | |
*** arnaud__ has quit IRC | 02:29 | |
*** thuc has joined #openstack-infra | 02:31 | |
*** markmcclain has quit IRC | 02:31 | |
*** penguinRaider has quit IRC | 02:34 | |
*** alkari has joined #openstack-infra | 02:34 | |
*** mrodden1 has joined #openstack-infra | 02:35 | |
*** mrodden has quit IRC | 02:35 | |
*** thuc has quit IRC | 02:36 | |
*** CaptTofu_ has joined #openstack-infra | 02:38 | |
*** mfer has quit IRC | 02:40 | |
*** dims has quit IRC | 02:40 | |
*** fanhe has joined #openstack-infra | 02:51 | |
*** sarob has quit IRC | 02:57 | |
*** chianingwang_ has joined #openstack-infra | 03:01 | |
*** sarob has joined #openstack-infra | 03:02 | |
openstackgerrit | Joshua Harlow proposed a change to openstack/requirements: Bump up to the six 1.7.x series https://review.openstack.org/99556 | 03:03 |
*** gokrokve has joined #openstack-infra | 03:05 | |
*** dims_ has joined #openstack-infra | 03:07 | |
*** otter768 has quit IRC | 03:10 | |
*** dims_ has quit IRC | 03:12 | |
*** praneshp has joined #openstack-infra | 03:18 | |
*** Longgeek has joined #openstack-infra | 03:21 | |
*** chianingwang_ has quit IRC | 03:21 | |
*** sarob has quit IRC | 03:26 | |
*** sarob has joined #openstack-infra | 03:26 | |
*** CaptTofu_ has quit IRC | 03:26 | |
*** CaptTofu_ has joined #openstack-infra | 03:27 | |
*** chianingwang has quit IRC | 03:27 | |
*** zhiyan is now known as zhiyan_ | 03:30 | |
lifeless | clarkb: ping me if/when you want more dib kibbitzing | 03:30 |
lifeless | clarkb: I'm keen to get you a good answer to the problem | 03:30 |
*** sarob has quit IRC | 03:30 | |
*** CaptTofu_ has quit IRC | 03:31 | |
*** talluri has joined #openstack-infra | 03:36 | |
*** arnaud__ has joined #openstack-infra | 03:36 | |
*** praneshp_ has joined #openstack-infra | 03:39 | |
*** pcrews has quit IRC | 03:40 | |
*** zhiyan_ is now known as zhiyan | 03:41 | |
*** praneshp has quit IRC | 03:41 | |
*** praneshp_ is now known as praneshp | 03:41 | |
*** nosnos has quit IRC | 03:44 | |
*** thuc has joined #openstack-infra | 03:45 | |
*** thuc has quit IRC | 03:49 | |
*** thuc has joined #openstack-infra | 03:50 | |
*** zns has joined #openstack-infra | 03:51 | |
*** thuc_ has joined #openstack-infra | 03:51 | |
lifeless | hmm, gertty also needs to linewrap the text lines not just comments :( | 03:52 |
lifeless | clarkb: nuts, the expand-next-N lines stuff doesn't show up in the commit messages atm; not sure why :( | 03:54 |
lifeless | clarkb: or... its a bug with end-of-file diffs in gertty. Actually that seems more likely. | 03:55 |
*** thuc has quit IRC | 03:55 | |
*** amcrn has joined #openstack-infra | 03:58 | |
*** zns has quit IRC | 03:59 | |
*** zns has joined #openstack-infra | 04:00 | |
*** yfried has quit IRC | 04:08 | |
*** matsuhashi has quit IRC | 04:11 | |
*** arnaud__ has quit IRC | 04:12 | |
openstackgerrit | lifeless proposed a change to stackforge/gertty: Don't crash on comments on unchanged files https://review.openstack.org/99563 | 04:16 |
*** harlowja is now known as harlowja_away | 04:16 | |
*** rohitk has joined #openstack-infra | 04:19 | |
*** thuc_ has quit IRC | 04:20 | |
*** thuc has joined #openstack-infra | 04:21 | |
*** arnaud__ has joined #openstack-infra | 04:21 | |
*** rohitk has quit IRC | 04:25 | |
*** thuc has quit IRC | 04:26 | |
*** sarob has joined #openstack-infra | 04:27 | |
jogo | so I am submitting a talk on hacking and looking for a good title | 04:28 |
jogo | was thinking 'Bikeshedding OpenStack, or why Style Guides Matter' | 04:29 |
kashyap | Maybe just leave the "Bikeshedding" aspect and just stick to the why :-) | 04:30 |
jogo | kashyap: that is the why | 04:30 |
jogo | well part of it | 04:30 |
*** sarob has quit IRC | 04:31 | |
*** CaptTofu_ has joined #openstack-infra | 04:31 | |
jogo | lifeless: I know you always have opinions ^ | 04:31 |
lifeless | OPINIONS | 04:31 |
lifeless | jogo: Making a project 500 bugs in one small release? | 04:32 |
lifeless | jogo: as a subtitle | 04:32 |
lifeless | so actually | 04:32 |
jogo | haha | 04:33 |
lifeless | my opinion here is that style guides are inferior to automation | 04:33 |
lifeless | pep8 is automating whinging | 04:33 |
mikal | jogo: you never came back to chat to me | 04:33 |
mikal | jogo: so lame | 04:33 |
jogo | so I don't disagree with that statement, but step one is have a style guide step 2 is automate fixing it | 04:33 |
lifeless | it does some, but fairly little, towards improving product quality and velocity. [I know different folk have different opinions here]. | 04:33 |
* jogo hides from mikal | 04:33 | |
lifeless | jogo: I think step one is to have an automated but nearly empty style guide, and step 2 is to increase it. | 04:34 |
Alex_Gaynor | lifeless: go fmt is a million times better than flake8 (apologies to all) | 04:34 |
lifeless | jogo: what we have is an ever increasing step 1 and no step 2. | 04:34 |
lifeless | Alex_Gaynor: exactly my point. | 04:34 |
lifeless | Alex_Gaynor: I experimented with one of the pep8 autoformatters on LP | 04:34 |
lifeless | Alex_Gaynor: I think it was a couple weeks work away from being usable. | 04:34 |
jogo | Alex_Gaynor: yeah go fmt is awesome | 04:35 |
lifeless | jogo: the more hacking adds the *higher* the effort to get to 'go fmt'. | 04:35 |
Alex_Gaynor | lifeless: the problem is there are still lots of python people who believe "A foolish consistency is the hobgoblin of little minds", the automation doesn't work if not /every/ file participates | 04:35 |
tchaypo | thanks mordred, morganfainberg | 04:35 |
StevenK | lifeless: I actually miss utilities/format-new-and-modified-imports from LP | 04:35 |
*** SumitNaiksatam has quit IRC | 04:35 | |
jogo | Alex_Gaynor: good point | 04:35 |
lifeless | Alex_Gaynor: right. I can't buy into 'make me work harder on trivia'. I can totally buy into 'I am not allowed to care about $aesthetic because the daemon will rewrite it for me' | 04:35 |
*** CaptTofu_ has quit IRC | 04:35 | |
jogo | lifeless: you ever use autopep8? | 04:36 |
lifeless | let me dig up my post on this | 04:36 |
*** SumitNaiksatam has joined #openstack-infra | 04:36 | |
*** zns has quit IRC | 04:36 | |
*** sarob has joined #openstack-infra | 04:39 | |
jogo | Alex_Gaynor: you make a good point about the a foolish consistency | 04:39 |
Alex_Gaynor | jogo: I believe the opposite to be clear, I think a lot of people are hung up on that | 04:39 |
jogo | Alex_Gaynor: I guess a big part of hacking is we are going for a foolish consistency | 04:39 |
*** zns has joined #openstack-infra | 04:39 | |
Alex_Gaynor | +1 :-) | 04:39 |
jogo | so maybe the title can be: "Style Guides: Why Foolish Consistency Matters" | 04:40 |
fungi | jogo: robotic hobgoblins | 04:41 |
jogo | fungi: how would you work that into the title? maybe I can use that in the abstract | 04:42 |
fungi | no idea. the image was merely compelling | 04:43 |
*** yfried has joined #openstack-infra | 04:43 | |
*** sarob has quit IRC | 04:43 | |
lifeless | jogo: https://lists.launchpad.net/launchpad-dev/msg07330.html | 04:44 |
lifeless | jogo: https://lists.launchpad.net/launchpad-dev/msg07338.html | 04:44 |
lifeless | now I was sure I did an experiment | 04:45 |
*** matsuhashi has joined #openstack-infra | 04:48 | |
lifeless | https://lists.launchpad.net/launchpad-dev/msg07340.html ... | 04:48 |
* StevenK stabs paste.o.o for giving ISEs | 04:48 | |
lifeless | jogo: https://lists.launchpad.net/launchpad-dev/msg07681.html - seems to be it | 04:48 |
lifeless | jogo: so tl;dr - PythonTidy was decent and hackable | 04:48 |
*** nosnos has joined #openstack-infra | 04:48 | |
lifeless | jogo: I'd would deeply deeply deeply love to see *that* hacked on and hacking frozen | 04:49 |
lifeless | jogo: because PythonTidy makes everyones life better, not more painful. | 04:49 |
*** trinaths has joined #openstack-infra | 04:50 | |
jogo | pythondity is very old but good to know | 04:50 |
*** markmcclain has joined #openstack-infra | 04:50 | |
jogo | lifeless: FWIW I generally am against any additions to hacking.rst at this point | 04:51 |
openstackgerrit | A change was merged to openstack/requirements: Updated taskflow now that 0.3.x is released https://review.openstack.org/99188 | 04:51 |
*** gokrokve has quit IRC | 04:53 | |
lifeless | jogo: hacking the project I meant, not the rst file :) | 04:53 |
jogo | lifeless: well the rst file is in the project but I know what you mean | 04:54 |
*** esker has quit IRC | 04:54 | |
*** michchap_ has quit IRC | 04:55 | |
*** michchap has joined #openstack-infra | 04:55 | |
openstackgerrit | Joe Gordon proposed a change to openstack-infra/config: Don't notify nova's IRC room on patch creation https://review.openstack.org/99566 | 04:56 |
*** lcheng has joined #openstack-infra | 04:57 | |
*** mmaglana has quit IRC | 04:58 | |
*** mmaglana has joined #openstack-infra | 04:59 | |
*** yfried_ has joined #openstack-infra | 05:01 | |
*** yfried has quit IRC | 05:02 | |
tchaypo | Do we have an openstack link shortener? | 05:02 |
tchaypo | using bit.ly makes me feel dirty | 05:02 |
*** mmaglana has quit IRC | 05:03 | |
lifeless | jogo: so anyhow, I hope my opinions were useful | 05:03 |
lifeless | jogo: I realise I sidetracked you -sorry | 05:04 |
jogo | lifeless: they were useful a little side tracked but useful | 05:06 |
lifeless | jogo: cool | 05:07 |
*** lcheng has quit IRC | 05:11 | |
*** zns has quit IRC | 05:16 | |
*** zns has joined #openstack-infra | 05:16 | |
trinaths | fungi: hello | 05:20 |
*** Longgeek_ has joined #openstack-infra | 05:21 | |
*** salv-orlando has joined #openstack-infra | 05:23 | |
*** Longgeek has quit IRC | 05:24 | |
*** markwash has quit IRC | 05:27 | |
*** thuc has joined #openstack-infra | 05:31 | |
*** markwash has joined #openstack-infra | 05:31 | |
*** salv-orlando has quit IRC | 05:33 | |
*** thuc has quit IRC | 05:36 | |
*** sarob has joined #openstack-infra | 05:39 | |
*** zehicle_at_dell has joined #openstack-infra | 05:41 | |
*** rdopieralski has joined #openstack-infra | 05:42 | |
*** sarob has quit IRC | 05:43 | |
*** mmaglana has joined #openstack-infra | 05:49 | |
*** _nadya_ has quit IRC | 05:51 | |
*** enikanorov has joined #openstack-infra | 05:58 | |
lifeless | devananda: could you abandon 97646 ? alembic is in requirements.txt | 05:58 |
*** oomichi has quit IRC | 05:58 | |
*** yfried_ is now known as yfried | 06:03 | |
*** basha has joined #openstack-infra | 06:08 | |
*** melwitt has quit IRC | 06:12 | |
*** thomasbiege1 has left #openstack-infra | 06:14 | |
*** markmcclain has quit IRC | 06:15 | |
*** praneshp has quit IRC | 06:16 | |
*** _nadya_ has joined #openstack-infra | 06:17 | |
*** markmcclain has joined #openstack-infra | 06:18 | |
*** markwash has quit IRC | 06:19 | |
*** CaptTofu_ has joined #openstack-infra | 06:19 | |
*** ildikov_ has quit IRC | 06:20 | |
basha | anybody there? | 06:21 |
*** denis_makogon has joined #openstack-infra | 06:22 | |
nibalizer | basha: best to just ask your question | 06:22 |
*** arnaud__ has quit IRC | 06:22 | |
nibalizer | if someone knows the answer they will help you | 06:22 |
basha | nibalizer: ive a patch which has been failing jenkins since yesterday. | 06:22 |
basha | for pretty random reasons. | 06:22 |
nibalizer | link? | 06:22 |
basha | nibalizer: https://review.openstack.org/#/c/78269/ | 06:23 |
basha | just wanted to know if its just the gate acting up and whether its worth retriggering it now? | 06:23 |
*** CaptTofu_ has quit IRC | 06:24 | |
clarkb | basha: we started voting on python33 with glanceclient recently | 06:25 |
clarkb | that change doesn't appear to be python33 clean | 06:25 |
nibalizer | thats beyond my ability to debug | 06:25 |
nibalizer | well if you look its failing to find subunit_log.txt, where does that file get created in the pipeline? | 06:26 |
basha | clarkb: the python33 log says its unable to find subunit_log | 06:26 |
nibalizer | doesn't really look to me like the code doesn't work under python33 | 06:26 |
basha | nibalizer: yes | 06:26 |
clarkb | basha: nibalizer http://logs.openstack.org/69/78269/6/check/gate-python-glanceclient-python33/ded5f8d/console.html#_2014-06-11_14_17_08_220 the error is there | 06:26 |
clarkb | no subunit log because that failed | 06:26 |
basha | its doesnt seem to be my change | 06:26 |
basha | thats pretty random too :( | 06:28 |
clarkb | its not random... | 06:28 |
clarkb | we made the change recently to start gating on python 3 | 06:28 |
basha | clarkb: so it runs the nosetests on python 3 as well, is that ryt? | 06:29 |
clarkb | and the import errors are in files modified by that change | 06:29 |
clarkb | basha: it runs testr under python3 yes | 06:29 |
clarkb | basha: if you look in that block I linked there is a fairly hard to read section that says `Ad\x17text/plain;charset=utf8\rimport | 06:30 |
basha | clarkb: but I dont see any import | 06:30 |
clarkb | errorsA3tests.test_exc\ntests.test_http\ntests.test_progressbar\ntests.te and so on | 06:30 |
clarkb | tests.test_exc, tests.test_http, tets.test_progressbar and so forth failed to import under python33 | 06:31 |
basha | clarkb: but the thing is I was able to get jenkins green the prev patch set. And the only diff is an extra log line | 06:31 |
basha | Which obviously doesnt do a new import | 06:31 |
clarkb | the test is new | 06:32 |
clarkb | well newly voting | 06:32 |
clarkb | patchset 1 failed with the same issues http://logs.openstack.org/69/78269/1/check/gate-python-glanceclient-python33/1e198c1/console.html.gz#_2014-03-05_15_55_10_475 | 06:33 |
basha | clarkb: oh you mean python3 started voting just recently? | 06:33 |
clarkb | but it wasn't voting at the time | 06:33 |
clarkb | basha: yes | 06:33 |
basha | ok gotcha. hmmm | 06:34 |
clarkb | glance was able to get it to work on master so we went ahead and made it voting to prevent new regressions | 06:34 |
basha | let me try running tests on python3 on my box then | 06:34 |
*** lcheng has joined #openstack-infra | 06:34 | |
*** ildikov has joined #openstack-infra | 06:39 | |
*** sarob has joined #openstack-infra | 06:39 | |
*** fanhe has quit IRC | 06:40 | |
*** sarob_ has joined #openstack-infra | 06:41 | |
*** alkari1 has joined #openstack-infra | 06:42 | |
*** alkari has quit IRC | 06:43 | |
*** sarob has quit IRC | 06:43 | |
*** sarob_ has quit IRC | 06:45 | |
*** achuprin_ has quit IRC | 06:51 | |
*** _nadya_ has quit IRC | 06:51 | |
*** _nadya_ has joined #openstack-infra | 06:52 | |
*** zehicle_at_dell has quit IRC | 06:53 | |
*** sarob has joined #openstack-infra | 06:53 | |
*** cody-somerville has quit IRC | 06:53 | |
*** zehicle_at_dell has joined #openstack-infra | 06:53 | |
*** jlibosva has joined #openstack-infra | 06:54 | |
*** sarob has quit IRC | 06:57 | |
*** srenatus has quit IRC | 06:58 | |
*** doude has joined #openstack-infra | 06:59 | |
*** srenatus has joined #openstack-infra | 06:59 | |
*** alkari1 has quit IRC | 07:01 | |
*** zns_ has joined #openstack-infra | 07:02 | |
*** zns has quit IRC | 07:03 | |
*** oomichi has joined #openstack-infra | 07:04 | |
*** alkari has joined #openstack-infra | 07:04 | |
*** cody-somerville has joined #openstack-infra | 07:05 | |
*** _nadya_ has quit IRC | 07:05 | |
*** ildikov has quit IRC | 07:07 | |
*** achuprin_ has joined #openstack-infra | 07:07 | |
*** Clabbe has quit IRC | 07:09 | |
*** markmcclain has quit IRC | 07:11 | |
*** pblaho has joined #openstack-infra | 07:12 | |
*** yfried has quit IRC | 07:12 | |
*** yfried has joined #openstack-infra | 07:12 | |
*** jcoufal has joined #openstack-infra | 07:13 | |
*** achuprin_ has quit IRC | 07:15 | |
*** yfried_ has joined #openstack-infra | 07:18 | |
*** yfried has quit IRC | 07:19 | |
*** yfried_ is now known as yfried | 07:19 | |
*** trinaths has quit IRC | 07:19 | |
*** flaper87|afk is now known as flaper87 | 07:22 | |
*** ildikov has joined #openstack-infra | 07:24 | |
*** e0ne has joined #openstack-infra | 07:24 | |
*** zehicle_at_dell has quit IRC | 07:25 | |
*** tkelsey has joined #openstack-infra | 07:26 | |
*** zehicle_at_dell has joined #openstack-infra | 07:27 | |
*** achuprin_ has joined #openstack-infra | 07:27 | |
*** sarob has joined #openstack-infra | 07:29 | |
*** zns_ has quit IRC | 07:29 | |
*** talluri has quit IRC | 07:32 | |
*** talluri has joined #openstack-infra | 07:32 | |
mattoliverau | I'm calling it a day, night all. | 07:33 |
*** jcoufal has quit IRC | 07:33 | |
*** sarob has quit IRC | 07:34 | |
*** zehicle_at_dell has quit IRC | 07:36 | |
*** mugsie has quit IRC | 07:36 | |
*** cody-somerville has quit IRC | 07:36 | |
*** talluri has quit IRC | 07:36 | |
*** _nadya_ has joined #openstack-infra | 07:38 | |
*** sarob has joined #openstack-infra | 07:39 | |
*** mrda is now known as mrda-away | 07:40 | |
*** jcoufal has joined #openstack-infra | 07:43 | |
*** sarob has quit IRC | 07:44 | |
*** e0ne has quit IRC | 07:44 | |
*** e0ne has joined #openstack-infra | 07:45 | |
*** mmaglana has quit IRC | 07:48 | |
*** e0ne has quit IRC | 07:49 | |
*** hashar has joined #openstack-infra | 07:49 | |
*** StevenK has quit IRC | 07:50 | |
*** cody-somerville has joined #openstack-infra | 07:50 | |
openstackgerrit | lifeless proposed a change to stackforge/gertty: Don't crash on comments on unchanged files https://review.openstack.org/99563 | 07:50 |
openstackgerrit | lifeless proposed a change to stackforge/gertty: Hide fully reviewed projects by default https://review.openstack.org/99591 | 07:50 |
lifeless | clarkb: ^ I think you'll like this shiny | 07:50 |
*** dizquierdo has joined #openstack-infra | 07:51 | |
*** jcoufal has quit IRC | 07:51 | |
*** StevenK has joined #openstack-infra | 07:51 | |
*** achuprin_ has quit IRC | 07:55 | |
*** ihrachyshka has joined #openstack-infra | 07:55 | |
*** rcarrill` has joined #openstack-infra | 07:58 | |
*** freyes has joined #openstack-infra | 07:58 | |
*** rcarrillocruz has quit IRC | 08:00 | |
*** jcoufal has joined #openstack-infra | 08:00 | |
*** ihrachyshka has quit IRC | 08:02 | |
*** jistr has joined #openstack-infra | 08:02 | |
*** ihrachyshka has joined #openstack-infra | 08:02 | |
*** _nadya_ has quit IRC | 08:03 | |
*** oomichi has quit IRC | 08:05 | |
*** andreykurilin_ has joined #openstack-infra | 08:06 | |
*** achuprin_ has joined #openstack-infra | 08:10 | |
*** derekh_ has joined #openstack-infra | 08:11 | |
*** enikanorov__ has quit IRC | 08:12 | |
*** fbo_away is now known as fbo | 08:14 | |
*** jcoufal has quit IRC | 08:14 | |
*** plars has quit IRC | 08:15 | |
*** plomakin_ has quit IRC | 08:16 | |
*** Hal_ has joined #openstack-infra | 08:16 | |
*** skraynev has quit IRC | 08:16 | |
*** tnurlygayanov has quit IRC | 08:17 | |
*** ilyashakhat has quit IRC | 08:17 | |
*** markmc has joined #openstack-infra | 08:19 | |
*** locke105 has quit IRC | 08:19 | |
*** srenatus has quit IRC | 08:24 | |
*** srenatus has joined #openstack-infra | 08:25 | |
*** pelix has joined #openstack-infra | 08:26 | |
*** andreykurilin_ has quit IRC | 08:28 | |
*** talluri has joined #openstack-infra | 08:33 | |
*** andreykurilin_ has joined #openstack-infra | 08:34 | |
lifeless | SergeyLukjanov: hey | 08:35 |
lifeless | SergeyLukjanov: have a look on review 92749 | 08:35 |
*** talluri has quit IRC | 08:35 | |
lifeless | (or jhesketh) ^ | 08:35 |
lifeless | Brocade OSS CI is commenting with links to status.o.o/zuul - I think they're spinning up a new 3rd-party system, badly. | 08:36 |
lifeless | might want to pull their access | 08:36 |
lifeless | before they mess everyone up :) | 08:36 |
jhesketh | lifeless: might be worth reaching out to them and letting them know before pulling access out from under their feet | 08:38 |
*** cody-somerville has quit IRC | 08:39 | |
*** sarob has joined #openstack-infra | 08:39 | |
*** afazekas has joined #openstack-infra | 08:42 | |
lifeless | jhesketh: I have no idea how to do that | 08:42 |
jhesketh | the email on the account is DL-GRP-VYATTA-OSS@Brocade.com | 08:43 |
*** rlandy has joined #openstack-infra | 08:43 | |
*** jcoufal has joined #openstack-infra | 08:43 | |
*** sarob has quit IRC | 08:43 | |
isviridov | SergeyLukjanov, could you take a look at https://review.openstack.org/#/c/99039/ and https://review.openstack.org/#/c/91050/ | 08:44 |
*** rcarrillocruz has joined #openstack-infra | 08:45 | |
lifeless | jhesketh: they've commented on looks like hundreds of reviews in the last few hours | 08:46 |
lifeless | hmmm, no gmail fail on screen size | 08:46 |
lifeless | 30ish | 08:46 |
lifeless | I will mail them cc infra | 08:46 |
jhesketh | okay | 08:46 |
jhesketh | thanks | 08:46 |
jhesketh | might be worth removing their access but lets try email first | 08:46 |
jhesketh | SergeyLukjanov might disagree and just revoke it anyway ;-) | 08:47 |
*** rcarrill` has quit IRC | 08:47 | |
openstackgerrit | A change was merged to openstack-infra/config: ceilometer: enable gate-grenade-dsvm-forward https://review.openstack.org/97430 | 08:48 |
lifeless | email sent | 08:48 |
*** dims_ has joined #openstack-infra | 08:49 | |
*** andreykurilin_ has quit IRC | 08:51 | |
*** cody-somerville has joined #openstack-infra | 08:52 | |
sweston | jhesketh: lifeless: hello | 08:52 |
sweston | that is my system, apologies | 08:53 |
lifeless | sweston: hi | 08:53 |
sweston | I stopped the zuul service | 08:53 |
*** Alexei_987 has quit IRC | 08:53 | |
lifeless | thank you | 08:53 |
sweston | that can you verify that we are no longer posting back? | 08:53 |
*** e0ne has joined #openstack-infra | 08:53 | |
sweston | sorry, this is my first attempt at this | 08:54 |
*** dims_ has quit IRC | 08:54 | |
sweston | should I just take everything out of my projects.yaml file except for the sandbox and try again? | 08:55 |
sweston | guess that would be my layout.yaml file in /etc/zuul | 08:56 |
jhesketh | sweston: you should modify your zuul's layout.yaml to not report to gerrit | 08:56 |
jhesketh | instead try setting up an smtp reporter | 08:56 |
jhesketh | so you can email yourself results while you set it up | 08:56 |
*** mrmartin has joined #openstack-infra | 08:57 | |
mrmartin | re | 08:57 |
* sweston is looking up zuul docs | 08:57 | |
*** ominakov has joined #openstack-infra | 09:01 | |
*** achuprin_ has quit IRC | 09:03 | |
sweston | jhesketh: so I should change the success and parameters in each of the pipeline definitions? | 09:03 |
sweston | jhesketh: lifeless: nevermind, I have the correct layout file now. apologies, again for the disturbance to your systems | 09:13 |
lifeless | sweston: np, thanks for taking prompt action | 09:14 |
*** _nadya_ has joined #openstack-infra | 09:14 | |
sweston | lifeless: sure thing, the least I could do :-) | 09:14 |
*** habib has joined #openstack-infra | 09:16 | |
*** achuprin_ has joined #openstack-infra | 09:16 | |
*** chianingwang has joined #openstack-infra | 09:18 | |
*** _nadya_ has quit IRC | 09:19 | |
*** chianingwang has quit IRC | 09:23 | |
*** zhiyan is now known as zhiyan_ | 09:23 | |
*** amcrn has quit IRC | 09:24 | |
*** andreykurilin_ has joined #openstack-infra | 09:24 | |
yjiang | i folks,I use "Gerrit trigger"(not zuul) to trigger Openstack 3rd part test.I've set Verify values "Successful 1", "Failed -1".My CI test could be triggered as normal,but the "Verified +/-1" will not be shown on the review page after CI test is done.Why this happened?Does there anyone could help to answer my question?Any suggestion is appreciated.Thx a lot! | 09:32 |
*** zehicle_at_dell has joined #openstack-infra | 09:33 | |
*** sarob has joined #openstack-infra | 09:38 | |
*** andreykurilin_ has quit IRC | 09:38 | |
*** sarob_ has joined #openstack-infra | 09:39 | |
Kiall | yjiang: your 3rd party testing account needs special permissions to give a Verified vote - that's rarely given out, and when it is, the 3rd party testing need a track record of producing useful and accurate tests etc etc | 09:41 |
*** sarob has quit IRC | 09:43 | |
*** denis_makogon has quit IRC | 09:43 | |
*** sarob_ has quit IRC | 09:44 | |
*** e0ne_ has joined #openstack-infra | 09:44 | |
*** e0ne has quit IRC | 09:48 | |
SergeyLukjanov | lifeless, jhesketh, does brocade ci still sending incorrect links? | 09:52 |
SergeyLukjanov | oh, I see the dialog with sweston | 09:53 |
sweston | SergeyLukjanov: yes, sorry about that | 09:53 |
SergeyLukjanov | sweston, np, it's not like the spamming all projects ;) | 09:54 |
*** oomichi has joined #openstack-infra | 09:54 | |
SergeyLukjanov | fungi, jeblair, clarkb, jhesketh, mordred, there are four holiday days in russia, so, I'll be limited available till Monday] | 09:54 |
sweston | SergeyLukjanov: yup, understood. Won't be making that mistake again :-) | 09:55 |
openstackgerrit | Kiall Mac Innes proposed a change to openstack-infra/config: Move unbound DNS recursor instance to 127.0.2.1 https://review.openstack.org/99611 | 09:57 |
*** zehicle_at_dell has quit IRC | 10:02 | |
*** zehicle_at_dell has joined #openstack-infra | 10:03 | |
*** ominakov has quit IRC | 10:05 | |
yjiang | Kiall: OK,thanks to reply.Another question.If I want to apply this "Verified vote" rights,what specific target should I got?For example,find how many bugs by my test or may be something else. | 10:05 |
Kiall | Honestly, not sure :) I just know it's handed out sparingly after proving reliable :) | 10:06 |
yjiang | Kiall: OK,thx! | 10:07 |
*** ominakov has joined #openstack-infra | 10:10 | |
*** ihrachyshka has quit IRC | 10:11 | |
openstackgerrit | lifeless proposed a change to openstack-infra/infra-specs: Make use of IP per slave optional. https://review.openstack.org/95625 | 10:13 |
*** skraynev has joined #openstack-infra | 10:14 | |
*** talluri has joined #openstack-infra | 10:16 | |
*** talluri has quit IRC | 10:20 | |
*** matsuhashi has quit IRC | 10:20 | |
openstackgerrit | lifeless proposed a change to stackforge/gertty: Don't crash on comments on unchanged files https://review.openstack.org/99563 | 10:20 |
*** matsuhashi has joined #openstack-infra | 10:20 | |
*** jerryz has quit IRC | 10:21 | |
*** yamahata has quit IRC | 10:21 | |
*** dims_ has joined #openstack-infra | 10:23 | |
*** amcrn has joined #openstack-infra | 10:24 | |
*** matsuhashi has quit IRC | 10:25 | |
*** Hal_ has quit IRC | 10:26 | |
*** yaguang has quit IRC | 10:30 | |
cgoncalves | Hi. When re-submitting a patchset to Gerrit can one change the topic? I'm concern about Gerrit possibly re-generating a new Change-Id. | 10:33 |
gilliard | IIRC you can't _just_ change the topic because the git commit hash will be the same. | 10:35 |
*** e0ne_ has quit IRC | 10:37 | |
cgoncalves | gilliard: I wouldn't only be changing the topic, so based on what you just said I believe it should be find changing the topic. thanks | 10:37 |
*** talluri has joined #openstack-infra | 10:37 | |
gilliard | But if you change the commit msg or some code it should keep the came change-id | 10:37 |
gilliard | (just testing that) | 10:37 |
*** e0ne has joined #openstack-infra | 10:37 | |
gilliard | confirmed/ | 10:37 |
cgoncalves | gilliard: much appreciated :-) | 10:38 |
*** kmartin has quit IRC | 10:38 | |
*** sarob has joined #openstack-infra | 10:39 | |
*** e0ne has quit IRC | 10:41 | |
*** thomasbiege has joined #openstack-infra | 10:42 | |
*** sarob has quit IRC | 10:43 | |
*** rcarrillocruz has quit IRC | 10:48 | |
*** rcarrillocruz has joined #openstack-infra | 10:52 | |
*** _nadya_ has joined #openstack-infra | 10:56 | |
*** srenatus has quit IRC | 10:59 | |
*** srenatus has joined #openstack-infra | 10:59 | |
*** atiwari has quit IRC | 11:04 | |
*** _nadya_ has quit IRC | 11:05 | |
openstackgerrit | Bob Ball proposed a change to openstack-infra/nodepool: Support install phase with nodepool https://review.openstack.org/97787 | 11:06 |
openstackgerrit | Bob Ball proposed a change to openstack-infra/nodepool: Support nodes with launch condition https://review.openstack.org/97798 | 11:06 |
*** nosnos has quit IRC | 11:13 | |
*** e0ne has joined #openstack-infra | 11:15 | |
*** e0ne has quit IRC | 11:17 | |
*** e0ne has joined #openstack-infra | 11:17 | |
* fungi is going to try and get a few openstack things done today, but is getting mired in final moving details and constant interruptions by agents trying to show the current house, so probably not around for a lot of general support | 11:19 | |
fungi | clarkb: mordred: ^ | 11:19 |
*** ihrachyshka has joined #openstack-infra | 11:20 | |
sdague | fungi: with that being said, could I get you to promote 99396, as working on readability for grenade (which should help sort some of these bugs) is somewhat limitted until that lands | 11:21 |
*** trinaths has joined #openstack-infra | 11:22 | |
*** e0ne has quit IRC | 11:22 | |
fungi | sdague: sure | 11:22 |
*** srenatus has quit IRC | 11:23 | |
sdague | I also think we need to kick all of ironic out of the gate | 11:23 |
*** andreykurilin_ has joined #openstack-infra | 11:23 | |
*** srenatus has joined #openstack-infra | 11:24 | |
*** tkelsey has quit IRC | 11:24 | |
*** [1]trinaths has joined #openstack-infra | 11:24 | |
*** basha has quit IRC | 11:26 | |
sdague | yeh, gate-tempest-dsvm-virtual-ironic has a 33% fail rate in the gate | 11:26 |
fungi | ouch | 11:26 |
sdague | I think we need to make that test non voting | 11:26 |
sdague | basically it's a terrible configuration | 11:26 |
sdague | relies massively more on the network than the rest of the world | 11:27 |
*** trinaths has quit IRC | 11:27 | |
*** [1]trinaths is now known as trinaths | 11:27 | |
*** yjiang has quit IRC | 11:28 | |
*** plomakin has joined #openstack-infra | 11:29 | |
sdague | fungi: if I push a config for that, can you fast path it? | 11:29 |
sdague | 30 gate fails in the last 48 hrs | 11:30 |
openstackgerrit | Sean Dague proposed a change to openstack-infra/config: disable voting on gate-tempest-dsvm-virtual-ironic https://review.openstack.org/99630 | 11:34 |
*** basha has joined #openstack-infra | 11:38 | |
*** andreykurilin_ has quit IRC | 11:39 | |
*** sarob has joined #openstack-infra | 11:39 | |
*** mrmartin has quit IRC | 11:42 | |
*** basha has quit IRC | 11:42 | |
*** sarob has quit IRC | 11:43 | |
*** lcheng has quit IRC | 11:44 | |
*** CaptTofu_ has joined #openstack-infra | 11:44 | |
*** e0ne has joined #openstack-infra | 11:45 | |
*** CaptTofu_ has quit IRC | 11:45 | |
*** CaptTofu_ has joined #openstack-infra | 11:45 | |
openstackgerrit | Sean Dague proposed a change to openstack-infra/elastic-recheck: gate-tempest-dsvm-virtual-ironic is in our gate https://review.openstack.org/99635 | 11:46 |
sdague | fungi: so... the non-voting? | 11:50 |
*** basha has joined #openstack-infra | 11:53 | |
*** yfried_ has joined #openstack-infra | 11:53 | |
openstackgerrit | A change was merged to openstack-infra/elastic-recheck: gate-tempest-dsvm-virtual-ironic is in our gate https://review.openstack.org/99635 | 11:54 |
*** thomasbiege has quit IRC | 11:55 | |
*** yfried has quit IRC | 11:56 | |
*** mrmartin has joined #openstack-infra | 11:56 | |
*** lbragstad has quit IRC | 11:59 | |
*** thomasbiege has joined #openstack-infra | 12:00 | |
*** dprince has joined #openstack-infra | 12:03 | |
*** mugsie has joined #openstack-infra | 12:07 | |
fungi | sdague: back now, and approved | 12:08 |
sdague | fungi: thanks | 12:08 |
sdague | I sent an email to the list describing the reasons as well | 12:08 |
fungi | excellent. if the ironic devs are worried about that letting breakage slip through... well... | 12:09 |
fungi | i guess the alternative is to go back to not running any shared jobs | 12:09 |
fungi | so that it can break in its own queue without slowing down the main integrated queue | 12:10 |
sdague | yeh, it only turns off the vote in gate | 12:11 |
*** yamahata has joined #openstack-infra | 12:11 | |
fungi | oh, that too | 12:11 |
fungi | a very good point | 12:11 |
sdague | I did try to do the minimum impact thing here :) | 12:12 |
openstackgerrit | A change was merged to openstack-infra/config: disable voting on gate-tempest-dsvm-virtual-ironic https://review.openstack.org/99630 | 12:13 |
openstackgerrit | Maxime Vidori proposed a change to openstack-infra/storyboard-webclient: Remove boostrap.js https://review.openstack.org/99638 | 12:15 |
*** dkranz has joined #openstack-infra | 12:19 | |
*** smarcet has joined #openstack-infra | 12:20 | |
*** adalbas has joined #openstack-infra | 12:20 | |
*** dims_ has quit IRC | 12:22 | |
*** dims_ has joined #openstack-infra | 12:23 | |
*** aysyd has joined #openstack-infra | 12:23 | |
sdague | fungi: oh ffs, new fails | 12:23 |
*** hashar has quit IRC | 12:24 | |
*** e0ne_ has joined #openstack-infra | 12:24 | |
*** basha has quit IRC | 12:24 | |
*** weshay has joined #openstack-infra | 12:25 | |
*** dims_ is now known as dims | 12:25 | |
fungi | it is feature breeding season after all | 12:27 |
*** e0ne has quit IRC | 12:28 | |
*** ArxCruz has joined #openstack-infra | 12:30 | |
*** amotoki has quit IRC | 12:31 | |
*** talluri has quit IRC | 12:33 | |
*** talluri has joined #openstack-infra | 12:33 | |
*** dkliban_afk is now known as dkliban | 12:36 | |
*** fanhe has joined #openstack-infra | 12:37 | |
*** talluri has quit IRC | 12:38 | |
openstackgerrit | Denis M. proposed a change to openstack-infra/config: Added experimental job for trove mongodb functional tests https://review.openstack.org/99644 | 12:38 |
*** denis_makogon has joined #openstack-infra | 12:38 | |
*** rfolco has joined #openstack-infra | 12:39 | |
*** sarob has joined #openstack-infra | 12:39 | |
*** maxbit has joined #openstack-infra | 12:39 | |
*** Hal_ has joined #openstack-infra | 12:40 | |
*** chandan_kumar has joined #openstack-infra | 12:41 | |
*** thomasbiege has left #openstack-infra | 12:43 | |
*** crc32 has quit IRC | 12:43 | |
*** sarob has quit IRC | 12:44 | |
*** chandan_kumar has quit IRC | 12:44 | |
openstackgerrit | A change was merged to openstack-infra/config: Don't deny visibility of ICLA group to its members https://review.openstack.org/99223 | 12:45 |
*** amcrn has quit IRC | 12:46 | |
*** bradm has quit IRC | 12:46 | |
*** openstackgerrit has quit IRC | 12:46 | |
*** yfried_ has quit IRC | 12:46 | |
*** bradm has joined #openstack-infra | 12:46 | |
*** openstackgerrit has joined #openstack-infra | 12:48 | |
*** oomichi has quit IRC | 12:49 | |
*** eharney has joined #openstack-infra | 12:50 | |
*** chandankumar has joined #openstack-infra | 12:52 | |
*** dims_ has joined #openstack-infra | 12:55 | |
*** prad has joined #openstack-infra | 12:56 | |
*** dims has quit IRC | 12:56 | |
*** lbragstad has joined #openstack-infra | 12:57 | |
andreaf | clarkb: ping | 13:00 |
*** julim has joined #openstack-infra | 13:01 | |
openstackgerrit | Denis M. proposed a change to openstack-infra/config: Added experimental job for trove mongodb functional tests https://review.openstack.org/99644 | 13:02 |
*** yamahata has quit IRC | 13:03 | |
*** yamahata has joined #openstack-infra | 13:03 | |
*** ihrachyshka has quit IRC | 13:05 | |
*** ihrachyshka has joined #openstack-infra | 13:07 | |
*** sweston has quit IRC | 13:09 | |
*** mriedem has joined #openstack-infra | 13:12 | |
*** hashar has joined #openstack-infra | 13:12 | |
*** dkranz has quit IRC | 13:14 | |
*** mrmartin has quit IRC | 13:14 | |
*** CaptTofu_ has quit IRC | 13:15 | |
*** CaptTofu_ has joined #openstack-infra | 13:16 | |
*** thuc_ has joined #openstack-infra | 13:18 | |
*** mbacchi has joined #openstack-infra | 13:20 | |
*** CaptTofu_ has quit IRC | 13:20 | |
*** oomichi has joined #openstack-infra | 13:22 | |
openstackgerrit | Maxime Vidori proposed a change to openstack-infra/storyboard-webclient: Remove boostrap.js https://review.openstack.org/99638 | 13:27 |
openstackgerrit | Maxime Vidori proposed a change to openstack-infra/storyboard-webclient: Removal of jquery https://review.openstack.org/99660 | 13:27 |
*** mrmartin has joined #openstack-infra | 13:31 | |
*** jergerber has joined #openstack-infra | 13:32 | |
*** talluri has joined #openstack-infra | 13:34 | |
*** CaptTofu_ has joined #openstack-infra | 13:34 | |
*** tkelsey has joined #openstack-infra | 13:35 | |
*** zehicle_at_dell has quit IRC | 13:36 | |
*** mwagner_lap has quit IRC | 13:36 | |
*** talluri has quit IRC | 13:38 | |
*** sarob has joined #openstack-infra | 13:39 | |
*** crc32 has joined #openstack-infra | 13:40 | |
*** crc32 has quit IRC | 13:41 | |
*** sileht has quit IRC | 13:43 | |
*** trinaths has quit IRC | 13:43 | |
*** sarob has quit IRC | 13:43 | |
BobBall | Anyone know how the yaml python package gets into the VMs? | 13:44 |
BobBall | devstack-gate/test-matrix.py needs yaml but it's not installed on the xs CI vms initially, but it _does_ get installed later by something (system package) - but I can't figure out where and it's bugging me! | 13:44 |
fungi | BobBall: perhaps it's installed by devstack? | 13:45 |
BobBall | not that I could see :/ | 13:45 |
BobBall | a bunch of puppet stuff depends on yaml - but the ones that do don't appear to be depended on by a node... | 13:46 |
*** crc32 has joined #openstack-infra | 13:47 | |
*** mfer has joined #openstack-infra | 13:47 | |
*** bknudson has joined #openstack-infra | 13:50 | |
fungi | yeah, this is perplexing... still looking | 13:50 |
*** ihrachyshka has quit IRC | 13:51 | |
fungi | looking at a random devstack-precise node, python-yaml 3.10-2 is installed from a deb | 13:52 |
*** homeless has joined #openstack-infra | 13:53 | |
*** yolanda has joined #openstack-infra | 13:54 | |
BobBall | hmmm... | 13:54 |
BobBall | lemme check something... | 13:54 |
*** beekneemech is now known as bnemec | 13:54 | |
openstackgerrit | A change was merged to openstack-infra/storyboard: Removed PostGres from code and documentation https://review.openstack.org/98870 | 13:54 |
*** yfried_ has joined #openstack-infra | 13:55 | |
*** Longgeek has joined #openstack-infra | 13:55 | |
*** ihrachyshka has joined #openstack-infra | 13:55 | |
BobBall | it's installed as a dependency | 13:55 |
BobBall | http://paste.openstack.org/show/83806/ | 13:56 |
*** basha has joined #openstack-infra | 13:56 | |
*** yfried_ has quit IRC | 13:56 | |
*** reaper has joined #openstack-infra | 13:56 | |
fungi | ahh, yep | 13:56 |
*** yfried_ has joined #openstack-infra | 13:56 | |
*** zz_gondoi is now known as gondoi | 13:57 | |
BobBall | not sure what the dependency path is ... but that's why it's there after devstack has run | 13:57 |
*** Longgeek_ has quit IRC | 13:57 | |
*** jistr has quit IRC | 14:00 | |
*** timrc-afk is now known as timrc | 14:01 | |
*** malini1 has joined #openstack-infra | 14:01 | |
*** sileht has joined #openstack-infra | 14:02 | |
fungi | BobBall: dpkg -l | grep -e $(echo `apt-cache rdepends python-yaml|grep '^ '`|sed 's/ / -e /g') | 14:02 |
fungi | looks like cloud-init, cloud-utils and python-kombu may be responsible (recurse as needed) | 14:02 |
*** jistr has joined #openstack-infra | 14:02 | |
malini1 | hello ! We are running into a heat timeout failure at the gate, which is blocking our patches. I see tht there is already a query associated with this one https://github.com/openstack-infra/elastic-recheck/blob/master/queries/1306029.yaml | 14:03 |
BobBall | great - thanks - just what I wanted fungi! | 14:03 |
malini1 | What else do I need to make this part of elastic recheck? | 14:03 |
fungi | malini1: http://docs.openstack.org/infra/elastic-recheck/readme.html#adding-bug-signatures | 14:04 |
*** mugsie has quit IRC | 14:04 | |
*** mugsie has joined #openstack-infra | 14:05 | |
malini1 | fungi: it seems like Step 4 in tht link is already in place. There is already a query in the repo https://github.com/openstack-infra/elastic-recheck/blob/master/queries/1306029.yaml | 14:06 |
fungi | malini1: also see the logstash job queue graph at the bottom of http://status.openstack.org/zuul/ indicating that we're currently running on a perpetual log processing backlog, so i think right now elastic-recheck is giving up most/all of the time waiting for logs of a failed job to get indexed (it only waits up to 15 minutes before deciding it's been too long) | 14:07 |
fungi | malini1: i think clarkb was planning to dig back into possible causes/remediation once he's awake | 14:08 |
malini1 | Thanks fungi ! I cud buy clarkb some caffeine ;) | 14:09 |
malini1 | fungi: So for now, my only option is to wait -rt? | 14:09 |
fungi | malini1: and in the meantime look at the failure logs yourself and identify the bug that way, when possible | 14:10 |
*** pcrews has joined #openstack-infra | 14:10 | |
malini1 | fungi: can I still use the 'recheck bug #', if elastic recheck doesnt find it? | 14:11 |
fungi | malini1: yes | 14:11 |
malini1 | fungi: cool! I didnt know that | 14:11 |
malini1 | It solves our problem | 14:11 |
malini1 | thanks again | 14:13 |
*** shayneburgess has joined #openstack-infra | 14:15 | |
*** shayneburgess has quit IRC | 14:15 | |
*** sweston has joined #openstack-infra | 14:15 | |
*** wenlock_ has joined #openstack-infra | 14:17 | |
*** atiwari has joined #openstack-infra | 14:18 | |
*** otherwiseguy has joined #openstack-infra | 14:19 | |
*** mrmartin has quit IRC | 14:19 | |
*** basha has quit IRC | 14:21 | |
*** UtahDave has joined #openstack-infra | 14:22 | |
*** malini1 has quit IRC | 14:25 | |
*** malini1 has joined #openstack-infra | 14:25 | |
*** malini1 has quit IRC | 14:26 | |
openstackgerrit | A change was merged to openstack-infra/elastic-recheck: Add query for ceilometer test_notify_alarm bug 1321826 https://review.openstack.org/98149 | 14:27 |
uvirtbot | Launchpad bug 1321826 in ceilometer "periodic notifier unit test failure" [Medium,In progress] https://launchpad.net/bugs/1321826 | 14:27 |
*** malini1 has joined #openstack-infra | 14:27 | |
*** trinaths has joined #openstack-infra | 14:27 | |
*** zehicle_at_dell has joined #openstack-infra | 14:27 | |
*** jcoufal has quit IRC | 14:27 | |
*** otherwiseguy has quit IRC | 14:31 | |
*** basha has joined #openstack-infra | 14:31 | |
openstackgerrit | A change was merged to openstack-infra/reviewday: Prettified all HTML files https://review.openstack.org/98655 | 14:31 |
*** shayneburgess has joined #openstack-infra | 14:34 | |
Kiall | fungi / clarkb: I put this review together in an attempt to fix that unbound issue mentioned during the Designate meet yesterday, approach seem OK? https://review.openstack.org/#/c/99611/ | 14:34 |
*** radez_g0n3 is now known as radez | 14:34 | |
*** otherwiseguy has joined #openstack-infra | 14:35 | |
*** shayneburgess has quit IRC | 14:38 | |
*** rdopieralski has quit IRC | 14:38 | |
*** sarob has joined #openstack-infra | 14:39 | |
*** timrc is now known as timrc-afk | 14:40 | |
*** yamahata has quit IRC | 14:41 | |
phschwartz | What are the requirements for CLA for the puppet-* acl configs for gerrit? Currently only one of the 6 in the repo require cla and I currently have a -1 from push back for not having it in my file I am adding. | 14:42 |
*** basha has quit IRC | 14:42 | |
*** sarob has quit IRC | 14:43 | |
openstackgerrit | A change was merged to openstack-infra/storyboard-webclient: Refresh token support https://review.openstack.org/95478 | 14:43 |
phschwartz | fungi: mordred: anteaya: clarkb: jeblair: ^ any of you have a weigh in? | 14:44 |
*** basha has joined #openstack-infra | 14:44 | |
*** mkerrin has quit IRC | 14:44 | |
mordred_phone | phschwartz: I believe we decided that they don't need it | 14:45 |
phschwartz | awesome, makes that easy. Now to bug you core reviewers for an approval ;) | 14:46 |
fungi | phschwartz: the status of cla as a requirement for things not distributed as a component of an openstack cloud is currently somewhat muddy. at a recent infra team meeting we resolved for now that if it derives in any way from another project which isn't currently enforcing the cla then it's not necessary | 14:46 |
fungi | phschwartz: so if you're copy/pasting some bits from openstack-infra/config for example, that repo already doesn't have a cla requrement | 14:46 |
fungi | requirement | 14:46 |
*** shayneburgess has joined #openstack-infra | 14:47 | |
* fungi digs up a link to use for reference | 14:47 | |
mordred | dstufft: just read an article about DNF which is the new replacement for yum for fedora 22 | 14:48 |
mordred | dstufft: it apparently uses "new dep solver technology which is faster and uses less memory" | 14:48 |
mordred | dstufft: since it's in python - maybe that dep solver can be re-used? | 14:48 |
*** timrc-afk is now known as timrc | 14:49 | |
fungi | phschwartz: http://eavesdrop.openstack.org/meetings/infra/2014/infra.2014-06-03-19.02.log.html#l-275 (you can skip down to around 19:50:15) | 14:50 |
mordred | dstufft: nevermind. its much less in python and much more in C | 14:50 |
*** ArxCruz has quit IRC | 14:51 | |
*** wenlock_ has quit IRC | 14:52 | |
*** thedodd has joined #openstack-infra | 14:52 | |
phschwartz | so if any core want to take a look at https://review.openstack.org/#/c/93953/ it would be appreciated as the -1 should now be a mute point. | 14:54 |
phschwartz | jhesketh: ping, you around? | 14:54 |
*** kmartin has joined #openstack-infra | 14:55 | |
phschwartz | fungi: ty, that was a good read as to what you guys are thinking and want to use as the current model to follow. | 14:56 |
*** jistr has quit IRC | 14:56 | |
*** jistr has joined #openstack-infra | 14:56 | |
*** ArxCruz has joined #openstack-infra | 14:56 | |
fungi | phschwartz: jhesketh probably won't be awake for a while... it's pretty dark in au right about now | 14:56 |
*** sarob has joined #openstack-infra | 14:57 | |
phschwartz | ah, didn't realize he was in AU. | 14:57 |
phschwartz | lol | 14:57 |
openstackgerrit | Nikhil Manchanda proposed a change to openstack-infra/config: Added new experimental job for trove functional tests https://review.openstack.org/98517 | 14:57 |
openstackgerrit | Nikhil Manchanda proposed a change to openstack-infra/config: Use job-template for gate-trove-buildimage jobs https://review.openstack.org/99680 | 14:57 |
*** james_li has joined #openstack-infra | 14:58 | |
*** oomichi has quit IRC | 14:58 | |
*** marun has joined #openstack-infra | 14:58 | |
*** flaper87 is now known as flaper87|afk | 15:00 | |
openstackgerrit | yolanda.robla proposed a change to openstack-infra/storyboard-webclient: Display dates in timeago format https://review.openstack.org/96713 | 15:00 |
*** jlibosva has quit IRC | 15:01 | |
*** sarob has quit IRC | 15:01 | |
fungi | Kiall: since designate is officially incubated now, i suppose accommodating its typical configuration with our job workers is reasonable even if it means running the local resolver daemon on a less typical address | 15:02 |
*** jlibosva has joined #openstack-infra | 15:02 | |
*** sarob has joined #openstack-infra | 15:03 | |
mordred | fungi: I agree | 15:03 |
yolanda | mordred, cboylan, doing some tests on nodepool in dib, but i receive this error: | 15:03 |
yolanda | Stderr: "qemu-img: 'image' uses a qcow2 feature which is not supported by this qemu version: QCOW version 3\nqemu-img: Could not open '/var/lib/nova/instances/_base/94d0200a5bb90968e0e40f682f9e187025d84276.part': Operation not supported\n" | 15:03 |
Kiall | Well, I was more aiming to put unbound on an address no sane service/config would normally do - It move it once and be done with it | 15:03 |
Kiall | i.e. move it once* | 15:03 |
yolanda | have you seen that before? looks as something related with dib | 15:03 |
fungi | Kiall: yes, it seems like a reasonable approach | 15:04 |
*** lcostantino has joined #openstack-infra | 15:04 | |
*** reed has joined #openstack-infra | 15:05 | |
fungi | Kiall: out of curiosity, though, why did designate care about the loopback address? wouldn't it be a service running inside an instance rather than in the devstack host context? | 15:05 |
*** dprince has quit IRC | 15:06 | |
fungi | Kiall: or is it not a service vm in the same vein as, say, trove? | 15:06 |
Kiall | Designate can run right next to nova etc, or inside nova VMs.. The DevStack gate makes running it alongside nova easier | 15:06 |
Kiall | Not quite the same - but there are plans in that direction for certain use cases | 15:06 |
anteaya | sweston: shivharis was looking for me yesterday and I was away, please let me know if there is anything you need from me | 15:06 |
fungi | Kiall: okay, fair enough | 15:06 |
anteaya | sweston: talking to one person is easier for me then doing cross-purposes with two | 15:07 |
sweston | anteaya: hello | 15:07 |
*** sarob has quit IRC | 15:07 | |
anteaya | sweston: hi | 15:07 |
anteaya | sweston: anything you need from me? | 15:08 |
mordred | yolanda: I haven't seen that - may want to ask the tripleo folks - i agree, it seems like a dib issue | 15:08 |
sweston | anteaya: yes, I understand. We had a meeting yesterday to coordinate everything across our three business units working with OpenStack plugins. | 15:08 |
*** vhoward has left #openstack-infra | 15:09 | |
anteaya | sweston: yay | 15:09 |
anteaya | sweston: how did it go? | 15:09 |
yolanda | mordred, looks as we need to call qemu-img with --compat=1.0 , at least for the environment i'm using for testing, but there is no option like that in dib | 15:10 |
sweston | yes, I do need to speak with you, may we continue our conversation out of band? | 15:10 |
*** basha has quit IRC | 15:10 | |
mordred | yolanda: yah. that'll definitely be a question for #tripleo | 15:11 |
yolanda | sure, heading there | 15:11 |
sweston | anteaya: it went very well. | 15:11 |
anteaya | sweston: let's go to -dev then, I want to ensure we are in a logged channel | 15:11 |
*** ihrachyshka_ has joined #openstack-infra | 15:12 | |
sweston | anteaya: ok, we can continue here as well. | 15:12 |
*** ihrachyshka_ has quit IRC | 15:12 | |
anteaya | okay | 15:12 |
* anteaya listens | 15:12 | |
sweston | anteaya: so, I now have zuul posting back to the openstack-dev sandbox project | 15:13 |
anteaya | sweston: let's back up | 15:13 |
anteaya | how many gerrit accounts are you tracking with brocade? | 15:13 |
*** ramashri has joined #openstack-infra | 15:13 | |
sweston | anteaya: ok, sure. Right now we have three | 15:14 |
anteaya | what are they called? | 15:15 |
sweston | anteaya: is that consistent with what you understand? | 15:15 |
phschwartz | Is there a reason why we don't use rebuild to reuse instances from nodepool, but delete them completely and recreate? | 15:15 |
anteaya | there is a system called Brocade CI that I don't have on my lists | 15:15 |
phschwartz | With some changes on the nova side for us, we could cut the instance creation times by a lot. | 15:15 |
*** ihrachyshka has quit IRC | 15:15 | |
anteaya | https://etherpad.openstack.org/p/automated-gerrit-account-naming-format | 15:15 |
*** doug-fish has joined #openstack-infra | 15:16 | |
anteaya | line 24, 25 and 26 | 15:16 |
*** andreaf has quit IRC | 15:16 | |
sweston | anteaya: yes, Brocade CI is one of ours, another one should be named Brocade OSS CI, and I need to look up the third | 15:16 |
anteaya | I don't have a Brocade CI | 15:16 |
*** shayneburgess has quit IRC | 15:16 | |
anteaya | I have Brocade ADX CI, Brocade OSS CI, and Brocade Tempest | 15:17 |
sweston | that should somehow be linked to Pattabi Ayyasami | 15:17 |
*** ramashri has quit IRC | 15:17 | |
*** _nadya_ has joined #openstack-infra | 15:17 | |
anteaya | I have Brocade ADX CI lined to that name | 15:17 |
*** otherwiseguy has quit IRC | 15:17 | |
sweston | anteaya: ok, then Brocade Tempest must be Shiv Haris's? | 15:18 |
*** shayneburgess has joined #openstack-infra | 15:18 | |
anteaya | fungi: can you take a peek at the gerrit db and see if there is a Brocade CI that isn't a part of the third party gerrit group? | 15:18 |
anteaya | sweston: Brocade CI is the name | 15:18 |
sweston | anteaya: I suspect that is linked to pattabi somehow | 15:19 |
sweston | anteaya: let me give him a call and ask | 15:19 |
*** denis_makogon has quit IRC | 15:19 | |
openstackgerrit | Ben Nemec proposed a change to openstack-infra/reviewstats: Oslo project updates https://review.openstack.org/99184 | 15:19 |
anteaya | sweston: thanks | 15:19 |
sweston | anteaya: sure, give me a few minutes ... :-) | 15:20 |
*** otherwiseguy has joined #openstack-infra | 15:21 | |
anteaya | sweston: also please respond to http://lists.openstack.org/pipermail/openstack-infra/2014-June/001336.html | 15:21 |
anteaya | sweston: from what I can tell this is Brocade OSS CI | 15:21 |
mordred | phschwartz: what does rebuild do? | 15:21 |
sweston | anteaya: yes, I followed up in infra irc immediately | 15:21 |
*** _nadya_ has quit IRC | 15:21 | |
mordred | phschwartz: do we get an instance that looks like it had just been booted? | 15:22 |
anteaya | I haven't read backscroll yet | 15:22 |
phschwartz | mordred: Rebuilds an instance and brings it back to initial startup state | 15:22 |
*** markmcclain has joined #openstack-infra | 15:22 | |
phschwartz | mordred: If we did move to that with changes on our nova side we could pre-cache the image which would remove the redownload of the image from glance cutting down on the time it takes. | 15:22 |
sweston | anteaya: so that's what it looks like when you run zuul with the layout.yaml file unchanged | 15:22 |
mordred | phschwartz: k. no, I do not believe there is a conscious reason - except it would potentially make the elastic logic need to be rethought | 15:22 |
phschwartz | mordred: would only hit the glance download when the image is refreshed. | 15:23 |
*** annegent_ has joined #openstack-infra | 15:23 | |
*** timrc is now known as timrc-afk | 15:23 | |
*** dkranz has joined #openstack-infra | 15:23 | |
mordred | because delete and create are handled independent - so there isn't really anywhere in the system that knows "I'm done with this node, but when I delete it, I'll still need one, so let me re-build instead of delete" | 15:24 |
annegent_ | ttx: governance question when you have a moment (and I should know this but want to ask) | 15:24 |
mordred | phschwartz: so it could work out - but it might not be the _easiest_ patch to write | 15:24 |
*** dprince has joined #openstack-infra | 15:25 | |
ttx | annegent_: ask | 15:25 |
mordred | might not be TOO terrible though | 15:25 |
*** CaptTofu_ has quit IRC | 15:25 | |
phschwartz | mordred: I have been looking at the code as I work on the throttling on error and I will see what it might take to do it. | 15:25 |
annegent_ | ttx: should the ptl be elected prior to incubation? | 15:25 |
annegent_ | the/a | 15:25 |
mordred | since nodepool marks a node for delete in its db | 15:25 |
NobodyCam | fungi: happen to be around? | 15:25 |
mordred | the create code _could_ just say "I need to create a node, are there any nodes in DELETE state in the db, if so, let me nova rebuild them and set their state to BUILDING" | 15:26 |
ttx | annegent_: http://git.openstack.org/cgit/openstack/governance/tree/reference/new-programs-requirements.rst | 15:26 |
mordred | phschwartz: ^^ | 15:26 |
ttx | "Team should have a lead, selected by the team contributors" | 15:26 |
annegent_ | ttx: thanks! I was looking at http://git.openstack.org/cgit/openstack/governance/tree/reference/incubation-integration-requirements.rst | 15:26 |
sweston | anteaya: I cannot reach him at the moment, I will take an action item to find out and get back to you as soon as I can | 15:26 |
ttx | that's only if the project requires a new program | 15:26 |
phschwartz | mordred: That might be the easiest way | 15:26 |
annegent_ | ttx: so that's even prior to the next hop | 15:26 |
ttx | annegent_: or at least concurrent | 15:26 |
*** alexpilotti has joined #openstack-infra | 15:26 | |
annegent_ | ttx: got it, thanks. Election isn't a requirement, teams can do their own selection process. | 15:27 |
ttx | annegent_: note that te wording gives some room for maneuvering | 15:27 |
openstackgerrit | A change was merged to openstack-infra/config: Use WatchedFileHandler to avoid copytruncate. https://review.openstack.org/95935 | 15:27 |
ttx | yes | 15:27 |
mordred | phschwartz: one of those times when having deletes be async makes logic _easier_ | 15:27 |
annegent_ | ttx: thanks! | 15:27 |
annegent_ | ttx: in the case of a project incubating within a program, is there just one ptl? | 15:27 |
fungi | anteaya: https://review.openstack.org/#/q/owner:%22Brocade+CI+%253Copenstack_gerrit%2540brocade.com%253E%22,n,z | 15:27 |
mordred | ttx: someone said something yesterday that made me think you should be involved in something | 15:28 |
*** talluri has joined #openstack-infra | 15:28 | |
phschwartz | mordred: exactly. Even build async might be nice. That way it rechecks a started instance instead of deleting when it first sees error incase it corrects itself (yeah, I know, rax issue) | 15:28 |
mordred | ttx: oh! | 15:28 |
annegent_ | ttx: considering the training incubation within docs program | 15:28 |
fungi | anteaya: er, i meant https://review.openstack.org/#/q/reviewer:%22Brocade+CI+%253Copenstack_gerrit%2540brocade.com%253E%22,n,z | 15:28 |
annegent_ | mordred: you crack me up | 15:28 |
mordred | ttx: https://api.launchpad.net/devel.html#specification <--- lifeless pointed out that there _is_ a blueprints API | 15:29 |
mordred | ttx: I still like the 'just f-ing use storyboard' plan though | 15:29 |
*** timrc-afk is now known as timrc | 15:29 | |
mordred | phschwartz: well, the same logic would hit there -because if it hits a failure it just marks it with delete state | 15:29 |
mordred | phschwartz: so we'd see the benefits there from the same patch | 15:30 |
*** gyee has joined #openstack-infra | 15:30 | |
ttx | mordred: There is one for sure. I use iot all the time. i wrote 25% of it. except it does not let you create a new spec | 15:30 |
fungi | NobodyCam: only barely around--what's up? | 15:30 |
ttx | mordred: tere is no createSpec method at project level | 15:30 |
ttx | mordred: but if lifeless knows how to workaround that, I'll take it | 15:31 |
mordred | ttx: that sounds like potentially a launchpadlib thing rather than an API thing? | 15:31 |
ttx | mordred: no, the API doesn't have the method | 15:31 |
mordred | ttx: still though - seriously - I'm passing it on in the name of completeness - I like the other plan better | 15:31 |
anteaya | fungi: okay thanks, I don't have that account listed as a member of the third party group | 15:31 |
mordred | ttx: wouldn't it just be a PUT with a spec body? | 15:31 |
phschwartz | mordred: would you mind if I took this patch on along with the changes for the throttling? It would be during the week next week that I get to it as I am away this weekend. | 15:31 |
NobodyCam | hey hey fungi :) I was looking at this slightly old bug, (https://bugs.launchpad.net/openstack-ci/+bug/1300208) and saw you commented on it. I'm hitting it here: http://logs.openstack.org/02/96902/13/check/check-tempest-dsvm-ironic/27f0c0a/logs/devstack-gate-setup-workspace-new.txt.gz#_2014-06-12_13_35_19_360 | 15:32 |
uvirtbot | Launchpad bug 1300208 in openstack-ci "ERROR: the main setup script run by this job " [Undecided,New] | 15:32 |
anteaya | fungi: https://etherpad.openstack.org/p/automated-gerrit-account-naming-format line 24, 25, 26 are my Brocade accounts | 15:32 |
*** rfolco has quit IRC | 15:32 | |
mordred | phschwartz: please do! | 15:32 |
NobodyCam | and wanted to check if you thought it was still valid | 15:32 |
mordred | phschwartz: I'm excited about both patches | 15:32 |
fungi | anteaya: it's entirely likely it was misbehaving back when we didn't have a separate non-voting group and so was simply taken out of the group and never revisited later | 15:32 |
anteaya | fungi: ah okay | 15:32 |
anteaya | can we wiggle that onto a todo list? | 15:32 |
*** thuc_ has quit IRC | 15:32 | |
anteaya | fungi: I think One Convergence is in the same boat | 15:32 |
*** doude has quit IRC | 15:33 | |
*** thuc has joined #openstack-infra | 15:33 | |
ttx | mordred: that's not how you create bugs or anything else. it's far from REST | 15:33 |
mordred | ttx: nod | 15:34 |
anteaya | fungi: https://review.openstack.org/#/q/reviewer:oc-neutron-test%2540oneconvergence.com+status:open,n,z | 15:34 |
anteaya | fungi: they aren't on my list either | 15:34 |
fungi | anteaya: likely so | 15:34 |
anteaya | thanks | 15:34 |
fungi | NobodyCam: looking | 15:34 |
ttx | mordred: but then maybe I miss something. i'll gladly accept example code that shows me how to create a blueprint in Launchpad from the API. | 15:34 |
anteaya | fungi: I'm still catching up, is there something I can do of high priorty to help? | 15:34 |
fungi | NobodyCam: the log line you linked to is a normal attempt by devstack-gate to discover whether there are any specific git refs calculated by zuul for a given project. git doesn't have a way to test for a remote ref without just trying to retrieve it and handling the error it returns when there isn't one to be had | 15:36 |
rainya | mordred, was germany right smack dab over my birthday your idea for midcycle?! | 15:36 |
fungi | rainya: bierhaus birthday blowout | 15:37 |
NobodyCam | fungi: Ack TY | 15:37 |
*** thuc has quit IRC | 15:38 | |
fungi | anteaya: aside from helping me spackle and sand drywall, probably not ;) | 15:38 |
rainya | fungi :) fwiw, not sure i'm going to be able to swing getting any of my team out there in person, which causes me much saddness! families for some silly reason expect us to be HOME for summer vacation | 15:38 |
fungi | anteaya: looked like it's been relatively quiet in here today thankfully | 15:38 |
* anteaya finds safety glasses | 15:38 | |
anteaya | fungi: yay | 15:39 |
rainya | phschwartz mentioned remote participation, so that will at least be a consolation prize for those of us that have spouses who would kill us if we changed vacation plans | 15:39 |
anteaya | okay I'm catching up on email and backscroll | 15:39 |
fungi | anteaya: probably just plugging at the gate job failures mostly, from an importance standpoint | 15:39 |
*** sarob has joined #openstack-infra | 15:39 | |
anteaya | fungi: do ping if I can spin any plates | 15:39 |
anteaya | fungi: kk, I'll put that on my list after I am caught up | 15:39 |
mordred | rainya: get less families | 15:40 |
rainya | mordred, not helpful advice, but thanks | 15:40 |
mordred | rainya: :) | 15:41 |
rainya | mordred, was going to do NYC for my birthday this year (by myself without families!) | 15:41 |
*** CaptTofu_ has joined #openstack-infra | 15:41 | |
fungi | living at the beach, i'm going to be glad to have a week away from fighting off the hordes of summer vacationers | 15:42 |
*** annegent_ has quit IRC | 15:42 | |
devananda | fungi: hi! have a minute to talk about gate-tempest-dsvm-virtual-ironic and its now non-voting nature? | 15:42 |
devananda | fungi: that is the main job that ironic uses in our own gate | 15:43 |
phschwartz | fungi: Where do you live? | 15:43 |
anteaya | fungi: I live in a summer vacation haven | 15:43 |
phschwartz | I am near the beaches in FL and I never go to them. lol | 15:43 |
anteaya | fungi: Sunday nights are the best | 15:43 |
anteaya | fungi: never never go to the grocery store on the weekend | 15:43 |
fungi | devananda: sure. sdague says it accesses the network a lot, which is causing it to fail over random network connectivity problems far more often than other jobs, something like 33% of the time now | 15:43 |
*** sarob has quit IRC | 15:43 | |
fungi | phschwartz: as of wednesday i'll be living in the north carolina outer banks | 15:44 |
devananda | fungi: we landed https://review.openstack.org/#/c/98886/ to address that issue | 15:44 |
phschwartz | Ah, very nice. At least it is a place with all 4 seasons instead of 4 versions of the same season | 15:44 |
fungi | phschwartz: a sandbar miles off shore, water within a few hundred feet in either direction | 15:45 |
fungi | hard to avoid the beach there | 15:45 |
devananda | fungi: i think sdague is just proposing that we move the caching of u-c-a keyring into nodepool, so it isn't part of the d-g job prep | 15:45 |
phschwartz | fungi: very hard. Here I at least live about 7 miles from the sand. lol | 15:45 |
devananda | fungi: which i think is grand (though i'm not sure how to do that yet) | 15:45 |
fungi | devananda: nodepool caches things which devstack has in its package lists | 15:45 |
devananda | fungi: can nodepool prerun apt-add-repository? | 15:46 |
isviridov | Hello infra, just 2 patched for magnetodb https://review.openstack.org/#/c/91050/ and https://review.openstack.org/#/c/99039/ | 15:46 |
fungi | devananda: the trick is that we need to un-add it too, because we don't want other tests besides ironic's subjected to uca versions of packages | 15:46 |
devananda | fungi: ironic's virtual tests can't run on precise w/o backports of certain things. I can dig up the bugs if needed | 15:47 |
*** ramashri has joined #openstack-infra | 15:47 | |
*** timrc is now known as timrc-afk | 15:47 | |
devananda | fungi: or can we pin ironic to 14.04 nodes? | 15:47 |
fungi | devananda: so possibly nodepool could grow a routine to add the repository, update package lists, retrieve the keyring package, remove the repository, update package lists again | 15:47 |
fungi | devananda: though i'm sure we can pin ironic to 14.04 nodes once we have some | 15:48 |
*** eharney_ has joined #openstack-infra | 15:48 | |
*** eharney has quit IRC | 15:48 | |
*** eharney_ is now known as eharney | 15:49 | |
*** zns has joined #openstack-infra | 15:49 | |
ttx | devananda: you're wanted in #openstack-relmgr-office | 15:49 |
fungi | devananda: another option might be a specific devstack-precise-uca node type, but that would probably be nearly as much work as getting trusty implemented | 15:49 |
*** mugsie has quit IRC | 15:52 | |
fungi | anteaya: yeah, the local wisdom seems to be dine at restaurants on weekends because vacationers are busy with check-in/check-out, and do your grocery shopping weekdays first thing in the morning or very late at night. and also if it's raining don't go to stores, restaurants, movie theaters, bowling alleys, arcades... they'll be packed like sardine cans | 15:52 |
anteaya | you got it | 15:52 |
anteaya | that is _exactly_ what we do | 15:53 |
anteaya | and wait for October | 15:53 |
anteaya | they all leave by then | 15:53 |
fungi | yup | 15:53 |
anteaya | I did chat with the fire chief and word got around so I don't have fireworks every week anymore | 15:54 |
*** gokrokve has joined #openstack-infra | 15:54 | |
*** habib has quit IRC | 15:54 | |
anteaya | just on holiday weekends which is about 6 weeks out of the summer, so that is okay | 15:54 |
sweston | anteaya: ok, I responded to the ML, and I am ready to move on when you are :-) | 15:55 |
*** cp16net_ has joined #openstack-infra | 15:55 | |
*** bogdando has quit IRC | 15:55 | |
*** bogdando has joined #openstack-infra | 15:56 | |
*** cp16net_ has quit IRC | 15:56 | |
*** markmcclain has quit IRC | 15:57 | |
*** markmcclain has joined #openstack-infra | 15:57 | |
*** ihrachyshka has joined #openstack-infra | 15:57 | |
*** markmcclain has quit IRC | 15:57 | |
*** markmcclain has joined #openstack-infra | 15:58 | |
*** marcoemorais has quit IRC | 15:58 | |
anteaya | sweston: okay so it appears that Brocade CI is not in the third-party CI gerrit group, so when fungi has a moment for housekeeping he will fix that so that it is | 15:59 |
anteaya | sweston: so you actually have 4 Brocade accounts | 15:59 |
*** e0ne_ has quit IRC | 15:59 | |
*** cp16net_ has joined #openstack-infra | 15:59 | |
*** hashar has quit IRC | 16:00 | |
*** e0ne has joined #openstack-infra | 16:00 | |
fungi | anteaya: added | 16:00 |
fungi | anteaya: there was another you thought should be added too? | 16:00 |
Shrews | fungi: hope your move goes smoothly. will suck to not have you nearby now | 16:00 |
anteaya | One Convergence | 16:00 |
anteaya | https://review.openstack.org/#/q/reviewer:oc-neutron-test%2540oneconvergence.com+status:open,n,z | 16:00 |
openstackgerrit | Maxime Vidori proposed a change to openstack-infra/storyboard-webclient: Remove boostrap.js https://review.openstack.org/99638 | 16:00 |
fungi | Shrews: not really--you'll just have to come to the beach for work-beer | 16:00 |
openstackgerrit | Maxime Vidori proposed a change to openstack-infra/storyboard-webclient: Removal of jquery https://review.openstack.org/99660 | 16:00 |
anteaya | fungi: they show up as a third party ci system | 16:01 |
fungi | anteaya: done | 16:01 |
anteaya | fungi: thank you | 16:01 |
*** cp16net_ has quit IRC | 16:02 | |
*** cp16net_ has joined #openstack-infra | 16:03 | |
sweston | anteaya: ok, I see you updated the etherpad. | 16:03 |
*** mestery has quit IRC | 16:03 | |
*** pblaho has quit IRC | 16:03 | |
*** mestery has joined #openstack-infra | 16:04 | |
anteaya | yeah | 16:04 |
*** cp16net has quit IRC | 16:04 | |
anteaya | early on we didn't have the two groups voting and none voting | 16:04 |
*** cp16net_ is now known as cp16net | 16:04 | |
*** e0ne has quit IRC | 16:04 | |
anteaya | so if a system went wild and we removed their voting rights there was no group for them | 16:04 |
anteaya | we have most corralled back again | 16:05 |
anteaya | since we have a non-voting group now, which is the default group | 16:05 |
*** pblaho has joined #openstack-infra | 16:05 | |
*** mwagner_lap has joined #openstack-infra | 16:07 | |
*** yfried_ has quit IRC | 16:07 | |
*** shayneburgess has quit IRC | 16:07 | |
sweston | anteaya: I understand. | 16:08 |
anteaya | sweston: great thanks | 16:09 |
anteaya | now that we have the emergiences addressed | 16:10 |
anteaya | what can I do to help? | 16:10 |
*** pblaho has quit IRC | 16:10 | |
*** yfried_ has joined #openstack-infra | 16:11 | |
*** ociuhandu has joined #openstack-infra | 16:11 | |
*** chuckC has quit IRC | 16:11 | |
sweston | anteaya: ok, sorry trying to balance another conversation .. | 16:11 |
*** locke105 has joined #openstack-infra | 16:12 | |
devananda | fungi: I'm looking at a recent failure of gate-tempest-dsvm-virtual-ironic and wondering why it didn't seem to run the fix we landed several days ago | 16:12 |
*** mrodden1 has quit IRC | 16:12 | |
devananda | http://logs.openstack.org/14/96114/4/gate/gate-tempest-dsvm-virtual-ironic/7f8433c/console.html#_2014-06-12_06_42_32_827 | 16:12 |
devananda | vs https://review.openstack.org/#/c/98886/1/modules/openstack_project/files/jenkins_job_builder/config/devstack-gate.yaml | 16:13 |
anteaya | sweston: I hear that | 16:13 |
devananda | there's no log of running apt-get update prior to a-a-r uca | 16:13 |
*** chuckC has joined #openstack-infra | 16:13 | |
openstackgerrit | Julien Vey proposed a change to openstack-infra/config: Add solum-guestagent repo to stackforge https://review.openstack.org/99705 | 16:13 |
fungi | devananda: seems to be using a devstack-precise node in hpcloud-b3. what was the url for the fix again? | 16:14 |
devananda | fungi: just pasted it :) | 16:14 |
fungi | devananda: i'll see if it's a stale nodepool image issue | 16:14 |
*** mbacchi has quit IRC | 16:14 | |
fungi | oh, that was it | 16:14 |
devananda | fungi: if that fix isn't fixing the problem, that's one thing. but if stale nodepool is causing all the gate failures that sdague is pointing out, that's another | 16:15 |
fungi | seems to have merged 2014-06-10 at 18:08 | 16:15 |
sdague | devananda: regardless, 65% pass rate is terrible | 16:15 |
devananda | sdague: i agree | 16:15 |
sdague | the ironic team has to get that up | 16:15 |
fungi | devananda: and yeah, it was a job config change, so wouldn't be impacted by stale node images... | 16:15 |
sdague | and has to be the people monitoring that | 16:15 |
sdague | landing a gate job also incurs the respossibility of ensuring it's at a high success rate, and digging on it when it's not | 16:16 |
devananda | sdague: and we are | 16:16 |
devananda | sdague: our gate was completely blocked for ~10 days so we couldn't do much to even see when it would/wouldn't work | 16:16 |
devananda | then nova landed the revert to unblock us | 16:16 |
devananda | we landed ~15 bug fixes | 16:16 |
devananda | including one that we thought addressed that very issue | 16:17 |
devananda | (in infra) | 16:17 |
sdague | well you also need to realize there is no more "our gate" | 16:17 |
sdague | you are in the main integration gate now | 16:17 |
fungi | devananda: another possibility is that jenkins02's job configuration is stale for some reason... checking that next | 16:17 |
devananda | sorry -- our check queue was blocked | 16:17 |
*** annegent_ has joined #openstack-infra | 16:17 | |
sdague | so the impact goes way up | 16:17 |
devananda | sdague: but you're right. that's another problem | 16:18 |
sweston | anteaya: I am finding out how to rename the Brocade CI account. | 16:18 |
anteaya | sweston: you can't | 16:18 |
devananda | sdague: ironic afaik shouldn't bbe in the integrated gate right now | 16:18 |
anteaya | sweston: we have to | 16:18 |
sweston | anteaya: yes, i mean, what to rename it to | 16:18 |
anteaya | sweston: all you have to do is tell me what name you want it to be | 16:18 |
sdague | yeh, this actually kind of leads back to whether or not we can take the risk on olsotest | 16:18 |
anteaya | sweston: ah yes, that I do need from you | 16:18 |
devananda | sdague: whether our tempest jobs vote on ironic or not shouldn't impact the integrated gate. I recognize that it does -- how can we detangle taht? | 16:18 |
sdague | because I'm actually kind of concerned of the "join the world" that oslotest is causing | 16:18 |
anteaya | sweston: thanks | 16:18 |
sweston | anteaya: yup yup | 16:18 |
devananda | sdague: without disabling our main test's ability to vote on ironic changes | 16:18 |
sdague | devananda: oslotest | 16:19 |
sdague | is dhellmann about | 16:19 |
*** jistr has quit IRC | 16:19 | |
sdague | this is the problem, but the way the oslotest jobs are set up, everything that uses oslotest is now merged into one flow | 16:19 |
sdague | and from a theory perspective, I get why | 16:20 |
devananda | i really want to see that job continue to vote on ironic -- even failing 33% of the time -- because that forces us to clean things up. making it non-voting doesn't encourage ironic devs to fix it. | 16:20 |
sdague | but from a practice perspective, I think we're kind of boned | 16:20 |
sdague | devananda: it's still voting in check | 16:20 |
sdague | and you have to have clean check to get to the gate | 16:21 |
*** Hal_ has quit IRC | 16:22 | |
devananda | sdague: maybe i'm missing something in how oslotest tied everything together | 16:22 |
devananda | sdague: how does a job that votes on ironic's gate cause patches to /other/ projects to fail to merge? | 16:22 |
fungi | devananda: yeah, so that was it... the jenkins master associated with that particular slave still has the previous version of the job config. digging now to see if/why jjb is failing to update it there | 16:23 |
sdague | devananda: ok. stop using the term "ironic's gate" | 16:23 |
devananda | sdague: or is the problem that the integrated gate is now serialized, so if there is an ironic change in the merge queue, and it fails, it slows things down? | 16:23 |
sdague | because no project has their own gate | 16:23 |
*** markmcclain has quit IRC | 16:23 | |
sdague | gates are constructed by computing projects that have overlapping jobs | 16:24 |
devananda | sorry, i'll rephrase | 16:24 |
sweston | anteaya: I am not getting a response right now, I will need to follow up with you when I have more info. | 16:24 |
anteaya | sweston: very good | 16:24 |
sdague | and the issue is a gate reset, because of failing in the gate, adds about 1 hr delay to redo all the jobs | 16:24 |
anteaya | sweston: anything else for the moment? | 16:24 |
sdague | ironic had 30 resets in the last 48 hrs in the gate queue on that one job | 16:25 |
sweston | anteaya: ok, so .. moving on. I have Zuul reporting back to the openstack-dev sandbox now. | 16:25 |
devananda | sdague: gotcha | 16:25 |
sdague | which means generating 30 hrs of delay (back of the envelope) | 16:25 |
sdague | that's huge | 16:25 |
*** mugsie has joined #openstack-infra | 16:25 | |
sdague | and a piece of the puzzle for why we have a 24 hour gate pipeline right now | 16:25 |
sweston | anteaya: can we verify that is happening correctly? | 16:25 |
anteaya | sweston: may I have a url | 16:25 |
devananda | sdague: right. now I understand. | 16:26 |
devananda | sdague: no predictive parallel testing, so one failure means re-testing everything "behind" that patch | 16:26 |
sdague | devananda: we are predictive | 16:26 |
sdague | but it's a speculation | 16:26 |
sdague | so if you fail in the gate, we have to unwind and redo our speculation | 16:27 |
sweston | anteaya: yes https://review.openstack.org/#/c/99656/ | 16:27 |
*** cgoncalves has quit IRC | 16:27 | |
*** cgoncalves has joined #openstack-infra | 16:27 | |
sdague | otherwise you can land code that never was tested in that combination | 16:27 |
fungi | optimistically predictive to avoid wasting even more resources on jobs than we already do | 16:27 |
fungi | as opposed to pessimistically predicting several failures deep in case a change fails | 16:27 |
sdague | yeh, with infinite resources you could assume things would fail and just grind | 16:27 |
anteaya | sweston: okay so third-party ci systems can't post "Starting check jobs." please disable | 16:28 |
anteaya | sweston: it is too much noise | 16:28 |
sdague | but take the current resources and multiply | 16:28 |
clarkb | o/ | 16:28 |
sdague | to get what you'd need for that | 16:28 |
sweston | anteaya: ok | 16:28 |
anteaya | sweston: and firefox can't find your logs: http://logs.ci.vyatta.net/56/99656/1/check/noop-check-communication/288afb5738384c4ebea7dd6ecdc83722 | 16:28 |
clarkb | anteaya: we actually need to turn that off in our zuul too | 16:28 |
sweston | anteaya: yes, I need to fix that | 16:29 |
*** Longgeek has quit IRC | 16:29 | |
anteaya | clarkb: awesome | 16:29 |
anteaya | sweston: great, that is my feedback | 16:29 |
clarkb | anteaya: gerrit 2.8 should fix that. I can confirm over on review-dev | 16:29 |
fungi | clarkb: though in actuality now gerrit adds a comment anyway when zuul un-sets its vote, so... | 16:29 |
anteaya | sweston: go forth, do great things | 16:29 |
clarkb | oh | 16:29 |
anteaya | clarkb: awesome, thank you | 16:29 |
clarkb | silly gerrit | 16:30 |
fungi | clarkb: it'll just be a comment with "patchset X: -Verify" as the only content | 16:30 |
sweston | anteaya: ok, so nothing should be reported when the system starts the check job? | 16:30 |
anteaya | sweston: correct | 16:30 |
fungi | clarkb: at least i think that's the new vote+0 behavior | 16:30 |
anteaya | sweston: with 20 systems reporting on a patch, that is too much activty for too little information | 16:30 |
*** mrodden has joined #openstack-infra | 16:31 | |
devananda | sdague: so. non-integrated projects which are now, by virtue of oslotest, in the integreated gate. that includes incubated projects, tripleo projects, etc | 16:31 |
sdague | yep | 16:31 |
clarkb | fungi: ya I think you are correct. I can poke at review-dev to see | 16:31 |
devananda | sdague: is that going to change? or the new status quo? | 16:31 |
sdague | that's some fallout that was not entirely figured out | 16:31 |
clarkb | fungi: it may end up being clearer that way anyways? | 16:31 |
sdague | I think we need to rethink that | 16:31 |
*** _nadya_ has joined #openstack-infra | 16:32 | |
devananda | ++ | 16:32 |
fungi | clarkb: it may end up being less code in zuul, at least | 16:32 |
sdague | and just take the risk of oslotest breaking | 16:32 |
*** gokrokve has quit IRC | 16:33 | |
sdague | but I need dhellmann around before I propose that | 16:33 |
mordred | clarkb: phschwartz and I were talking in the scrollback about a patch to nodepool to use nova rebuild in some cases | 16:33 |
clarkb | mordred: what does nova rebuild do? | 16:34 |
mordred | clarkb: it reuses the vm but splats the contents of the original image on it - so the end result is like having booted a new instance, but it's quicker/cheaper | 16:34 |
sweston | anteaya: gotcha. I have a response on our earlier query, could we change Brocade CI to Brocade BNA CI, and change Brocade OSS CI to Brocade Vyatta CI | 16:34 |
anteaya | yes | 16:35 |
anteaya | I will make a note of those | 16:35 |
sdague | mordred: when it works :) | 16:35 |
mordred | clarkb: the short theory is to have the node launching logic say "I need more nodes, are there any nodes with a matching image in the DELETE state, if so, rebuild, if not, launch a new one" | 16:35 |
clarkb | mordred: lifeless: also the more I think of the dib service issue the more I come back to have puppet install mysql and postgres then manually start postgres and mysql in post-install.d and add the user and db. Then once we are switched delete that code from puppet to prevent branching too hard | 16:35 |
anteaya | we don't have a time yet for the naming changes to take effect yet, but when we rename those are the names they will be | 16:35 |
sweston | anteaya: Yay!! And I have a new ssh key we would like associated with Brocade BNA CI, how should I post that to you? | 16:35 |
mordred | clarkb: ++ | 16:35 |
mordred | clarkb: I swear I'm going to write you a patch in that direction | 16:36 |
clarkb | ok | 16:36 |
mordred | clarkb: you know, if I can get the heck out of vegas | 16:36 |
clarkb | mordred: are you ready for world cup? | 16:36 |
clarkb | I am hanging out at home today so that I acn watch todays game | 16:36 |
mordred | sdague, clarkb: for the nodepool patch, I think we'd also want to put a flag or a counter on a node in the db so that if we have rebuild failures, we can mark it as "please don't try to rebuild me" | 16:36 |
mordred | clarkb: I am - although we're going to be travelling for this game :( | 16:37 |
clarkb | mordred: phschwartz sdague once a node goes into delete that means we are trying to delete it | 16:37 |
mordred | clarkb: right | 16:37 |
sdague | mordred: it would honestly be nice if nodepool ended image build with trying to run tempest-dsvm-full | 16:37 |
sdague | to know if the think worked | 16:37 |
clarkb | how do we preempt that deletion on the nova side? | 16:37 |
mordred | sdague: yes. I want to get to taht point | 16:37 |
anteaya | sweston: please post the ssh key to the infra email list | 16:37 |
clarkb | sdague: I want to make that part of the dib cycle | 16:37 |
sdague | especially with the UPDATE_REPOS=False, to know that we cached | 16:37 |
mordred | clarkb: don't we mark it for delete in teh db and then have a reaper thread which goes through and does the deletes? Or am I high? | 16:37 |
sdague | because, right now we don't | 16:38 |
anteaya | sweston: please check the etherpad to ensure all the Brocade account names are correct | 16:38 |
sdague | in a lot of the images | 16:38 |
clarkb | sdague: create dib image, upload to glance as image-beta/test/derp then have a periodic job that runs once a day that only runs on that image flavor which if it passes triggers a thing to rename the image | 16:38 |
anteaya | sweston: and note that the usernames will not be automatically renamed, we are going to make a list of accounts that are willing to volunteer for that | 16:38 |
*** openstackgerrit has quit IRC | 16:38 | |
clarkb | mordred: yes that is how it works | 16:38 |
clarkb | mordred: so there is a race between making delete api request and rebuilding | 16:38 |
sweston | anteaya: I am going to set up mailing lists for all of the accounts, so that we can add and remove people from the list as necessary, but I do not have the server up yet. Will that be okay? | 16:38 |
phschwartz | clarkb: Correct, but if all we have to do is look for an instance that is marked as delete, but not deleted yet. (not sure if the db currently denotes if the instance has been grabbed by the delete thread and is in the process of calling the api to delete. | 16:39 |
clarkb | mordred: and currently I don't think you can win that race beacuse its a fifo queue | 16:39 |
mordred | clarkb: right. so once it's marked in the db, if the delete thread gets to it, great. but if it doesn't and the rebuild thread gets to it, neat | 16:39 |
anteaya | sweston: since the timing of that renaming needs to be co-ordinated since once we change the name, you can't use the system until you rename your system | 16:39 |
mordred | clarkb: ah - so it might need a little more logic work then | 16:39 |
clarkb | yes | 16:39 |
*** sarob has joined #openstack-infra | 16:39 | |
mordred | kk. | 16:39 |
mordred | phschwartz: have fun! | 16:39 |
clarkb | I am sure you can make it happen, but you will haev to be careful to not mark a rebuilt node as ready again | 16:39 |
anteaya | sweston: as long as I can talk to you about anything Brocade, and you can turn systems off if they go wild | 16:39 |
clarkb | then have it deleted in 20 minutes | 16:39 |
anteaya | sweston: you do anything you need to do from your end to make it work | 16:40 |
sdague | clarkb: cool. though right now, I mostly need a working ES cluster :) | 16:40 |
clarkb | sdague: yes I am booting this morning | 16:40 |
sdague | woot | 16:40 |
devananda | fungi: any further info on the stale image? | 16:40 |
phschwartz | mordred: I think there is a way around it. If we can grab it before it is deleted, then we can change the status so the delete thread never tries to delete it. (drop it from the queue, might have to move from being a fifo for that) | 16:40 |
clarkb | sdague: I think next step is to start tailing all the logs and make sense of what appears to be a cyclic process | 16:40 |
sweston | anteaya: ok, awesome :-) what does that mean, then. we won't be able to use the accounts from when to when? | 16:41 |
*** marcoemorais has joined #openstack-infra | 16:41 | |
devananda | sdague, fungi: it looks like, in the last 48 hours, only one failure in gate-tempest-dsvm-virtual-ironic was from something /other/ than UCA failing (which I think is just due to stale images at this point, since we landed a fix on 6-10) | 16:41 |
clarkb | wait wasn't the fix in the job definitions not the salves? | 16:41 |
sweston | anteaya: yes, I will be available 24/7 in the event of problems | 16:41 |
sdague | devananda: sure, but until that's actually working and fixed, it's still a problem | 16:42 |
*** marcoemorais has quit IRC | 16:42 | |
devananda | sdague: ack | 16:42 |
*** dizquierdo has quit IRC | 16:42 | |
sdague | I don't actually care why it's failing, the fact that it's failing disqualifies that from voting | 16:42 |
*** marcoemorais has joined #openstack-infra | 16:42 | |
phschwartz | There is another benefit of getting the rebuild working also. It removes a race where one of our regions might be out of capacity and when you do the delete it hits a soft-delete while it is waiting for the resources to be released basically leaving it as an usable slot for a short period of time. If we do a rebuild instead, the slot never leaves the usage from infra so no competing for a new slot that might not be available at that time. | 16:42 |
clarkb | phschwartz: yup I think it will be a good thing to try | 16:43 |
anteaya | sweston: if, and only if, you volunteer to have your accounts renamed, you I and fungi pick a time for it to happen, we change gerrit, you change your 4 systems, and we are all good | 16:43 |
anteaya | sweston: if you choose to do it, it should take about 15 minutes start to finish | 16:43 |
*** sarob has quit IRC | 16:43 | |
anteaya | sweston: the trick is to pick a time when fungi has the 15 minutes | 16:44 |
phschwartz | clarkb: I am going to start working on it monday morning while I am flying to SAT. I want to get a WIP patch out for you guys to start looking at as soon as I can. | 16:44 |
*** markmcclain has joined #openstack-infra | 16:44 | |
clarkb | phschwartz: mordred: or even preempt the entire marking of DELETION if the current needed nodes is non zero | 16:45 |
clarkb | this doesn't fix it quite so properly but may be simpler | 16:45 |
anteaya | sweston: I am afk for a bit, I will let you know when I am back | 16:45 |
sweston | anteaya: that sounds great. I will let you know when we are ready to proceed with the name changes, send the new key out to the mailing list, and ping you when the log and status servers are up. | 16:45 |
*** zehicle_at_dell has quit IRC | 16:45 | |
sdague | ok, going to drop off for a bit, need to get some lunch and relocate back to home. | 16:45 |
sweston | anteaya: oh ok, looks like our messages crossed paths, I'll wait :-) | 16:45 |
phschwartz | clarkb: hmm, I think I have an idea based on that logic. Give me a second to think before I respond. | 16:45 |
*** isviridov is now known as isviridov|away | 16:47 | |
*** rfolco has joined #openstack-infra | 16:47 | |
*** amcrn has joined #openstack-infra | 16:47 | |
phschwartz | clarkb: ok, I know this would complicate the process a bit. But what if when we are done with an instance or at a time when we would mark it for deletion, we mark it for rebuild. We would need 1 more thread for the rebuilder that if it decides it cannot rebuild an instance, it would flag it for deletion. This would make it so there is no need to modify delete process at all. | 16:48 |
phschwartz | The rebuild thread would put it into a state that the create thread can then use. | 16:49 |
*** fanhe has quit IRC | 16:49 | |
devananda | fungi: fwiw, all the failures seem to be from jobs started by jenkins02 | 16:49 |
*** _nadya_ has quit IRC | 16:50 | |
devananda | except for one java.io.IOException from jenkins04 | 16:50 |
*** sweston has quit IRC | 16:51 | |
*** maxbit has quit IRC | 16:51 | |
*** sweston has joined #openstack-infra | 16:51 | |
*** cp16net has quit IRC | 16:52 | |
*** chianingwang has joined #openstack-infra | 16:53 | |
clarkb | phschwartz: yeah you coudl slip someting in between like that | 16:53 |
clarkb | phschwartz: which should make it a bit easier to reason about races as there shouldn't be any | 16:53 |
rcarrillocruz | hey guys, i'm deploying review.pp in a clound instance. If I access to the server with http, I get "The requested URL / was not found on this server.", if I access it with https I get "SSL Connection error". | 16:53 |
phschwartz | I am going to think more on it while I head out with the wife to get lunch. | 16:53 |
*** trinaths has quit IRC | 16:53 | |
clarkb | devananda: right I don'tthink it is a slave thing it is a master thing | 16:53 |
rcarrillocruz | now, if i edit the vhost and replace the VirtualHostname <hostname>:<port> for VirtualHostname *:<port> it works | 16:53 |
*** mkerrin has joined #openstack-infra | 16:53 | |
clarkb | phschwartz: sure let me know what you come up with (I think any approach is worth doing though) | 16:54 |
mordred | sdague: ++ to removing postgres, btw | 16:54 |
*** harlowja_away is now known as harlowja | 16:54 | |
clarkb | rcarrillocruz: does the name match the name you are hitting it with? | 16:54 |
clarkb | rcarrillocruz: if you try to hit localhost but the vhost says review.foo it won't work | 16:54 |
devananda | clarkb: how do we address it? that staleness is still causing ~30% of our tests to fail | 16:54 |
phschwartz | sdague: ++ from me also, I am a certified postgres admin and developer and I can't stand using it. | 16:54 |
*** markwash has joined #openstack-infra | 16:54 | |
clarkb | devananda: fungi said he was looking at it | 16:54 |
devananda | clarkb: ack | 16:55 |
rcarrillocruz | what i did was to add the host name in my laptop /etc/hosts and in the cloud instance itself | 16:55 |
devananda | just trying to be helpful | 16:55 |
rcarrillocruz | <external ip> gerrit.openstacklocal | 16:55 |
clarkb | devananda: ya I think we need to hear back from fungi | 16:55 |
clarkb | devananda: it *should* be as simple as kicking JJB to apply the new job | 16:55 |
clarkb | rcarrillocruz: and gerrit.openstacklocal is what the vhost said before the splat? | 16:55 |
rcarrillocruz | lemme paste the vhost in paste.openstack.org | 16:56 |
mordred | sdague, jogo: I know I'm a broken record on this ... but ^^ is the dumbest default behavior ever. I blame both of you because of the nova-core status | 16:56 |
mordred | Kiall: you're going to fix that with designate, right? | 16:57 |
*** zehicle_at_dell has joined #openstack-infra | 16:58 | |
*** skraynev has quit IRC | 16:58 | |
rcarrillocruz | clarkb: http://paste.openstack.org/show/83844/ | 16:58 |
*** jcoufal has joined #openstack-infra | 16:58 | |
*** skraynev has joined #openstack-infra | 16:59 | |
rcarrillocruz | and put in both laptop/gerrit server the pair <external IP> gerrit.openstacklocal gerrit under /etc/hosts | 16:59 |
rcarrillocruz | so it shouldn't resolve to localhost | 16:59 |
*** sarob has joined #openstack-infra | 16:59 | |
*** rwsu has joined #openstack-infra | 17:00 | |
*** radez is now known as radez_g0n3 | 17:00 | |
*** bogdando has quit IRC | 17:00 | |
*** jlibosva has quit IRC | 17:01 | |
*** marcoemorais has quit IRC | 17:02 | |
devananda | clarkb: in response to sdague's email, how do you feel about nodepool precaching UCA? I have yet to look into the nodepool code, but it sounds like he doesn't want to re-enable voting on that job until it no longer pulls anything directly from UCA | 17:02 |
*** jlibosva has joined #openstack-infra | 17:02 | |
mordred | why don't we just have UCA enabled from the get-go again? | 17:02 |
Kiall | mordred: was AFK.. What am I fixing? ;) | 17:03 |
mordred | Kiall: nova boot foo.bar.com ; ssh foo.bar.com ; hostname == foo.openstacklocal | 17:03 |
Kiall | Yea - We had a chat with Nova/Neutron guys at the summit to talk about fixing that ;) | 17:04 |
mordred | Kiall: I generally assume you can solve all of my problems | 17:04 |
clarkb | mordred: devananda UCA doesn't work | 17:04 |
*** zns_ has joined #openstack-infra | 17:04 | |
clarkb | or hadn't | 17:05 |
mordred | oh. well, that's a good reason | 17:05 |
clarkb | which is why we avoided it | 17:05 |
Kiall | mordred: funny that, I thought I caused more problems that I fixed ;) | 17:05 |
clarkb | libvirt was broken | 17:05 |
clarkb | mongo was broken | 17:05 |
devananda | clarkb: https://review.openstack.org/#/c/98886/1 | 17:05 |
*** dprince has quit IRC | 17:05 | |
clarkb | devananda: right so in that case its just you that are affected | 17:05 |
mordred | devananda: I believe he means "the software in UCA is broken" | 17:05 |
*** ramashri has quit IRC | 17:05 | |
devananda | mordred: ah | 17:05 |
mordred | not the mechanism | 17:05 |
devananda | so in this case, the issue sdague has is the mechanism | 17:06 |
*** andreykurilin_ has joined #openstack-infra | 17:06 | |
devananda | that we're installing UCA at run time, instead of precaching it | 17:06 |
*** nati_ueno has joined #openstack-infra | 17:06 | |
clarkb | sdague: so it looks like some of our logstash-indexers are off derping | 17:06 |
mordred | yah. hrm. I wonder ... | 17:06 |
*** ramashri has joined #openstack-infra | 17:06 | |
mordred | devananda: I have an idea | 17:06 |
devananda | mordred: ironic's tempest job is now non-voting | 17:06 |
clarkb | sdague: I am going to sweep through and see if I can figure out what they are doing but I think the issue is at the logstash indexer level | 17:07 |
mordred | devananda: we do the normal apt precaching that we do | 17:07 |
devananda | mordred: because, essentially, intalling UCA fails too often | 17:07 |
clarkb | sdague: so adding new ones helped until they derped too | 17:07 |
*** zns has quit IRC | 17:07 | |
mordred | devananda: then we a-a-r uca and do another round of apt precaching ... | 17:07 |
mordred | devananda: THEN, remove the sources.list.d file | 17:07 |
mordred | so that "enabling" uca is not running a-a-r, it's adding the sources.list.d file back, and then any additional packages you'd get from uca would also be pre-cached | 17:08 |
mordred | clarkb: ^^ sanity check me on that | 17:08 |
mordred | I think that would allow us the mechanism to pre-cache/pre-download the things we need without polluting the box for non-uca runs | 17:08 |
mordred | we could even add uca without a-a-r at all | 17:09 |
mordred | after all, it's just a sources.list.d file and an apt-key command | 17:09 |
clarkb | mordred: thta should work | 17:09 |
*** zns_ has quit IRC | 17:09 | |
mordred | we could potentially generalize is - so that the UCA repo is referenced somewhere in devstack as a repo that might get enabled | 17:10 |
*** marcoemorais has joined #openstack-infra | 17:10 | |
*** zehicle_at_dell is now known as zehicle_defcore | 17:10 | |
mordred | and we could have a generalized thing in d-g that pre-caches stuff with any additional repos that devstack lists | 17:10 |
*** zns has joined #openstack-infra | 17:10 | |
mordred | but that would be step two and may never be needed | 17:10 |
clarkb | wow dnsmasq hates us in syslo | 17:11 |
devananda | mordred: so, side note, since https://review.openstack.org/#/c/98886/1/modules/openstack_project/files/jenkins_job_builder/config/devstack-gate.yaml landed | 17:11 |
*** james_li has quit IRC | 17:11 | |
devananda | mordred: i haven't seen any more of those failures | 17:11 |
devananda | mordred: except for the possibly-stale nodes that fungi is looking into | 17:11 |
*** zzelle has joined #openstack-infra | 17:12 | |
mordred | devananda: yah - but we do have a hole where we'll be downloading from the internet rather than from pre-cache | 17:12 |
devananda | mordred: but in principle, I dont see why this really fixes it | 17:12 |
devananda | exactly | 17:12 |
mordred | devananda: and we've developed pretty good history to know that that WILL break | 17:12 |
mordred | it's just a matter of time | 17:12 |
devananda | right | 17:12 |
mordred | so the longer version above should fix it the _right_ way | 17:12 |
*** markmc has quit IRC | 17:14 | |
*** derekh_ has quit IRC | 17:14 | |
*** trinaths has joined #openstack-infra | 17:15 | |
*** annegent_ has quit IRC | 17:17 | |
fungi | devananda: not so much stale nodes and jenkins masters not getting job configs updated | 17:17 |
*** Ryan_Lane has joined #openstack-infra | 17:17 | |
fungi | gah, step away for a few minutes and so many pings | 17:17 |
fungi | devananda: i started the jjb update on jenkins02 before i stepped away, but it's still churning | 17:19 |
fungi | seems to think it needs to reconfigure lots and lots of jobs | 17:19 |
*** trinaths has quit IRC | 17:19 | |
*** cp16net has joined #openstack-infra | 17:19 | |
fungi | it's creating a bunch it was missing too | 17:19 |
devananda | fungi: ack | 17:20 |
devananda | mordred: so i haven't dug into the nodepool precaching code before. a) do you think step1 is needed before gate-tempest-dsvm-virtual-ironic can vote again? b) if so, mind pointing me in the right direction to get started on that? | 17:21 |
*** gokrokve has joined #openstack-infra | 17:22 | |
*** jlibosva has quit IRC | 17:23 | |
*** esker has joined #openstack-infra | 17:23 | |
mordred | devananda: ./modules/openstack_project/files/nodepool/scripts/cache_devstack.py in openstack-infra/config does it | 17:23 |
*** gyee has quit IRC | 17:23 | |
mordred | devananda: since we collect the list of apt packages | 17:24 |
*** cp16net has quit IRC | 17:24 | |
mordred | devananda: I think we may just want to put in something around line 142 | 17:24 |
devananda | off topic, ironic has a patch up to make the ipmi driver fail to load if ipmitool isn't installed, which seems like a sane thing | 17:24 |
*** dprince has joined #openstack-infra | 17:25 | |
devananda | but that made me realize taht we're not installing ipmitool in nodepool | 17:25 |
clarkb | sdague: :timestamp=>"2014-06-12T17:22:26.537000+0000", :message=>"Failed to flush outgoing items", :outgoing_count=>512, :exception=>java.lang.OutOfMemoryError: Java heap space, may be our culprit | 17:25 |
mordred | devananda: hrm. I may need to think about a sane way to implement the above stuff | 17:25 |
*** markmcclain has quit IRC | 17:25 | |
devananda | also, we're not doing CI with ipmitool anyway | 17:25 |
mordred | devananda: that would be a devstack thing. you'd need to add ipmitool to an apts file in devstack | 17:25 |
mordred | devananda: and then nodepool will know to pre-cache it | 17:25 |
devananda | so a) we add ipmitool to devstack/xx/apts or b) we dont enable the ipmitool driver in devstack, since it's not used in CI testing | 17:26 |
mordred | devananda: but devstack's ironic config would want to be the one to actually install it | 17:26 |
devananda | mordred: right. but we're not actually going to use it for upstream CI | 17:26 |
mordred | devananda: nod | 17:26 |
mordred | devananda: I could see either thing ... devstack _is_ used for more than the gate | 17:26 |
devananda | which do ya'll prefer? a) saner devstack config for folks testing, but an extra (unused) package | 17:26 |
devananda | right | 17:26 |
mordred | devananda: so if you expect someone using devstack to be able to configure ironic and then have that cloud control things with ipmi ... then I'd go ahead and add it | 17:27 |
mordred | and the fact that we don't use it in the testing is meh | 17:27 |
devananda | ack, will do that then | 17:27 |
devananda | since it is the recommended / default / reference driver that most folks test with | 17:27 |
mordred | ++ | 17:27 |
clarkb | sdague: but this makes me want to use fluentd more | 17:28 |
*** MarkAtwood has joined #openstack-infra | 17:28 | |
clarkb | mordred: ^ | 17:28 |
mordred | devananda: if you can figure out a sensible way to implement the stuff above in cache_devstack.py go for it - if not, poke me when I get back home and I'll figure it out | 17:28 |
mordred | clarkb: I support your choices in this area 100% | 17:28 |
mordred | clarkb: if fluentd would be a better choice, then awesome | 17:28 |
*** maxbit has joined #openstack-infra | 17:29 | |
clarkb | mordred: well it isn't a better choice until we have structured data. but definitelysomething we can work towards | 17:29 |
*** ihrachyshka has quit IRC | 17:29 | |
*** arnaud__ has joined #openstack-infra | 17:30 | |
mordred | clarkb: it's the same choice essentially while we dont' though right? | 17:30 |
mordred | clarkb: so would we be doing a fluentd+elasticsearch cluster intead of a logstash+elasticsearch cluster? | 17:30 |
*** rwsu has quit IRC | 17:30 | |
clarkb | ya | 17:31 |
mordred | cool | 17:31 |
*** praneshp has joined #openstack-infra | 17:31 | |
anteaya | sweston: back, send the key to the infra ml list anytime use both the new and old Full Name for the account | 17:31 |
clarkb | mordred: fluentd doesn't really do parsing of unstructured data | 17:31 |
clarkb | mordred: so it is supposed to be able to do much better throughput | 17:31 |
clarkb | mordred: but you have to start with good data | 17:31 |
anteaya | sweston: I'll let you know when we are ready to change the Full Name of the account, we can address changing the username after that | 17:31 |
mordred | clarkb: so what do we do until we have structured data? | 17:31 |
*** rwsu has joined #openstack-infra | 17:32 | |
clarkb | mordred: limp along on logstash | 17:32 |
mordred | ah - gotcha | 17:32 |
anteaya | sweston: and yes, let me know when you have something new for me to see in the sandbox repo comments | 17:32 |
mordred | so we need to get the ability to have structured data, then spin up fluentd? | 17:32 |
zaro | morning | 17:32 |
mordred | or spin them up side by side? | 17:32 |
*** ihrachyshka has joined #openstack-infra | 17:32 | |
sweston | anteaya: awesome! Thank you so much for your time today. | 17:33 |
*** markmcclain has joined #openstack-infra | 17:33 | |
*** markmcclain has quit IRC | 17:33 | |
anteaya | sweston: np | 17:33 |
anteaya | sweston: thanks for being the Brocade point person, saves me time | 17:33 |
fungi | mordred: devananda: i think the hard part about caching this in nodepool is going to be that the packages ironic's job needs cached are in ubuntu cloud archive, which means we need to enable it, update package lists, then retrieve the package versions it needs into the cache, then disable it, then update the package index again | 17:34 |
sweston | anteaya: you bet. always glad to do what I can to ease the burden for others :-D | 17:34 |
*** marcoemorais1 has joined #openstack-infra | 17:34 | |
mordred | fungi: yes. that is what I wrote above | 17:34 |
fungi | mordred: devananda: and so the ironic job is still going to have to re-enable uca and re-update the package list | 17:34 |
clarkb | mordred: I think we focus on structured data first | 17:34 |
mordred | fungi: although you summarized is very nicely | 17:34 |
*** marcoemorais1 has quit IRC | 17:34 | |
clarkb | mordred: as that is project side and will be potentially problematic | 17:34 |
mordred | clarkb: ++ | 17:34 |
clarkb | mordred: though if we make oslo.config import python json logging that may be all we need | 17:34 |
*** marcoemorais has quit IRC | 17:34 | |
clarkb | mordred: then we can config it to do json | 17:34 |
*** marcoemorais has joined #openstack-infra | 17:35 | |
mordred | fungi: but it should still get us much further in that they would not be pulling packages from the internets | 17:35 |
*** markmcclain has joined #openstack-infra | 17:35 | |
*** ihrachyshka has quit IRC | 17:35 | |
*** SumitNaiksatam has left #openstack-infra | 17:35 | |
fungi | mordred: true. also it *might* be possible (though sorta hacky) to save and pivor between package lists | 17:35 |
fungi | s/pivor/pivot/ | 17:36 |
*** SumitNaiksatam has joined #openstack-infra | 17:36 | |
dhellmann | clarkb: I'm probably missing some context, but there's a json logger in the oslo log code | 17:36 |
clarkb | dhellmann: oh cool | 17:36 |
fungi | rsync /var/cache/apt to a /var/cache/apt.ironic or something and swap back and forth during image creation and within the jobs | 17:36 |
clarkb | dhellmann: I didn't know so apparently we would just have to flip a switch to make that work | 17:37 |
clarkb | dhellmann: we are finding that doing post processing of log data to make it structured is expensive and we shouldn't do it | 17:37 |
clarkb | dhellmann: so starting with json is where we want to go | 17:37 |
dhellmann | clarkb: https://review.openstack.org/#/c/95929/ | 17:37 |
fungi | also possible we could play tricks with apt pinning and just have a very low preference on the uca repos, then ironic specifically requests the version/suite it needs for a given dependency | 17:37 |
dhellmann | clarkb: that makes complete sense, and we may want to make that logger smarter after we move it to oslo.log | 17:38 |
clarkb | dhellmann: awesome that is great news | 17:38 |
mordred | woot! | 17:38 |
clarkb | dhellmann: do you know if json logging is available in say nova today? | 17:38 |
clarkb | dhellmann: or any of the projects? | 17:38 |
clarkb | dhellmann: if not I can work on syncing logging | 17:38 |
dhellmann | clarkb: the class is there, I don't know if anyone uses it | 17:38 |
*** lbragstad has quit IRC | 17:39 | |
clarkb | dhellmann: mordred: ok I will do some digging and see if I can get a d-g run to spit out json logs | 17:39 |
devananda | fungi, mordred: so that discussion has indeed gone past my ability to track it // rapidly implement any of the things you're suggesting :( | 17:39 |
dhellmann | that is, the class is in the incubated version of log.py, but I don't know if nova is up to date and I don't know if any nova users have turned that on | 17:39 |
anteaya | sweston: :D | 17:39 |
mordred | clarkb: it's your next project to get you ATC status in all the projects :) | 17:39 |
clarkb | mordred: :) | 17:39 |
clarkb | dhellmann: ah ok I may end up trying to do syncs if necessary but this is great news thanks | 17:39 |
mordred | dhellmann: we're very excited by this | 17:39 |
clarkb | mordred: fungi: sdague: in the interim we can try going to 8GB perf nodes for logstash workers | 17:39 |
*** mrmartin has joined #openstack-infra | 17:40 | |
clarkb | then double the jvm heap space for logstash | 17:40 |
mordred | clarkb: gross | 17:40 |
dhellmann | clarkb, mordred : I would love to have some feedback about how useful that class actually is and how to make it better | 17:40 |
clarkb | dhellmann: noted. will try to provide it | 17:40 |
dhellmann | sdague: you had something about oslotest and gating you wanted to talk about? | 17:40 |
dhellmann | clarkb: thanks! | 17:40 |
*** chmartinez has joined #openstack-infra | 17:42 | |
chmartinez | hello! Sorry to bother. Could someone tell me what's going on with the gate jobs of this review: https://review.openstack.org/#/c/96582/? | 17:43 |
zaro | clarkb: is there some secret to allow logging in from one hpcloud vm to another? i can't seem to get passwordless ssh connection. | 17:43 |
*** sweston has quit IRC | 17:43 | |
chmartinez | at zuul, the gate is marked with red :| | 17:43 |
zaro | clarkb: actually i can't seem to do any type of connection. | 17:43 |
clarkb | zaro: you have to forwad your ssh agent but generally shouldnt | 17:43 |
*** esker has quit IRC | 17:44 | |
*** ramashri has quit IRC | 17:44 | |
*** ArxCruz has quit IRC | 17:44 | |
zaro | clarkb: forget ssh, just simple login from one vm to another. does that work for you? | 17:44 |
fungi | mordred: possibly something to keep on your radar, this is why we're presently not publishing wheels for data-only projects https://bitbucket.org/pypa/wheel/issue/116 i've put up a pull request, but you've probably dug a lot deeper into that code than i have so input would be welcome | 17:45 |
clarkb | zaro: oh you mean any communication? check your security groups | 17:46 |
clarkb | zaro: I ended up going the infra route and opened my security groups whide open | 17:46 |
clarkb | then mange local firewalls | 17:46 |
*** afazekas has quit IRC | 17:47 | |
zaro | clarkb: you mean open all ports? | 17:47 |
*** mbacchi has joined #openstack-infra | 17:47 | |
clarkb | zaro: ya thats what I did | 17:48 |
clarkb | zaro: you really don't need to but I got sick of dealing with it at that level | 17:48 |
clarkb | much easier to modify iptables on a host | 17:48 |
clarkb | mordred: ^ is that maybe feedback we should give to openstack as a whole? | 17:49 |
*** lbragstad has joined #openstack-infra | 17:49 | |
*** ramashri has joined #openstack-infra | 17:51 | |
*** talluri has quit IRC | 17:52 | |
clarkb | Kiall: if you are around did you get sorted on the unbound thing? | 17:56 |
clarkb | I see there is a change | 17:57 |
mrmartin | fungi: hi, if you have some time, may I ask a review for this path: https://review.openstack.org/#/c/99481/ it is a larger refactoring code for community portal instance to provide better deployment / update path and is required for deploy some features there | 17:57 |
fungi | mrmartin: i probably won't have time this week or next. i'm fairly busy packing and moving | 17:58 |
mrmartin | ok, no prob. | 17:58 |
fungi | mrmartin: but hopefully some of our other reviewers will have a look in the meantime | 17:58 |
clarkb | Kiall: fungi pretty sure that chaneg will break everything | 17:59 |
Kiall | clarkb: Code Review in action :D | 17:59 |
Kiall | why? | 17:59 |
clarkb | it doesn't put unbound on 127.0.2.1 for long lived servers but updates resolv.conf | 17:59 |
zaro | clarkb: opened all ports but still cannot connect. you should be able to ssh connect from VM A to VM B with ubuntu account right? | 17:59 |
clarkb | Kiall: so we will end up in a situation where nothing resolves on review.openstack.org for example | 17:59 |
clarkb | zaro: yes | 18:00 |
*** dangers_away is now known as dangers | 18:00 | |
zaro | argg! | 18:00 |
clarkb | Kiall: fungi: I am much more comfortable with the services being tested being treated special | 18:00 |
clarkb | Kiall: fungi: especialyl for a service like DNS | 18:00 |
Kiall | Oh - I thought unbound just went on the single use slaves? | 18:00 |
fungi | Kiall: no, it's on all our servers | 18:00 |
*** jerryz has joined #openstack-infra | 18:00 | |
fungi | and yeah, i missed that we didn't factor out the unbound configuration for the nodepool nodes separately from everythign else | 18:01 |
*** markmcclain has quit IRC | 18:01 | |
*** markmcclain has joined #openstack-infra | 18:01 | |
chmartinez | hello! Sorry to bother. Could someone tell me what's going on with the gate jobs of this review: https://review.openstack.org/#/c/96582/? | 18:02 |
*** markmcclain has quit IRC | 18:02 | |
zaro | clarkb: do i need to do anything to make that happen? keep getting permission denied (public key). | 18:02 |
Kiall | fungi: Okay, I can rework it to listen only 127.0.0.1:53, rather than *:53, and at least we can work around that easily in devstack | 18:02 |
clarkb | sdague: well I kicked things to deal with the unhappy OOMers and now thinsg appear to be worse | 18:02 |
clarkb | sdague: maybe the OOMing is self regulating :/ | 18:02 |
clarkb | zaro: yes you ahve to forward your key | 18:03 |
clarkb | zaro: but you shouldn't do that | 18:03 |
*** zul has quit IRC | 18:03 | |
*** pelix has quit IRC | 18:03 | |
fungi | chmartinez: it was approved at 02:17 utc, check tests were rerun on it, then it was enqueued into the gate at 03:55 utc and is waiting for its turn | 18:03 |
*** zul has joined #openstack-infra | 18:03 | |
clarkb | Kiall: does it listen on *:53 by default? | 18:03 |
Kiall | yep | 18:03 |
* clarkb looks | 18:03 | |
fungi | chmartinez: you can find its current status by searching for the change number on http://status.openstack.org/zuul/ | 18:04 |
clarkb | Kiall: netstat says it doesn't | 18:04 |
Kiall | humm - can you paste the output? | 18:04 |
clarkb | Kiall: 127.0.0.1:53 and ::1:53 for tcp and udp | 18:05 |
zaro | clarkb: i should still be able to login without forwarding key right? just type in password? but ssh doesn't even ask me for the password. | 18:05 |
clarkb | tcp 0 0 127.0.0.1:53 0.0.0.0:* LISTEN | 18:05 |
*** markmcclain has joined #openstack-infra | 18:05 | |
clarkb | udp 0 0 127.0.0.1:53 0.0.0.0:* | 18:05 |
clarkb | zaro: password auth is probably disabled | 18:05 |
chmartinez | fungi: yes, I check that and I'm seeing this: openstack/ceilometer unknown 14 hr 10 min | 18:05 |
clarkb | bceause you shouldn't password auth either | 18:05 |
Kiall | Humm - The documentation suggested it was 0.0.0.0:53, If that's the case, we should be able to work around it | 18:05 |
*** praneshp_ has joined #openstack-infra | 18:05 | |
sdague | clarkb: bummer | 18:06 |
chmartinez | fungi: it's being enqueued for 14hs.. Is that normal? (sorry to ask, I'm new at this) | 18:06 |
fungi | chmartinez: correct, it's presently taking changes ~24 hours to get to the top of the gate given the current rate of random test failures | 18:06 |
*** praneshp has quit IRC | 18:06 | |
*** praneshp_ is now known as praneshp | 18:06 | |
clarkb | sdague: I mean in theory peple have hundreds of ndoes in these clusters | 18:06 |
clarkb | sdague: but maybe they have real hardware | 18:07 |
chmartinez | fungi: OK. Good to know :) Thanks!! | 18:07 |
fungi | chmartinez: the known bugs impeding testing are tracked at http://status.openstack.org/elastic-recheck/ if you're interested | 18:08 |
clarkb | sdague: oh! kicking things seems to be writing to slightly older indexes. This may be related | 18:08 |
sdague | clarkb: yeh, I wonder if we could hit up some provider for real hardware, for this one use case | 18:09 |
mtreinish | fungi, clarkb: is there an issue with gerritbot? I just pushed a patch and didn't see an irc msg | 18:10 |
sdague | dhellmann: you still around? | 18:10 |
*** tkelsey has quit IRC | 18:11 | |
fungi | mtreinish: openstackgerrit left earlier today on a netsplit and never rejoined. i'll give it a nudge | 18:11 |
clarkb | sdague: at this point I am curious to see if it reregulates on its own | 18:12 |
*** ildikov has quit IRC | 18:12 | |
clarkb | sdague: because that may be an indication of what is happening | 18:12 |
clarkb | sdague: so I think we have a few issues that we can definitely work on addressing. | 18:12 |
sdague | ok | 18:12 |
clarkb | the OOMing in jvm. the disk space situation | 18:12 |
clarkb | but even when they are happy the whole thing seems to be :? | 18:13 |
clarkb | er :/ | 18:13 |
mtreinish | fungi: ok, np. I just was curious | 18:13 |
*** openstackgerrit has joined #openstack-infra | 18:13 | |
chmartinez | fungi: OK! | 18:15 |
fungi | mtreinish: openstackgerrit is back now, btw | 18:17 |
mtreinish | fungi: cool | 18:17 |
sdague | fungi: next time we get a promote window - https://review.openstack.org/#/c/99412/ | 18:17 |
zaro | clarkb: password auth was disabled. turn it on and i'm finally able to connect. thanks. that's what i'm gonna use unless you can tell me a better way. just needed the connection to test jenkins. | 18:17 |
sdague | I added it to the etherpad | 18:17 |
sdague | I think a chunk of grenade failures are actually that | 18:17 |
clarkb | zaro: use keys | 18:17 |
sdague | but hidden in a buffering issue | 18:17 |
*** jcoufal has quit IRC | 18:17 | |
*** rfolco has quit IRC | 18:17 | |
zaro | clarkb: yeah ok. i'll try that next. | 18:18 |
clarkb | zaro: just don't use your key | 18:18 |
clarkb | create ones specifically for that | 18:18 |
zaro | yeah, i got that at least :) | 18:18 |
lifeless | yolanda: thats not a dib issue | 18:19 |
*** james_li has joined #openstack-infra | 18:19 | |
*** cp16net has joined #openstack-infra | 18:20 | |
lifeless | yolanda: that path is a nova instance path, no? | 18:20 |
lifeless | ttx: let me look | 18:20 |
*** e0ne has joined #openstack-infra | 18:21 | |
lifeless | clarkb: sounds reasonable | 18:21 |
lifeless | yolanda: oh, I think perhaps thats what the nova folk from the cloud you're testing with are reporting? I'd like to know what version of qemu they have | 18:22 |
*** markwash_ has joined #openstack-infra | 18:22 | |
lifeless | yolanda: and what version you're building with - we don't use any exotic options | 18:22 |
lifeless | yolanda: my guess - latest ubuntu (you're running utopic?) has a default that LTS can't handle, or some suc | 18:23 |
*** zns has quit IRC | 18:23 | |
*** ominakov has quit IRC | 18:23 | |
*** sweston has joined #openstack-infra | 18:23 | |
*** timrc-afk is now known as timrc | 18:24 | |
*** cp16net has quit IRC | 18:24 | |
*** markwash has quit IRC | 18:25 | |
*** markwash_ is now known as markwash | 18:25 | |
*** YorikSar has joined #openstack-infra | 18:27 | |
ttx | lifeless: AFAICT bugs are created with bugs.createBug(), releases with milestone.createRelease()... but there is nothing like createSpec() or createBlueprint() | 18:29 |
*** cp16net has joined #openstack-infra | 18:29 | |
*** zns has joined #openstack-infra | 18:30 | |
*** markmcclain has quit IRC | 18:31 | |
*** markmcclain has joined #openstack-infra | 18:31 | |
openstackgerrit | Nikhil Manchanda proposed a change to openstack-infra/config: Added new experimental job for trove functional tests https://review.openstack.org/98517 | 18:31 |
openstackgerrit | Nikhil Manchanda proposed a change to openstack-infra/config: Use job-template for gate-trove-buildimage jobs https://review.openstack.org/99680 | 18:31 |
clarkb | sdague: we are hitting iowait | 18:33 |
clarkb | I think | 18:33 |
sdague | what's the storage backends for these? | 18:33 |
sdague | local ephemeral disks? | 18:33 |
clarkb | sdague: nope local ephemeral isn't big enough | 18:33 |
clarkb | its cinder ovlumes or rax equivalent | 18:34 |
sdague | single volumes, or something raided? | 18:34 |
clarkb | fungi: care to look at sar -dp 5 5 on the ES nodes and tell me what you see | 18:34 |
clarkb | sdague: single volumes | 18:34 |
lifeless | ttx: I see yes; I will track down | 18:34 |
sdague | apparently the ec2 trick is to allocate 4 volumes and stripe them | 18:35 |
sdague | I wonder if that would help here | 18:35 |
sdague | or if we're maxed on the network side | 18:35 |
fungi | clarkb: i picked a random es node and it claims sar isn't installed... should it be? | 18:35 |
clarkb | fungi: no you need to install sysstat I am installing it as Igo | 18:36 |
fungi | k | 18:36 |
clarkb | sdague: it looks like only one node may be affected | 18:36 |
ttx | lifeless: I can't say i'm surprised -- blueprints never had a full API, it was all added piecemeal by platform team when we needed to scratch itches. Like I said, I probably authored 25% of it. | 18:36 |
clarkb | sdague: which may be a rax side problem? | 18:36 |
sdague | clarkb: and one bad node hurts the others? | 18:36 |
clarkb | sdague: or that node is being crazy as compared to the others | 18:36 |
sdague | is it the node handling api requests? | 18:37 |
clarkb | sdague: yes because of replicas and searches hitting that disk | 18:37 |
clarkb | sdague: yes it is that node too | 18:37 |
clarkb | sdague: but it shouldn't be spooling any of that to disk | 18:37 |
clarkb | api requests should be memroy which is why we went to much bigger nodes for more memory | 18:37 |
fungi | oh yeah, await is spiking up fairly high at points | 18:37 |
sdague | clarkb: but that will have no interaction with the local shard? | 18:38 |
lifeless | ttx: yah, I've pinged cprov and he's looking at hwo hard it would be to get the collection exposed | 18:38 |
sdague | I'm just wondering if we're in high load otherwise if it's impacting | 18:38 |
lifeless | ttx: if you look at bugs there is /bugs and there is the bugs type, specs only has the type exposed | 18:38 |
clarkb | sdague: it may, we may find turning off e-r makes it stop | 18:38 |
sdague | well, also when things are bad is when people are using logstash a lot to discover things | 18:39 |
*** markmcclain has quit IRC | 18:39 | |
clarkb | fungi: ya and now look at 01 | 18:39 |
fungi | yep | 18:39 |
dhellmann | sdague: I'm back | 18:39 |
mgagne | sdague: regarding the thread about capacity issues in the gate, would throwing more hardware/ressources to the problem fix the problem or would it just buy us some time until a deeper unknown problem (to me) is fixed? | 18:39 |
ttx | lifeless: cool, thx. Keep me posted! | 18:39 |
fungi | pew pew lasers on the lvm pv | 18:39 |
clarkb | sdague: the quick and easy thing to try is to stop apache on logstash.o.o | 18:39 |
mestery | fungi: FYI, I just received word from the OpenDaylight folks that their CI is now functioning normally again, if you have time, can you let them vote again? Thanks! | 18:39 |
lifeless | ttx: https://bugs.launchpad.net/launchpad/+bug/1329424 | 18:39 |
clarkb | sdague: we may see everything get happy again | 18:39 |
uvirtbot | Launchpad bug 1329424 in launchpad "cannot create specification via API" [Undecided,New] | 18:39 |
sdague | clarkb: or the cron jobs | 18:40 |
ttx | lifeless: like I said my current script works around it by spawning a browser window, which is kind of unwieldy :) | 18:40 |
clarkb | sdague: they hit apache :) | 18:40 |
sdague | clarkb: oh, right :) | 18:40 |
clarkb | its like a giant valve I can turn off which is nice | 18:40 |
fungi | mestery: done | 18:40 |
sdague | clarkb: sure, want to black it out for 30 minutes | 18:40 |
sdague | and see if it impacts things | 18:40 |
ttx | lifeless: ok, subscribed myself | 18:40 |
clarkb | sdague: ya lets try that | 18:40 |
mestery | fungi: thank you sir! | 18:40 |
clarkb | I am stopping pupept and apache on logstash.o.o now | 18:40 |
sdague | mgagne: more nodes never hurts, but we've got as bad a people scaling problem on tracking down the fails | 18:41 |
fungi | sounds like a good test of the theory at any rate | 18:41 |
sdague | people are less elastic | 18:41 |
sdague | dhellmann: ok, so oslotest has had some interesting implications when it comes to zuul | 18:42 |
dhellmann | sdague: yes? | 18:42 |
sdague | because in it's current job matrix it has joined the world into a single gate | 18:42 |
dhellmann | oof | 18:43 |
dhellmann | we'll have a similar issue for oslo.config, oslo.i18n, etc. | 18:43 |
sdague | because it creates a set of transitive dependencies | 18:43 |
sdague | right | 18:43 |
sdague | so I'd like to propose a risk model here | 18:43 |
dhellmann | I image we'll replace that matrix when we implement https://etherpad.openstack.org/p/juno-infra-library-testing | 18:44 |
sdague | in that we only run those jobs in check | 18:44 |
dhellmann | that's fair, since they are only unit tests | 18:44 |
sdague | that does mean there is a chance we'll get a wedge across some projects if just the wrong set of things go through the gate | 18:44 |
sdague | however | 18:44 |
sdague | I think that's less pain then the current setup which puts all the world into the same gate queue | 18:45 |
sdague | and made ironics fail issues back up everything else, for instance | 18:45 |
dhellmann | yeah, let's remove those gate jobs | 18:47 |
lifeless | sdague: btw I have a little confusion about the namespace thing we discussed with pcrews | 18:47 |
lifeless | sdague: the current web ui looks like it has namespaces (all pipelines, gate pipeline, uncategorized) already | 18:48 |
dhellmann | sdague: is the comment on line 52 of https://etherpad.openstack.org/p/juno-infra-library-testing accurate? | 18:48 |
*** markwash_ has joined #openstack-infra | 18:48 | |
sdague | lifeless: we need another dimension | 18:48 |
lifeless | sdague: ok | 18:49 |
lifeless | sdague: thanks | 18:49 |
sdague | dhellmann: yes | 18:49 |
sdague | dhellmann: let me propose this as a config change, then we can discuss | 18:50 |
dhellmann | sdague: sounds good | 18:50 |
*** markwash has quit IRC | 18:50 | |
*** markwash_ is now known as markwash | 18:50 | |
openstackgerrit | A change was merged to openstack-dev/pbr: Register testr as a distutil entry point https://review.openstack.org/99277 | 18:52 |
*** mrmartin has quit IRC | 18:57 | |
*** SumitNaiksatam has quit IRC | 18:58 | |
*** markmcclain has joined #openstack-infra | 18:59 | |
*** SumitNaiksatam has joined #openstack-infra | 18:59 | |
openstackgerrit | Sean Dague proposed a change to openstack-infra/config: do not co-gate oslotest with the projects that include it https://review.openstack.org/99736 | 19:00 |
sdague | dhellmann: ok, I tried to be really verbose with that commit message | 19:00 |
sdague | if you can take a look | 19:00 |
dhellmann | sdague: looking | 19:00 |
sdague | turning off ironic voting in gate seems to have vastly increased velocity | 19:01 |
dhellmann | sdague: +1 | 19:02 |
sdague | fungi: hey, so we just got a gate reset | 19:02 |
sdague | can I get a promote on the ceilo grenade fix? | 19:02 |
*** mancdaz has quit IRC | 19:02 | |
*** johnthetubaguy has quit IRC | 19:02 | |
*** changbl has quit IRC | 19:02 | |
*** phschwartz has quit IRC | 19:02 | |
fungi | sdague: yep, promoting as soon as those two at the top report | 19:02 |
*** phschwartz_ has joined #openstack-infra | 19:03 | |
*** changbl has joined #openstack-infra | 19:03 | |
sdague | fungi: awesome, thank you sir | 19:03 |
*** ramashri has quit IRC | 19:03 | |
*** ramashri has joined #openstack-infra | 19:03 | |
fungi | sdague: is the failure which caused the reset there what 99412 is trying to address? | 19:04 |
sdague | fungi: nope | 19:04 |
sdague | but it's something that's in the grenade uncategorized fail list | 19:05 |
fungi | given it killed a ceilo change i was sort of wondering | 19:05 |
sdague | and I think we've got a buffering problem there which I'm hoping my new output filter solves | 19:05 |
sdague | yeh | 19:05 |
sdague | no, that's another thing | 19:05 |
*** mancdaz has joined #openstack-infra | 19:05 | |
sdague | there are 'so many' new bugs here | 19:05 |
*** radez_g0n3 is now known as radez | 19:05 | |
*** johnthetubaguy has joined #openstack-infra | 19:05 | |
sdague | I've rarely seen my changes get bounced twice for the same fail | 19:05 |
*** annegent_ has joined #openstack-infra | 19:06 | |
openstackgerrit | Michael Krotscheck proposed a change to openstack-infra/storyboard-webclient: Including UX Feedback on menu and nav. https://review.openstack.org/99209 | 19:06 |
sdague | fungi / clarkb / mordred / SergeyLukjanov - https://review.openstack.org/99736 should also relieve some things | 19:06 |
sdague | and dhellmann is onboard | 19:06 |
openstackgerrit | Michael Krotscheck proposed a change to openstack-infra/storyboard-webclient: Error message & notification handling https://review.openstack.org/99515 | 19:07 |
sdague | that's a config change | 19:07 |
sdague | that will split the queues out | 19:07 |
clarkb | looking | 19:07 |
fungi | sdague: ayup, saw the conversation. makes sense | 19:08 |
*** annegent_ has quit IRC | 19:08 | |
sdague | fungi: the swift change at top of gate is failing on grenade | 19:08 |
sdague | I would just promote now | 19:08 |
sdague | and give those a second go | 19:08 |
sdague | because that swift change is failing badly and slowly | 19:08 |
sdague | it's going to take a long time to report | 19:08 |
*** freyes_ has joined #openstack-infra | 19:09 | |
clarkb | sdague: so trying to grok that change, does it do enoughto break the transitivity? | 19:10 |
clarkb | sdague: oh wait I think I grok, those tests are run on the other projects hence the transitive inclusion | 19:10 |
sdague | clarkb: right | 19:10 |
clarkb | but removing them from the source side breaks that | 19:10 |
sdague | in the gate queue | 19:11 |
sdague | yep | 19:11 |
*** phschwartz_ is now known as phschwartz | 19:11 | |
clarkb | ok approved | 19:11 |
clarkb | fungi: the other thing that we may want to do in the nearish future is restart zuul to pick up that fix for the swift stuff | 19:11 |
clarkb | fungi: but I am happy to focus on gate fixes | 19:11 |
fungi | sdague: done | 19:12 |
openstackgerrit | Michael Krotscheck proposed a change to openstack-infra/storyboard-webclient: Removed old template https://review.openstack.org/99738 | 19:12 |
sdague | fungi: thanks | 19:12 |
phschwartz | clarkb: So I thought about it more and more. If there are no objections I am going to go down the road of having a new thread for rebuilds and have everything that currently goes to deleted go to rebuild and let the rebuild thread determine if it should be moved to deleted. | 19:12 |
clarkb | phschwartz: sounds good to me | 19:12 |
*** e0ne has quit IRC | 19:13 | |
*** saper has quit IRC | 19:13 | |
clarkb | so we are still doing an order of magnitude more data on es01 | 19:13 |
clarkb | but stopping apache has helped | 19:13 |
*** e0ne has joined #openstack-infra | 19:14 | |
*** chmartinez has left #openstack-infra | 19:14 | |
clarkb | I have no idea what makes es01 special at this point. Maybe it is still trying to deal with requests? | 19:14 |
sdague | hmmm | 19:14 |
fungi | clarkb: is that just elasticsearch not sharding with an eye for access volume? | 19:14 |
fungi | maybe es01 got "lucky" and has the most accessed shards? | 19:14 |
clarkb | fungi: the shard allocation is random in our setup so we should see even writes across all of them | 19:15 |
fungi | huh. okay. scratch that idea | 19:15 |
clarkb | I mean there may be a bug in that | 19:15 |
sdague | clarkb: would it be worth trying to do the striped raid of volumes on that node to increase it's io throughput? | 19:15 |
clarkb | sdague: possibly, I think first I need to figure out what makes that node special | 19:15 |
sdague | that's fair | 19:15 |
clarkb | sdague: the other potential thing to try is doing non data nodes | 19:16 |
clarkb | and let them deal with searches | 19:16 |
sdague | sure | 19:16 |
phschwartz | clarkb: oh, and as to your question yesterday about requesting more volume space, it would have to be discussed with Pvo, I know mordred has been in contact with him so you might want to ask him to pose the question. | 19:16 |
clarkb | they essentially act as fat caches to take strain off the indexing nodes | 19:16 |
clarkb | phschwartz: thanks | 19:16 |
*** e0ne has quit IRC | 19:16 | |
sdague | ok | 19:16 |
clarkb | sdague: but if we add a second volume to each ndoe I definitely think we can try doing the raid approach | 19:17 |
sdague | man, our deletes really are backing up as well | 19:17 |
sdague | on overall nodes | 19:17 |
sdague | looks like > 50% are currently deleting | 19:17 |
*** _nadya_ has joined #openstack-infra | 19:17 | |
fungi | looks like most of rax-dfw is deleting | 19:19 |
fungi | and most of rax-iad | 19:19 |
*** ociuhandu has quit IRC | 19:19 | |
fungi | and about half of ord | 19:19 |
phschwartz | Let me look at our cloud monitor | 19:19 |
fungi | many have been attempting to delete for 1-2 hours | 19:21 |
clarkb | es01 is starting to get more inline with the other 5 nodes now | 19:22 |
fungi | actually more than half of the nodes nodepool wants to delete in rax regions have been in that state for more than 4 hours | 19:22 |
clarkb | I want to watch and see if it is consistent that way | 19:22 |
*** ArxCruz has joined #openstack-infra | 19:22 | |
phschwartz | fungi can you get me a list of all of the uuid's of the instances stuck in deleting across the regions. I will get ops involved. | 19:22 |
jesusaurus | did we upgrade the version of puppet? im seeing a lot of new deprecation warnings in the puppet-apply-precise test | 19:23 |
fungi | phschwartz: from what i've seen they're not so much "stuck" as getting ignored and then periodically retried | 19:23 |
fungi | phschwartz: nodepool requests a delete from nova, the client call returns, the node never disappears, nodepool throws it back into the queue and tries again in 10 minutes or so, lather, rinse, repeat | 19:24 |
jesusaurus | clarkb: did you figure out what was mucking up your es node? | 19:24 |
*** ildikov has joined #openstack-infra | 19:24 | |
*** markmcclain has quit IRC | 19:25 | |
fungi | i'm going to do a bulk parallel delete of anything nodepool's been trying to delete for at least an hour | 19:25 |
clarkb | jesusaurus: seems possibly related to hammering one of the nodes with searches | 19:25 |
clarkb | jesusaurus: the disk on that node couldn't keep up | 19:25 |
sdague | fungi: we should just write a tool that opens a ticket automatically when it takes more than an hour to delete a node :) | 19:25 |
clarkb | http://gibrown.wordpress.com/2014/01/09/scaling-elasticsearch-part-1-overview/ is interesting | 19:25 |
openstackgerrit | Adam Gandelman proposed a change to openstack-infra/config: Pre-cache UCA packages during nodepool img build https://review.openstack.org/99740 | 19:25 |
clarkb | a quick glance shows we are trying to do more with less :/ | 19:26 |
phschwartz | fungi: hold off for a couple of min if you can. I want an admin to look if they can | 19:26 |
jesusaurus | clarkb: huh | 19:26 |
clarkb | fungi: sdague: the other thing we can try is ssd volumes | 19:26 |
fungi | phschwartz: okay, will do | 19:26 |
sdague | clarkb: oh... that's an option? | 19:26 |
clarkb | so I think we have a lot of options. but may need to caht with pvo | 19:27 |
clarkb | sdague: possibly. They exist :) | 19:27 |
*** ArxCruz has quit IRC | 19:27 | |
*** denis_makogon has joined #openstack-infra | 19:27 | |
clarkb | mordred: ^ is that something you want to do? | 19:27 |
clarkb | pvo doesn't seem to be resident here anymore | 19:27 |
sdague | right, I guess it's hard to know if the bottleneck is the volume backend, or the path to the volume | 19:27 |
clarkb | right | 19:27 |
clarkb | and before we make big changes would be good to ave an understanding from rax | 19:27 |
sdague | can phschwartz help us figure that one out? :) | 19:28 |
clarkb | he mentioned talking to pvo | 19:28 |
phschwartz | I will ping him, but the request coming to him from mordred would probably be better. | 19:28 |
clarkb | phschwartz: ok | 19:29 |
clarkb | thanks for the help | 19:29 |
clarkb | 01 is still higher than 02 though | 19:30 |
openstackgerrit | Antoine Musso proposed a change to openstack-infra/zuul: Make swiftclient an optional dependency https://review.openstack.org/97933 | 19:30 |
mordred_phone | clarkb: I can ping him | 19:31 |
*** thuc has joined #openstack-infra | 19:33 | |
*** smarcet has quit IRC | 19:33 | |
*** hashar has joined #openstack-infra | 19:35 | |
clarkb | the big difference seems to be es01 is much more heavy on reads than es02 | 19:35 |
clarkb | which may still be fallotu from being the api endpoint node | 19:35 |
*** mrmartin has joined #openstack-infra | 19:37 | |
jesusaurus | clarkb: oh, you arent load-balancing requests across all the nodes? | 19:37 |
openstackgerrit | Philip Marc Schwartz proposed a change to openstack-infra/config: Creation of vinz project in the openstack-infra scheme. https://review.openstack.org/93953 | 19:37 |
*** thuc has quit IRC | 19:38 | |
phschwartz | mordred: fungi: anteaya: ^ that is the rebase due to the merge failure | 19:38 |
*** ominakov has joined #openstack-infra | 19:39 | |
*** andreykurilin_ has quit IRC | 19:40 | |
*** cp16net has quit IRC | 19:40 | |
anteaya | phschwartz: you are a trooper | 19:41 |
*** ominakov has quit IRC | 19:41 | |
phschwartz | Only thing I hate about long running reviews. The need for multiple rebases. lol | 19:41 |
jogo | why are so many of our cloud resources in deleting state? | 19:43 |
clarkb | jesusaurus: no because its supposed to do that for me | 19:43 |
phschwartz | fungi: before attempting to force the delete, can you open a ticket for the issue. I have ping some people to look and one of our nova dev's is looking at one of the instances now. | 19:43 |
clarkb | but apparently the chosen node is hit harder than I expected | 19:43 |
*** markmcclain has joined #openstack-infra | 19:45 | |
*** james_li has quit IRC | 19:46 | |
mrmartin | hi. what is the proper way to add compass (http://compass-style.org), a css styling tool required to compile sass files to css to the jenkins slave? The problem that the ubuntu 12.04lts package is very old, but we could use a "gem install compass" to deploy the latest stable version. | 19:46 |
*** smarcet has joined #openstack-infra | 19:46 | |
*** nati_ueno has quit IRC | 19:47 | |
openstackgerrit | Alexandre Viau proposed a change to openstack-infra/config: Added the Surveil project to gerritbot and stackforge config https://review.openstack.org/99746 | 19:47 |
mgagne | mrmartin: does your project have a Gemfile or use Bundler already? | 19:47 |
jogo | clarkb: logstash.o.o is giving me errors | 19:47 |
*** e0ne has joined #openstack-infra | 19:48 | |
mrmartin | mgagne: not yet. what I want to achieve, to remove the pre-compiled css files from github repository, and cleanup styling related patches. | 19:48 |
*** _nadya_ has quit IRC | 19:49 | |
clarkb | jogo yup we killed it | 19:49 |
*** ihrachyshka has joined #openstack-infra | 19:49 | |
mgagne | mrmartin: we (puppet modules for openstack) are already relying on Bundler to install specific gem versions of our dependencies. I guess it should be trivial to do the same with yours. | 19:49 |
jogo | clarkb: ahh | 19:49 |
clarkb | jogo es issues seem related to search volume | 19:50 |
jogo | clarkb: I take it, it will be coming back at some point in the not so distant future | 19:50 |
mrmartin | mgagne: could you show me some example in the current infra repo? | 19:50 |
clarkb | we are hitting iowait so bad | 19:50 |
jogo | clarkb: thanks | 19:50 |
jogo | ouch | 19:50 |
mgagne | mrmartin: sure, hold on | 19:50 |
jogo | moar cloud | 19:50 |
*** thuc has joined #openstack-infra | 19:51 | |
*** markmcclain has quit IRC | 19:51 | |
sweston | anteaya: Hello, again. I have just been informed by Shiv that he is using the key attached to Brocade CI. | 19:51 |
mgagne | mrmartin: one of our job: https://github.com/openstack-infra/config/blob/master/modules/openstack_project/files/jenkins_job_builder/config/puppet-module-jobs.yaml#L8-L13 | 19:52 |
*** praneshp has quit IRC | 19:52 | |
mrmartin | mgagne: thank you, I'll review this | 19:52 |
mgagne | mrmartin: afaik, with bundler, you are expected to wrap your call with bundle exec to have access to console scripts provided by gems | 19:53 |
sweston | anteaya: I sent a correction to my ml request. I just want to make sure I updated you with the latest information. | 19:53 |
*** cp16net has joined #openstack-infra | 19:53 | |
*** thuc has quit IRC | 19:53 | |
*** thuc has joined #openstack-infra | 19:53 | |
sweston | anteaya: never mind, I see you are already on top of it :-) | 19:54 |
mrmartin | mgagne: I had a talk with this guy today in the same topic, so compass / bundler integration seems to be working here: http://cheppers.com/blog/bundlerize-your-sassy-themes | 19:54 |
mgagne | mrmartin: might not be what you want if console scripts are executed from tox, a Makefile or whatever you use in your project. Console scripts are expected to be found in the system search path. | 19:54 |
anteaya | sweston: k | 19:54 |
*** pvo has joined #openstack-infra | 19:54 | |
anteaya | sweston: please confirm my emailed statement is correct, posting below my response is prefered | 19:55 |
sweston | anteaya: yes, your statement is correct. | 19:55 |
anteaya | sweston: thanks | 19:56 |
mrmartin | mgagne: I want to add here: https://github.com/openstack-infra/config/blob/master/modules/openstack_project/files/jenkins_job_builder/config/groups.yaml so I guess it will work | 19:56 |
*** nati_ueno has joined #openstack-infra | 19:56 | |
sweston | anteaya: apologies, I the email just left my inbox ... I will post below in future correspondence with the ml. | 19:56 |
*** james_li has joined #openstack-infra | 19:56 | |
anteaya | sweston: thanks, I appreciate that | 19:56 |
*** ArxCruz has joined #openstack-infra | 19:57 | |
*** markwash has quit IRC | 19:58 | |
*** e0ne has quit IRC | 19:59 | |
*** markwash has joined #openstack-infra | 20:00 | |
*** e0ne has joined #openstack-infra | 20:00 | |
*** e0ne has quit IRC | 20:02 | |
fungi | phschwartz: i could open a ticket but i'm not sure how to characterize it (nor do i have a good description for how to trivially recreate the condition) | 20:04 |
*** ArxCruz has quit IRC | 20:04 | |
openstackgerrit | Joe Gordon proposed a change to openstack-infra/config: Don't run large-ops test on stable branches https://review.openstack.org/99750 | 20:05 |
jogo | sdague: ^ | 20:05 |
*** dprince has quit IRC | 20:05 | |
sdague | jogo: I think we only want to exclude havana | 20:05 |
fungi | phschwartz: i'm mostly afk, but if i had time i would probably dig in the nodepool logs and a thread dump to identify exactly why nodepool is unable to delete from one provider in as timely a manner as another | 20:05 |
sdague | I think we should still run it on icehouse | 20:05 |
jogo | sdague: sure | 20:05 |
* fungi is currently working from the seat of a car in a grocery store parking lot while people are touring his old residence | 20:06 | |
*** zns has quit IRC | 20:07 | |
sdague | fungi: awesome | 20:07 |
jogo | sdague: and itwas the wrong job | 20:07 |
*** Alexei_987 has joined #openstack-infra | 20:07 | |
sdague | jogo: ++ | 20:07 |
*** radez is now known as radez_g0n3 | 20:07 | |
*** zns has joined #openstack-infra | 20:07 | |
*** malini1 has quit IRC | 20:08 | |
openstackgerrit | A change was merged to openstack-infra/config: do not co-gate oslotest with the projects that include it https://review.openstack.org/99736 | 20:08 |
clarkb | fungi: you should be in a bar watching world cup | 20:09 |
*** mbacchi has quit IRC | 20:10 | |
*** ramashri_ has joined #openstack-infra | 20:10 | |
openstackgerrit | Joe Gordon proposed a change to openstack-infra/config: Don't run large-ops test on stable/havana branches https://review.openstack.org/99750 | 20:10 |
jogo | sdague: take two ^ | 20:10 |
*** ramashri has quit IRC | 20:11 | |
sdague | jogo: you still have copy/paste errors | 20:11 |
sdague | look at the diff | 20:11 |
jogo | sdague: ahh | 20:12 |
*** rlandy has quit IRC | 20:12 | |
jogo | sdague: take 3 | 20:14 |
openstackgerrit | Joe Gordon proposed a change to openstack-infra/config: Don't run large-ops test on stable/havana branches https://review.openstack.org/99750 | 20:14 |
jogo | fungi: for the delete issue in nodepool, is the issue only on rax or both clouds? | 20:16 |
jogo | fungi: I wonder if we can fix something in nova to help this | 20:16 |
fungi | jogo: all our providers lag some on delete calls, but at the moment most of our rax quota is tied up in instances nodepool wanted deleted hours ago | 20:17 |
jogo | fungi: there is a force delete command not sure if its admin only though | 20:17 |
*** boris-42 has quit IRC | 20:18 | |
*** bookwar has quit IRC | 20:18 | |
*** morganfainberg has quit IRC | 20:18 | |
fungi | jogo: that's not so much the issue. retrying the delete has a ~% chance of either working or just being ignored from what we've seen in the past | 20:18 |
*** boris-42 has joined #openstack-infra | 20:18 | |
*** bookwar has joined #openstack-infra | 20:18 | |
*** morganfainberg has joined #openstack-infra | 20:18 | |
jogo | fungi: so it sounds like a nova bug? | 20:19 |
fungi | from any provider (parts of hpcloud 1.0 were pretty bad about that too) | 20:19 |
fungi | jogo: maybe. depends on what sort of modifications they have in place compared to vanilla nova, how it's impacted by load/performance issues/whatever on the underlying infrastructure, et cetera | 20:20 |
*** mbacchi has joined #openstack-infra | 20:20 | |
*** ramashri_ has quit IRC | 20:21 | |
*** a2hill has joined #openstack-infra | 20:21 | |
openstackgerrit | Sean Dague proposed a change to openstack-infra/elastic-recheck: remove old queries https://review.openstack.org/99756 | 20:22 |
fungi | basically nodepool issues a nova delete and then if that returns an ok response periodically looks at the nova list to watch for the instance to finally disappear. sometimes it never does, so nodepool retries that periodically until it finally works | 20:22 |
*** alkari has quit IRC | 20:22 | |
sdague | clarkb: I wonder if our query growth impacts things | 20:22 |
openstackgerrit | A change was merged to openstack-infra/config: Creation of vinz project in the openstack-infra scheme. https://review.openstack.org/93953 | 20:22 |
clarkb | sdague: its possible | 20:22 |
clarkb | sdague: also possible that we have a bunch of terrible queries | 20:22 |
sdague | clarkb: yeh, so are the logs of the batch jobs contained anywhere? | 20:23 |
sdague | I could put timers around the queries so that we can determine which ones seem most expensive | 20:23 |
clarkb | uh probably not. we just run them out of cron right? | 20:23 |
sdague | clarkb: yep | 20:23 |
clarkb | sdague: but we should be able to have them write to /var/log/es-batch-jobs or whatever | 20:24 |
jogo | jaypipes: ^^^ | 20:24 |
jogo | fungi: I wonder if we can somehow reproduce that issue | 20:24 |
fungi | clarkb: sdague: does the e-r web dashboard call to es? is it possible the additional query load coincides with when the change merged to replace the old rechecks page with it? | 20:24 |
sdague | fungi: no, it's batch generated | 20:25 |
clarkb | fungi: no it generates each of those 3 tabs twice and hour | 20:25 |
clarkb | so 6 jobs per hour | 20:25 |
jogo | fungi: the bot goes way more often | 20:25 |
sdague | so at least that's a fixed cost that we control | 20:25 |
*** esker has joined #openstack-infra | 20:25 | |
sdague | jogo: it processes in serial though | 20:25 |
fungi | okay. i just notice about a 10-second lag between when the empty page with theming comes up and when the graphs appear, so didn't know what was being queried | 20:25 |
jogo | sdague: true | 20:25 |
sdague | fungi: yeh, that's the json loading over the network for the graphs | 20:25 |
fungi | k | 20:26 |
sdague | as the graphs are client side | 20:26 |
fungi | unrelated then | 20:26 |
sdague | yep | 20:26 |
jogo | fungi: so when I try deleting a rax instance locally its quick | 20:28 |
fungi | does the query volume per job failure increase linearly with the number of classification queries e-s has at its disposal? | 20:28 |
jogo | fungi: is there anything special on these instances? volumes etc | 20:28 |
*** nati_ueno has quit IRC | 20:28 | |
sdague | fungi: yes | 20:28 |
fungi | jogo: instance booted from snapshot. it also very well may be related to the volume of instance operations we perform in our tenant | 20:29 |
*** esker has quit IRC | 20:29 | |
jogo | jaypipes: ^ | 20:29 |
sdague | jogo: that's 16 queries I think we can drop - https://review.openstack.org/99756 | 20:29 |
jogo | ahhh | 20:29 |
fungi | sdague: do we periodically de-cruft the old classified entries we aren't hitting any longer? | 20:29 |
sdague | fungi: manually | 20:29 |
fungi | okay | 20:29 |
sdague | that's what I was just doing | 20:30 |
fungi | yup, you read my mind, or something ;) | 20:30 |
jogo | sdague: have you updated the related bugs in launchapd? | 20:30 |
*** esker has joined #openstack-infra | 20:30 | |
sdague | jogo: nop | 20:30 |
jogo | sdague: want to do that ? | 20:30 |
jogo | then +W from me | 20:30 |
*** wenlock_ has joined #openstack-infra | 20:32 | |
sdague | jogo: I'm not sure what that is | 20:32 |
*** a2hill has quit IRC | 20:32 | |
*** blogan has joined #openstack-infra | 20:32 | |
sdague | jogo: you should do the thing you want to do in launchpad, I always just delete these things | 20:33 |
clarkb | sdague: I want to see if it will catch up in this state | 20:33 |
clarkb | sdague: then we can try turning apache back on | 20:33 |
sdague | clarkb: yeh, the slope looks good | 20:33 |
*** radez_g0n3 is now known as radez | 20:33 | |
clarkb | sdague: it may be that when e-r doesn't have to ask over and over and over for a change that we get better behavior | 20:33 |
sdague | clarkb: agreed | 20:34 |
phschwartz | fungi: ty, do you want the request for the groups created to be an email or here? | 20:35 |
*** julim has quit IRC | 20:35 | |
*** openstackgerrit has quit IRC | 20:35 | |
*** praneshp has joined #openstack-infra | 20:35 | |
fungi | phschwartz: in here is fine. the groups get automatically created but i have to manually add people to them | 20:36 |
*** marcoemorais has quit IRC | 20:36 | |
*** markmcclain has joined #openstack-infra | 20:36 | |
fungi | but i won't get to it for a bit still | 20:36 |
*** marcoemorais has joined #openstack-infra | 20:36 | |
*** openstackgerrit has joined #openstack-infra | 20:36 | |
*** wenlock_ has quit IRC | 20:36 | |
phschwartz | fungi: They are vinz-core and vinz-ptl | 20:38 |
phschwartz | fungi: not a problem for a wait. | 20:38 |
phschwartz | how often does puppetmaster get updated and propagate the changes anyways? I have never asked. | 20:39 |
jogo | sdague: just mark them as invalid etc | 20:39 |
sdague | jogo: the bug might not be invalid | 20:39 |
sdague | some of these bugs are still out there, we're just not matching them any more | 20:39 |
jogo | so the ones that are | 20:39 |
sdague | I don't know which ones those are | 20:39 |
sdague | the whole point when we age out is just that this query is no longer matching that bug | 20:40 |
sdague | the reasons for that might be that it's fixed, or it moved around | 20:40 |
jogo | sdague: so for example https://bugs.launchpad.net/swift/+bug/1209086 | 20:41 |
*** bknudson has left #openstack-infra | 20:41 | |
uvirtbot | Launchpad bug 1209086 in swift "grenade tests fail with error trying to create container" [Medium,Confirmed] | 20:41 |
jogo | I am just commenting | 20:41 |
*** otherwiseguy has quit IRC | 20:41 | |
jogo | to at least give folks some insight | 20:41 |
sdague | gotcha | 20:41 |
*** blogan has left #openstack-infra | 20:42 | |
sdague | so long term we should probably do a post job that comments on launchpad bugs when we add or remove queries | 20:42 |
lifeless | sdague: heh, I suggested that the other day | 20:42 |
lifeless | sdague: but I suggest you use a bug attachment | 20:42 |
*** gyee has joined #openstack-infra | 20:42 | |
sdague | lifeless: why an attachment? | 20:42 |
lifeless | sdague: then you can have the current query as a yaml file attached to the bug, and if theres no file there is no attachment | 20:42 |
lifeless | sdague: so you don't need to read through N comments to figure it out | 20:43 |
*** __afazekas is now known as afazekas | 20:43 | |
sdague | well given how often lp times out, a comment seems safer, as there won't be 100% consistency | 20:43 |
sdague | also I vaguely know how to do that with lplib :) | 20:43 |
lifeless | sdague: don't see how comments or attachments are safer, same API servers in use, same notification code | 20:44 |
lifeless | also attachments trigger notifications. up to you though | 20:44 |
sdague | yeh, but if we are just telling people something has changed in er, then they come back to er for source of truth | 20:44 |
lifeless | there's example attachment code on the api page- api.l.n/devel/ | 20:44 |
*** nati_ueno has joined #openstack-infra | 20:44 | |
sdague | I worry about pushing out the actual data to lp, because then it feels we need to be more responsible for making sure it's consistent | 20:44 |
openstackgerrit | Alexandre Viau proposed a change to openstack-infra/config: Added the Surveil project to gerritbot and stackforge config https://review.openstack.org/99746 | 20:46 |
lifeless | mordred_phone: can we please release pbr 0.8.3 to get the testr fixout? setup.py test doesn't accept options, so we do need that fix. | 20:47 |
lifeless | mordred_phone: I can tag it if you're ok with a release | 20:47 |
*** dims_ has quit IRC | 20:47 | |
openstackgerrit | lifeless proposed a change to openstack-dev/pbr: Allow examining parsing exceptions. https://review.openstack.org/80856 | 20:47 |
openstackgerrit | lifeless proposed a change to openstack-dev/pbr: Teach pbr VersionInfo about debian versions. https://review.openstack.org/81074 | 20:47 |
openstackgerrit | lifeless proposed a change to openstack-dev/pbr: Teach pbr about post versioned dev versions. https://review.openstack.org/80449 | 20:48 |
openstackgerrit | lifeless proposed a change to openstack-dev/pbr: Use the current pbr for testpackage tests. https://review.openstack.org/94107 | 20:48 |
openstackgerrit | lifeless proposed a change to openstack-dev/pbr: Add a converter to version_tuples. https://review.openstack.org/80457 | 20:48 |
openstackgerrit | lifeless proposed a change to openstack-dev/pbr: Break out a common version object from VersionInfo https://review.openstack.org/94108 | 20:48 |
jogo | sdague: done | 20:51 |
*** ramashri has joined #openstack-infra | 20:51 | |
sdague | jogo: yeh, I was racing with yuo through that | 20:53 |
*** chianingwang has quit IRC | 20:54 | |
*** mrmartin has quit IRC | 20:56 | |
sdague | clarkb: so it occurs to me that the bot is actually querying across all indexes for the real time queries | 20:56 |
sdague | when it probably only needs the most recent one | 20:57 |
*** e0ne has joined #openstack-infra | 20:57 | |
sdague | I wonder if a time boundary there will help | 20:57 |
clarkb | oh yes | 20:57 |
clarkb | I didn't realize it was hitting all of them | 20:57 |
sdague | well it has no bounds | 20:58 |
clarkb | ah | 20:58 |
clarkb | ya we should change that | 20:58 |
sdague | which I assume means all | 20:58 |
clarkb | yup if you don't bound it is all indexes | 20:58 |
sdague | what's the syntax for "since" | 20:58 |
clarkb | well you query a specific index | 20:58 |
clarkb | eg logstash-todaysdate | 20:58 |
sdague | so we have a race across rotation? | 20:58 |
clarkb | yes there will be | 20:59 |
sdague | I thought there was a time range | 20:59 |
clarkb | sdague: there is too but it will search all indexes for that time range | 20:59 |
sdague | can I get the last 2 indexes? | 20:59 |
clarkb | sdague: yeah you can comma delimit them | 20:59 |
*** thuc_ has joined #openstack-infra | 20:59 | |
clarkb | /index1,index2/query or whatever | 21:00 |
openstackgerrit | A change was merged to openstack/requirements: Bump pep8 from 1.5.6 to 1.5.7 https://review.openstack.org/97944 | 21:00 |
sdague | we need to change the query url? | 21:00 |
clarkb | sdague: yes | 21:00 |
clarkb | sdague: otherwise it searches all indexes | 21:00 |
sdague | so the 15 minute searches via logstash come back pretty fast | 21:00 |
sdague | it's going to be easier in the code to do the date range, I'm wondering if that's going to be good enough | 21:01 |
clarkb | for a timestamp range you do @timestamp:[2014-06-01T12:12:12Z TO 2014-06-02T12:12:12] but that searches all indexes for that range | 21:01 |
clarkb | sdague: because indexes aren't necessarily date bound, its just how logstash does it | 21:01 |
sdague | oh | 21:01 |
clarkb | sdague: so kibana is being smart when you say give me last 15 minutes | 21:01 |
clarkb | I think | 21:02 |
sdague | ah, gotcha | 21:02 |
sdague | we always rotate at UTC 00:00 | 21:02 |
sdague | ? | 21:02 |
clarkb | yes | 21:02 |
*** gokrokve_ has joined #openstack-infra | 21:02 | |
sdague | ok, let me figure out if I can put the same smarts into our side | 21:02 |
clarkb | ok | 21:02 |
clarkb | let me know if you have other questions | 21:02 |
clarkb | I can dig into the kibana source but there is a config option to tell it when rollover happens beacuse it does this magic too | 21:03 |
sdague | will do, but I probably just need to dive on this for a little bit | 21:03 |
*** thuc has quit IRC | 21:03 | |
jogo | fungi: to be clear, what is teh workflow for creating and deleting a isntance for you guys: 'nova boot';'nova image-create' to create aucstom image' and boot from that image? | 21:04 |
clarkb | jogo: yes | 21:04 |
clarkb | nodepool boots off of the "base" image provided by our providers | 21:04 |
phschwartz | fungi: It looks like the instances stuck in deleting might be an issue on our side and some of them have been stuck for multiple days. (looks like possibly 5 days or more back) | 21:04 |
clarkb | then it runs scripts on that to create our image, snapshots that and deletes the node the snapshot was taken from | 21:04 |
jogo | thanks, I am trying to locally (with rax) reprodue the slow deletes | 21:04 |
clarkb | jogo: then we boot off of that snapshot for all of the slaves | 21:04 |
*** gokrokve has quit IRC | 21:05 | |
openstackgerrit | Maxime Vidori proposed a change to openstack-infra/storyboard-webclient: Remove boostrap.js https://review.openstack.org/99638 | 21:06 |
openstackgerrit | Maxime Vidori proposed a change to openstack-infra/storyboard-webclient: Removal of jquery https://review.openstack.org/99660 | 21:06 |
*** nati_ueno has quit IRC | 21:07 | |
*** thuc has joined #openstack-infra | 21:07 | |
clarkb | sdague: also the gerrit comments should come with a timestamp | 21:09 |
jogo | phschwartz: oh? | 21:09 |
clarkb | sdague: it is probably relatively straightforward to convert that into a N and N-1 index | 21:09 |
NobodyCam | oh new ironic / DIB queue... | 21:09 |
phschwartz | jogo: yeah, we are still trying to trace the issue., just was letting fungi know where we stand at the moment. | 21:09 |
*** melwitt has joined #openstack-infra | 21:09 | |
*** thuc_ has quit IRC | 21:10 | |
NobodyCam | is that new queue permanent? | 21:10 |
fungi | phschwartz: thanks. added you to those groups just now too | 21:11 |
*** jerryz_ has joined #openstack-infra | 21:11 | |
*** markmcclain has quit IRC | 21:12 | |
*** markmcclain1 has joined #openstack-infra | 21:12 | |
phschwartz | fungi: ty | 21:12 |
fungi | NobodyCam: is anything here ever permanent? it's a result of taking oslo cross-tests off those projects for the gate pipeline, which caused them to no longer have jobs in common with anything in the main integrated gate queue | 21:13 |
NobodyCam | :) ahh ok Ty fungi :) | 21:14 |
fungi | so their job failures no longer impact time to land anything in the larger queue | 21:14 |
NobodyCam | and vis versa | 21:14 |
fungi | yup | 21:14 |
jogo | wow booting from a snapshot is super duper slow | 21:14 |
*** doug-fish has left #openstack-infra | 21:15 | |
clarkb | jogo: yes its one reason we want to move away from it (but not the most important reason) | 21:15 |
openstackgerrit | A change was merged to openstack-infra/elastic-recheck: remove old queries https://review.openstack.org/99756 | 21:17 |
*** mbacchi has quit IRC | 21:18 | |
*** fifieldt_ has quit IRC | 21:19 | |
JayF | Now up to 5 times I've been bounced from the list :( | 21:21 |
openstackgerrit | Craig Bryant proposed a change to openstack-infra/config: Add the python-monascaclient https://review.openstack.org/99767 | 21:22 |
sdague | clarkb: what's the naming convention of the indexes? | 21:22 |
sdague | also I moved to the living room so that sportsball is on | 21:22 |
clarkb | sdague: yes I have done the same >_> | 21:22 |
clarkb | sdague: one sec I will get it for you | 21:22 |
clarkb | sdague: http://logstash.net/docs/1.4.1/outputs/elasticsearch#index we use the default | 21:23 |
clarkb | so logstash-2014.06.12 for today | 21:23 |
*** e0ne has quit IRC | 21:26 | |
*** e0ne has joined #openstack-infra | 21:26 | |
*** andreykurilin_ has joined #openstack-infra | 21:28 | |
*** mmaglana has joined #openstack-infra | 21:29 | |
phschwartz | sdague: for your devstack-vagrant, what format has is it looking for, for the password. | 21:29 |
sdague | phschwartz: it's the hashed value in /etc/shadow | 21:30 |
lifeless | sdague: so anyhwo - https://review.openstack.org/#/c/92497/ delete the whole start-output ? | 21:30 |
sdague | lifeless: yeh, I think so | 21:30 |
sdague | phschwartz: what you'd pass usermod -p | 21:31 |
zaro | clarkb: you know what's up with and hpcloud and sudo? when i use sudo the command it takes about 10 times longer to execute? | 21:31 |
clarkb | zaro: might be doing a name lookup? | 21:31 |
*** e0ne has quit IRC | 21:31 | |
phschwartz | sdague: makes sense. I just have to remember what hash alg linux uses for that as I am on a mac. lol | 21:32 |
clarkb | zaro: those nodes are really slow too | 21:32 |
sdague | phschwartz: just spin up a linux node set the password, and snag the value | 21:32 |
sdague | that's what I do :) | 21:32 |
zaro | clarkb: yes, it's slow, but crawling when sudo-ing. | 21:33 |
zaro | mordred: ^ do you see same issue? | 21:35 |
*** fifieldt_ has joined #openstack-infra | 21:36 | |
lifeless | zaro: almost certainly hostname | 21:36 |
lifeless | zaro: check that hostname is in /etc/hosts | 21:36 |
fungi | zaro: yeah, gethostbyname() calls. you could strace the sudo process to see where it's hanging if you want to confirm | 21:37 |
*** mfer has quit IRC | 21:39 | |
*** smarcet has quit IRC | 21:40 | |
jogo | fungi: just reproduced the slow delete, very odd | 21:41 |
jogo | did it without a snapshot | 21:42 |
fungi | oh! interesting | 21:42 |
jogo | a second delete helped | 21:42 |
*** esker has quit IRC | 21:44 | |
openstackgerrit | Maxime Vidori proposed a change to openstack-infra/storyboard-webclient: Documentation improvment https://review.openstack.org/99775 | 21:45 |
*** HenryG has quit IRC | 21:45 | |
*** esker has joined #openstack-infra | 21:45 | |
fungi | jogo: yep, same for us. basically nodepool retries them periodically, then sticks them back into its delete queue if they don't disappear within the expected timeframe, and it tries again later | 21:47 |
fungi | and then eventually they're freed up | 21:48 |
*** mriedem has quit IRC | 21:48 | |
*** esker has quit IRC | 21:49 | |
*** lbragstad has quit IRC | 21:50 | |
*** masayuki_ has joined #openstack-infra | 21:50 | |
*** radez is now known as radez_g0n3 | 21:50 | |
*** mrodden has quit IRC | 21:51 | |
jogo | very odd, next step is to reproduce in devstack (doubtful) | 21:51 |
openstackgerrit | Sean Dague proposed a change to openstack-infra/elastic-recheck: have realtime engine only search recent indexes https://review.openstack.org/99776 | 21:52 |
sdague | clarkb ^^^ | 21:52 |
sdague | also jogo, and any other er people | 21:52 |
clarkb | looking | 21:52 |
sdague | gah, I missed a thing | 21:53 |
sdague | one second | 21:53 |
openstackgerrit | Sean Dague proposed a change to openstack-infra/elastic-recheck: have realtime engine only search recent indexes https://review.openstack.org/99776 | 21:53 |
sdague | it helps to actually pass the index param to search | 21:53 |
clarkb | I was just going to ask about that | 21:54 |
clarkb | sdague: is that tested? | 21:54 |
clarkb | sdague: the approach is sound to me | 21:54 |
sdague | it is not, I just finished it | 21:54 |
*** hashar has quit IRC | 21:55 | |
sdague | I did only start 50 minutes ago :) | 21:55 |
mtreinish | sdague: yeah it looks reasonable to me, but we probably should test it :) | 21:55 |
phschwartz | fungi: Looks like nova refuses all requests to delete while instances are in deleting state so all the spam of trying to delete after the first are a waste. (API shouldn't timeout as the error is seen in node-pool though) | 21:55 |
clarkb | sdague: also sportsball | 21:55 |
sdague | clarkb: yeh | 21:55 |
sdague | though I was kind of rooting against br | 21:56 |
*** marcoemorais has quit IRC | 21:56 | |
*** markmcclain1 has quit IRC | 21:56 | |
sdague | because, that would be funny | 21:56 |
jogo | ++ to it sounds very reasonable to me | 21:56 |
*** markmcclain has joined #openstack-infra | 21:56 | |
*** marcoemorais has joined #openstack-infra | 21:56 | |
*** lcostantino has quit IRC | 21:56 | |
*** dims_ has joined #openstack-infra | 21:57 | |
*** praneshp_ has joined #openstack-infra | 21:57 | |
*** andreykurilin_ has quit IRC | 21:57 | |
sdague | clarkb: do we have the api back on yet? | 21:58 |
*** jamielennox is now known as jamielennox|away | 21:58 | |
fungi | phschwartz: i'm not entirely sure that's true. we regularly see deletes requested which basically never get processed, but then clear up immediately on a subsequent delete call | 21:58 |
sdague | or are we waiting for that to fully burn down ? | 21:58 |
clarkb | sdague: not yet | 21:58 |
clarkb | I was waiting for it to burn down but its picking up again | 21:58 |
clarkb | and load on es01 is climbing | 21:58 |
sdague | ok, any idea what else is going on? | 21:58 |
clarkb | no according to bigdesk there are no searches | 21:59 |
lifeless | huh | 21:59 |
*** praneshp has quit IRC | 21:59 | |
*** praneshp_ is now known as praneshp | 21:59 | |
lifeless | why is hudson-openstack closing bugs on merge, rather than fix-committing them ? | 21:59 |
*** thuc has quit IRC | 21:59 | |
lifeless | see https://bugs.launchpad.net/tripleo/+bug/1327090 | 21:59 |
uvirtbot | Launchpad bug 1327090 in tripleo "can't deploy ci-overclouds on Ubuntu - ensure-bridge wipes out /e/n/i" [High,Fix committed] | 21:59 |
lifeless | (I just put it to fix committed, which is the state it should have) | 22:00 |
clarkb | lifeless: it is a toggleable option | 22:00 |
fungi | lifeless: it depends on how your project is configured in review.projects.yaml | 22:00 |
*** thuc has joined #openstack-infra | 22:00 | |
lifeless | hmm | 22:00 |
clarkb | sdague: iotop shows es doing a lot of reads and sar corroborates that as the slowness | 22:00 |
fungi | lifeless: it sounds like that project is set for direct release, implying it's one which doesn't do real releases and is just used from trunk | 22:00 |
*** MarkAtwood has quit IRC | 22:00 | |
clarkb | sdague: so I don't think this is purely related to queries | 22:00 |
clarkb | I am half tempted to restart es on that node | 22:01 |
lifeless | is this new? Its wrong. | 22:01 |
clarkb | I guess I can strace | 22:01 |
fungi | lifeless: no idea. looking to see now which one it is and git-blaming the file for you | 22:01 |
*** mrda-away is now known as mrda | 22:01 | |
sdague | clarkb: yeh, I would say we should take the opportunity to try to diagnose while we've got the query side off | 22:02 |
lifeless | fungi: 0a9d800b modules/openstack_project/files/review.projects.yaml (Monty Taylor 2013-12-13 12:12:54 -0500 441) - direct-release | 22:02 |
*** nati_ueno has joined #openstack-infra | 22:03 | |
fungi | lifeless: yup | 22:03 |
fungi | lifeless: nix that one line and it will do the default thing which is to set to fix committed on merge | 22:03 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Unbreak tripleo projects https://review.openstack.org/99778 | 22:04 |
fungi | heh | 22:04 |
lifeless | fungi: obviously we're going to want that in quickly :) | 22:04 |
phschwartz | fungi: all of your current instances in deleting state have tons of requests going through our api's but when they get to the api at the cell level are kicked with this error. 2014-06-10 11:25:34.426 25720 INFO nova.compute.api [req-c314c89d-58ac-46ae-8a60-57f0edbfac54 156185 637776] [instance: 1bc3554b-1f2a-44c7-b9a7-3d8bc956f7cd] Instance is already in deleting state, ignoring this request | 22:04 |
*** thuc has quit IRC | 22:04 | |
clarkb | oh cells | 22:04 |
phschwartz | fungi: so if it makes to deleting state from the nova-api they are ignored after that | 22:04 |
fungi | phschwartz: sounds like a different class of problem than we're used to seeing in that case | 22:05 |
sdague | yeh, probably something new. | 22:05 |
fungi | phschwartz: just to be clear, are these the handful which are undeletable or the hundreds which seem to delete if i call nova delete again on them | 22:05 |
phschwartz | fungi: I think it is and I am thinking on how it should be handled from node-pool that way it doesn't keep retrying if they are in a deleting state | 22:05 |
sdague | jaypipes did point out a cells delete fix in review, but the fact that cells is basically untested upstream I'm sure doesn't help | 22:05 |
sdague | https://review.openstack.org/#/c/93860/ | 22:06 |
fungi | phschwartz: oh, yeah if nova shows the state is deleting that's a different class of problem than the one i was thinking of | 22:06 |
phschwartz | fungi: All the ones that are undeletable because they are stuck in the deleting state. as the delete errored and it looks like nova never reset the vm_state to error. | 22:06 |
fungi | we also regularly see many which remain in active state according to nova after a delete is requested | 22:06 |
*** amcrn has quit IRC | 22:06 | |
*** esker has joined #openstack-infra | 22:06 | |
phschwartz | fungi: I have seen that before and if a retry is done it will delete. | 22:06 |
fungi | yep | 22:06 |
phschwartz | fungi: this issue is if the delete actually tried to happen | 22:07 |
fungi | that's what i was thinking we were probably hitting, but yes this sounds different and solvable (to the degree to which we have any real control over it) in nodepool | 22:07 |
sdague | phschwartz: does the linked review above look relevant here? | 22:07 |
phschwartz | sdague: no, but this one does. https://review.openstack.org/#/c/58829/ | 22:08 |
phschwartz | That remove the wrapper for reverting state on error of delete. | 22:08 |
* fungi graciously bows out of the discussion to get back to moving prep | 22:08 | |
phschwartz | sdague: basically that one leaves an instance so it can't be deleted from the api once a delete error happens. | 22:09 |
*** resker has joined #openstack-infra | 22:09 | |
sdague | phschwartz: gotcha | 22:09 |
*** pvo has quit IRC | 22:09 | |
*** resker has quit IRC | 22:10 | |
*** mrodden has joined #openstack-infra | 22:10 | |
*** resker has joined #openstack-infra | 22:10 | |
*** esker has quit IRC | 22:11 | |
*** mrodden1 has joined #openstack-infra | 22:11 | |
*** cp16net has quit IRC | 22:11 | |
*** jergerber has quit IRC | 22:12 | |
phschwartz | I understand why the change was made, but it traded an issue of "Oh" this puts the instance into a bad state as you can't revert a delete. But it ended up putting into a worse one as there is no way with out making a db change to fix the state so you can forcefully delete it | 22:12 |
clarkb | so strace does show lots of reads mapping file descriptors to actual throughput is a bit hard | 22:12 |
jogo | phschwartz: AFAIK this has been an issue way before that patch | 22:13 |
jogo | phschwartz: plus I can call nova delete on teh same instance multiple times | 22:14 |
phschwartz | jogo: no, it wouldn't have been as the revert_task would have put it in a pure error instead of leaving it in a vm_state of deleting. | 22:14 |
*** dkliban is now known as dkliban_afk | 22:14 | |
phschwartz | Nova doesn't allow you to delete a deleting instance | 22:14 |
phschwartz | so stuck in that state a reissue of the delete does nothing | 22:15 |
*** resker has quit IRC | 22:15 | |
*** mrodden has quit IRC | 22:15 | |
*** mfer has joined #openstack-infra | 22:15 | |
jogo | phschwartz: just to be clear you are saying that paych isn't it? | 22:16 |
jogo | or it could be | 22:16 |
phschwartz | No, that patch is the issue. It removed the reverts_task_state wrapper from the delete action | 22:16 |
*** denis_makogon has quit IRC | 22:16 | |
phschwartz | So the delete that fails causes the instance to stay in deleting. | 22:17 |
*** gondoi is now known as zz_gondoi | 22:17 | |
phschwartz | subsequent delete requests are ignored by nova because of the deleting state basically locking the instance to where you need to force modify the state in the db to delete it by the api, or you have to delete it at the compute level and then hand change the table to deleted. | 22:18 |
fungi | JayF: we should revisit http://wiki.list.org/pages/viewpage.action?pageId=17891458 based on your reminder. would you care to add it to the agenda for tuesday's infra team meeting? i won't probably be around to discuss it but others on the team also have a firm grasp of the issue i think | 22:19 |
*** UtahDave has quit IRC | 22:19 | |
*** mfer has quit IRC | 22:19 | |
*** andreaf has joined #openstack-infra | 22:20 | |
clarkb | ooh there is a slow log and it looks like I can enable it on the fly | 22:20 |
*** mmaglana has quit IRC | 22:21 | |
jogo | phschwartz: so the nodepool instances that are stuck are in error state? | 22:21 |
jogo | and are failing a delete? | 22:21 |
jogo | phschwartz: "Failure to power off a VM during delete leads to it going back to Active(None) Edit | 22:21 |
jogo | " | 22:21 |
jogo | doesn't sound like this issue | 22:22 |
jogo | direcrtly at least | 22:22 |
phschwartz | jogo: The task_state is deleting on them so all further delete requests are ignored. | 22:22 |
jogo | phschwartz: that is the state the stuck instances are in? | 22:22 |
phschwartz | basically with that patch I linked it makes them vm_state=Error, task_state=deleting and | 22:22 |
sdague | realistically we actually hit this issue in the gate as well | 22:22 |
jogo | I found a different state | 22:22 |
phschwartz | jogo: yes | 22:22 |
sdague | clarkb: oh, coolness | 22:22 |
jogo | phschwartz: do you have access to nodepools 'nova list' | 22:23 |
phschwartz | jogo: I have better then that. I have all our logging and the nova db | 22:23 |
jogo | phschwartz: who is 'our'? RAX? | 22:23 |
*** SumitNaiksatam has quit IRC | 22:24 | |
*** dkranz has quit IRC | 22:24 | |
phschwartz | jogo: correct, I am a developer from Rax that has become a helpful resource to infra ;) | 22:24 |
fungi | phschwartz: a VERY helpful resource! you're our new pvo ;) | 22:24 |
jogo | phschwartz: wanna share the relavent logs? | 22:24 |
*** dkranz has joined #openstack-infra | 22:25 | |
jogo | fungi: do you have a nova list from nodepool? | 22:25 |
fungi | jogo: i can get you one, though it changes by the minute | 22:25 |
*** dizquierdo has joined #openstack-infra | 22:25 | |
fungi | just a sec | 22:25 |
jogo | fungi: thanks | 22:25 |
phschwartz | jogo: Unfortunately that is something I can't do at the moment as they have info for non-infra things in them and would be too hard to scrub. | 22:25 |
fungi | we can at least speak in terms of specific uuids though | 22:26 |
jogo | phschwartz: thats what I assumed but figured I would ask anyway | 22:26 |
sdague | phschwartz ftw! | 22:26 |
*** freyes_ has quit IRC | 22:26 | |
jogo | phschwartz: yeah thanks for helping out on this | 22:26 |
*** markmcclain has quit IRC | 22:27 | |
phschwartz | That is what I am here for. I am thinking of a quick fix for node-pool to not hammer our api if it gets stuck in deleting for the edge case. I am leaving now to go to a concert and will have to send an email to infra when I get back if I have come up with something good. | 22:27 |
jogo | phschwartz: why would that help? | 22:28 |
jogo | less load? | 22:28 |
*** weshay has quit IRC | 22:28 | |
phschwartz | In about 10 days there have been over 55k requests to delete the same instances that are stuck in deleting | 22:29 |
phschwartz | That is traffic that is a waste :) | 22:29 |
jogo | phschwartz: waste yes, but not sure how that would help infra | 22:29 |
*** dizquierdo has quit IRC | 22:29 | |
jogo | just not yest convinced that would make things better per se. | 22:30 |
phschwartz | That is why I have to think on the best way to do it because if something is stuck in deleting for a long time we need it to report, not just try to delete it over and over again | 22:30 |
phschwartz | well I am off, I will be back in a bit. | 22:30 |
fungi | jogo: ... | 22:30 |
sdague | phschwartz: well that was my tongue in cheek idea of having nodepool open a ticket | 22:30 |
fungi | dfw: http://paste.openstack.org/show/83880 | 22:30 |
fungi | iad: http://paste.openstack.org/show/83881 | 22:30 |
fungi | ord: http://paste.openstack.org/show/83882 | 22:30 |
sdague | but then you'd get a lot of tickets :) | 22:30 |
fungi | thanks again for the help phschwartz! enjoy the concert | 22:31 |
*** gokrokve_ has quit IRC | 22:32 | |
jogo | phschwartz: o/ | 22:32 |
clarkb | ok I have slowlog setup for both index and serach | 22:32 |
clarkb | the search slowlog is empty even with trace at 200ms | 22:32 |
jogo | fungi: just as phschwartz said | 22:32 |
clarkb | so I think we can reasonably confirm that there is no searching | 22:32 |
sdague | another gate reset coming, ceilometer unit tests | 22:33 |
jogo | fungi: what does 'nova show $instance in error,deleting' say | 22:33 |
*** esker has joined #openstack-infra | 22:33 | |
jogo | fungi: hmm I think we have a reset state button somewhere in nova | 22:33 |
clarkb | this appears to be an indexing problem too | 22:33 |
clarkb | you know I wonder if we just have a cranky volume | 22:34 |
clarkb | and it acts up in intervals | 22:34 |
sdague | https://jenkins04.openstack.org/job/gate-ceilometer-python26/666/console actually kind of weird | 22:34 |
clarkb | which is why we see it as a cyclic problem | 22:34 |
clarkb | fungi: ^ | 22:34 |
sdague | clarkb: like we're getting hit by a qos issue? | 22:34 |
clarkb | sdague: maybe | 22:34 |
clarkb | we hit max number of iops for that period then get throttled | 22:34 |
sdague | are the ssd volumes in rax? | 22:35 |
clarkb | we are not using ssd volumes iirc but they have them | 22:35 |
sdague | so I'd expect they'd be configured for higher iops, no? | 22:35 |
clarkb | I would expect so | 22:35 |
jogo | fungi: reset-state isn't in rax :/ | 22:35 |
clarkb | I think we really need to get in touch with pvo | 22:35 |
clarkb | mordred_phone: any luck? | 22:35 |
clarkb | the slow log isn't growing at an insane rate so I will leave it in place for a bit and maybe we will see it drop off on the next downward portion of the cycle | 22:36 |
*** andreaf has quit IRC | 22:36 | |
jogo | fungi: so it looks like we are tickling a bug in RAX that is causing deletes to error out. | 22:37 |
jogo | perahps when phschwartz gets back he can help tell us what that is | 22:37 |
*** dangers is now known as dangers_away | 22:37 | |
anteaya | so we have ERROR deleting and ACTIVE deleting, so ERROR deleting are the stuck nodes? | 22:38 |
jogo | anteaya: that is my understanding | 22:39 |
sdague | clarkb: is the slow log going to give us slow queries? | 22:39 |
*** thedodd has quit IRC | 22:39 | |
anteaya | and hopefully ACTIVE deleting turns into a DELETED status | 22:39 |
fungi | jogo: is 'nova show $instance in error,deleting' an actual syntax? i get help/usage output from that | 22:39 |
*** crc32 has quit IRC | 22:40 | |
jogo | fungi: nova show f160258b-46d6-4122-a217-4294cf6a9bb5 | 22:40 |
*** mrmartin has joined #openstack-infra | 22:40 | |
jogo | for rax-dfw | 22:40 |
fungi | jogo: oh, you mean show output for each of the nodes in one of those two states. i can construct a shell one-liner to get that into a paste. just a sec | 22:41 |
jogo | fungi: I just need one example | 22:41 |
jogo | not all | 22:41 |
jogo | fungi: all will take too long | 22:41 |
jogo | fungi: as a user I don't think there will be any really useful info but figured it is worht a shot | 22:42 |
mattoliverau | Morning | 22:42 |
*** HenryG has joined #openstack-infra | 22:42 | |
anteaya | morning mattoliverau | 22:43 |
jogo | lifeless: https://review.openstack.org/#/c/99743/ | 22:44 |
fungi | jogo: well, too late... http://paste.openstack.org/show/83885 | 22:45 |
lifeless | jogo: yes :) | 22:45 |
fungi | ConnectionFailed' object has no attribute 'status_code' | 22:46 |
*** zehicle_defcore has quit IRC | 22:46 | |
*** gokrokve has joined #openstack-infra | 22:46 | |
jogo | lifeless: I want to get rid of that rule as well | 22:46 |
clarkb | sdague: yes it will giev us slow queries too | 22:46 |
jogo | fungi: ohh this was fun | 22:46 |
jogo | | fault | {u'message': u'Connection to neutron failed: Maximum attempts reached', u'code': 500, u'created': u'2014-06-05T05:45:02Z'} | | 22:46 |
clarkb | sdague: we will need to fine tune the valeus as right now they are not conservative at all and we will probably fill the logs up real quick | 22:47 |
*** jhesketh has quit IRC | 22:47 | |
jogo | its neutron hahahaha | 22:47 |
jogo | sdague: you would like this http://paste.openstack.org/show/83885/ | 22:47 |
anteaya | is connection reset by peer neutron as well? | 22:47 |
jogo | anteaya: I *think* so but don't quote me on that | 22:47 |
jogo | phschwartz: ^ | 22:48 |
anteaya | and no attribute 'status_code'? | 22:48 |
fungi | jogo: heh | 22:48 |
sdague | heh | 22:48 |
jogo | anteaya: you ask too many good questions | 22:48 |
jogo | not sure | 22:48 |
anteaya | sorry to spoil the taring and feathering party | 22:49 |
anteaya | do carry on | 22:49 |
jogo | mestery: ^ | 22:49 |
sdague | jogo: I wonder if that's a failed conversion from objects in cells | 22:50 |
jogo | sdague: ohh possibly. | 22:50 |
sdague | I know it's lagging there | 22:51 |
sdague | you should poke alaski, I think he was chasing some of that | 22:51 |
jogo | of 16 nodes in deleting | 22:51 |
clarkb | sdague: I have a few entries in the indexing slow log on the other nodes but nothing like on 01 | 22:51 |
jogo | 7 have neutron in the error message | 22:51 |
jogo | alaski: ^ poke | 22:51 |
* jogo wonders off to catch BART | 22:51 | |
clarkb | and 01 has the same number of master and replica shards as three other machines | 22:51 |
*** sarob has quit IRC | 22:52 | |
clarkb | I am becoming more suspicious of that volume (but that may be the lazy in me | 22:52 |
sdague | clarkb: I support the lazy in you | 22:53 |
*** sarob has joined #openstack-infra | 22:53 | |
clarkb | indexing is properly load balanced. the logstash processes join the cluster and talk directly to the node that needs the data | 22:53 |
sdague | I do like the idea of trying to get higher performance volumes | 22:53 |
clarkb | I have mostly confirmed at this point that queries do not cause the issue though they may contribute | 22:53 |
clarkb | it is present in indexing but only on one node with similar use as compared to other nodes in the cluster | 22:53 |
sdague | can you do some straight out IO testing? | 22:54 |
sdague | like shut down ES and just beat on the volumes? | 22:54 |
clarkb | sdague: numbers probably won't mean anything without shutting down the cluster | 22:54 |
clarkb | yeah we could probably do something like that | 22:54 |
sdague | right | 22:54 |
fungi | bonnie++ or whatever the kids are using these days | 22:54 |
sdague | yeh | 22:54 |
sdague | that should at least give a definitive answer of volume vs. configuration | 22:54 |
sdague | also, some interesting data on the volume throughput | 22:55 |
clarkb | do we want to shut everything down for that though? | 22:55 |
fungi | clarkb: it is a cloud. entirely possible one volume is sharing a device crushed under the weight of i/o from another neighboring tenant | 22:55 |
sdague | clarkb: well, I don't know | 22:55 |
sdague | on the up side, it would give us some answers, maybe | 22:56 |
openstackgerrit | Adam Gandelman proposed a change to openstack-infra/config: Pre-cache UCA packages during nodepool img build https://review.openstack.org/99740 | 22:56 |
*** james_li has quit IRC | 22:56 | |
sdague | and we've not drained the inbound queue in days | 22:56 |
sdague | on the down side, our blindspot would get much bigger | 22:56 |
fungi | or sharing an i/o channel choked by a neighbor on the same nova compute node | 22:56 |
clarkb | fungi: right | 22:57 |
*** james_li has joined #openstack-infra | 22:57 | |
sdague | clarkb: so what if you killed es01 brought up a new compute, and allocated a fresh volume for it? | 22:57 |
clarkb | sdague: we could do something like that too | 22:57 |
clarkb | sdague: my only concern there is we are so close to the edge on available disk that the cluster may die | 22:57 |
sdague | well, actually, what about this | 22:57 |
fungi | though the shard rebuilding would be similarly traumatic to cluster performance | 22:57 |
clarkb | sdague: we don't currently have enough extra disk on nodes to lose one :/ | 22:58 |
sdague | clarkb: even after kiling the indexes? | 22:58 |
clarkb | sdague: yeah | 22:58 |
bodepd | any chance I can get one more +2 on the puppet repos I am waiting for? | 22:58 |
sdague | how deep are our indexes now? | 22:58 |
*** eharney has quit IRC | 22:58 | |
sdague | can we trim down? | 22:58 |
sdague | like trim to 7 days | 22:58 |
bodepd | I'm also curious about the best way to proceed about the module decoupling from config | 22:59 |
bodepd | I assume it's blocked b/c you guys would rather switch to r10k? | 22:59 |
sdague | or can we do add then remove? | 22:59 |
sdague | so bring in the new node first | 22:59 |
sdague | reshard, and drop the old one | 22:59 |
sdague | I would say keep es01 around after just to benchmark that volume | 23:00 |
sdague | to know if it seems bad | 23:00 |
clarkb | sdague: http://paste.openstack.org/show/83890/ | 23:00 |
sdague | ok, I'm not sure what I'm looking at :) | 23:01 |
clarkb | sdague: its teh size as reported by _status for each index | 23:01 |
clarkb | sdague: that includes the replica | 23:01 |
clarkb | so divide by two for size without replica | 23:01 |
*** zzelle has quit IRC | 23:01 | |
*** dims_ has quit IRC | 23:01 | |
sdague | ok and what does it need to fit into? | 23:01 |
*** james_li has quit IRC | 23:01 | |
clarkb | 6TB with enough room that ext4 doesn't hate us | 23:01 |
fungi | ext4 will always hate you, it just does a better job of hiding it than ext3 did | 23:02 |
clarkb | fungi: what was your process for spinning up nodes and adding the volumes | 23:03 |
clarkb | fungi: such that ES doesn't come up without a volume and immediately get cranky | 23:03 |
clarkb | fungi: were you just relying on firewall rules to prevent it from joining the cluster? | 23:03 |
*** adalbas has quit IRC | 23:04 | |
clarkb | also Ireally wish nova + cinder had a better first boot story | 23:04 |
clarkb | mikal: jgriffith: it would be amazing if I could spin up a node with a block device preattached and formatted | 23:05 |
*** jamielennox|away is now known as jamielennox | 23:05 | |
fungi | clarkb: hummm... i think i manually added them to the cluster one at a time as they were ready, though now i can't remember | 23:06 |
clarkb | fungi: yeah I think that will work | 23:06 |
clarkb | fungi: basically first boot and puppet will use normal disk then stop ES, attach volume, format, mount, start ES | 23:07 |
jgriffith | clarkb: hmmm | 23:07 |
clarkb | and sometime after ES is stopped update firewall rules | 23:07 |
jgriffith | clarkb: I could probably make that happen | 23:07 |
clarkb | jgriffith: the use case being on first boot you tend to do all sorts of config and stuff | 23:07 |
jgriffith | clarkb: I could add an option to "cinder create" that let's you do partitioning and formatting | 23:08 |
clarkb | jgriffith: and you can either induce a failure the first time if the fs isn't there and deal with it later | 23:08 |
jgriffith | clarkb: yeah, I get that for sure | 23:08 |
clarkb | jgriffith: or you run into masking of stuff | 23:08 |
clarkb | neither of which is great and has different trade offs | 23:08 |
jgriffith | clarkb: that's why I use BFV for everything :) | 23:08 |
jgriffith | clarkb: I'll write a spec | 23:08 |
clarkb | fungi: what do you think? | 23:09 |
*** bookwar has quit IRC | 23:09 | |
clarkb | fungi: is it worth going through the rebalance terror to see if we get lucky? | 23:09 |
clarkb | fungi: or should we maybe poke rax harder? | 23:09 |
*** dkranz is now known as dkranz_afk | 23:09 | |
*** sarob has quit IRC | 23:09 | |
clarkb | fungi: sdague: I think the "luck" portion of this is what bothers me most | 23:09 |
clarkb | jgriffith: so you would create block device first and format it, then nova boot with it attached? | 23:10 |
clarkb | I think that would work | 23:10 |
jgriffith | clarkb: yeah | 23:10 |
jgriffith | clarkb: so cinder could have options at create to do all that | 23:10 |
*** sarob has joined #openstack-infra | 23:10 | |
jgriffith | clarkb: trick is getting nova to mount it | 23:10 |
clarkb | ++ though I have no idea what that means for you guys | 23:11 |
*** thuc has joined #openstack-infra | 23:11 | |
fungi | clarkb: thinking back, i *believe* what i did was puppet them without any elasticsearch in the global site manifest, add the volumes, then add es in the manifest one patch at a time | 23:11 |
clarkb | fungi: so you used a dev env with custom site.pp? | 23:11 |
jgriffith | clarkb: not sure how to make that work in our current nova without just relying on cloud init or something | 23:11 |
fungi | clarkb: i think i approved patches for them individually | 23:11 |
clarkb | fungi: oh right | 23:11 |
fungi | to add and remove cluster members | 23:12 |
*** esker has quit IRC | 23:12 | |
fungi | it went on over the course of days because we didn't want to strain the existing overloaded cluster by doing too many nodes at once | 23:12 |
clarkb | yeah | 23:13 |
*** jhesketh has joined #openstack-infra | 23:13 | |
clarkb | now we have node /^elasticsearch\d+\.openstack\.org$/ so it will puppet as es server | 23:13 |
jhesketh | Morning | 23:14 |
jhesketh | phschwartz: pong | 23:14 |
*** thuc_ has joined #openstack-infra | 23:14 | |
clarkb | jhesketh: o/ | 23:14 |
anteaya | morning jhesketh | 23:14 |
anteaya | phschwartz is at a concert | 23:14 |
anteaya | he will be back later | 23:14 |
clarkb | fungi: I am going to modify that regex to give us the option of doing this the other way | 23:15 |
*** thuc_ has quit IRC | 23:15 | |
*** thuc has quit IRC | 23:15 | |
*** rcarrillocruz has quit IRC | 23:15 | |
clarkb | but I am still a bit worried it is a lot of work for nothing as its luck of the draw sort of thing | 23:15 |
*** thuc has joined #openstack-infra | 23:16 | |
anteaya | anyone have any reason why paste.o.o 500'd on me a minute or so ago? | 23:16 |
anteaya | and why is it using so much cache memory? http://cacti.openstack.org/cacti/graph_view.php?action=tree&tree_id=1&leaf_id=14&page=2 | 23:16 |
anteaya | I couldn't see anything else funny on paste from cacti | 23:16 |
clarkb | anteaya: cache memory is al inux thing | 23:16 |
clarkb | anteaya: linux says memory is there if it isn't used for anything specific I will use it for cache | 23:16 |
anteaya | doesn't it seem like paste is using quite a bit? | 23:16 |
anteaya | oh okay | 23:17 |
anteaya | so back to why did it 500 on me | 23:17 |
*** esker has joined #openstack-infra | 23:17 | |
anteaya | cacti doestn' point to anything specific | 23:17 |
clarkb | for the 500 I don't have a specific answer, but guessing its related to trove | 23:17 |
anteaya | oh | 23:17 |
anteaya | is that a thing or is it just "oh trove" | 23:17 |
anteaya | like do we have a bug report for it, or do we want one? | 23:18 |
clarkb | anteaya: we probably need to properly debug it first | 23:18 |
anteaya | ah okay | 23:18 |
*** markmcclain has joined #openstack-infra | 23:18 | |
anteaya | I'm guessing I have no much debugging power from this side of being no access to servers | 23:18 |
*** rcarrillocruz has joined #openstack-infra | 23:19 | |
sdague | paste.o.o 500s quite a bit actually | 23:19 |
*** esker has quit IRC | 23:19 | |
sdague | clarkb: are we recording apache logs for that? | 23:19 |
sdague | I'd say 1 time in 20 I get a 500 | 23:19 |
*** ramashri has quit IRC | 23:20 | |
clarkb | sdague: we should be. the puppet apache module (or maybe just ubuntu apache) seems to do the right thing | 23:20 |
*** thuc has quit IRC | 23:20 | |
clarkb | I can hop on there in a moment | 23:20 |
jogo | fungi: sunds like we have the bug: nova cannot delete an instance in error,deleting | 23:20 |
jogo | 1bc3554b-1f2a-44c7-b9a7-3d8bc956f7cd] Instance is already in deleting state, ignoring this request | 23:21 |
anteaya | jogo: where did you find that error message? | 23:21 |
fungi | clarkb: i too remain unconvinced it's worth replacing the node to test this theory. perhaps move the query api endpoint to another node instead and see if the problem stays behind or follows the action? | 23:22 |
jogo | anteaya: from phschwartz | 23:22 |
anteaya | jogo: helpful chap | 23:22 |
openstackgerrit | Clark Boylan proposed a change to openstack-infra/config: Be specific about which ES nodes are puppetable https://review.openstack.org/99794 | 23:23 |
jogo | b3c9cc504903eccbc68c441a81b0a727a83117fa I2f97f93bd714e0ea3b6d4fa3ac457ab43eed00e1 | 23:23 |
clarkb | fungi: ^ something like that | 23:23 |
*** ramashri has joined #openstack-infra | 23:23 | |
clarkb | fungi: we still have queries disabled and it happens when indexing | 23:23 |
*** amcrn has joined #openstack-infra | 23:23 | |
clarkb | fungi: I am reasonably confident this doesn't isn't induced by the queries | 23:23 |
*** ihrachyshka has quit IRC | 23:23 | |
fungi | got it | 23:24 |
sdague | fungi: though we haven't been querying it for the last couple of hours | 23:24 |
clarkb | fungi: but ya the buckshot approach to cloud is :? | 23:24 |
clarkb | er :/ | 23:25 |
fungi | scattershot | 23:25 |
fungi | git yer scatter gun | 23:26 |
fungi | as we say 'round these parts | 23:26 |
*** atiwari has quit IRC | 23:27 | |
fungi | completely unrelated to anything going on here (other than mentioning the open-source movement and a nod to zero wing), i loved seeing http://www.teslamotors.com/blog/all-our-patent-are-belong-you | 23:29 |
jesusaurus | clarkb: with an approach like 99749 how would you replace an existing node? if es02 went sideways and a new one needed to be spun up, would the replacement have to be es07 and there would be no es02 afterwards? | 23:29 |
clarkb | jesusaurus: correct | 23:29 |
anteaya | fungi: I shouted with joy as I read that | 23:30 |
clarkb | jesusaurus: in a cattle world that doesn't bother me a whole lot | 23:30 |
clarkb | jesusaurus: but it could potentially be confusing | 23:30 |
jogo | phschwartz fungi: https://review.openstack.org/99796 | 23:30 |
jogo | I think ^ should fix things for us | 23:30 |
jogo | gotta actually test that, etc but thats the idea | 23:30 |
jogo | sdague: ^ | 23:30 |
clarkb | jesusaurus: one potential way around that is to make all of the ES stuff run on internal addresses | 23:31 |
jesusaurus | clarkb: i would be confused/annoyed by a set of non-sequential host numbers, but that might just be me | 23:31 |
fungi | clarkb: jesusaurus: gaps in sequential server numbering don't bother me in the least, and i'm probably close to clinically ocd | 23:31 |
clarkb | jesusaurus: but at one point I had considered multi az clustering | 23:31 |
clarkb | jesusaurus: that never happened | 23:31 |
*** mmaglana has joined #openstack-infra | 23:31 | |
* fungi has learnt to use his ocd for good, not evil | 23:32 | |
fungi | pragmatic choices are an acceptable loss of consistency | 23:32 |
anteaya | jogo: I'm confused | 23:33 |
alaski | jogo: just saw your poke about object conversion and cells | 23:33 |
anteaya | jogo: I thought the == vm_states.ERROR was the state we are seeing that needs to be addressed | 23:34 |
*** praneshp_ has joined #openstack-infra | 23:34 | |
*** mmaglana has quit IRC | 23:36 | |
clarkb | oh boo lodge is being proxied to no mod wsgi'd | 23:36 |
clarkb | and now I can't find logs /me hunts for them | 23:36 |
*** esker has joined #openstack-infra | 23:37 | |
*** praneshp has quit IRC | 23:37 | |
*** praneshp_ is now known as praneshp | 23:37 | |
clarkb | upstart to the rescue | 23:37 |
*** esker has quit IRC | 23:37 | |
clarkb | OperationalError: (OperationalError) (2006, 'MySQL server has gone away') | 23:38 |
clarkb | anteaya: fungi sdague ^ | 23:38 |
clarkb | 110 occurences of that over the last 18 hours or so | 23:39 |
anteaya | where does it go? | 23:39 |
anteaya | when it goes away | 23:39 |
*** gokrokve has quit IRC | 23:39 | |
anteaya | now on the east coast of Canada "away" is Quebec, Ontario or the West | 23:40 |
anteaya | maybe MySQL went there | 23:40 |
anteaya | I wonder what it does there | 23:40 |
jogo | alaski: we are digging into why rax is erroring out on so mayn delets http://paste.openstack.org/show/83885/ | 23:41 |
anteaya | probably swats flies like the rest of us | 23:41 |
*** sarob has quit IRC | 23:41 | |
jogo | alaski: and someone thought | fault | {u'message': u"'ConnectionFailed' object has no attribute 'status_code'", u'code': 500, u'created': u'2014-06-09T12:41:33Z'} | | 23:41 |
*** sarob_ has joined #openstack-infra | 23:41 | |
jogo | could be objects | 23:41 |
anteaya | jogo: so I don't understand your patch | 23:41 |
anteaya | since I thought the vm_state was ERROR | 23:41 |
jogo | anteaya: yeah, so wrote that on the train so let me see if it makes any sense | 23:41 |
mattoliverau | maybe the max_connections needs to be raised in the trove instance | 23:41 |
anteaya | but your patch seems to say not ERROR | 23:41 |
*** reed has quit IRC | 23:42 | |
anteaya | jogo: okay well I don't understand it yet | 23:42 |
jogo | so the issue right now is this: you try to deleting an instance, and previously if that failed it went back to active state | 23:43 |
alaski | jogo: a very good guess, but cells/objects failures are usally "'dict' object has no attribute..." | 23:43 |
jogo | alaski: also surprised any now many neutron errors you have ;) | 23:43 |
jogo | anteaya: so https://review.openstack.org/#/c/58829/ changed that behavior | 23:44 |
clarkb | mattoliverau: possibly | 23:44 |
alaski | jogo: yeah :( | 23:44 |
jogo | anteaya: so now instances go into deleting,error | 23:45 |
jogo | anteaya: which makes more sense. | 23:45 |
jogo | anteaya: note I am ignoring *why* the delete fails | 23:45 |
jogo | anteaya: and there was another patch https://review.openstack.org/#/c/55444/ | 23:45 |
jogo | that says if you send a second delete to an instance deleting to ignore it, as its already in deleting state | 23:45 |
jogo | anteaya: does that sound right so far? this is helping me sanity check this, so thank you for that | 23:45 |
jogo | alaski: also https://review.openstack.org/#/c/99796/ | 23:46 |
jogo | alaski: I think that should fix a lot of our issues with having instances stuck in error,deleting | 23:46 |
anteaya | always happy to share in a sanity check | 23:47 |
*** markwash has quit IRC | 23:47 | |
anteaya | and yes I am following the bouncing ball so far | 23:47 |
jogo | anteaya: so my patch should make the ignore the second delete if already deleting logic to: | 23:47 |
jogo | ignore if already deleting and not in error state | 23:47 |
anteaya | oh | 23:48 |
anteaya | ummmm how does that fix what we are seeing then? | 23:48 |
fungi | clarkb: mattoliverau: that sounds like a very, very plausible explanation for what we're seeing with paste.o.o | 23:48 |
anteaya | or at least any part of what we are seeing? | 23:49 |
jogo | anteaya: it will fix the part where we keep trying to delete an isntance and it doesn't delete | 23:49 |
jogo | anteaya: if fungi tries to manaully delete one of the instances in error,deleting it won't work | 23:49 |
anteaya | okay that was part of what phschwartz wanted | 23:49 |
jogo | with my patch it should be deletable | 23:50 |
anteaya | to reduce using resources to no effect | 23:50 |
anteaya | oh | 23:50 |
jogo | well he wanted us to give up on those instances | 23:50 |
anteaya | any way to test it? | 23:50 |
anteaya | he did | 23:50 |
jogo | anteaya: unit testing? | 23:50 |
anteaya | but we still need to get rid of them | 23:50 |
jogo | anteaya: yup, this patch should do that (if it works) | 23:50 |
anteaya | well yes, but I was thinking put it in action and get rid of one of those instances | 23:50 |
anteaya | since we have a few to try to kill | 23:51 |
jogo | anteaya: ohh well if we could get into rax and change code | 23:51 |
jogo | (maybe someoen can) | 23:51 |
anteaya | guess that is pvo then | 23:51 |
*** thuc_ has joined #openstack-infra | 23:52 | |
anteaya | jogo: in the commit message can I get some urls to the two patches you linked me to: https://review.openstack.org/#/c/99796/1//COMMIT_MSG | 23:53 |
*** thuc_ has quit IRC | 23:53 | |
anteaya | with a small sample of the chain of events you just described for me? | 23:53 |
jogo | anteaya: yup, I am transcribing what I told you into it | 23:54 |
jogo | anteaya: one step ahead of me | 23:54 |
anteaya | awesome thank you | 23:54 |
anteaya | :D | 23:54 |
*** thuc_ has joined #openstack-infra | 23:54 | |
anteaya | composing commit messages is harder than just typing stuff into irc | 23:54 |
anteaya | you don't need puncuation or anything | 23:54 |
anteaya | unless you _want_ to put a full stop at the end of a sentence. | 23:55 |
anteaya | then you can, in irc. | 23:55 |
anteaya | long live the full stop. | 23:55 |
fungi | it's like typing in telegram. | 23:55 |
anteaya | ha ha ha | 23:55 |
clarkb | I am still trying to sort out lodgeits concurrency model | 23:55 |
fungi | clarkb: it *has* a concurrency model? | 23:56 |
clarkb | fungi: mattoliverau ^ it uses weurkzeug script.make_runserver | 23:56 |
fungi | clarkb: in my head that translates to "someone should write a paste server" | 23:56 |
clarkb | https://github.com/mitsuhiko/werkzeug/blob/master/werkzeug/script.py#L288 | 23:57 |
clarkb | that line is great. we are using a development server... | 23:57 |
clarkb | it looks like by default it uses one proces without threads | 23:57 |
clarkb | so we shouldn't have connection issues right? | 23:58 |
*** rcarrillocruz has quit IRC | 23:58 | |
fungi | bwahaha | 23:58 |
jogo | comstud: can you help reset task states for now? | 23:58 |
*** rcarrillocruz has joined #openstack-infra | 23:59 | |
fungi | clarkb: yeah, unless it's not a concurrency issue but rather a rate issue? | 23:59 |
*** ramashri has quit IRC | 23:59 | |
clarkb | fungi: could be | 23:59 |
clarkb | poor server can't deal with it | 23:59 |
*** jerryz_ has quit IRC | 23:59 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!