clarkb | jlk: our jenkins slaves are good at DDoSing our git server | 00:01 |
---|---|---|
clarkb | jlk: particularly when we point them at git-daemon | 00:01 |
jlk | strange. | 00:01 |
jlk | but your repos are significantly larger than Fedoras was | 00:01 |
jlk | Fedora was thousands of small repos | 00:01 |
jlk | Our hits were probably more distributed as well, distributed over time and network capabilities. RHT infrastructure had networking gear in between our servers and the Internet, I don't know what they did for throttling or whatnot | 00:03 |
jeblair | jlk: did you use xinetd or run git-daemon itself? | 00:04 |
jlk | good question! I believe I used whatever was packaged in EPEL | 00:04 |
jlk | would have been rhel6 era | 00:04 |
jeblair | jlk: that's pretty much what we're doing, which ends up being xinetd. so no particular tuning? | 00:08 |
fungi | clarkb: any good reason not to pass --events on mysqldump runs? currently cronspamming us about skipping the mysql.event table on each server | 00:08 |
jlk | jeblair: not that I remember. | 00:09 |
jlk | I think I looked at one time at doing git export to just get the latest bits instead of doing a full clone, or doing shallow clones, on our build server | 00:09 |
jlk | because it didn't need any history, just needed the bits | 00:09 |
mordred | jeblair: ^^ that's a little bit what I was afraid of - we tend to absolutely slam the cloning infrastructure | 00:10 |
jlk | ah, apparently they do use xinetd to throttle it a lot now | 00:10 |
jlk | where "it" == anonymous clones | 00:10 |
openstackgerrit | Clark Boylan proposed a change to openstack-infra/config: Proxy git-daemon with haproxy. https://review.openstack.org/42784 | 00:11 |
jeblair | mordred: not really, we almost never clone | 00:11 |
clarkb | I can't help myself | 00:11 |
ttx | mordred: are you done merging back swift m-p tags to master, or should I keep the m-p branch alive for some more time ? | 00:12 |
clarkb | that is completely untested but in theory made easy with the puppetlabs module | 00:12 |
mordred | ttx: done with it | 00:12 |
ttx | mordred: I can delete it now ? | 00:12 |
mordred | ttx: also wrote a patch to potentially do it | 00:12 |
mordred | ttx: yup | 00:12 |
clarkb | jlk: aren't all git-daemon clones anonymous? | 00:12 |
ttx | mordred: ok, on my way to final cleanup | 00:12 |
mordred | ttx: https://review.openstack.org/#/c/41927/ | 00:12 |
jlk | well, yes, I'm not sure why I added that bit of data. | 00:12 |
jlk | "data" | 00:12 |
jeblair | clarkb, mordred: http://paste.openstack.org/show/44553/ | 00:13 |
jeblair | clarkb, mordred: that thread is just sitting there. best i can tell, it's not waiting on a lock. but it is holding one which is blocking everyone else. | 00:13 |
jeblair | that should be the jjb update that changes the git url. it applied fine on jenkins01 | 00:13 |
mordred | jeblair: wow. that's stellar | 00:14 |
jeblair | i'm leaning towards "try to manually kill that thread". any other ideas before i do that? | 00:14 |
clarkb | jeblair: is it possibly waiting on a locked file? | 00:15 |
*** pcrews has quit IRC | 00:16 | |
*** ^demon has joined #openstack-infra | 00:16 | |
*** ^demon has joined #openstack-infra | 00:16 | |
ttx | mordred: there is a corner case in the merge-tags thing | 00:16 |
jeblair | clarkb: it looks like a runaway regex | 00:16 |
mordred | jeblair: I'd chalk that up to "java sucks sometimes" | 00:17 |
mordred | ttx: yeah? | 00:17 |
clarkb | fungi: uh I don't know | 00:17 |
ttx | mordred: for stable/* I'm not sure you actually want to merge tags back... do you ? | 00:17 |
* clarkb reads more manpages | 00:17 | |
*** nati_ueno has quit IRC | 00:17 | |
clarkb | pleia2: if you are really adventurous I think it would be cool to apply 42784 to your test server if it is still up | 00:17 |
jeblair | mordred: 'cept gearman-plugin is a few rungs down the stacktrace | 00:18 |
jeblair | mordred: so it's our fault | 00:18 |
mordred | ttx: branch: ^(milestone-proposed).*$ | 00:18 |
ttx | mordred: i.e. when we tag 2013.1.3 on stable/grizzly, do we rally want to merge the tags back to havana master ? | 00:18 |
mordred | ttx: the job is configured to only run on milestone-proposed | 00:18 |
mordred | since that's the only time we ever want to do this | 00:18 |
ttx | mordred: at release time we use milestone-proposed too, and turn that into stable/* | 00:18 |
*** ^d has quit IRC | 00:18 | |
clarkb | mordred: thoughts on fungi's --events mysqldump option? | 00:19 |
mordred | ttx: but it's milestone-proposed when you make the tag, right? | 00:19 |
clarkb | mordred: is that table useful or just noise? | 00:19 |
ttx | mordred: so we push like, havana-rc2 tags to milestone-proposed while master switched to icehouse | 00:19 |
mordred | clarkb: noise. we don't use it | 00:19 |
mordred | ttx: yup. that's fine | 00:19 |
clarkb | mordred: so better to redirect that warning message to /dev/null than to dump the table? | 00:20 |
ttx | mordred: ok, just doublechecking | 00:20 |
mordred | ttx: we _do_ want the final tag from havana milestone-proposed to be in master, so that the in-flight versions look "sensible" | 00:20 |
mordred | but I agree, the following tags that are made on stable/* do not want to be merged to master | 00:20 |
ttx | mordred: can that job generate a conflict ? Or is it always successful ? | 00:21 |
mordred | ttx: and we're making it always a null-merge, so the merge will never bring changes from m-p to master | 00:21 |
ttx | ok, guess that answers my question | 00:21 |
mordred | ttx: it's always successful. it's using the merge strategy which says "just keep my version" | 00:21 |
ttx | ack | 00:21 |
ttx | +1ed | 00:22 |
Alex_Gaynor | Is there anythign I could be doing to help with the "ddosing ourselves with git" issue? | 00:22 |
clarkb | Alex_Gaynor: right now we are switching to using https instead of git:// as apache deals with ddosing ourselves better | 00:23 |
jeblair | clarkb, mordred: uh, wow, ok, it got unstuck. | 00:23 |
mordred | jeblair: wow | 00:23 |
Alex_Gaynor | clarkb: "apache deals with ddosing ourselves better", I feel like this encapsulates everything I feel about computering (for better and for worse) :) | 00:23 |
clarkb | Alex_Gaynor: https://review.openstack.org/42784 is one potential way of moving back to using git:// but it needs testing and probably input from someone that knows haproxy better than me | 00:23 |
Alex_Gaynor | clarkb: I can probably ping some HA proxy friends | 00:23 |
clarkb | Alex_Gaynor: I am semi hoping we can abuse pleia2's test box if it is still around | 00:24 |
jlk | seems really strange to make use of https to make things faster... | 00:24 |
jlk | IIRC git:// isn't doing any encryption, which /should/ make it an easier process to handle. | 00:24 |
jeblair | Alex_Gaynor, jlk: basically, git under xinetd has no socket queueing, so you're either under the 50 process limit, or over, in which case you get your connection dropped | 00:24 |
jlk | interesting | 00:24 |
jeblair | Alex_Gaynor, jlk: apache at least will let you separately tune how many things you run, vs how many things you queue | 00:24 |
clarkb | and if we increase the connection limit we end up hitting cpu and disk hard | 00:24 |
jlk | nod | 00:25 |
Alex_Gaynor | Is there anything we can point at github? | 00:25 |
jeblair | so we can set a reasonable number of processes to run at once, and a larger queue | 00:25 |
Alex_Gaynor | let them deal with the problem | 00:25 |
mordred | Alex_Gaynor: hehehe | 00:25 |
mordred | Alex_Gaynor: that's funny | 00:25 |
jeblair | Alex_Gaynor: that's been our strategy up to this point | 00:25 |
jlk | they appear to be moving away from git:// as much as they can | 00:25 |
BobBall_Away | mordred: Now the only failure with VIRTUAL_ENV is grenade... not sure how to fix it though, since we're explicitely trying to perform an upgrade it sounds like it might be more difficult than I'd hope... | 00:25 |
jlk | but that might just be because they can stick all sorts of tracking around http usage that they can't w/ git:// | 00:25 |
mordred | BobBall_Away: I think we just may need to do similar work there | 00:26 |
jeblair | Alex_Gaynor: github still fails quite often, enough for our automagic to notice | 00:26 |
mordred | BobBall_Away: or backport some of the changes to devstack stable/grizzly | 00:26 |
mordred | BobBall_Away: but that's thrilling! | 00:26 |
BobBall_Away | mordred: effectively the error seems to be it's running in the venv but things (such as pip) haven't been installed in it | 00:26 |
jeblair | Alex_Gaynor: (i should say partial strategy -- we haven't used github in tests for a long time, but we still use it for cronjobs, etc) | 00:26 |
mordred | BobBall_Away: I'm going to run out fora second, I'll look at grenade when I get back | 00:26 |
BobBall_Away | very thrilling | 00:26 |
BobBall_Away | I'm going to bed now | 00:26 |
jlk | I think Fedora infrastructure also has multiple front ends for git | 00:26 |
jlk | that use a shared FS | 00:26 |
mordred | BobBall_Away: thanks for your help! | 00:26 |
BobBall_Away | it's 1:30am and I've had enough :D | 00:27 |
dstufft | use a CDN | 00:27 |
dstufft | ! | 00:27 |
jlk | not positive though | 00:27 |
Alex_Gaynor | dstufft: doing invalidation on a CDN'd git repo sounds awful | 00:27 |
jlk | yikes | 00:27 |
* mordred has a hunch multiple servers is going to wind up being in the cards eventually | 00:27 | |
dstufft | Alex_Gaynor: I dunno sounds like it wouldn't be that bad actually | 00:27 |
lifeless | dstufft: I'm not aware of any git CDN's | 00:27 |
Alex_Gaynor | lifeless: if you run git over HTTP(S) you can just use any HTTP pass-through CDN | 00:28 |
clarkb | lifeless: the http stuff should CDN just fine | 00:28 |
jeblair | mordred: yep. i just want it to be multiple good servers. | 00:28 |
lifeless | Alex_Gaynor: clarkb: yeouch. No. Thanks. | 00:28 |
jlk | multiple servers seems easy for read-only support. it's the read/write that's hard with a load balancer | 00:28 |
Alex_Gaynor | master/slave git | 00:28 |
mordred | jlk: we don't need read/write | 00:28 |
mordred | we have a single writ emaster | 00:28 |
mordred | which is gerrit | 00:28 |
jlk | and I really didn't want there to be two vastly different URLs for read-only clone vs write clone | 00:28 |
jeblair | jlk: we are in the fortunate position of only needing to consider read-only mirrors here | 00:29 |
mordred | which replicates to things | 00:29 |
*** nati_ueno has joined #openstack-infra | 00:29 | |
jlk | mordred: oh right, that makes things a lot easier for you | 00:29 |
mordred | yup | 00:29 |
lifeless | Alex_Gaynor: clarkb: I presume you are aware of the way plain HTTP with git (and basically all VCS's) works, right ? | 00:29 |
lifeless | Alex_Gaynor: clarkb: or perhaps I should say, I presume you *aren't* aware, or you wouldn't suggest a CDN be a good fit. | 00:29 |
dstufft | pretend network latency doesn't exist and just fetch some files ? :V | 00:29 |
lifeless | dstufft: thats part A of the terror. part B is to either do readv's, or to sporadically download the entire repo all over again, due to the rebalancing of 'pack' operations | 00:30 |
openstackgerrit | Clark Boylan proposed a change to openstack-infra/config: Make mysql backup crons quiet. https://review.openstack.org/42785 | 00:30 |
clarkb | fungi: mordred ^ that should make mysqldump cronspam less annoying | 00:30 |
dstufft | you can probably run multiple git slaves and just front it with haproxying proxying streams around, the only hard part would be determining if an incoming stream is read or write, if there's something obvious in the cnnect that lets you know if something is authentcated you can just shove all authenticated at the master and anonymous at the read slaves | 00:31 |
clarkb | lifeless: if the repo hasn't changed the packs stay the same | 00:31 |
jeblair | dstufft: all streams are read. :) | 00:31 |
clarkb | lifeless: and iirc for large repos like nova you end up with several static packs as git leaves old stuff alone | 00:31 |
jeblair | for us | 00:31 |
jlk | dstufft: I don't think we have to worry about writes, everything is a read | 00:32 |
jlk | dstufft: only gerrit has write access | 00:32 |
dstufft | if everything is read then that's even easier | 00:32 |
dstufft | just use haproxy as a TCP load balancer | 00:32 |
dstufft | use whatever protocol you want, http, git, ssh, doesn't matter | 00:33 |
clarkb | dstufft: https://review.openstack.org/42784 | 00:33 |
mordred | dstufft: that's what clarkb was looking in to earlier | 00:33 |
dstufft | wtf is a pp file | 00:33 |
mordred | dstufft: puppet | 00:33 |
dstufft | oh | 00:33 |
jeblair | dstufft, jlk, Alex_Gaynor: so here's the thing -- we spun up a 30g, 8vcpu cloud server for this, and ddosed it with jenkins (it's arguable whether it performed better or worse than the http setup we have on review.o.o) | 00:34 |
jlk | that seems really bizarre, unless you're working with huge repos | 00:34 |
dstufft | you mean the haproxy solution? | 00:34 |
mordred | we have a LOT of activity :) | 00:34 |
clarkb | dstufft: mordred that is a first stab at using haproxy to do queing but it can be grown to handle mutliple servers | 00:34 |
jeblair | dstufft, jlk, Alex_Gaynor: before we spin up an army of maxsize(rackspacecloudservers) for this, i figure a little thought and testing of the tuning of one server might be in order. | 00:34 |
mordred | clarkb: ports => '29418', ? | 00:34 |
Alex_Gaynor | jeblair: so, suggest from a friend of mine "instances=32" | 00:35 |
Alex_Gaynor | jeblair: for xinetd | 00:35 |
dstufft | oh you were just shoving a bigger server at it | 00:35 |
Alex_Gaynor | I assume this forks 32 processes to handle requests | 00:35 |
lifeless | clarkb: it tries to accomodate things yes, which makes the behaviour worse, because you get sporadic 'wtf is it doing' when it has to suck down the entire history again. | 00:35 |
clarkb | dstufft: mordred or maybe we use lbaas to do handle multiple services and keep the local haproxy for queueing | 00:35 |
jlk | mordred: does all that activity require a full clone of the repo? | 00:35 |
dstufft | what does rackspace have for HD's | 00:35 |
*** dina_belova has joined #openstack-infra | 00:35 | |
*** rfolco has joined #openstack-infra | 00:35 | |
jeblair | Alex_Gaynor: we currently have the default of 50. | 00:35 |
dstufft | spinning up more processes won't help if you're IO bound | 00:35 |
clarkb | mordred: haproxy will listen on 9418 so I stuck gitdaemon on the alternate that gerrit uses | 00:35 |
mordred | clarkb: ahhhh | 00:36 |
mordred | clarkb: I agree with jeblair - let's see what a local haproxy queue will do to it | 00:36 |
mordred | before we start adding in multi-machine lbaas | 00:36 |
clarkb | mordred: definitely | 00:36 |
mordred | but potentially yes | 00:36 |
jeblair | i think we ought to do some real performance testing too | 00:36 |
dstufft | where was the bottleneck? | 00:36 |
*** coderanger has joined #openstack-infra | 00:36 | |
jeblair | where we figure out where the bottleneck actually is :) | 00:36 |
coderanger | Alex_Gaynor: Fine :P | 00:37 |
Alex_Gaynor | coderanger knows how haproxy works and junk | 00:37 |
jeblair | and what kind of throughput we can get under different configurations | 00:37 |
*** mriedem has joined #openstack-infra | 00:37 | |
Alex_Gaynor | coderanger: tl;dr; too many things trying to get stuff from git == ddosing ourselves | 00:37 |
jlk | yeah, curious where the bottleneck is. Disk, or CPU, or network | 00:37 |
clarkb | coderanger: Alex_Gaynor https://review.openstack.org/#/c/42784/1/modules/cgit/manifests/init.pp is the important file | 00:37 |
dstufft | I think before you go changing your configs around you should figure out the bottleneck | 00:37 |
coderanger | So cranking down maxconns won't buffer connections like it says in the review comment, it will just leave the socket in the listen queue | 00:37 |
dstufft | because that's going to influence what the solution is a lot :V | 00:38 |
coderanger | So if you are getting backed up, you are just going to end up with the kernel refusing conns | 00:38 |
clarkb | coderanger: "anything behind that will queue" is what the commit message says. Is that completely wrong? | 00:38 |
clarkb | ah | 00:38 |
clarkb | well that doesn't help | 00:38 |
coderanger | I mean if can smooth out spikes | 00:39 |
*** michchap has joined #openstack-infra | 00:39 | |
coderanger | Up to whatever you max fds is | 00:39 |
clarkb | coderanger: spikes are the current issue. Our jenkins slaves are a thundering herd | 00:39 |
coderanger | Do you know the magnitude? | 00:39 |
clarkb | coderanger: we need a semi deterministic way of making them wait in line if necessary | 00:39 |
jeblair | #status ok | 00:40 |
*** ChanServ changes topic to "Discussion of OpenStack Developer Infrastructure | docs http://ci.openstack.org | bugs https://launchpad.net/openstack-ci/+milestone/grizzly | https://github.com/openstack-infra/config" | 00:40 | |
*** dina_belova has quit IRC | 00:40 | |
coderanger | clarkb: If thats the way you want to go, make sure you set the backlog param in haproxy too :) | 00:40 |
clarkb | coderanger: absolute worst case is something like ~300 connections all at once based on the number of slaves we have | 00:40 |
clarkb | + some fudge for random people using it too | 00:41 |
coderanger | Ahh okay, for 300 conns thats fine as long as you know you can clear them | 00:41 |
coderanger | Do the slaves retry on failure? | 00:41 |
clarkb | coderanger: they do not, and that may help a little but not fix the problem | 00:41 |
coderanger | If so, you can also just set the xinetd instances=32 | 00:41 |
coderanger | or probably do that anyway jut for safety :) | 00:41 |
coderanger | Any reason to not use Jenkins' "hash" support in the scm config? | 00:42 |
coderanger | Thats been the default for a while now for exactly this reason | 00:42 |
fungi | coderanger: we don't really use the scm plugin for this | 00:42 |
clarkb | coderanger: because it has been useless for a long time. I believe mordred helped make it better but we tried switching to it and didn't for some other reason | 00:43 |
clarkb | mordred: jeblair do you remember why we stuck with g-g-p? | 00:43 |
coderanger | Ahh, manual build kickoff times every slave trying to pull down code? | 00:43 |
jeblair | clarkb: because it has a nice echo statement | 00:43 |
mordred | less work for jenkins to attempt to do | 00:43 |
jeblair | coderanger: yeah, we 'manually' run 400-600 jobs per hour | 00:44 |
jeblair | coderanger: obviously it's not manual, but that's the way jenkins sees it; they're triggered by a project gating system hooked up to our code review | 00:44 |
coderanger | Yahr | 00:44 |
coderanger | And to be clear, this is on recent-ish linux, right? :) | 00:44 |
clarkb | coderanger: haproxy or jenkins? | 00:45 |
mordred | well, the git server is running on centos6 | 00:45 |
coderanger | haproxy | 00:45 |
coderanger | (this would do truly bad things on Windows) | 00:45 |
mordred | we don't do windows | 00:45 |
openstackgerrit | Clark Boylan proposed a change to openstack-infra/config: Proxy git-daemon with haproxy. https://review.openstack.org/42784 | 00:45 |
fungi | using windows would be truly bad things | 00:45 |
clarkb | ^^ now with backlog | 00:45 |
uvirtbot | clarkb: Error: "^" is not a valid command. | 00:45 |
clarkb | uvirtbot: sssshhh | 00:45 |
uvirtbot | clarkb: Error: "sssshhh" is not a valid command. | 00:45 |
mordred | clarkb: yes. that looks good | 00:46 |
coderanger | clarkb: Other thing to check is that no hooks on the git server are using the remote IP for anything (access control, logging?) | 00:47 |
coderanger | Other than that, sounds like it will do what you want :) | 00:47 |
clarkb | coderanger: we don't have server side hooks so we should be fine | 00:47 |
jlk | I hadn't thought about hooks on a git-daemon pull | 00:48 |
clarkb | coderanger: cool thanks | 00:48 |
* jeblair runs again | 00:49 | |
clarkb | coderanger: what does the hash option to jenkins scm plugin do? | 00:51 |
* fungi assumes it's hash-based load distribution | 00:51 | |
* Alex_Gaynor assumes it reuses the same clone but just fetches that hash | 00:51 | |
fungi | ooh, you're probably right | 00:52 |
coderanger | Yeah, the scm plugin uses a cron-style config | 00:52 |
coderanger | the hash flag just lets you do <hash based lb>/N | 00:52 |
coderanger | Spreads out the thundering heard, but that only helps balance against multiple jobs | 00:52 |
coderanger | not multiple slaves on the same job | 00:52 |
clarkb | coderanger: if you want to see shiny graphs and current tests http://status.openstack.org/zuul/ | 00:52 |
* fungi guessed right | 00:53 | |
Alex_Gaynor | jeez, 600+ outstanding events | 00:53 |
Alex_Gaynor | s/events/results/ | 00:53 |
clarkb | Alex_Gaynor: this is what happens before milestone 3 every single time | 00:53 |
clarkb | Alex_Gaynor: for grizzly it was particularly painful | 00:53 |
Alex_Gaynor | clarkb: ahaha, this is my first milestone I guess | 00:54 |
clarkb | Alex_Gaynor: if we had the grizzly load today we would've been fine, but you guys keep writing more code :) | 00:54 |
Alex_Gaynor | clarkb: sorry? | 00:54 |
Alex_Gaynor | :D | 00:54 |
Alex_Gaynor | clarkb: these events/results are all bottlenecked on git? | 00:57 |
*** anteaya has quit IRC | 00:58 | |
lifeless | mordred: is the expectation that doing 'pip install -r requirements.txt' will grab everything a service needs? | 00:58 |
lifeless | mordred: pyudev which neutron wants is not listed in it's requirements.txt. I suspect it's a transitive dependency :( | 00:58 |
clarkb | Alex_Gaynor: events definitely are. I don't think results are so it is weird to see results so high | 00:59 |
clarkb | Alex_Gaynor: actually I take that back. results end up merging code in gerrit which would be bottlenecked too | 00:59 |
clarkb | Alex_Gaynor: events is gerrit events input into zuul. Things like new patchset or new comment. results are results from jenkins | 01:00 |
Alex_Gaynor | clarkb: I assume results are serialized, so it's really a head of the line problem? | 01:01 |
clarkb | Alex_Gaynor: correct | 01:02 |
*** lbragstad has joined #openstack-infra | 01:02 | |
clarkb | comparing cacti graphs for zuul and review.o.o this really seems to be a zuul problem | 01:02 |
clarkb | mordred: jeblair fungi I think we should merge the change to point d-g at git.o.o | 01:02 |
clarkb | jeblair: and I wonder if we shouldn't artificially throttle zuul, or at least have the option to | 01:04 |
clarkb | I feel better when things are slow but under control :) | 01:04 |
jeblair | clarkb: what? | 01:05 |
clarkb | jeblair: see the queue lengths on the zuul status page | 01:05 |
bodepd | was mgagne in here asking about redirects? | 01:06 |
*** beagles has quit IRC | 01:06 | |
* bodepd searches logs... | 01:06 | |
clarkb | bodepd: he was at some point last week iirc | 01:07 |
bodepd | clarkb: what was the verdict? | 01:08 |
bodepd | clarkb: shoul I open a ticket? | 01:08 |
bodepd | clarkb: we've got a lot of changes that need to happen, and decision to make based on if that happens | 01:09 |
clarkb | bodepd: I want to say he made the change and it merged | 01:09 |
clarkb | bodepd: check in the git log for openstack/config | 01:09 |
bodepd | the repo does not exist | 01:09 |
clarkb | er openstack-infra/config | 01:09 |
*** pabelanger has quit IRC | 01:10 | |
bodepd | no, I meant stackforge/puppet-quantum | 01:10 |
clarkb | oh renames | 01:10 |
clarkb | he wanted puppet lint file redirects. I thought that is what you were talking about | 01:11 |
clarkb | mordred: ^ rename question | 01:11 |
bodepd | sorry.for hte lack of context | 01:11 |
jeblair | i believe that repo has been renamed | 01:11 |
bodepd | mordred: basically, a github redict stackforge/puppet-quantum -> stackforge/puppet-neutron | 01:11 |
bodepd | would be awesome | 01:11 |
bodepd | I know it's possible to do if you are admin of an account | 01:12 |
jeblair | bodepd: i'm opposed to that. | 01:12 |
bodepd | jeblair: ok. | 01:12 |
bodepd | jeblair: that is what I need to know. (if it is going to happen or not) | 01:12 |
bodepd | b/c we have lots of code that needs to be updated otherwise | 01:12 |
bodepd | jeblair: what is the reason against it? | 01:13 |
jeblair | bodepd: sorry, it's an extremely busy time, we're even shorter staffed then normal, and we need to focus on keeping openstack running | 01:13 |
clarkb | jeblair: the last log item for processing result events is from 00:25 | 01:13 |
*** xchu has joined #openstack-infra | 01:13 | |
*** pabelanger has joined #openstack-infra | 01:13 | |
jeblair | clarkb: yeah, i'm trying to figure out what it's doing | 01:14 |
jeblair | clarkb: oh really, i thought this was the last | 01:15 |
jeblair | 2013-08-20 00:09:35,360 DEBUG zuul.Scheduler: Processing result event <Build 3133095c056a4d7ab064e05a01c7b310 of gate-tempest-devstack-vm-postgres-full> | 01:15 |
pleia2 | clarkb: am away from my laptop for a few hours, can do some tests later (my test server is still up) | 01:16 |
clarkb | pleia2: awesome. That would be helpful as it seems like I am doing 2 other things at the moment | 01:16 |
clarkb | pleia2: and I think it can wait for tomorrow | 01:16 |
jeblair | oh you're right | 01:17 |
jeblair | 2013-08-20 00:25:24,949 DEBUG zuul.Scheduler: Processing result event <Build 339f8f6144644de8b354d56303879d7b of gate-cinder-pep8> | 01:17 |
clarkb | jeblair: which is interesting because it is a result that should end up merging code or anything like that | 01:20 |
*** lcestari has quit IRC | 01:21 | |
clarkb | jeblair: but that would trigger pipeline.manager.onBuildCompleted(build) | 01:23 |
clarkb | jeblair: 42726,2 is in the check queue | 01:25 |
jeblair | clarkb: any completion event triggers the pipeline processor | 01:25 |
clarkb | jeblair: it does look like the gate queue is still being processed though? | 01:26 |
jeblair | it does? | 01:27 |
fungi | bodepd: per github redirects, i got the impression from the article on their site that it happens automagically when a repo is moved/renamed. but maybe not | 01:28 |
clarkb | jeblair: well the existing changes are getting some updates. I think anything going through the global event loop is stuck | 01:28 |
Alex_Gaynor | fungi: yes, when a repo is renamed the redirects should be automatic | 01:28 |
clarkb | jeblair: though it looks like that is happening for check changes too. So status on the changish/eventqueueobject is being updated but the big while true loop is stuck so we don't update much more than that | 01:28 |
clarkb | jeblair: are we stuck in the while self.processQueue loop in the pipeline manager? | 01:30 |
clarkb | jeblair: https://git.openstack.org/cgit/openstack-infra/zuul/tree/zuul/scheduler.py#n1036 | 01:31 |
*** coderanger has left #openstack-infra | 01:32 | |
*** Ryan_Lane has quit IRC | 01:32 | |
*** mriedem has quit IRC | 01:33 | |
clarkb | jeblair: http://paste.openstack.org/show/44559/ is the last time I see that log message | 01:34 |
jeblair | clarkb: it recently logged it again 2013-08-20 01:27:07,488 DEBUG zuul.IndependentPipelineManager: Starting queue processor: check | 01:34 |
clarkb | jeblair: yeah my version of the debug log was out of date | 01:35 |
jeblair | clarkb: did it move? | 01:35 |
jeblair | clarkb: istr top of check had no running jobs | 01:35 |
clarkb | jeblair: yeah looking at the log it seems to have moved | 01:36 |
jeblair | clarkb: 2013-08-20 01:27:07,148 DEBUG zuul.Scheduler: Run handler sleeping | 01:36 |
jeblair | 2013-08-20 01:27:07,148 DEBUG zuul.Scheduler: Run handler awake | 01:36 |
*** dina_belova has joined #openstack-infra | 01:36 | |
jeblair | clarkb: so basically it just spent 1 hour in one iteration of that loop | 01:36 |
clarkb | jeblair: http://paste.openstack.org/show/44560/ | 01:36 |
clarkb | jeblair: yes | 01:36 |
Alex_Gaynor | it looks like the queue started to move again? | 01:37 |
Alex_Gaynor | at least a little | 01:37 |
clarkb | Alex_Gaynor: yeah a little | 01:37 |
clarkb | I need to head home or food will be cold. But I will check back in from there | 01:37 |
clarkb | jeblair: tail -f /var/log/zuul/debug.log | grep 'zuul.*PipelineManager' is what I am running now to see it move | 01:38 |
fungi | is the gerrit-overloaded-slowing-merges-and-result-posting theory still being batted around? with load average ~300 there and cpu pegged flat out, it seems reasonable for that to crawl | 01:40 |
fungi | er, ~200 i guess | 01:41 |
*** dina_belova has quit IRC | 01:41 | |
Alex_Gaynor | everything broke together is a pretty reasonable explanation it seems | 01:41 |
jeblair | fungi: it's possible; but we didn't see this earlier when we were busier | 01:41 |
fungi | mmm, point | 01:42 |
Alex_Gaynor | so what changed such that things started moving again? | 01:42 |
Alex_Gaynor | (there's still a ton of oustadning events/results) | 01:43 |
HenryG | Trying to figure out what went wrong in gate-grenade-devstack-vm here: https://review.openstack.org/35085 | 01:44 |
HenryG | Help? | 01:45 |
fungi | HenryG: could this be the client backwards compat issue which was causing problems earlier today? have you asked in #openstack-qa? | 01:47 |
*** pcrews has joined #openstack-infra | 01:47 | |
mordred | yes it is | 01:47 |
*** ftcjeff has joined #openstack-infra | 01:47 | |
mordred | HenryG: known issue from earlier. should be fixed now | 01:47 |
HenryG | mordred: fungi: thanks. recheck bug #? | 01:47 |
jeblair | HenryG: it's at the top of the page here: http://status.openstack.org/rechecks/ | 01:48 |
fungi | HenryG: yeah, looking at the console log for that change it looks the same | 01:48 |
Alex_Gaynor | so I'm starting to think those queue counts can't possibly be right | 01:49 |
jeblair | Alex_Gaynor: why? it's been stuck/slow for over an hour | 01:49 |
Alex_Gaynor | jeblair: well, there are ~50 patches in tehre right now, how can there be 965 results (is that queue entirely jenkins results/) | 01:50 |
jeblair | Alex_Gaynor: those are start and stop events for jenkins; something like more than 700 have arrived since the start of the slowness | 01:53 |
Alex_Gaynor | so 50 * (say 6 tests per) * 2 still doesn't account for 900? | 01:53 |
fungi | and yeah, it does seem from the cacti graphs that cpu/load have fallen dramatically on zuul in the past couple hours | 01:53 |
Alex_Gaynor | Random other point: the SCP step for the logs seems to be slower today | 01:54 |
jeblair | Alex_Gaynor: it's more than 6 jobs per change | 01:55 |
jeblair | Alex_Gaynor: nova runs 13 | 01:55 |
jeblair | in the check queue | 01:55 |
Alex_Gaynor | gah, good point, I guess it does add up | 01:55 |
Alex_Gaynor | 1k events :( | 01:56 |
*** nati_ueno has quit IRC | 01:57 | |
jeblair | i have attached a debugger. | 01:58 |
jeblair | i need to get a stack trace, but the last time i tried that with gdb, the old trick i used to use didn't work | 01:59 |
clarkb | gdb or pdb? | 02:00 |
jeblair | gdb | 02:00 |
jeblair | can you attach pdb to a running process? | 02:00 |
Alex_Gaynor | attach a gdb, acquire the GIL, use pdb :) | 02:00 |
jeblair | Alex_Gaynor: do you have instructions for that? | 02:01 |
dstufft | you'll have to teach me how to do that someday Alex_Gaynor | 02:01 |
Alex_Gaynor | if it's a recent gcc there's actually a python embedded that let's you do stuff | 02:01 |
Alex_Gaynor | gdb&\ | 02:01 |
Alex_Gaynor | gdb*\ | 02:01 |
Alex_Gaynor | http://wiki.python.org/moin/DebuggingWithGdb has some details | 02:02 |
jeblair | Alex_Gaynor: afaict, the 'py-bt' thing is a fedora-ism | 02:02 |
jeblair | https://fedoraproject.org/wiki/Features/EasierPythonDebugging#New_gdb_commands | 02:02 |
Alex_Gaynor | jeblair: it was originalyl developed by a redhat person for fedora, but it's upstream now | 02:02 |
jeblair | oh. this is on precise | 02:03 |
Alex_Gaynor | maybe debian/friends don't compile with the needed flags or something :( | 02:03 |
jeblair | Alex_Gaynor: i think those are extra gdb commands | 02:03 |
*** rfolco has quit IRC | 02:04 | |
jeblair | ah, they are also in the precise python dbg package | 02:04 |
fungi | load average on review.o.o has collapsed too now | 02:04 |
* fungi needs to head out to a dinner reservation. bbl | 02:05 | |
Alex_Gaynor | I need to head home from the office because at some point it became 7PM, I'll be around more when I'm home | 02:05 |
*** rfolco has joined #openstack-infra | 02:05 | |
clarkb | jeblair: anything else I can be doing now to help? | 02:09 |
jeblair | clarkb: i'm still unable to get a stacktrace. 'py-bt' just says (unable to read python frame information) for every frame | 02:10 |
jeblair | clarkb: figuring out how to get a stacktrace from a running python on ubuntu precise is what i'm working on now. any help there would be appreciated | 02:11 |
*** yaguang has joined #openstack-infra | 02:11 | |
clarkb | jeblair: ok | 02:12 |
jeblair | clarkb: apparently those macros expect to be run with python-dbg, which of course is not how we started zuul | 02:13 |
clarkb | jeblair: http://www.python.org/~jeremy/weblog/031003.html not quite a stack trace but possibly useful | 02:13 |
*** xBsd has joined #openstack-infra | 02:16 | |
clarkb | jeblair: also http://svn.python.org/projects/python/trunk/Misc/gdbinit | 02:17 |
jeblair | clarkb: i think the objects have changed since then | 02:17 |
clarkb | jeblair: that gdbinit comes with a pystack function | 02:18 |
*** ^demon has quit IRC | 02:20 | |
jeblair | clarkb: No symbol "co" in current context. | 02:20 |
jeblair | clarkb: these all seem to be obsolete. | 02:20 |
clarkb | :( yeah they are fairly old | 02:20 |
* clarkb finds python2.7 branch | 02:21 | |
*** lbragstad has quit IRC | 02:22 | |
jeblair | clarkb: i think it's due to gcc optimizations | 02:23 |
clarkb | jeblair: http://hg.python.org/cpython/file/c048b211f634/Misc/gdbinit doesn't seem different but I haven't actually diffed them | 02:23 |
clarkb | jeblair: ah so the symbols just don't exist because gcc | 02:23 |
jeblair | i wonder if we could even do Alex_Gaynor's pdb trick with the current level of symbol mangling | 02:25 |
Alex_Gaynor | jeblair: if you can grab the Gil and use c execute simple string it should be possible | 02:26 |
jeblair | Alex_Gaynor: that sounds easy but i have no idea how to go about that | 02:26 |
Alex_Gaynor | When I'm at a computer and not my phone I'll try to find av reference | 02:27 |
mordred | jeblair: I'm here - I do not what what I can do to be helpful to you | 02:30 |
clarkb | mordred: we need a stacktrace from running zuul | 02:30 |
mordred | http://www.jmcneil.net/2012/04/debugging-your-python-with-gdb-ftw/ | 02:31 |
mordred | reading this now | 02:31 |
jeblair | mordred: my understanding of that is that it does not work because of gcc optimizations | 02:32 |
mordred | jeblair: yeah. I believe you are correct | 02:32 |
mordred | btw - symbol stripping, which debian is obsessed with, has no real noticable benefit most times | 02:33 |
mordred | and screws you in times like this | 02:33 |
mordred | jeblair: have you installed python-dbg? sometimes deb packages extract the symbols and put them into external files | 02:33 |
jeblair | thanks debian! | 02:33 |
jeblair | mordred: yes i have | 02:33 |
mordred | and gdb can be told to load them as symbol maps | 02:33 |
mordred | let me see if i can get some info on that | 02:34 |
jeblair | mordred: that made the backtraces look like this: #33 0x0000000000466a42 in PyEval_EvalFrameEx () | 02:34 |
jeblair | mordred: but still no understanding of arguments or local variables | 02:34 |
*** eharney has joined #openstack-infra | 02:34 | |
mordred | so "p *co" does nothing | 02:35 |
jeblair | No symbol "co" in current context. | 02:35 |
mordred | awesome | 02:35 |
jeblair | so, we could call this a wash | 02:36 |
*** dina_belova has joined #openstack-infra | 02:36 | |
jeblair | and restart zuul using the 'python-dbg' binary | 02:36 |
mordred | oh - wait | 02:36 |
mordred | there's a thing dhellman tweeted about the other day | 02:37 |
*** jfriedly has quit IRC | 02:37 | |
clarkb | this must be why people gentoo | 02:37 |
jeblair | and if it happens again, we'd be in a better place (no idea what that would do to performance though, since i think it is doing refcount debugging as well) | 02:37 |
jeblair | mordred: that's exciting; i'm holding for your tweet | 02:37 |
jeblair | (i'll be really excited if the actual method is less than 140 characters) | 02:37 |
mordred | ok. I don't think this is it, but, while I'm looking, look at: https://github.com/albertz/pydbattach | 02:38 |
*** rfolco has quit IRC | 02:38 | |
*** dina_belova has quit IRC | 02:38 | |
jeblair | mordred: wilco | 02:38 |
*** mriedem has joined #openstack-infra | 02:40 | |
jeblair | mordred: neat, but it's complicated, and i don't really want to audit it or compile/run it on our server right now | 02:41 |
mordred | jeblair: ok. that's the closest thing I can find right now | 02:41 |
mordred | I think that call a wash and restart zuul with python-dbg is our best bet | 02:42 |
clarkb | wfm | 02:42 |
clarkb | not elegant, but if it keeps things moving... | 02:42 |
*** bingbu has joined #openstack-infra | 02:43 | |
jeblair | okay that's clearly more complicated than it seems | 02:46 |
jeblair | ImportError: /usr/local/lib/python2.7/dist-packages/Crypto/Util/_counter.so: undefined symbol: Py_InitModule4_64 | 02:46 |
jeblair | ok, so i can just restart it as normal, and add some more debug lines to it i guess. | 02:47 |
jeblair | maybe add a jenkins style "threadDump" command. won't that just be the best? | 02:48 |
jeblair | zuul has been restarted. it has no queue. | 02:48 |
*** mriedem has quit IRC | 02:49 | |
*** pcrews has quit IRC | 02:49 | |
clarkb | jeblair: that will work too | 02:49 |
Alex_Gaynor | well that doesn't sound good | 02:49 |
mordred | jeblair: sigh. I believe, now that you mention, to use python-dbg, you will need -dbg versions of all of the c-based python libraries you might have installed | 02:49 |
mordred | in addition to the -dbg versions of the c libraries they depend on | 02:49 |
jeblair | mordred: lets move all our servers to rhel | 02:50 |
mordred | jeblair: ok | 02:50 |
clarkb | jeblair: or gentoo | 02:50 |
mordred | jeblair: or gentoo - and we can compile from source ourselves | 02:50 |
jeblair | mordred: https://bugs.launchpad.net/nova/+bug/937554/comments/13 | 02:51 |
uvirtbot | Launchpad bug 937554 in nova "Lots of problems with deleting a server immediately after create (dup-of: 934575)" [High,Fix committed] | 02:51 |
uvirtbot | Launchpad bug 934575 in nova "notifier endless loops in is_primitive" [Medium,Fix released] | 02:51 |
*** eharney has quit IRC | 02:51 | |
* mordred is looking at the debian packaging and cannot figure out why stack information is missing in the normal python | 02:51 | |
*** melwitt has quit IRC | 02:51 | |
mordred | they aren't passing stupid optimizer flags | 02:51 |
jeblair | handy instructions for building your own python, in a nova bug report no less! | 02:51 |
jeblair | mordred: " | 02:52 |
jeblair | #Recompiling python with make "CFLAGS=-g -fno-inline -fno-strict-aliasing" solves this problem. | 02:52 |
jeblair | mordred: ^ from that bug report; that help? | 02:52 |
mordred | ahhhh | 02:52 |
mordred | yes | 02:52 |
mordred | -fno-inline | 02:52 |
mordred | I forgot - python actually has a bunch of stuff defined in header files | 02:52 |
mordred | so -O2 is going to wind up inlining the shit out of it | 02:52 |
mordred | -O2 includes -finline-small-functions | 02:55 |
mordred | -O0, which python-dbg is compiled with, does not | 02:55 |
mordred | they're all compiled with -g but then dh_strip puts the symbols into python-dbg | 02:55 |
mordred | none of that is helpful here | 02:56 |
*** afazekas_zz is now known as __afazekas_zz | 03:02 | |
jeblair | i have reverified all the changes that were approved and did not have a vrfy-2 | 03:04 |
*** rcleere has joined #openstack-infra | 03:04 | |
*** markmcclain has quit IRC | 03:05 | |
jeblair | i have had a very long day and am not useful. tomorrow i intend to work on nodepool. if anyone wants to add some more debugging or a threadDump feature to zuul, that would be great; otherwise, i'll get to it later this week | 03:06 |
jeblair | also, i'm thinking we should have the gearman-plugin stop seding work status packets. | 03:06 |
jeblair | sending | 03:06 |
*** Ryan_Lane has joined #openstack-infra | 03:07 | |
Alex_Gaynor | so, are no builds happening right now? | 03:07 |
clarkb | I can look into zuul threaddumps | 03:07 |
clarkb | after I propose changed to add mysql backups (that should be quick) | 03:07 |
jeblair | Alex_Gaynor: i restarted zuul, should be running now | 03:07 |
Alex_Gaynor | jeblair: there doesn't appear to be anythign on http://status.openstack.org/zuul/ | 03:07 |
clarkb | jeblair: are work status packets causing problems? | 03:07 |
clarkb | Alex_Gaynor: refresh? there is stuff for me | 03:07 |
*** pcrews has joined #openstack-infra | 03:08 | |
jeblair | Alex_Gaynor: you may need to reload it? | 03:08 |
Alex_Gaynor | I don't even know. I hate browsers. | 03:08 |
jeblair | clarkb: no, but we ignore them. just busy work. | 03:08 |
mordred | jeblair: oh, for some reason I thought we were using them for status bars - I agree with anything you say | 03:12 |
clarkb | mordred: that is what I thought they were for too | 03:12 |
clarkb | and isn't zuul LOST status the result of not getting a status from gearman? | 03:13 |
*** erfanian has quit IRC | 03:13 | |
jeblair | mordred: we could. what we actually do is grab the estimated time from the first one and then calculate it ourselves. | 03:13 |
mordred | jeblair: ah. nice | 03:13 |
jeblair | clarkb: no, it polls gearman to see if the job is still in the queue. that would be a reasonable thing to do though... | 03:14 |
jeblair | clarkb: it would have helped with the jobs that got stuck in the jenkins queue and never ran | 03:14 |
jeblair | clarkb: maybe we should keep it and just reduce the logging. | 03:14 |
clarkb | ++ | 03:14 |
jeblair | i've seen several jobs lost because of errors like this: https://jenkins02.openstack.org/job/gate-grenade-devstack-vm/2370/console | 03:15 |
jeblair | i have no idea what's going on there. perhaps a dead slave (nodepool does not have a periodic job to recheck ssh access) | 03:15 |
jeblair | but it seems to happen a lot for that. | 03:15 |
Alex_Gaynor | For all that jobs that were lost when zuul was restarted, are the patch authors notified so they can recheck/reverfiy? | 03:16 |
clarkb | Alex_Gaynor: no, but I think jeblair indicated he did it for them | 03:16 |
jeblair | Alex_Gaynor: i reverified the ones that were approved; | 03:16 |
Alex_Gaynor | Oh, that's nice of you! | 03:17 |
jeblair | I have not done rechecks. | 03:17 |
jeblair | it's hard to get a gerrit query for that. | 03:17 |
Alex_Gaynor | Things that don't hvae a current status from jenkins | 03:17 |
Alex_Gaynor | gerrit doesn't have an easy way to do that? :( | 03:17 |
mordred | -label:Verified<=2 will get you the ones that are completely new - but it's hard to get the ones that may have had a new patchset uploaded since the last time they were check verified | 03:19 |
mordred | because we don't clear the verified status on the start of a new check job like we do for the gate | 03:19 |
mordred | actually, you'd want -label:Verified<=2 -label:Approved for the first one, to make sure that you're not catching a thing that the gate has cleared the verified vote | 03:20 |
mordred | but still, you're still missing a ton there | 03:21 |
*** HenryG has quit IRC | 03:25 | |
*** zul has quit IRC | 03:31 | |
*** cthulhup has joined #openstack-infra | 03:33 | |
*** cthulhup has quit IRC | 03:37 | |
*** dina_belova has joined #openstack-infra | 03:37 | |
*** dina_belova has quit IRC | 03:42 | |
*** afazekas has joined #openstack-infra | 03:42 | |
*** boris-42 has joined #openstack-infra | 03:49 | |
bodepd | fungi: I just went through the following process: https://gist.github.com/bodepd/6276932 | 03:52 |
bodepd | fungi: and my redirects worked as expected. I did, however, use github's GUI, and I am not sure what process was used by your team | 03:53 |
*** xBsd has quit IRC | 03:53 | |
*** jfriedly has joined #openstack-infra | 03:55 | |
*** wenlock has joined #openstack-infra | 03:56 | |
*** mberwanger has joined #openstack-infra | 03:59 | |
*** yaguang has quit IRC | 03:59 | |
*** vogxn has joined #openstack-infra | 04:01 | |
*** michchap_ has joined #openstack-infra | 04:04 | |
*** michchap has quit IRC | 04:08 | |
*** yaguang has joined #openstack-infra | 04:12 | |
*** ftcjeff has quit IRC | 04:23 | |
*** wenlock has quit IRC | 04:24 | |
*** dims has quit IRC | 04:25 | |
*** cthulhup has joined #openstack-infra | 04:27 | |
*** cthulhup has quit IRC | 04:31 | |
*** dina_belova has joined #openstack-infra | 04:38 | |
*** mberwanger has quit IRC | 04:38 | |
*** dina_belova has quit IRC | 04:42 | |
*** xBsd has joined #openstack-infra | 04:47 | |
*** reed has quit IRC | 04:53 | |
*** yaguang has quit IRC | 04:59 | |
*** rcleere has quit IRC | 05:03 | |
fungi | bodepd: yeah, mordred did the stackforge/puppet-{quantum,neutron} move, but not sure what he did in github land for it. our http://ci.openstack.org/gerrit.html#renaming-a-project recipe suggests "12. Rename the project in GitHub..." so i would assume that's what he did | 05:07 |
*** dmakogon_ has joined #openstack-infra | 05:08 | |
*** yaguang has joined #openstack-infra | 05:12 | |
*** cthulhup has joined #openstack-infra | 05:21 | |
*** SergeyLukjanov has joined #openstack-infra | 05:24 | |
*** cthulhup has quit IRC | 05:25 | |
*** nicedice_ has quit IRC | 05:29 | |
mordred | fungi, bodepd I'm pretty sure I just deleted the old project and let the new project be created by manage_projects | 05:34 |
*** dina_belova has joined #openstack-infra | 05:38 | |
*** dina_belova has quit IRC | 05:43 | |
*** thomasbiege has joined #openstack-infra | 05:48 | |
*** DennyZhang has joined #openstack-infra | 05:55 | |
*** mikal has quit IRC | 05:55 | |
*** thomasbiege1 has joined #openstack-infra | 05:59 | |
*** thomasbiege has quit IRC | 06:02 | |
*** thomasbiege1 has quit IRC | 06:13 | |
*** cthulhup has joined #openstack-infra | 06:15 | |
*** thomasbiege has joined #openstack-infra | 06:17 | |
*** cthulhup has quit IRC | 06:20 | |
*** dina_belova has joined #openstack-infra | 06:39 | |
*** dina_belova has quit IRC | 06:43 | |
*** tian has quit IRC | 06:44 | |
*** nayward has joined #openstack-infra | 06:47 | |
*** fbo is now known as fbo_away | 06:49 | |
*** SergeyLukjanov has quit IRC | 06:50 | |
*** jfriedly has quit IRC | 06:52 | |
bodepd | mordred: :( . I'm trying to reach out to some folks at github to see if they can help us setup those redirects | 06:57 |
bodepd | mordred: I may need someone with actual credentials to approve it once I get a hold of the right person | 06:58 |
*** michchap has joined #openstack-infra | 07:00 | |
*** xchu has quit IRC | 07:00 | |
*** michchap_ has quit IRC | 07:02 | |
*** cthulhup has joined #openstack-infra | 07:09 | |
*** SergeyLukjanov has joined #openstack-infra | 07:11 | |
*** xchu has joined #openstack-infra | 07:12 | |
*** cthulhup has quit IRC | 07:14 | |
*** SergeyLukjanov has quit IRC | 07:14 | |
*** ruhe has joined #openstack-infra | 07:26 | |
*** pblaho has joined #openstack-infra | 07:29 | |
*** boris-42 has quit IRC | 07:34 | |
*** SergeyLukjanov has joined #openstack-infra | 07:38 | |
*** dina_belova has joined #openstack-infra | 07:39 | |
*** michchap has quit IRC | 07:39 | |
*** michchap has joined #openstack-infra | 07:39 | |
odyi | bodepd: Simply contacting Github support had really good turn around on the redirects from puppetlabs/puppetlabs-* to stackforge/puppet-*. | 07:41 |
odyi | They manually put them in long before I actually deleted the repositories. | 07:42 |
*** odyssey4me3 has joined #openstack-infra | 07:47 | |
odyi | The "Approved" label that seems to be a part of each Gerrit project. What is it used for? Gerrit docs don't make mention of it so I assume it is a custom label. | 07:48 |
* odyi also couldn't find it mentioned in any of the OpenStack/Gerrit workflow docs. | 07:50 | |
*** michchap has quit IRC | 07:52 | |
*** morganfainberg is now known as morganfainberg|a | 07:55 | |
*** DennyZhang has quit IRC | 07:56 | |
*** SergeyLukjanov has quit IRC | 08:00 | |
*** vogxn has quit IRC | 08:03 | |
*** cthulhup has joined #openstack-infra | 08:03 | |
*** jpich has joined #openstack-infra | 08:04 | |
*** derekh has joined #openstack-infra | 08:06 | |
*** fbo_away is now known as fbo | 08:07 | |
*** cthulhup has quit IRC | 08:08 | |
*** xchu has quit IRC | 08:09 | |
*** alex_dolby has joined #openstack-infra | 08:15 | |
*** jhesketh has quit IRC | 08:16 | |
alex_dolby | hi guys.. i am running tox -epy26 in python-novaclient compoennt and getting error about pbr version versions | 08:17 |
alex_dolby | pbr version in setup.py and requirement.txt has different versions.. | 08:18 |
alex_dolby | any pointers? | 08:18 |
*** mkerrin has quit IRC | 08:20 | |
*** dina_belova has quit IRC | 08:21 | |
*** ladquin has quit IRC | 08:24 | |
*** xchu has joined #openstack-infra | 08:26 | |
*** psedlak has joined #openstack-infra | 08:27 | |
*** SergeyLukjanov has joined #openstack-infra | 08:27 | |
*** boris-42 has joined #openstack-infra | 08:40 | |
*** cthulhup has joined #openstack-infra | 08:57 | |
*** vogxn has joined #openstack-infra | 09:02 | |
*** cthulhup has quit IRC | 09:02 | |
*** arezadr has quit IRC | 09:12 | |
*** dina_belova has joined #openstack-infra | 09:22 | |
*** dina_belova has quit IRC | 09:26 | |
*** bingbu has quit IRC | 09:27 | |
*** SergeyLukjanov has quit IRC | 09:32 | |
*** dina_belova has joined #openstack-infra | 09:32 | |
*** dina_belova has quit IRC | 09:34 | |
*** dina_belova has joined #openstack-infra | 09:35 | |
*** yaguang has quit IRC | 09:43 | |
*** odyssey4me3 has quit IRC | 09:54 | |
*** xchu has quit IRC | 10:03 | |
*** odyssey4me3 has joined #openstack-infra | 10:05 | |
*** dina_belova has quit IRC | 10:09 | |
*** LinuxJedi has quit IRC | 10:09 | |
*** ruhe has quit IRC | 10:12 | |
*** alexpilotti has joined #openstack-infra | 10:12 | |
*** odyssey4me3 has quit IRC | 10:17 | |
*** LinuxJedi has joined #openstack-infra | 10:20 | |
*** ruhe has joined #openstack-infra | 10:21 | |
*** odyssey4me3 has joined #openstack-infra | 10:24 | |
*** thomasbiege has quit IRC | 10:28 | |
*** SergeyLukjanov has joined #openstack-infra | 10:37 | |
*** nayward has quit IRC | 10:45 | |
dhellmann | mordred, jeblair : were you looking for https://github.com/dhellmann/smiley/ last night? | 10:49 |
*** mkerrin has joined #openstack-infra | 10:52 | |
*** nayward has joined #openstack-infra | 10:52 | |
*** markmc has joined #openstack-infra | 10:56 | |
*** thomasbiege has joined #openstack-infra | 11:02 | |
*** dina_belova has joined #openstack-infra | 11:09 | |
jswarren | After the glanceclient fix yesterday, I reviewed three changes with "recheck no bug" about 12 hours ago. Jenkins has not re-reviewed them yet. Anything else I need to do? | 11:09 |
jswarren | For example: https://review.openstack.org/#/c/40232/ | 11:11 |
*** SergeyLukjanov has quit IRC | 11:12 | |
*** dina_belova has quit IRC | 11:14 | |
*** vogxn has quit IRC | 11:16 | |
*** lcestari has joined #openstack-infra | 11:17 | |
*** vogxn has joined #openstack-infra | 11:18 | |
*** zul has joined #openstack-infra | 11:19 | |
*** dina_belova has joined #openstack-infra | 11:19 | |
*** dims has joined #openstack-infra | 11:20 | |
*** dina_belova has quit IRC | 11:24 | |
*** nayward has quit IRC | 11:29 | |
*** weshay has joined #openstack-infra | 11:31 | |
*** vogxn has quit IRC | 11:31 | |
*** ruhe has quit IRC | 11:32 | |
*** SergeyLukjanov has joined #openstack-infra | 11:39 | |
*** nayward has joined #openstack-infra | 11:41 | |
*** SergeyLukjanov has quit IRC | 11:44 | |
*** zul has quit IRC | 11:46 | |
*** pcm_ has joined #openstack-infra | 11:46 | |
*** vogxn has joined #openstack-infra | 11:46 | |
*** HenryG has joined #openstack-infra | 11:49 | |
openstackgerrit | Julien Danjou proposed a change to openstack/requirements: Add gevent https://review.openstack.org/42871 | 11:50 |
*** jjmb1 has quit IRC | 11:58 | |
*** afazekas is now known as afazekas_no_irq | 11:59 | |
*** yaguang has joined #openstack-infra | 12:02 | |
*** ruhe has joined #openstack-infra | 12:06 | |
*** rfolco has joined #openstack-infra | 12:07 | |
*** alex_dolby has quit IRC | 12:09 | |
*** vogxn has quit IRC | 12:12 | |
*** apcruz has joined #openstack-infra | 12:18 | |
*** mriedem has joined #openstack-infra | 12:19 | |
*** dina_belova has joined #openstack-infra | 12:20 | |
*** sandywalsh has quit IRC | 12:22 | |
*** sandywalsh has joined #openstack-infra | 12:24 | |
*** dina_belova has quit IRC | 12:25 | |
*** anteaya has joined #openstack-infra | 12:27 | |
*** SergeyLukjanov has joined #openstack-infra | 12:35 | |
*** ruhe has quit IRC | 12:36 | |
*** zul has joined #openstack-infra | 12:37 | |
*** dims has quit IRC | 12:38 | |
*** dprince has joined #openstack-infra | 12:39 | |
*** dkranz has joined #openstack-infra | 12:39 | |
*** dims has joined #openstack-infra | 12:40 | |
*** dina_belova has joined #openstack-infra | 12:43 | |
zul | so im curious why jenkins hasnt been triggered for https://review.openstack.org/#/c/41093/ and https://review.openstack.org/#/c/42789/ | 12:44 |
*** ruhe has joined #openstack-infra | 12:47 | |
markmc | zul, you know, I think I'm seeing this too with my nova reviews | 12:47 |
* markmc looks | 12:47 | |
*** dina_belova has quit IRC | 12:47 | |
*** SergeyLukjanov has quit IRC | 12:47 | |
markmc | zul, ok, not seeing it now - but think I saw zuul missing some submissions yesterday | 12:48 |
zul | hmmm | 12:48 |
zul | is there a way to kick them off again? | 12:49 |
markmc | looks like recheck doesn't work, I don't know of another way | 12:51 |
markmc | just change the commit message of the first patch and re-submit | 12:51 |
zul | ok | 12:55 |
*** dkranz has quit IRC | 12:55 | |
anteaya | markmc zul there were issues yesterday with jenkins. The best I understand is that jenkins was ddosing our git server and there was much work to bring about a resolution. Reading the logs, I can not definitively point to a solution that was found. What you are seeing _may_ be related. | 13:00 |
markmc | ok, thanks | 13:01 |
*** jog0 is now known as jog0-away | 13:01 | |
zul | anteaya: cool thanks | 13:01 |
anteaya | np | 13:01 |
*** mberwanger has joined #openstack-infra | 13:01 | |
*** adalbas has quit IRC | 13:03 | |
*** kiall has quit IRC | 13:08 | |
*** dkliban has quit IRC | 13:11 | |
*** changbl has quit IRC | 13:12 | |
*** whayutin_ has joined #openstack-infra | 13:14 | |
*** weshay has quit IRC | 13:16 | |
*** xchu has joined #openstack-infra | 13:20 | |
*** w_ has quit IRC | 13:23 | |
*** sgviking has quit IRC | 13:25 | |
*** sgviking has joined #openstack-infra | 13:25 | |
*** sgviking has quit IRC | 13:26 | |
*** sgviking has joined #openstack-infra | 13:26 | |
*** lbragstad has joined #openstack-infra | 13:27 | |
*** HenryG has quit IRC | 13:27 | |
*** michchap has joined #openstack-infra | 13:30 | |
*** mberwanger has quit IRC | 13:35 | |
*** prad_ has joined #openstack-infra | 13:37 | |
*** burt has joined #openstack-infra | 13:42 | |
*** thomasbiege2 has joined #openstack-infra | 13:43 | |
mordred | dhellmann: yes | 13:44 |
*** cppcabrera has joined #openstack-infra | 13:45 | |
*** thomasbiege has quit IRC | 13:46 | |
jd__ | ttx, mordred, dhellmann, whoever, I'd need https://review.openstack.org/#/c/42871/ quickly to unblock Ceilomeer CI failing | 13:46 |
jd__ | zul: ^ | 13:46 |
mordred | jd__: can you point me to the failing thing? | 13:46 |
ttx | jd__: looks like I don't have +2 on requirements | 13:47 |
* mordred wants to understand why our mirror builder isn't picking it up | 13:47 | |
ttx | mordred: I thought I had, but meh | 13:47 |
jd__ | mordred: http://logs.openstack.org/46/42846/1/check/gate-ceilometer-python27/caaca73/console.html.gz | 13:47 |
mordred | thank you | 13:47 |
jd__ | mordred: Pymongo does not specify the dependency… | 13:48 |
ttx | I can certainly spare the effort | 13:48 |
mordred | jd__: o m g | 13:48 |
mordred | jd__: SERIOUSLY? | 13:48 |
mordred | I hate people | 13:48 |
*** dina_belova has joined #openstack-infra | 13:48 | |
jd__ | I couldn't agree more | 13:48 |
jd__ | I've opened a ticket upstream https://jira.mongodb.org/browse/PYTHON-558 | 13:48 |
mordred | aprvd | 13:48 |
ttx | mordred: was I supposed to have +2 on requirements ? I forget what we originally said (discovered recently I wasn't subscribed to it) | 13:49 |
mordred | ttx: I'm happy to give you +2 on them - makes sense for you to have it | 13:49 |
*** whayutin_ has quit IRC | 13:49 | |
jd__ | +1 :) | 13:50 |
ttx | can't remember if I signed up for it or not | 13:51 |
*** dina_belova has quit IRC | 13:52 | |
mordred | jd__: ok- there is feedback on that bug... | 13:53 |
ttx | mordred: let me watch the reviews for some time to see if I actually care enough | 13:53 |
jd__ | mordred: just saw, I'm responding | 13:53 |
mordred | jd__: I did too | 13:53 |
jd__ | ah | 13:53 |
* jd__ lags | 13:53 | |
mordred | jd__: "gevent doesn't support python 3 or pypy" -- is there an internal feature of pymongo that you're using that's going to get us in trouble with python 3 and pypy support? | 13:54 |
jd__ | mordred: no, we use nothing fancy | 13:54 |
mordred | k. cool | 13:54 |
mordred | I'll be interested to see what's going on here | 13:54 |
jd__ | that's why I'm surprised we see errors about gevent now that we pull pymongo 2.6 | 13:54 |
*** michchap has quit IRC | 13:57 | |
*** dina_belova has joined #openstack-infra | 13:58 | |
*** ftcjeff has joined #openstack-infra | 13:59 | |
openstackgerrit | Russell Bryant proposed a change to openstack-infra/config: Disable tempest in the cells job temporarily https://review.openstack.org/42898 | 14:00 |
*** weshay has joined #openstack-infra | 14:01 | |
*** vogxn has joined #openstack-infra | 14:01 | |
jd__ | ah now that talks about greenlet and I'm going to be lost in that again | 14:02 |
* jd__ runs | 14:02 | |
*** dina_belova has quit IRC | 14:03 | |
openstackgerrit | Russell Bryant proposed a change to openstack-infra/config: Disable tempest in the cells job https://review.openstack.org/42898 | 14:06 |
*** dkliban has joined #openstack-infra | 14:08 | |
*** xBsd has quit IRC | 14:10 | |
*** xBsd has joined #openstack-infra | 14:10 | |
*** xBsd has quit IRC | 14:12 | |
mordred | jd__: I think we can remove use_greenlets | 14:15 |
mordred | jd__: "If you need to use standard Python threads in the same process as Gevent and greenlets" | 14:16 |
jd__ | indeed, we don't use threads so that should be ok I guess | 14:16 |
markmc | are you sure? | 14:16 |
markmc | libraries have been known to spawn random threads :) | 14:17 |
*** pabelanger has quit IRC | 14:17 | |
jd__ | now I'm unsure and scared | 14:17 |
mordred | markmc: well, let's solve that problem when we come to it for real - keeping the option means we're adding another python3 incompatability | 14:17 |
mriedem | dhellmann: ping | 14:17 |
mordred | jd__: can we try a patch to ceilometer that removes the option? | 14:18 |
markmc | jd__, cooperative coroutines mumble mumble ... oh, look over there! | 14:18 |
jd__ | mordred: sure, it'll take me a sec' | 14:18 |
mordred | jd__: sending patch in... | 14:18 |
jd__ | mordred: cool | 14:18 |
mordred | jd__: https://review.openstack.org/42906 | 14:20 |
*** changbl has joined #openstack-infra | 14:20 | |
jd__ | mordred: ack, approving, if Jenkins' happy, we'll be too | 14:21 |
mordred | great! | 14:21 |
jd__ | and we'll be able to revert gevent fortunately | 14:21 |
*** xBsd has joined #openstack-infra | 14:21 | |
mordred | I already blocked that from merging | 14:21 |
mordred | and https://jira.mongodb.org/browse/PYTHON-558?focusedCommentId=407277#comment-407277 for anyone who wants to play along | 14:22 |
mriedem | dhellmann: nevermind | 14:22 |
*** odyssey4me3 has quit IRC | 14:22 | |
ttx | nice turnaround on that bug report | 14:22 |
mordred | dhellmann: I'm reading the mailing list as being in approval of give Alex_Gaynor +2 on requirements... | 14:23 |
mordred | dhellmann: shall we make that happen? | 14:23 |
*** beagles has joined #openstack-infra | 14:28 | |
*** thomasbiege2 has quit IRC | 14:28 | |
*** rcleere has joined #openstack-infra | 14:32 | |
*** mrmartin has joined #openstack-infra | 14:33 | |
*** gordc has joined #openstack-infra | 14:35 | |
*** markmcclain has joined #openstack-infra | 14:37 | |
*** ruhe has quit IRC | 14:38 | |
gordc | hi folks, would anyone happen to know when the cron job runs to update CI mirror? i just made a release for a lib and was wondering when jenkins would pick it up ... or if i could force it to get picked up. | 14:38 |
*** datsun180b has joined #openstack-infra | 14:40 | |
*** yaguang has quit IRC | 14:40 | |
*** __afazekas_zz has quit IRC | 14:41 | |
mordred | gordc: it runs after we land requirements changes - which lib? is it a thing that we should raise the min in openstack/requirements for? | 14:47 |
*** odyssey4me4 has joined #openstack-infra | 14:47 | |
*** senk has joined #openstack-infra | 14:47 | |
*** derekh has quit IRC | 14:50 | |
gordc | mordred: its for pycadf library (a new lib for audit data) -- i did not include a min since some changes were still being made aruond time it was added | 14:50 |
*** SergeyLukjanov has joined #openstack-infra | 14:51 | |
*** dina_belova has joined #openstack-infra | 14:57 | |
*** david-lyle has quit IRC | 14:57 | |
*** cthulhup has joined #openstack-infra | 14:58 | |
*** sandywalsh has quit IRC | 15:01 | |
*** ryanpetrello has joined #openstack-infra | 15:03 | |
*** wu_wenxiang has joined #openstack-infra | 15:04 | |
mriedem | gordc: hey, i noticed that this didn't automatically change the status/assignee of the bug in launchpad: https://review.openstack.org/#/c/42904/ | 15:04 |
mriedem | was going to ask dhellmann if the pycadf project is hooked up to launchpad via gerrit for status changes | 15:05 |
gordc | mriedem: it probably isn't hooked up correctly. i created the launchpad project so good chance i mucked it up :) | 15:06 |
wu_wenxiang | I find some commit didn't start check for a long time, for example: https://review.openstack.org/#/c/38963/ and https://review.openstack.org/#/c/42794/ | 15:06 |
*** ruhe has joined #openstack-infra | 15:08 | |
*** sridevi has joined #openstack-infra | 15:08 | |
*** xchu has quit IRC | 15:08 | |
jeblair | wu_wenxiang: leave a comment with "recheck no bug"; we had to restart zuul yesterday and it lost its queue | 15:09 |
sridevi | Hi could someone help me with https://review.openstack.org/#/c/34801/ | 15:09 |
sridevi | I see "ERROR:root:Could not find any typelib for GnomeKeyring" failures | 15:10 |
*** ^d has joined #openstack-infra | 15:12 | |
*** ^d has joined #openstack-infra | 15:12 | |
*** xBsd has quit IRC | 15:12 | |
wu_wenxiang | jeblair: Thanks | 15:12 |
*** SlickNik has quit IRC | 15:13 | |
*** vogxn has quit IRC | 15:13 | |
*** SlickNik has joined #openstack-infra | 15:14 | |
*** pabelanger has joined #openstack-infra | 15:15 | |
*** wu_wenxiang has quit IRC | 15:16 | |
*** david-lyle has joined #openstack-infra | 15:17 | |
*** sandywalsh has joined #openstack-infra | 15:17 | |
ryanpetrello | jeblair: Can I bug you to take a peek at this review? https://review.openstack.org/#/c/42685/2 | 15:17 |
*** UtahDave has joined #openstack-infra | 15:19 | |
ryanpetrello | or clarkb for that matter | 15:19 |
*** ruhe has quit IRC | 15:20 | |
jeblair | ryanpetrello: i'm hacking on a fix for a production problem we've been having right now, but i will make it a point to review it today if the rest of the team hasn't taken care of it | 15:21 |
ryanpetrello | thanks | 15:21 |
ryanpetrello | this obviously takes a back seat :) | 15:21 |
*** ruhe has joined #openstack-infra | 15:22 | |
openstackgerrit | gordon chung proposed a change to openstack/requirements: assign a min version to pycadf https://review.openstack.org/42923 | 15:23 |
*** reed has joined #openstack-infra | 15:23 | |
*** dina_belova has quit IRC | 15:24 | |
*** sridevi has quit IRC | 15:24 | |
mordred | ryanpetrello: done | 15:30 |
ryanpetrello | jeblair: Monty approved it, thanks | 15:30 |
mordred | jeblair: anything I can do to help you? | 15:30 |
ryanpetrello | (thanks) | 15:30 |
openstackgerrit | A change was merged to openstack-infra/config: Add WSME to StackForge. https://review.openstack.org/42685 | 15:36 |
*** nayward has quit IRC | 15:39 | |
*** afazekas_no_irq is now known as afazekas | 15:42 | |
*** thomasbiege has joined #openstack-infra | 15:42 | |
*** vogxn has joined #openstack-infra | 15:43 | |
NobodyCam | jeblair: seems we have no core members on stackforge/pyghmi we did before the rename | 15:45 |
*** rnirmal has joined #openstack-infra | 15:46 | |
*** zehicle_at_dell has joined #openstack-infra | 15:47 | |
mordred | NobodyCam: looking | 15:49 |
NobodyCam | thnank you mordred :) | 15:50 |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: Rename python-impi acl file to pyghmi https://review.openstack.org/42932 | 15:51 |
NobodyCam | w00t | 15:51 |
mordred | NobodyCam: should be fixed soon | 15:51 |
*** changbl has quit IRC | 15:51 | |
NobodyCam | :) TY | 15:51 |
NobodyCam | mordred: shouldn't you be burning things about now? | 15:52 |
mordred | NobodyCam: soon | 15:52 |
NobodyCam | :) | 15:52 |
*** mrodden has quit IRC | 15:53 | |
*** davidhadas has quit IRC | 15:54 | |
*** ruhe has quit IRC | 15:55 | |
openstackgerrit | A change was merged to openstack-infra/config: Rename python-impi acl file to pyghmi https://review.openstack.org/42932 | 15:58 |
*** xBsd has joined #openstack-infra | 15:59 | |
clarkb | morning | 16:00 |
NobodyCam | good morning clarkb | 16:01 |
clarkb | mordred jeblair: which production issue? | 16:01 |
mordred | clarkb: I'm assuming the thing from yesterday | 16:01 |
*** sridevi has joined #openstack-infra | 16:02 | |
mordred | clarkb: if you have a second, a ton of these: https://review.openstack.org/#/q/watchedby:mordred%2540inaugust.com+-label:CodeReview%253C%253D-1+-label:Verified%253C%253D-1+-label:Approved%253E%253D1++-status:workinprogress+-status:draft+-is:starred+-owner:mordred%2540inaugust.com,n,z | 16:02 |
clarkb | mordred: which one :) it was like a horrible train wreck | 16:02 |
mordred | clarkb: could use a second +2 and are trivial changes | 16:02 |
*** sridevi has quit IRC | 16:03 | |
clarkb | mordred ok I have a couple things I want to fix while I am thinking of them but can look at those after | 16:03 |
mordred | clarkb: k. they're not important, but most of them are simple enough to be 'while drinking first cup of coffee' fodder | 16:03 |
clarkb | mordred jeblair what do you think of adding something like celery.contrib.rdb to zuul for stack traces and remote pdb | 16:03 |
mordred | oy | 16:04 |
mordred | something about using celery in a project that uses gear seems weird | 16:04 |
clarkb | I would simplify and vendor it | 16:04 |
*** mrodden has joined #openstack-infra | 16:04 | |
clarkb | mordred forget it is celery :) but their contrib.rdb module seems relatively decent and they have tests for it | 16:04 |
mordred | neat | 16:05 |
*** gyee has joined #openstack-infra | 16:05 | |
mordred | then why not just requirements celery? | 16:05 |
clarkb | we could do that too... seems heavy for something like a contrib module. I could go either way vendor or require | 16:06 |
NobodyCam | mordred: should that merge have fixed us? | 16:06 |
mordred | NobodyCam: it'll take a minute | 16:07 |
NobodyCam | ahh ok :) TY | 16:07 |
mordred | NobodyCam: we have to wait for the git pull cron followed by the puppet agent - so it could be as long as 30 minutes from merge | 16:07 |
*** jfriedly has joined #openstack-infra | 16:08 | |
*** gordc has left #openstack-infra | 16:08 | |
mordred | clarkb: also, your haproxy patch has 3 +2's : https://review.openstack.org/#/c/42784/ so I think whenever you want to land that and ride shotgun, you know, whatever | 16:09 |
*** odyssey4me4 has quit IRC | 16:11 | |
jeblair | mordred, clarkb: i am reworking nodepool (as i mentioned yesterday) | 16:11 |
*** pabelanger has quit IRC | 16:12 | |
jeblair | clarkb: the celery thing is heavyweight. i don't think we need a full remote debugger, we just need better logging, and the ability to get a stacktrace if something is stuck... | 16:13 |
clarkb | jeblair: It needs an update. because the proxy is a single source we need to bump xinetd limits... i will propose that shortly | 16:13 |
*** thomasbiege has quit IRC | 16:13 | |
pleia2 | testing 42784 here now | 16:13 |
jeblair | clarkb: and that's just for a desparate situation -- in reality we should always be able to figure out what's going on from logs. this is perhaps the first time we've been unable to do that with zuul. :( | 16:13 |
clarkb | jeblair: ok, I figured remote debugger would give us that and more, but can just log stacktraces as a start | 16:14 |
pleia2 | clarkb: there are some errors for 42784, investigating and drafting up comment now | 16:16 |
clarkb | pleia2 thanks. /me -> office | 16:16 |
openstackgerrit | Russell Bryant proposed a change to openstack-infra/config: Disable tempest in the cells job https://review.openstack.org/42898 | 16:22 |
*** boris-42 has quit IRC | 16:22 | |
*** dina_belova has joined #openstack-infra | 16:24 | |
ryanpetrello | mordred: how long does it generally take for merged openstack-infra/config projects to show up in github.com/stackforge ? | 16:26 |
mordred | ryanpetrello: usually quicker than this - let me look | 16:26 |
ryanpetrello | thx | 16:26 |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: Make the gitweb links in gerrit point to git.o.o https://review.openstack.org/42694 | 16:27 |
*** pabelanger has joined #openstack-infra | 16:27 | |
*** markmc has quit IRC | 16:29 | |
clarkb | pleia2 try stopping xinetd first. It has port 9418 | 16:32 |
clarkb | or rather kick it to pick up the new config | 16:33 |
*** nicedice_ has joined #openstack-infra | 16:34 | |
pleia2 | clarkb: ah, yeah! it didn't pick up the new config, restarting it then starting haproxy is fine | 16:34 |
*** xBsd has quit IRC | 16:36 | |
*** psedlak has quit IRC | 16:36 | |
clarkb | cool I eill encode into puppet | 16:37 |
*** adalbas has joined #openstack-infra | 16:38 | |
*** dina_belova has quit IRC | 16:41 | |
*** pycabrera has joined #openstack-infra | 16:42 | |
*** nati_ueno has joined #openstack-infra | 16:42 | |
*** kgriffs has joined #openstack-infra | 16:42 | |
*** nati_ueno has joined #openstack-infra | 16:43 | |
*** pblaho has quit IRC | 16:43 | |
pleia2 | having some trouble getting it to clone with haproxy enabled, browsing logs | 16:43 |
kgriffs | hey guys, Kurt here from the Marconi team. We'd like to enable logging and/or meetbot for #openstack-marconi - what's the recommended way to do this? | 16:43 |
kgriffs | host it ourselves, or is there a shared bot? | 16:43 |
*** cppcabrera has quit IRC | 16:43 | |
*** pycabrera is now known as cppcabrera | 16:43 | |
pleia2 | kgriffs: there is a shared bot, hang on, I'll grab a recent review as an example | 16:44 |
annegentle | modules/gerritbot/files/gerritbot_channel_config.yaml | 16:44 |
annegentle | kgriffs: I think that's it. ^^ | 16:44 |
*** alexpilotti has quit IRC | 16:44 | |
pleia2 | kgriffs: https://review.openstack.org/#/c/41512/ | 16:44 |
annegentle | pleia2: mine's not so recent, but https://review.openstack.org/#/c/21696/ | 16:44 |
annegentle | heh | 16:44 |
pleia2 | for logging it's modules/openstack_project/manifests/eavesdrop.pp | 16:44 |
pleia2 | not gerritbot | 16:44 |
kgriffs | cool, thanks! | 16:45 |
pleia2 | gerritbot is the one that tells you updates in reviews merges and things :) | 16:45 |
kgriffs | actually, I think we are in gerritbot | 16:45 |
cppcabrera | yup, we have gerritbot running as of yesterday. :D | 16:45 |
ryanpetrello | mordred: that seemed to work, thx :) | 16:45 |
pleia2 | kgriffs: once it's in eavesdrop you get logs up on http://eavesdrop.openstack.org/ | 16:45 |
ryanpetrello | I noticed, however that one of the groups was created - https://review.openstack.org/#/admin/groups/202,members - while the other, wsme-ptl, wasn't | 16:46 |
dhellmann | mordred: I am, too. I was going to wait the number of days specified in https://wiki.openstack.org/wiki/Governance/Approved/CoreDevProcess but I don't have | 16:46 |
kgriffs | pleia2: excellent | 16:46 |
dhellmann | mordred: added Alex_Gaynor to requirements-core group in gerrit | 16:48 |
openstackgerrit | Clark Boylan proposed a change to openstack-infra/config: Proxy git-daemon with haproxy. https://review.openstack.org/42784 | 16:48 |
clarkb | pleia2: ^ slightly updated. You may want to try those settings as the xinetd ACLs are slightly relaxed to be more friendly to haproxy | 16:48 |
pleia2 | clarkb: great, thanks | 16:49 |
openstackgerrit | Monty Taylor proposed a change to openstack-dev/pbr: Rework run_shell_command https://review.openstack.org/42337 | 16:49 |
openstackgerrit | Monty Taylor proposed a change to openstack-dev/pbr: Use wheel by default https://review.openstack.org/41255 | 16:49 |
mordred | ryanpetrello: you are now in wsme-core, so you should be able to add other people | 16:51 |
mordred | as you see fix | 16:51 |
mordred | fit | 16:51 |
mordred | ryanpetrello: poking wsme-ptl | 16:51 |
*** SlickNik has quit IRC | 16:52 | |
ryanpetrello | awesome, and *thank you* | 16:52 |
mordred | NobodyCam: you should be set | 16:52 |
*** SlickNik has joined #openstack-infra | 16:52 | |
mordred | ryanpetrello: I'm excited to have wsme moved in! | 16:52 |
*** alexpilotti has joined #openstack-infra | 16:52 | |
dhellmann | mordred: cdevienne is looking forward to having more contributors :-) | 16:53 |
NobodyCam | mordred: Thank you !! | 16:53 |
mordred | dhellmann: :) | 16:53 |
*** kgriffs has left #openstack-infra | 16:53 | |
*** afazekas has quit IRC | 16:54 | |
mordred | dhellmann: while you're here, could I get a second +2 on https://review.openstack.org/#/c/42515/ ? I have another patch that's wanting it and I'm trying to clear as much of my outstanding niggly stuff before I am out today | 16:54 |
dhellmann | mordred: sure, looking now | 16:54 |
mordred | dhellmann: (there's two other in requirements that could probably use love as well) | 16:54 |
dhellmann | mordred: I've got a standup in 3 minutes, but after that can look at anything you'd like reviewed | 16:56 |
clarkb | pleia2: anything I can do to help testing/debug git-daemon? | 16:57 |
openstackgerrit | Alejandro Cabrera proposed a change to openstack-infra/config: feat: add marconi channel to eavesdrop https://review.openstack.org/42956 | 16:57 |
pleia2 | clarkb: the patch helps us stop losing the puppet lottery (xinetd should have to look at the file it's subscribed to first before haproxy stuff happens) but still unable to clone from git:// with it enabled, looking for haproxy related logs now | 17:00 |
*** ladquin has joined #openstack-infra | 17:00 | |
*** fbo is now known as fbo_away | 17:01 | |
*** jerryz has joined #openstack-infra | 17:04 | |
*** morganfainberg|a is now known as morganfainberg | 17:04 | |
pleia2 | gosh, looking for issues with git specifically is a fun google-fu problem | 17:08 |
*** dprince has quit IRC | 17:08 | |
morganfainberg | Alex_Gaynor: ping | 17:09 |
clarkb | pleia2: is it like googling for Go? | 17:14 |
pleia2 | yeah, and screen(1) :) | 17:14 |
pleia2 | might be an issue with my test isntance though, it doesn't have a fqdn for one | 17:14 |
fungi | i cannot, for the life of me, figure out how to adjust bugtask metadata for git-review on bug 1179008 (trying to set it to fix-committed for example). tried repeatedly over the past few days and every time i get a launchpad "timeout error..." ideas? | 17:15 |
openstackgerrit | Clark Boylan proposed a change to openstack-infra/zuul: SIGUSR2 logs stack traces for active threads. https://review.openstack.org/42959 | 17:15 |
uvirtbot | Launchpad bug 1179008 in python-neutronclient "rename requires files to standard names" [Medium,In progress] https://launchpad.net/bugs/1179008 | 17:15 |
pleia2 | and ipaddress might show up weird on hpcloud (the local address the machine thinks it has in `ip addr` is not the public address | 17:15 |
* pleia2 manual tweaks | 17:16 | |
*** vipul is now known as vipul-away | 17:16 | |
koolhead17 | pleia2: hi there | 17:17 |
clarkb | jeblair: ^ 42959 is a bit of a WIP but I figured I would get that out sooner than later. I am still working on testing it (is the easiest way to do that to write a unittest?) | 17:17 |
clarkb | fungi: it times out for me too. Maybe we attached too many projects to that bug? | 17:18 |
pleia2 | koolhead17: hey, hope you're enjoying your stay in SF :) | 17:19 |
*** vipul-away is now known as vipul | 17:22 | |
koolhead17 | pleia2: i am. :) | 17:23 |
jeblair | clarkb: lgtm; you might need to actually run it in order to test it. also, i think there is something messed up with signals and using the internal gear server. | 17:23 |
koolhead17 | lets catch up sometime over weekend | 17:23 |
mordred | jeblair: oh lovely | 17:23 |
* koolhead17 waves jeblair mordred clarkb & everyone :D | 17:23 | |
mordred | hey koolhead17 - enjoyin SF? | 17:23 |
koolhead17 | yes sir. its great | 17:24 |
koolhead17 | :) | 17:24 |
koolhead17 | i might be in seattle for a day | 17:25 |
clarkb | koolhead17: one day is not enough for seattle :P | 17:25 |
Alex_Gaynor | morganfainberg: pong | 17:25 |
koolhead17 | clarkb: i know :( | 17:26 |
clarkb | jeblair: is there still a dev zuul that I can use to test within a running system? | 17:26 |
koolhead17 | clarkb: won`t mind coming to portland for beer for few hr though. :D | 17:26 |
koolhead17 | Alex_Gaynor: hi there | 17:26 |
morganfainberg | Alex_Gaynor: hey, wanted to follow up with you regarding: https://review.openstack.org/#/c/42455/ (since you, in theory could bump up to a +2 now, btw, gratz on core for requirements) | 17:27 |
pleia2 | clarkb: so netstat tells me git daemon isn't even running when not on the default port, so trying to fix that now | 17:27 |
clarkb | jeblair: I have at least one small updated to that. I realized that a reconfigure will also reconfigure logging so I am just going to get the logger each time I need to dump stack traces | 17:27 |
morganfainberg | Alex_Gaynor: see if there was any outstanding concerns, since thats thenext blocker for my caching stuff in keystone. | 17:27 |
clarkb | pleia2: doesn't xinetd fork git-daemon's on demand as connections come in? | 17:28 |
*** vogxn has quit IRC | 17:28 | |
fungi | however xinetd should be listening on that port | 17:29 |
Alex_Gaynor | morganfainberg: I don't think there are any outstanding concerns, but I'll have to give it a once over before +2ing :) I'll come around in a few minutes to it | 17:29 |
*** pcm_ has quit IRC | 17:29 | |
pleia2 | clarkb: yeah, but it still should have: :::9418 :::* LISTEN 10606/xinetd | 17:29 |
pleia2 | as fungi says :) | 17:29 |
morganfainberg | Alex_Gaynor: thanks! i appreciate it :) | 17:29 |
clarkb | pleia2: haproxy will be 9418, xinetd on 29418 | 17:29 |
pleia2 | right, haproxy shows up on 9418 and no xinetd at all | 17:30 |
pleia2 | can't get it to listen on 29418 | 17:30 |
clarkb | weird | 17:31 |
* pleia2 confirms it's not selinux | 17:31 | |
*** pcm_ has joined #openstack-infra | 17:32 | |
*** SergeyLukjanov has quit IRC | 17:34 | |
clarkb | jeblair: woot, I wrote a small script that sits in a while loop with that signal handler configured and it seems to work | 17:35 |
clarkb | jeblair: much easier testing that way than getting a complete zuul running | 17:35 |
*** cthulhup has quit IRC | 17:35 | |
pleia2 | Aug 20 17:36:39 git-vanilla xinetd[10709]: Service git expects port 9418, not 29418 | 17:36 |
pleia2 | heh | 17:36 |
pleia2 | dear xinetd, do it anyway | 17:37 |
*** mgagne has joined #openstack-infra | 17:37 | |
*** mgagne has quit IRC | 17:37 | |
*** mgagne has joined #openstack-infra | 17:37 | |
openstackgerrit | Anita Kuno proposed a change to openstack-dev/hacking: Testing how .html files are rendered by cgit. https://review.openstack.org/42961 | 17:42 |
*** zul has quit IRC | 17:42 | |
morganfainberg | Alex_Gaynor: looks like dhellmann got to it before you. thanks :) | 17:46 |
Alex_Gaynor | morganfainberg: okey doke, sorry bout that, I'm writing some scripts to setup swift for some benchmarkming :) | 17:46 |
morganfainberg | Alex_Gaynor: not a problem man, was just following up with people today about it. thanks again! | 17:47 |
openstackgerrit | Anita Kuno proposed a change to openstack-dev/hacking: Testing how .html files are rendered by cgit https://review.openstack.org/42961 | 17:48 |
clarkb | pleia2: maybe we should consider running it as a stand alone daemon? | 17:48 |
clarkb | pleia2: and rely on haproxy to do the DDoS protection | 17:48 |
pleia2 | clarkb: so it looks like xinetd uses /etc/services to determine where it should bind stuff, by patching /etc/services I got it to work, but this seems sub-optimal | 17:50 |
pleia2 | (commented out the 9418 git lines, added ones for 29418) | 17:51 |
*** dina_belova has joined #openstack-infra | 17:52 | |
clarkb | pleia2: so cloning works now? its a start :) | 17:52 |
pleia2 | yeah! This is with haproxy running: git clone git://15.185.127.146/openstack-infra/config.git | 17:53 |
pleia2 | browsing git-daemon docs, the /etc/services thing may actually be more git daemon and less xinetd | 17:54 |
pleia2 | so maybe we do need to change /etc/services | 17:55 |
*** dina_belova has quit IRC | 17:57 | |
clarkb | ok | 17:58 |
anteaya | pleia2: to add to your list of things to do, here is a patch consisting of an .html file I generated with rst2html: https://review.openstack.org/#/c/42961/ | 17:58 |
anteaya | let me know how it looks | 17:58 |
clarkb | pleia2: that seems hacky though | 17:59 |
*** cppcabrera is now known as cppcabrera_afk | 17:59 | |
pleia2 | clarkb: yeah, so if we run it stand alone without --inetd we should be able to specify an alternate --port | 18:00 |
clarkb | pleia2: I like that better | 18:01 |
pleia2 | I am not sure of the best way to do this, as "the centos way" is using xinetd to run services that don't have specific init scripts, git is just a command line "git daemon..." | 18:03 |
clarkb | pleia2: ubuntu's git daemon package comes with an init script. we could vendor it for centos | 18:03 |
clarkb | I am sure that the red hat folk in the channel want to beat me after saying that | 18:03 |
pleia2 | hehe | 18:04 |
pleia2 | so we'd just drop it in /etc/init.d/ ? I am really unfamiliar with rh init system stuff | 18:04 |
pleia2 | (well, after tweaking it to work properly, of course) | 18:05 |
openstackgerrit | Clark Boylan proposed a change to openstack-infra/zuul: SIGUSR2 logs stack traces for active threads. https://review.openstack.org/42959 | 18:07 |
clarkb | jeblair: ^ that comes with a test. Let me know what you think | 18:07 |
clarkb | pleia2: yes, dropping it in /etc/init.d/ and having puppet ensure the service is enabled should be sufficient | 18:07 |
clarkb | assuming that the debian/ubuntu script doesn't have a bunch of debianisms in it that centos won't like | 18:08 |
*** changbl has joined #openstack-infra | 18:09 | |
pleia2 | clarkb: looking now, it does - hard coded paths, /etc/default references, might actually be worth rewriting | 18:09 |
pleia2 | there are useful things I can pull from it though, hacking away | 18:10 |
pleia2 | anteaya: ok, I'll have a look in a little bit | 18:12 |
pleia2 | or use one someone already wrote http://robescriva.com/blog/2009/01/13/git-daemon-init-scripts-on-centos-52/ | 18:13 |
anteaya | k thanks | 18:13 |
* pleia2 frowns at no license | 18:14 | |
pleia2 | ah, easy enough to write own | 18:14 |
clarkb | pleia2: let me know if there is anything I can do to help | 18:16 |
clarkb | I half feel like I threw my crazy haproxy idea over the wall >_> | 18:17 |
*** cthulhup has joined #openstack-infra | 18:17 | |
clarkb | was not my intention :) | 18:17 |
*** zul has joined #openstack-infra | 18:17 | |
pleia2 | no worries, it mostly worked, certainly didn't anticipate it being so cranky about non-standard ports, it shouldn't be like this :) | 18:18 |
*** xBsd has joined #openstack-infra | 18:23 | |
*** melwitt has joined #openstack-infra | 18:23 | |
*** cthulhup has quit IRC | 18:24 | |
clarkb | pleia2: jeblair: mordred: Worth noting that the g-g-p times with https://git.o.o seem to be better than when against review.o.o on centos unittest slaves | 18:25 |
clarkb | so maybe we should stop worrying too much about git:// | 18:25 |
pleia2 | hmm, maybe there is a way I can edit the server_args line to support port | 18:27 |
*** vipul is now known as vipul-away | 18:30 | |
*** vipul-away is now known as vipul | 18:30 | |
pleia2 | not so much | 18:31 |
reed | need a staging server for activity.openstack.org | 18:32 |
pleia2 | clarkb: maybe, seems unlikely that if we point everything at https that there will be enough load on git:// to cause problems | 18:34 |
*** arezadr has joined #openstack-infra | 18:35 | |
jeblair | reed: do you want to write the puppet (we can point you to some docs), or do you want someone else to do it? | 18:39 |
reed | jeblair, send me the puppet stuff, I'd like to learn | 18:39 |
reed | (is that a good answer or what?) | 18:39 |
*** markmcclain has quit IRC | 18:40 | |
pleia2 | reed: http://ci.openstack.org/sysadmin.html#adding-a-new-server is a good start :) | 18:40 |
anteaya | are we waiting for anything specific for this patch: https://review.openstack.org/#/c/38177/ Use cgit server instead of github for everything There is quite the lineup of green +'s on it | 18:40 |
jeblair | reed: it's the most perfect answer ever. :) | 18:40 |
* reed admires his most perfect answer ever, sipping coffee | 18:41 | |
jeblair | reed: http://ci.openstack.org/sysadmin.html#adding-a-new-server | 18:41 |
jeblair | reed: you should actually start at the top of that doc | 18:41 |
pleia2 | anteaya: still working to tune the git server before we throw everything at it | 18:41 |
jeblair | reed: it has background info, and also instructions on how to test | 18:41 |
anteaya | pleia2: ah, okay | 18:41 |
jeblair | reed: but the section i pointed to has the actual steps | 18:41 |
jeblair | reed: and somewhere, there's mrmartin's change to add his staging server | 18:42 |
jeblair | looking | 18:42 |
reed | jeblair, oh, right... I can copy that too | 18:42 |
jeblair | reed: https://review.openstack.org/#/c/42608/ | 18:42 |
reed | sweet | 18:42 |
*** SergeyLukjanov has joined #openstack-infra | 18:43 | |
jeblair | reed, mrmartin: and sorry i haven't reviewed that yet. it is a high priority, after we get some of the operational issues we've been dealing with under control | 18:43 |
*** SergeyLukjanov has quit IRC | 18:43 | |
jeblair | (this week is very busy due to a feature freeze deadline) | 18:43 |
reed | np, mrmartin is on vacation today anyway | 18:44 |
mordred | damn feature freeze | 18:44 |
mordred | clarkb, pleia2: git-daemon wants us to edit /etc/services to run it on another port? | 18:45 |
*** vipul is now known as vipul-away | 18:46 | |
pleia2 | mordred: well, inetd does | 18:46 |
pleia2 | if running it from xinetd or using --inetd on the command line, you can't specify --port because it just does an /etc/services lookup and will only use what's in that file | 18:47 |
pleia2 | I vote that this is broken :) | 18:47 |
pleia2 | but it is what it is | 18:47 |
*** openstack` has joined #openstack-infra | 18:51 | |
*** openstack has quit IRC | 18:51 | |
*** pabelanger has quit IRC | 18:52 | |
*** openstack` is now known as openstack | 18:52 | |
*** boris-42 has joined #openstack-infra | 18:52 | |
*** afazekas has joined #openstack-infra | 18:53 | |
mordred | pleia2: it seems like a very poor design | 18:55 |
openstackgerrit | James E. Blair proposed a change to openstack-infra/nodepool: WIP: provider manager https://review.openstack.org/42973 | 18:56 |
openstackgerrit | Clark Boylan proposed a change to openstack-infra/zuul: SIGUSR2 logs stack traces for active threads. https://review.openstack.org/42959 | 18:57 |
jeblair | mordred, clarkb: ^ that is my solution to the problems with rate limits we saw yesterday ^. i also think it's a bit cleaner and more reliable. | 18:58 |
clarkb | jeblair: I will review after the meeting | 18:58 |
jeblair | mordred, clarkb: it needs a little more work, and testing with a real provider instead of my fake one, but it's mostly there and worth a general review | 18:58 |
clarkb | jeblair: the zuul change should be ready for review as well | 18:59 |
jeblair | clarkb: thanks | 18:59 |
jeblair | meeting time! | 18:59 |
jeblair | almost | 18:59 |
*** AJaeger has joined #openstack-infra | 19:01 | |
*** pabelanger has joined #openstack-infra | 19:01 | |
*** mriedem1 has joined #openstack-infra | 19:02 | |
*** cppcabrera_afk is now known as cppcabrera | 19:03 | |
*** mriedem has quit IRC | 19:03 | |
AJaeger | Hi infra team, I'd like to have some guidance and help on getting the Basic Install guide build now also for openSUSE - and thus on the docs.openstack.org | 19:05 |
AJaeger | annegentle guided me in https://review.openstack.org/#/c/41777/ to you. | 19:06 |
*** thomasbiege1 has joined #openstack-infra | 19:06 | |
clarkb | AJaeger: we are in our weekly meeting currently, so we may be a bit slow to answer, but will catch up after the meeting | 19:06 |
AJaeger | clarkb, sorry, didn't know. Ok, I'll stay around and let you finish your meeting. Thanks for the quick heads-up. | 19:06 |
*** gyee has quit IRC | 19:07 | |
*** vipul-away is now known as vipul | 19:07 | |
*** vipul is now known as vipul-away | 19:07 | |
*** vipul-away is now known as vipul | 19:07 | |
AJaeger | clarkb, btw if I should send an email or use other means, just tell me | 19:07 |
clarkb | AJaeger: IRc is probably easiest, it will just be maybe an hour before we can really answer your questiosn | 19:08 |
AJaeger | clarkb, ok, thanks | 19:08 |
*** markmcclain has joined #openstack-infra | 19:15 | |
Alex_Gaynor | So the amount of time between whe a job finishes on jenkins, and when zuul records it as done seems why too large. Are there any known bottlenecks there, and what can be done to improve that? | 19:15 |
*** fbo_away is now known as fbo | 19:15 | |
nati_ueno | Jenkinsreview on Gerrit get really readable! Nice | 19:17 |
*** dprince has joined #openstack-infra | 19:18 | |
reed | jeblair, pleia2: since the activity-staging server needs to have apache and mysql should I draw inspiration from static.pp for the include::apache and various mods?? | 19:19 |
*** kiall_ has joined #openstack-infra | 19:20 | |
pleia2 | reed: yes | 19:21 |
reed | cool | 19:21 |
jeblair | Alex_Gaynor: link to an example change? | 19:22 |
mordred | nati_ueno: thanks! (jeblair did it) | 19:22 |
*** vipul is now known as vipul-away | 19:22 | |
nati_ueno | jeblair: Thanks! | 19:23 |
*** nati_ueno has quit IRC | 19:26 | |
*** jerryz has quit IRC | 19:26 | |
*** HenryG has joined #openstack-infra | 19:26 | |
*** cthulhup has joined #openstack-infra | 19:26 | |
Alex_Gaynor | jeblair: just random ones I'm noticing as they happen | 19:27 |
*** nati_ueno has joined #openstack-infra | 19:30 | |
*** gordc has joined #openstack-infra | 19:34 | |
*** thomasbiege1 has quit IRC | 19:34 | |
*** cthulhup has quit IRC | 19:37 | |
*** cthulhup has joined #openstack-infra | 19:41 | |
*** xBsd has quit IRC | 19:42 | |
jeblair | Alex_Gaynor: don't forget about severed heads; | 19:43 |
*** vipul-away is now known as vipul | 19:43 | |
jeblair | Alex_Gaynor: the head of the queue was just severed because it failed a test, but it's still running its tests and won't report until they are done | 19:43 |
jeblair | Alex_Gaynor: (scroll to the bottom of the gate queue to see it) | 19:43 |
Alex_Gaynor | jeblair: so the case I was looking at was teh top item in the gate queue | 19:44 |
Alex_Gaynor | s/queue/pipeline | 19:44 |
russellb | btw, i put up this change earlier today to help free up some jenkins resources over the next couple weeks: https://review.openstack.org/#/c/42898/ | 19:45 |
*** zul has quit IRC | 19:47 | |
SlickNik | hey guys. | 19:49 |
SlickNik | just wanted to report in that review.openstack.org is being much slower than usual. | 19:50 |
fungi | SlickNik: yes, it's being used much more than usual | 19:51 |
clarkb | SlickNik: yup, it is getting bogged down by all of the testing to test all of your code :) I think we just agreed to merge a change that will hopefully alleviate some of this | 19:51 |
clarkb | jeblair: do you want to force merge that change or should I just go ahead and do it? | 19:51 |
jeblair | clarkb: i'll do it | 19:51 |
fungi | SlickNik: with the icehouse feature freeze looming, lots of people are trying to submit/review/merge much more code volume than usual | 19:51 |
openstackgerrit | A change was merged to openstack-infra/devstack-gate: Use git.openstack.org as origin https://review.openstack.org/42693 | 19:52 |
clarkb | jeblair: thanks | 19:52 |
SlickNik | Cool, thanks! Understandable with the FF looming. | 19:52 |
SlickNik | And thanks for being on top of it (as usual). | 19:52 |
SlickNik | Chers. | 19:53 |
SlickNik | Cheers* | 19:53 |
clarkb | SlickNik: in the mean time you will probably find that using git review -d and the gerrit ssh interface to be a little more responsive | 19:53 |
clarkb | and do your reviews locally (not sure if you can do inline comments this way, but otherwise it should work) | 19:53 |
openstackgerrit | Anne Gentle proposed a change to openstack-infra/config: Ensure that the release.path.name is set for the Block Storage https://review.openstack.org/42984 | 19:54 |
*** afazekas has quit IRC | 19:54 | |
ryanpetrello | anybody know if there's a generalized sphinx upload hook for pythonhosted.org ? | 19:54 |
pleia2 | clarkb: I'm heading out to lunch in a couple minutes (might run a bit long), will finish up init script upon my return! | 19:54 |
ryanpetrello | that does e.g., http://pythonhosted.org/an_example_pypi_project/buildanduploadsphinx.html | 19:54 |
ryanpetrello | similar to what the rtfd hook does, but uploads directly to pythonhosted.org? | 19:55 |
ryanpetrello | if not, I'd be glad to experiment in writing one, just wanting to make sure it doesn't already exist... | 19:55 |
*** markmc has joined #openstack-infra | 19:55 | |
mordred | ryanpetrello: we have not made one | 19:55 |
ryanpetrello | I wonder if doc_upload has the same permissions as how maintainer roles work | 19:56 |
mordred | at some point, I'd love to get a good general design/direction around rtfd/pythonhosted/docs.o.o | 19:56 |
ryanpetrello | i.e., if you're a maintainer, you can upload docs | 19:56 |
mordred | dhellmann, annegentle ^^ | 19:56 |
mordred | ryanpetrello: also, look at how we do pypi-upload | 19:56 |
clarkb | ryanpetrello: note we don't use setup.py to upload stuff to pypi because ugh. Instead we have a wrapper around curl to do it so that we don't have to run arbitrary code | 19:56 |
annegentle | mordred: I've met with Todd Morey in the last couple weeks to try to synch with www for design | 19:56 |
jeblair | ryanpetrello: i lookd into it briefly | 19:56 |
mordred | ryanpetrello: it's probably more directly related to how we'd need to upload docs to pypi | 19:56 |
annegentle | mordred: Sphinx does work well for dev docs | 19:56 |
jeblair | ryanpetrello: it can be done by uplodaing a zipfile | 19:56 |
clarkb | AJaeger: still around? | 19:56 |
AJaeger | clarkb, Yes. | 19:57 |
jeblair | ryanpetrello: so basically, it would be like the pypi-upload job | 19:57 |
mordred | annegentle: main questoin is - which of the three available locations should we automatically upload to? | 19:57 |
mordred | annegentle: or - should we upload to all of them? | 19:57 |
annegentle | mordred: ah | 19:57 |
clarkb | AJaeger: ok, give me a quick minute to settle back into doing stuff and I will do my best to answer your questions about new doc jobs | 19:57 |
annegentle | mordred: one place. | 19:57 |
dhellmann | mordred: we're looking at pythonhosted for wsme because that's one of the places it is already using | 19:57 |
*** dina_belova has joined #openstack-infra | 19:57 | |
ryanpetrello | why not as many as you specify via hooks? | 19:57 |
*** SergeyLukjanov has joined #openstack-infra | 19:58 | |
dhellmann | my preference is for rtfd.org, because that's what most people are doing for new projects | 19:58 |
ryanpetrello | if elect pythonhosted vs rtfd | 19:58 |
annegentle | ryanpetrello: why clutter the internet? :) | 19:58 |
ryanpetrello | the submission process for those is quite different | 19:58 |
dhellmann | annegentle: +1 | 19:58 |
ryanpetrello | no, I agree | 19:58 |
annegentle | dhellmann: my issue with rtfd is we need the GA info to make good decisions about docs | 19:58 |
jeblair | openstack projects should have their docs uploaded to docs.openstack.org | 19:58 |
ryanpetrello | just staying we should give folks the flexibility to choose | 19:58 |
dhellmann | for openstack stuff, I think we should just host it ourselves | 19:58 |
annegentle | jeblair: yes | 19:58 |
jeblair | stackforge projects can do whatever they want | 19:58 |
mordred | sure | 19:58 |
jeblair | and we do give them the flexibility to do that right now. | 19:58 |
dhellmann | annegentle: right, this would just be for third-party or stackforge stuff | 19:58 |
annegentle | jeblair: sure | 19:58 |
ryanpetrello | right, Doug and I are mostly referring to stackforge in this context | 19:58 |
annegentle | dhellmann: ok | 19:58 |
*** ^demon has joined #openstack-infra | 19:59 | |
*** ^demon has joined #openstack-infra | 19:59 | |
ryanpetrello | just suggesting that stackforge folks may find a "auto-upload to pythonhosted.org on release" useful | 19:59 |
ryanpetrello | they currently have this for rtfd | 19:59 |
ryanpetrello | just considering another option | 19:59 |
dhellmann | yep | 19:59 |
dhellmann | I think we should allow pythonhosted, but encourage rtfd where possible | 19:59 |
annegentle | ryanpetrello: ok. nice that it happens on upload | 20:00 |
ryanpetrello | +1 | 20:00 |
mordred | ++ | 20:00 |
annegentle | ryanpetrello: but there are good reasons to ci docs | 20:00 |
annegentle | I'd probably encourage continuous publishing | 20:00 |
clarkb | AJaeger: we configure all of our jenkins jobs using the Jenkins Job Builder, http://ci.openstack.org/jjb.html | 20:00 |
ryanpetrello | sure, s/on release/whenever is applicable | 20:00 |
dhellmann | annegentle: good point | 20:00 |
ryanpetrello | continuous, if it's right for your project/preference | 20:01 |
*** lcestari has quit IRC | 20:01 | |
clarkb | AJaeger: that page is a good starting point for learning how JJB works. With the help of that page you should be able to grab an existing doc job that does something similar to what you want and copy pasta as needed without losing too much understanding of what is going on | 20:01 |
*** ^d has quit IRC | 20:01 | |
clarkb | AJaeger: then the second thing you need to do is tell zuul to run that jenkins job when you need it to be run | 20:02 |
*** mikal has joined #openstack-infra | 20:02 | |
clarkb | AJaeger: https://github.com/openstack-infra/config/blob/master/modules/openstack_project/files/zuul/layout.yaml is where you do that. http://ci.openstack.org/zuul.html has a brief zuul intro and links ot more in depth docs | 20:03 |
clarkb | AJaeger: so from a super high level your change will have two parts. 1. add job to jenkins with JJB and 2. tell zuul to run new job in layout.yaml | 20:03 |
AJaeger | clarkb: Thanks, I'll check how the current guides are build and see whether I need to duplicate that setup or can somehow hook into it... | 20:04 |
*** zehicle_at_dell has quit IRC | 20:05 | |
openstackgerrit | Clark Boylan proposed a change to openstack-infra/config: Make mysql backup crons quiet. https://review.openstack.org/42785 | 20:06 |
clarkb | jeblair: mordred fungi ^ | 20:06 |
clarkb | and now time for reviews | 20:06 |
*** mikal has quit IRC | 20:07 | |
fungi | clarkb: lgtm. i'm popping out for lunch and then i'll try to review a few changes before my next meeting | 20:09 |
clarkb | AJaeger: feel free to ask questions as they arise. I know I gave the high level info dump and wasn't very specific | 20:10 |
AJaeger | clarkb, that helped a lot - I got the right pointer. I'll propose a change in a few minutes for you to review that I didn't miss anything... | 20:12 |
*** mikal has joined #openstack-infra | 20:13 | |
openstackgerrit | Andreas Jaeger proposed a change to openstack-infra/config: Build Basic Install Guide for openSUSE https://review.openstack.org/42988 | 20:15 |
*** dmakogon_ has quit IRC | 20:16 | |
AJaeger | clarkb, my feeling is just that I'm missing something. That was too easy ;) | 20:16 |
vipul | you guys aware of review.o.o being a slow today? | 20:17 |
clarkb | vipul: yes, we are DDoSing it with the jenkins slaves | 20:17 |
vipul | ooh fun! | 20:18 |
clarkb | we recently merged a devstack gate change that will point more tests to git.openstack.org which will hopefully alleviate the pressure on review.o.o but we need the currently running tests to flip over before we see | 20:18 |
openstackgerrit | James E. Blair proposed a change to openstack-infra/nodepool: Add ProviderManager https://review.openstack.org/42973 | 20:19 |
clarkb | vipul: this is the typical pre feature freeze rush that never fails to break something | 20:21 |
clarkb | vipul: tl;dr you need to write more code during H1 :) | 20:21 |
*** pabelanger has quit IRC | 20:21 | |
vipul | clarkb: h1 is for recovering from all the hangovers at the summit :D | 20:22 |
*** pcm_ has quit IRC | 20:22 | |
*** HenryG has quit IRC | 20:23 | |
*** mikal has quit IRC | 20:24 | |
jeblair | i think jenkins02 is experiencing a similar slowness as before; i've got jstack trying to get a thread dump; it is responding, but very slowly, and it has a bunch of offline nodes sitting aroind. | 20:27 |
jeblair | clarkb, fungi: ^ i uploaded a polished version of the providermanager change; i'm about to start live-testing it | 20:27 |
clarkb | jeblair: ok, it is next up in my queue. | 20:28 |
jeblair | clarkb, fungi: i think i will also do something similar to serialize jenkins access, and try to deploy both of those together. | 20:28 |
openstackgerrit | Clark Boylan proposed a change to openstack-infra/devstack-gate: Replace review.o.o with git.o.o. https://review.openstack.org/42989 | 20:28 |
clarkb | jeblair: ^ I noticed that needed doing | 20:28 |
jeblair | clarkb: no it doesn't we don't use those anymore | 20:29 |
clarkb | jeblair: well it needs doing at least for the README | 20:29 |
clarkb | jeblair: the image building is elsewhere, maybe there should be a clean up d-g commit then do the git stuff on top of it | 20:29 |
jeblair | clarkb: ok, sure, we can change the readme. i'm pretty sure the image building, whether run manually or nightly, is not causing current performance problems, so i deferred it | 20:30 |
jeblair | similarly, i have deferred removing those things until there's a replacement | 20:31 |
jeblair | (for manually running) | 20:31 |
jeblair | clarkb: but can we at least avoid adding that to the gate queue until it's not busy? | 20:31 |
clarkb | jeblair: ya | 20:31 |
clarkb | I will WIP it | 20:32 |
*** pabelanger has joined #openstack-infra | 20:32 | |
openstackgerrit | Jim Branen proposed a change to openstack/requirements: Allow use of hp3parclient 1.1.0. https://review.openstack.org/42991 | 20:32 |
clarkb | russellb: https://jenkins01.openstack.org/job/gate-nova-python26/1366/console seems to be a fairly frequent test failure | 20:32 |
clarkb | jeblair: FYI ^ I think that has semi broken the gate (only nova runs that test so only nova is affected) | 20:33 |
russellb | boris-42: ^^^ | 20:33 |
russellb | boris-42: can you help dig into that? since you (and your team) have been working most in that area | 20:34 |
mordred | anteaya: when you get a moment, would you look at the scrollback in the meeting channel | 20:35 |
mordred | anteaya: and the discussion of setting up a repo that we'll use for voting for TC motions? | 20:36 |
markmc | russellb, I think someone from his team submitted a patch | 20:36 |
* markmc digs it up | 20:36 | |
anteaya | mordred: I was following some of that | 20:36 |
*** kiall_ is now known as Kiall | 20:36 | |
boris-42 | russellb I am here | 20:36 |
anteaya | mordred: am I the resource volunteered for duty? | 20:36 |
markmc | russellb, it was victor, https://review.openstack.org/#/c/42649/ | 20:36 |
anteaya | :D | 20:36 |
mordred | anteaya: yup | 20:37 |
russellb | markmc: nicedice_ | 20:37 |
anteaya | okey dokey smokey | 20:37 |
russellb | err, nice. | 20:37 |
boris-42 | russellb yeah this is already solve | 20:37 |
mordred | anteaya: you know, if you want :) | 20:37 |
anteaya | yeah yeah yeah | 20:37 |
russellb | clarkb: looks like we have a patch up for that ... need to get it reviewed/merged though | 20:37 |
anteaya | so the way I understand it, I go back through the TC meeting logs and pull out past decisions | 20:37 |
clarkb | russellb: markmc: great. Note that any nova changes approved before that one probably won't merge | 20:37 |
*** mikal has joined #openstack-infra | 20:38 | |
anteaya | and offer them up as patches to the repo | 20:38 |
anteaya | that I am about to create | 20:38 |
anteaya | to gather the history | 20:38 |
markmc | clarkb, it only happens like 1 in every 5 times from what I've seen | 20:38 |
anteaya | is that one of the tasks, apart from creating the repo itself | 20:38 |
anteaya | ttx: what do we want to call this TC decision repo? | 20:39 |
anteaya | at the very least, I will learn a lot about the history of the TC | 20:39 |
russellb | clarkb: that change is approved now | 20:42 |
boris-42 | russellb nice | 20:42 |
boris-42 | russellb thnaks | 20:42 |
boris-42 | =) | 20:42 |
*** SergeyLukjanov has quit IRC | 20:42 | |
russellb | boris-42: yep, np | 20:42 |
jeblair | clarkb, mordred: i think jstack is stuck in its deadlock detection. | 20:43 |
mordred | jeblair: wow | 20:44 |
*** dina_belova has quit IRC | 20:44 | |
*** cthulhup has quit IRC | 20:45 | |
*** cthulhup has joined #openstack-infra | 20:45 | |
clarkb | load on git.o.o is ~18 and under 1 on review.o.o | 20:46 |
clarkb | jeblair: that is an impressive feat | 20:46 |
openstackgerrit | Russell Bryant proposed a change to openstack-infra/config: Disable tempest in the cells job https://review.openstack.org/42898 | 20:46 |
jeblair | i'm attaching the debugger and will try that way | 20:46 |
mordred | clarkb: woot! | 20:46 |
clarkb | jeblair: I am working my way through the nodepool client manager change right now | 20:46 |
*** cppcabrera is now known as cppcabrera_afk | 20:47 | |
notmyname | mordred: tags for getting pbr with swift...got a few minutes? | 20:48 |
lifeless | Alex_Gaynor: do you know, is there a way to get a unicode string directly from a memoryview, rather than copy to bytestrnig, then decode to unicode string? | 20:48 |
jeblair | clarkb, mordred: i think it's slow because there are so many nodes still attached to it (which is true because it is slow) | 20:48 |
jeblair | mordred: got a few mins? | 20:49 |
Alex_Gaynor | lifeless: apparently! codecs.utf_8_codecs(memoryview) seems to wokr (for example) | 20:49 |
*** cthulhup has quit IRC | 20:50 | |
lifeless | Alex_Gaynor: ahha, thanks! | 20:50 |
jeblair | mordred: i think jenkins02 needs to be stopped, and have all the nodes removed from its config.xml; all related nodes deleted from nova, and then started again. | 20:50 |
clarkb | jeblair: that is no good. What do you think about an artificial throttle in zuul or nodepool, so that we can at least prevent it from overrunning itself | 20:50 |
lifeless | Alex_Gaynor: though 2.7's codecs module has no utf_8_codecs attribute | 20:51 |
jeblair | clarkb: i mentioned that i wanted to serialize access to jenkins, do you want something else? | 20:51 |
Alex_Gaynor | lifeless: codecs.utf_8_decode(m) | 20:52 |
dprince | jeblair: question on Gerrit comment syntax. I noticed recently that 'SUCCESS' is green.... and 'FAILED' is red. Is that HTML formatting that does that? or some sort of magic gerrit syntax you'd need to use? | 20:52 |
lifeless | Alex_Gaynor: ahha! cool. | 20:52 |
clarkb | jeblair: I think serializing access to jenkins is part of the answer, doing more to add a configurable queue length so that anything going over some limit blocks | 20:52 |
jeblair | clarkb: if we wanted the whole system to be slow, we could have done nothing. it was self limiting earlier. | 20:53 |
jeblair | clarkb: and still is | 20:53 |
jeblair | clarkb: the point is to actually be able to run all of the tests we need to run | 20:53 |
jeblair | clarkb: that's why we're scaling jenkins horizontally and adding more masters | 20:53 |
clarkb | jeblair: I am not suggesting to make it slow, you can still make the limit arbitrarily high | 20:53 |
jeblair | clarkb: what are you suggesting then? | 20:54 |
clarkb | jeblair: but in cases like this we would be much more better off putting a limit on how fast it can be | 20:54 |
jeblair | clarkb: how fast what? | 20:54 |
clarkb | jeblair: jobs per hour | 20:54 |
jeblair | clarkb: are you talking about zuul? | 20:54 |
clarkb | jeblair: or nodepool concurrent nodes | 20:54 |
clarkb | jeblair: I am think of zuul and or nodepool. They can both be throttled to take some of the pressure off of jenkins and gerrit | 20:55 |
jeblair | clarkb: okay, so we just merged a change that will cause tests to not touch gerrit | 20:55 |
*** mikal has quit IRC | 20:56 | |
jeblair | clarkb: zuul accesses gerrit serially when creating its changes | 20:56 |
jeblair | ideally, we have just done quite a lot to take the pressure off of gerrit | 20:56 |
jeblair | clarkb: so what pressure on gerrit do you want to relieve? | 20:56 |
ttx | anteaya: openstack/governance ideally, though it's a bit overreaching | 20:56 |
clarkb | jeblair: our major problem today and yesterday appears to be a thundering herd. If we can let them thunder at a tunable pace we should be able to reign in when jenkins runs faster than it shoes can move | 20:56 |
jeblair | clarkb: i think you are over-generalizing | 20:56 |
ttx | but openstack/tech-governance is a mouthful | 20:56 |
anteaya | ttx: I'm fine with openstack/governance | 20:57 |
clarkb | jeblair: I am trying to be generic, because next milestone it will be some other DDoS | 20:57 |
ttx | and it's not as if we never renamed any project in the past | 20:57 |
anteaya | do we want it in the openstack/ namespace or the openstack-infra/ namespace do you think, ttx? | 20:57 |
ttx | well if one thing is openstack/, that would be it | 20:57 |
anteaya | very good | 20:57 |
clarkb | jeblair: and a generic pace enforcment will help us at least keep moving rather than needing emergency fixes to keep going | 20:57 |
* anteaya goes back to looking up docs for creating a new git repo | 20:58 | |
jeblair | clarkb: overgeneralizing a problem does not help provide a solution. how do you write a patch to "don't cause problems"? | 20:58 |
jeblair | clarkb: your second point | 20:58 |
jeblair | clarkb: pressure on jenknis | 20:58 |
ttx | mordred: your cookiecutter thing looks good -- looks like an automated mordred-goes-to-fix-your-project merge | 20:59 |
jeblair | clarkb: we have seen that jenkins can run a lot of jobs, and have a lot of slaves | 20:59 |
anteaya | ttx: I don't have any expectation of any gate or check tests for openstack/governance | 20:59 |
jeblair | clarkb: but right now, we've seen issues with slaves not being removed from jenkins | 20:59 |
ttx | anteaya: we could enforce some common template | 20:59 |
jeblair | clarkb: i don't know why that is. there may be a bug in the gearman-plugin. the 'thundering herd' of deleted nodes may just be too much contention for that kind of operation. | 21:00 |
ttx | anteaya: but not yet maybe | 21:00 |
anteaya | ttx: got one in mind? | 21:00 |
anteaya | ttx: very good | 21:00 |
jeblair | clarkb: and as you observed earlier, jenkins does not do well if you do lots of things at once | 21:00 |
clarkb | jeblair: ya | 21:00 |
jeblair | clarkb: so serializing access to adding and removing nodes from jenkins may help with that | 21:00 |
jeblair | clarkb: at least, we might get a better idea of what is going on | 21:00 |
clarkb | jeblair: I am all for fixing the specific bottlenecks because I want to be able to do as many operations as possible. But I also think having some way of pull back so that everything doesn't shut down is useful | 21:01 |
jeblair | clarkb: anyway, you've had some good suggestions, and i'm trying to implement solutions for the problems we've seen based on them | 21:01 |
jeblair | clarkb: that sounds great. i have no idea what you're talking about though. | 21:01 |
anteaya | ttx: who do you want as core for openstack/governance? | 21:02 |
clarkb | jeblair: I am not sure where we would want the control to go (proabably in zuul) but being able to tell it launch at most 300 jobs per hour or some number of jobs per minute/second etc will be useful so that in cases like now we can continue to run jenkins jobs without making the problem worse. | 21:03 |
jeblair | clarkb: why would we want to do that? what problem does that solve? | 21:03 |
ttx | anteaya: that's where it gets tricky. You want +2/-2 for TC members. And APRV for the chair (me) | 21:03 |
* ttx is in a meeting | 21:03 | |
clarkb | jeblair: I also see that as being useful so that it can be tied to a PID loop (or similar) where it automatically increases the limit and decreases it based on job throughput or some other metric | 21:03 |
anteaya | ttx: okay, sorry more questions later | 21:04 |
clarkb | jeblair: right now it would potentially give jenkisn a chance to catch back up on its own | 21:04 |
jeblair | clarkb: catch up with what? | 21:04 |
clarkb | jeblair: deleting nodes | 21:04 |
jeblair | clarkb: oh, i don't think that has anything to do with it | 21:04 |
clarkb | jeblair: or $otheroperation that has slowed to a crawl | 21:04 |
jeblair | clarkb: it can't delete nodes because it's deleting nodes | 21:04 |
jeblair | clarkb: not because it's running jobs | 21:04 |
jeblair | clarkb: there _are_ things we can control to tune this whole system, but we need to tune the right things. | 21:04 |
*** gyee has joined #openstack-infra | 21:05 | |
*** pblaho has joined #openstack-infra | 21:05 | |
jeblair | clarkb: if you want to rate-limit starting or stopping jobs, that can be done with zuul and gearman, in how they dispatch jobs | 21:05 |
jeblair | clarkb: but setting an arbitrary jobs-per-hour limit doesn't address an actual problem. | 21:05 |
clarkb | jeblair: right, I see it as a tool help implement proper bottleneck fixes | 21:06 |
jeblair | clarkb: i really don't think it will help | 21:06 |
jeblair | clarkb: you're creating and tuning a parameter that has nothing to do with the systems that are actually running | 21:07 |
clarkb | but it is a parameter that influences everything | 21:07 |
jeblair | clarkb: for instance, it would do nothing to prevent mass simultaneous deletions of nodes, which is an ACTUAL problem | 21:07 |
*** nati_ueno has quit IRC | 21:07 | |
jeblair | (or at least seems to be) | 21:07 |
*** melwitt has quit IRC | 21:08 | |
*** melwitt1 has joined #openstack-infra | 21:08 | |
clarkb | just noticed that the zuul status timers don't do hours properly... | 21:08 |
*** nati_ueno has joined #openstack-infra | 21:08 | |
clarkb | jeblair: but it would reduce the number of nodes that would be deleted together | 21:08 |
jeblair | clarkb: no, the fix that i'm trying to write right now will do that | 21:09 |
jeblair | clarkb: it will delete only one node from a jenkins at a time | 21:09 |
jeblair | clarkb: why would you want to try to fix that another way? | 21:09 |
clarkb | I am not suggesting this as a fix | 21:09 |
jeblair | clarkb: what are you suggesting? | 21:09 |
clarkb | you would still want to fix that particular problem with the change you are writing | 21:09 |
clarkb | jeblair: I am suggesting that we have some way of slowing everything down to usable levels while you write that fix | 21:10 |
*** rfolco has quit IRC | 21:10 | |
clarkb | we are very spiky and the ability to smooth out really big spikes will help in fixing the fallout | 21:10 |
jeblair | clarkb: the fix i want to write will do that? why don't i just go write that instead of something else that won't fix it? | 21:11 |
clarkb | because next week or during icehouse freeze we will run into similar yes different problems | 21:11 |
*** cppcabrera_afk is now known as cppcabrera | 21:14 | |
*** fbo is now known as fbo_away | 21:16 | |
jeblair | mordred, fungi: ping | 21:17 |
mordred | jeblair: pong | 21:17 |
jeblair | mordred: can you clean up jenkins02? | 21:17 |
mordred | jeblair: yes. is there a description of the problem in the scrollback? | 21:18 |
*** vipul is now known as vipul-away | 21:18 | |
jeblair | mordred: yes | 21:18 |
mordred | jeblair: great. I will find it | 21:18 |
jeblair | mordred: thanks | 21:18 |
*** vipul-away is now known as vipul | 21:18 | |
mordred | ttx: next year, can we move the nova FF one week prior? having me be only partially here due to burningman prep is not fantastic | 21:18 |
mordred | jeblair: oh wow. ok. force stop ok yeah? | 21:19 |
jeblair | mordred: yep | 21:19 |
mordred | stopping | 21:20 |
mordred | btw - salt-master has cpu pegged on puppetmaster - I'm going to restart it | 21:20 |
jeblair | mordred: i thought we stopped all the minions? maybe stop the master too. | 21:21 |
mordred | great | 21:21 |
clarkb | we should make a second pass at cleaning up the salt stuff after featurefreeze | 21:22 |
clarkb | I believe the minions are still going crazy after the ssh thing | 21:22 |
clarkb | s/ssh/crypto/ | 21:22 |
jeblair | oh, we didn't stop them? | 21:22 |
*** thomasbiege1 has joined #openstack-infra | 21:22 | |
reed | fungi, jeblair, pleia2: let me know if you think it may work https://review.openstack.org/#/c/42998/ | 21:23 |
clarkb | jeblair: we stopped them by hand, then restarted them then ran the rekey thing in hopes it would make them sane again | 21:24 |
ttx | mordred: nex tyear, you shall scream when I show the schedule on the screen | 21:24 |
clarkb | jeblair: but it didn't we should probably just disable the minion service on the slaves | 21:24 |
mordred | ttx: yes, I will | 21:24 |
clarkb | ttx: I think he did | 21:25 |
mordred | clarkb: oh, you're right | 21:25 |
mordred | I did | 21:25 |
mordred | I believe I mentioned something like "there's going to be a rush and I'm not going to be much help" if the FF is that week | 21:25 |
*** thomasbiege1 has quit IRC | 21:26 | |
ttx | next year if we separate summit/conf it would happen earlier | 21:26 |
mordred | perfect | 21:26 |
lifeless | mordred: when do you leave for burning man | 21:27 |
lifeless | ? | 21:27 |
openstackgerrit | James E. Blair proposed a change to openstack-infra/nodepool: Add ProviderManager https://review.openstack.org/42973 | 21:27 |
jeblair | clarkb: ^ live-tested | 21:27 |
*** prad_ has quit IRC | 21:27 | |
openstackgerrit | Anita Kuno proposed a change to openstack-infra/config: Creating/adding the openstack/governance repository https://review.openstack.org/43002 | 21:27 |
jeblair | clarkb: i'm basically just going to do the same thing for jenkins now. | 21:27 |
clarkb | jeblair: ok | 21:28 |
clarkb | jeblair: I have only found one minor issue so far | 21:28 |
clarkb | jeblair: but it won't cause any bugs | 21:28 |
anteaya | mordred ^ | 21:29 |
mordred | jeblair: I've stopped jenkins02, amd currently working on deleting devstack slaves that were attached to it | 21:29 |
anteaya | so in addition to this patch (I basically just followed the instructions for stackforge repos) what else to I have to do to create the repo? | 21:29 |
mordred | lifeless: first thing in the morning | 21:30 |
anteaya | do I just create it on my laptop and push it as an empty repo? | 21:30 |
clarkb | lifeless: too soon | 21:30 |
anteaya | giving it a .gitreview file | 21:30 |
lifeless | mordred: ack | 21:30 |
*** alexpilotti has quit IRC | 21:33 | |
mordred | jeblair: ERROR: n/a (HTTP 400) | 21:34 |
mordred | jeblair: is that ^^ a symptom of az1 rate limiting? | 21:34 |
Alex_Gaynor | so trying to access the jenkins pages for some of hte running jobs on the zuul status page is resulting in 502s | 21:35 |
jeblair | mordred: not that i'm aware; i don't see current rate limiting errors from nodepool | 21:37 |
mordred | AWESOME | 21:37 |
anteaya | are there tc meeting logs prior to October 2012? this link has October 2012 through to now but not prior: http://eavesdrop.openstack.org/meetings/tc/ | 21:37 |
jeblair | Alex_Gaynor: mordred is working on that | 21:37 |
Alex_Gaynor | jeblair: okey doke (as always if I can help in some way, let me know) | 21:37 |
mordred | jeblair: I'm getting that error a lot from running nova list and nova delete | 21:37 |
mordred | btw - ERROR: n/a (HTTP 400) is a TERRIBLE error message | 21:38 |
*** dprince has quit IRC | 21:38 | |
jeblair | mordred: OverLimit: This request was rate-limited. (HTTP 413) | 21:40 |
mordred | ok | 21:40 |
jeblair | mordred: ^ that's what that looks like (and just happened) | 21:40 |
mordred | fantastic | 21:41 |
*** boris-42 has quit IRC | 21:41 | |
*** cppcabrera has left #openstack-infra | 21:42 | |
mordred | jeblair: I'm not having much luck in deleting the nodes... how important is that part of the step? | 21:44 |
jeblair | mordred: i think you can skip it, nodepool should be able to clean up | 21:47 |
jeblair | mordred: it will be slow about it, which probably isn't a bad thing | 21:47 |
mordred | jeblair: ok. then I'm going to delete the node section from config.xml and restart | 21:47 |
jeblair | mordred: just the devstack nodes | 21:48 |
*** mrmartin has quit IRC | 21:49 | |
*** prad_ has joined #openstack-infra | 21:50 | |
*** AJaeger has quit IRC | 21:51 | |
*** thomasbiege1 has joined #openstack-infra | 21:51 | |
mordred | jeblair: jenkins02 is starting | 21:55 |
mordred | jeblair: and yes - just hte devstack nodes were delete | 21:55 |
*** dina_belova has joined #openstack-infra | 21:55 | |
*** weshay has quit IRC | 21:55 | |
* fungi is caught up on scrollback from lunch and reviewing gate-performance-improving changes as a first priority | 21:57 | |
clarkb | jeblair: woo finally got through that change | 21:58 |
clarkb | jeblair: the only major concern I have is with the default timeout used by the manager code | 21:58 |
pleia2 | oh, my lunch was productive, got to talk to a redhat admin who thinks that for our use case running git daemon as a service makes more sense than xinetd anyway since we're using it so much, feel less bad about writing the init script now ;) | 21:59 |
Alex_Gaynor | So is this how it works every feature freeze? We fix the latest rounds of bottlenecks ? | 21:59 |
clarkb | Alex_Gaynor: yes | 21:59 |
*** dina_belova has quit IRC | 22:00 | |
*** thomasbiege1 has quit IRC | 22:00 | |
clarkb | pleia2: oh good | 22:00 |
mordred | Alex_Gaynor: each time, the feature freeze has been significantly larger than the previous too | 22:00 |
Alex_Gaynor | mordred: sure, that was the underlying premise of my statementn, I didn't meean to imply we weren't making progress :) | 22:00 |
clarkb | Alex_Gaynor: the number of changes that go in the week before feature freeze is not only much greater than the previous feature freeze but much greater than the weeks before it | 22:00 |
*** gyee has quit IRC | 22:01 | |
*** markmc has quit IRC | 22:02 | |
notmyname | mordred: I can do the needful this afternoon for the tagging process to get pbr working | 22:02 |
*** rnirmal has quit IRC | 22:02 | |
*** mriedem1 has quit IRC | 22:03 | |
*** markmcclain has quit IRC | 22:03 | |
mordred | notmyname: ok. from my side, I believe we can do that | 22:05 |
notmyname | mordred: here's, IMO, a simple thing I think will make it all work | 22:07 |
*** burt has quit IRC | 22:08 | |
mordred | ooh. I like simple things | 22:08 |
notmyname | mordred: we tag today with 1.9.2 and consume that version number (ie we won't ever "release" a 1.9.2). This will let pbr do the right thing and create version numbers that sort properly | 22:08 |
notmyname | mordred: if we have another minor release, it will be 1.9.3 | 22:08 |
notmyname | mordred: but most likely will be 1.10.0 anyway | 22:09 |
mordred | well... we could do that ... | 22:09 |
mordred | but it will cause a 1.9.2 to be released to tarballs.o.o | 22:09 |
mordred | but I'm ok with that if you are | 22:09 |
openstackgerrit | Clark Boylan proposed a change to openstack-infra/zuul: SIGUSR2 logs stack traces for active threads. https://review.openstack.org/42959 | 22:09 |
notmyname | mordred: I don't see that as a problem, but do you have an alternate suggestion? | 22:09 |
clarkb | jeblair: ^ now with documentation | 22:09 |
jeblair | clarkb: just looked at your comment | 22:09 |
mordred | notmyname: tagging 1.9.2-dev - which will not cause a release to be cut | 22:10 |
mordred | and will map closely to your current version in tree | 22:10 |
jeblair | on cleanupServer in providerManager... | 22:10 |
notmyname | mordred: to quote from clay on the pbr patch "Rather than waiting for imminent merge, we really should get a 1.9.2 tag on the origin repo *now* so the git based versioning works in sane fashion for review. I don't really care about 1.9.2-dev which doesn't parse by distutils.version.StrictVersion *anyway*." | 22:10 |
clarkb | jeblair: about the timeout value | 22:10 |
jeblair | clarkb: yeah | 22:10 |
mordred | notmyname: ok. I'm sold by that | 22:11 |
jeblair | clarkb: so the timeout loop is a big loop that runs inside of the thread that is trying to delete the server | 22:11 |
notmyname | mordred: ya, mostly the last line | 22:11 |
notmyname | mordred: and if you haven't you should read his full comment on https://review.openstack.org/#/c/28892/ | 22:11 |
notmyname | mordred: but I think we can go forward with a 1.9.2 tag and then merge the patch | 22:12 |
jeblair | clarkb: inside of that loop, it puts a task on the queue to get the server, and waits for that to complete | 22:12 |
jeblair | clarkb: so i don't think anything about the timeout value changes | 22:12 |
jeblair | clarkb: overall, we still wait, er, an hour for the server to be deleted (in a thread that is pretty much dedicated to trying to delete the server) | 22:12 |
jeblair | clarkb: but that shouldn't affect anything else, other than every 2 seconds, that thread asks the provider thread to check on the server | 22:13 |
mordred | notmyname: reading now | 22:13 |
notmyname | mordred: so I think that leaves it here: I'll approve/merge the pbr patch when I see the 1.9.2 tag on master upstream | 22:13 |
jeblair | clarkb: (if a lot of servers are being slow to be deleted, everything else about that provider will be slow too, but i think that's desirable. mostly.) | 22:13 |
*** ^demon has quit IRC | 22:13 | |
*** gyee has joined #openstack-infra | 22:14 | |
*** ^d has joined #openstack-infra | 22:14 | |
*** pblaho has quit IRC | 22:14 | |
clarkb | jeblair: will it not prevent other tasks for running? for some reason I thought it would, but that function is called from outside the manager thread and does the poll loop there | 22:14 |
*** ^d has quit IRC | 22:14 | |
*** ^d has joined #openstack-infra | 22:14 | |
clarkb | jeblair: so I think I was concerned about nothing | 22:14 |
clarkb | The running of the delete task runs in the manager thread which is quick | 22:15 |
clarkb | jeblair: I will update my vote | 22:15 |
jeblair | clarkb: exactly, all of those methods just put a task on the manager's queue, running those tasks happens in the dedicated thread, and all the tasks should be simple 1:1 nova api calls | 22:15 |
clarkb | jeblair: done | 22:16 |
clarkb | jeblair: pleia2 http://logs.openstack.org/93/42593/4/gate/gate-grenade-devstack-vm/6de9e45/logs/devstack-gate-setup-workspace-new.txt | 22:18 |
anteaya | ttx: when you are around but not in a meeting, here is my first attempt: https://review.openstack.org/#/c/43002/ | 22:18 |
*** ^d has quit IRC | 22:19 | |
clarkb | jeblair: pleia2: I think that may be replication related | 22:19 |
clarkb | though I am not sure because I would've expected git to make that more atomic | 22:20 |
mordred | notmyname: ok. yes. I tihnk it's a well written comment, and I appreciate the willingness to go along. | 22:20 |
*** dkliban has quit IRC | 22:20 | |
mordred | notmyname: do you want me to cut a tag? or do you want to do it? | 22:20 |
notmyname | mordred: I can't make tags for swift (unless that's changed) | 22:20 |
notmyname | mordred: if I have the perms, I'd be happy to do it | 22:21 |
jeblair | clarkb: i agree, it wfm locally | 22:21 |
clarkb | jeblair: pleia2 http://paste.openstack.org/show/44689/ is what I see in the apache log | 22:21 |
mordred | ttx: you around? | 22:21 |
jeblair | clarkb: what a strange error | 22:22 |
clarkb | jeblair: ya, file exists though and has timestamps from days in the past | 22:22 |
pleia2 | that is odd, it's just ssh that replicates so it shouldn't be doing something like deleting it first (huh, would it?) | 22:22 |
notmyname | mordred: after midnight in paris right now.. | 22:22 |
mordred | notmyname: ok. I'll just do it | 22:23 |
jeblair | notmyname, mordred: he's not in that timezone | 22:23 |
notmyname | ah, ok then :-) | 22:23 |
clarkb | pleia2: I don't expect it to and the mod time on that dir is from the 13th | 22:23 |
notmyname | mordred: ok. who has permission to push tags? with the change to pbr is that changing? | 22:23 |
notmyname | jeblair: clarkb: ^ ? | 22:23 |
mordred | notmyname: no - it should be still ttx since it's a server project | 22:23 |
notmyname | ok | 22:24 |
mordred | notmyname: the main change is that it won't need to commit to change the version anymore | 22:24 |
mordred | notmyname: so the chances of your milestone-proposed brnach being any different than master are _REALLY_ low :) | 22:24 |
openstackgerrit | Jim Branen proposed a change to openstack/requirements: Allow use of hp3parclient 2.0 https://review.openstack.org/42991 | 22:25 |
mordred | notmyname: 5c6f0015d56478108a623cf65641a39ea91fc2b5 work for you? | 22:25 |
notmyname | mordred: confirm. 5c6f0015d56478108a623cf65641a39ea91fc2b5 | 22:25 |
*** changbl has quit IRC | 22:26 | |
mordred | notmyname: done | 22:26 |
notmyname | mordred: thanks | 22:27 |
notmyname | mordred: final tests on pbr branch | 22:27 |
notmyname | rd | 22:27 |
clarkb | I wonder | 22:29 |
*** lbragstad has quit IRC | 22:29 | |
clarkb | jeblair: pleia2 so apache is allowed to read the pack and idx files directly without talking to the git http thing | 22:33 |
clarkb | jeblair: pleia2 and that is what appears to have failed | 22:33 |
*** jungleboyj has joined #openstack-infra | 22:33 | |
jungleboyj | Can anyone answer questions about how the Transifex Translations are being automatically done? | 22:34 |
clarkb | pleia2: any chance selinux is involved? | 22:34 |
clarkb | jungleboyj: yes I can, whats up? | 22:34 |
jungleboyj | clarkb: Awesome. Thank you! | 22:35 |
*** jhesketh has joined #openstack-infra | 22:35 | |
pleia2 | clarkb: good question, it shouldn't since everything in /var/lib/git should have the right selinux magic to serve it up to httpd | 22:35 |
pleia2 | clarkb: but this is getting quite far out of my git expertise to understand what is happening git-wise (pack and idx files?) | 22:36 |
clarkb | pleia2: in .git/objects/pack | 22:36 |
jungleboyj | clarkb: I am working on Cinder and noticed that we had some english strings that were coming our wrong. When I look at the .po files for en_US I see that it has a msgstr defined that is either incomplete or all together wrong. Trying to figure out the right way to fix that. I had gone through and removed all the msgstr s (msgstr="") since it doesn't make sense to translate English to English but now I see the latest | 22:37 |
mordred | jungleboyj: can you defined "coming out wrong" ? | 22:37 |
clarkb | pleia2: the pack files contain a bunch of object files all compressed together, I believe the idx files tell git where to look in that compressed blob for specific objects | 22:37 |
clarkb | pleia2: that particular file has been in place since the 13th though | 22:38 |
pleia2 | clarkb: I see, so that doesn't sound to me like anything strange that selinux would have a problem with inside /var/lib/git/ | 22:38 |
clarkb | jungleboyj: can you link to a particular example in a proposed change? | 22:38 |
clarkb | jungleboyj: and I think the way i8ln works it does make sense to translate English to English depending on the locale :) | 22:39 |
jungleboyj | mordred: I had the string _("Failure creating image %s. Error %s", vol_id, error) or something like that. In the .po the msgstr for that was just "Failure creating image" and that was all that was printed to the logs. | 22:39 |
lifeless | bad translator, no cookie | 22:39 |
*** apcruz has quit IRC | 22:40 | |
*** sandywalsh has quit IRC | 22:40 | |
* clarkb updates cinder repo | 22:40 | |
*** shardy is now known as shardy_afk | 22:40 | |
clarkb | pleia2: the normal permissions all look fine. I don't know why else apache would fail to see a dir | 22:41 |
*** nijaba has quit IRC | 22:42 | |
mgagne | With JJB, has anyone had the great idea to use parameterized jobs in job-group? | 22:42 |
jungleboyj | clarkb: Here is the specific example: https://review.openstack.org/#/c/40948/2/cinder/locale/en_US/LC_MESSAGES/cinder.po Line 583 | 22:42 |
pleia2 | clarkb: /var/log/audit.log is where selinux logs violations, so you can look there | 22:43 |
clarkb | pleia2: thanks | 22:43 |
jungleboyj | msgid "Failed to copy image to volume: %(reason)s" | 22:43 |
jungleboyj | msgstr "Failed to copy image to volume" | 22:43 |
clarkb | jungleboyj: we treat transifex as the source of truth for those msgstrs | 22:45 |
clarkb | jungleboyj: the old string there may have been a casualty of babel doing a fuzzy translation and not understanding the %(reasons) I am not actually sure there | 22:46 |
jungleboyj | clarkb: Ok, well, in the case of Cinder the msgstrs are incomplete or wrong. Need to figure out how to fix it. Saw the same thing in other projects too. | 22:46 |
clarkb | jungleboyj: but for patchset 1 the removal of the msgstr would've come from transifex or the update_catalog that we run prior to updating from transifex | 22:46 |
clarkb | jungleboyj: yeah, things were wrong at one point because babel allows fuzzy translations by default, we have since disabled that. Let me get you a link to the script that proposes these chagnes | 22:47 |
fungi | jungleboyj: i have seen translations from the "c" source language to en get extremely stale because nobody is checking them for some projects, so eventually the source strings grow different numbers of format string parameters than the obsolete en versions which should normally be identical | 22:47 |
clarkb | jungleboyj: https://github.com/openstack-infra/config/blob/master/modules/jenkins/files/slave_scripts/propose_translation_update.sh | 22:48 |
clarkb | jungleboyj: https://github.com/openstack-infra/config/blob/master/modules/jenkins/files/slave_scripts/propose_translation_update.sh#L46-L55 is the most relevant section. I wonder if this is fallout from when we didn't prevent fuzzy matches | 22:49 |
fungi | jungleboyj: i did a fairly massive pass through nova some months back to clean up english translations (which basically resulted in me duplicating the source strings) | 22:49 |
fungi | i'm not familiar with what the impact from fuzzy matches might be though | 22:50 |
clarkb | jungleboyj: from git blame http://paste.openstack.org/show/44691/ that was long enough ago to be when fuzzy matching was allowed so I think that is the issue | 22:50 |
*** mikal has joined #openstack-infra | 22:51 | |
clarkb | fungi: jungleboyj: we may want to reseed them all with non fuzzy strings based on what is in transifex to get past the cruft that babel let through initially | 22:51 |
*** mikal has quit IRC | 22:52 | |
*** prad_ has quit IRC | 22:52 | |
fungi | i take it there's no way to identify a fuzzy vs. non-fuzzy translation of a string solely from the pofile | 22:53 |
*** sandywalsh has joined #openstack-infra | 22:53 | |
notmyname | mordred: patch merged (merging) and email sent to ML | 22:53 |
mordred | notmyname: woot! | 22:53 |
notmyname | mordred: thanks for your help on it | 22:53 |
mordred | notmyname: thanks for yours! I believe pbr is much better today than it was originally due to addressing your concerns | 22:54 |
*** nijaba has joined #openstack-infra | 22:54 | |
clarkb | fungi: there is the # fuzzy comment, but I think babel may not remove those when it has a non fuzzy translation | 22:54 |
clarkb | fungi: which makes it a little painful to work with | 22:54 |
jungleboyj | clarkb: So, let me make sure that I understand. There are some old en translations that didn't happen properly because fuzzy matching was allowed. | 22:54 |
*** ftcjeff has quit IRC | 22:55 | |
*** markmcclain has joined #openstack-infra | 22:55 | |
notmyname | mordred: in my email I said, "If you have any issues, just ask Monty. Preferably after 10pm on Tuesdays" ;-) | 22:55 |
*** michchap has joined #openstack-infra | 22:55 | |
mordred | clarkb: speaking of i18n, we should get swift on the transifex bandwagon - they already use babel and everything | 22:55 |
fungi | clarkb: right. unless we actually expect un-fuzzed translations to result in the #fuzzy comment also getting removed, no way to tell just from the translated string itself | 22:55 |
mordred | clarkb: and their translations are in top level like I sort of want everyone else's to be :) | 22:55 |
mordred | notmyname: I look forward to those questions :) | 22:56 |
clarkb | jungleboyj: correct | 22:56 |
jungleboyj | clarkb: If that is the case, how can I get fixes for those strings that got fuzzed up. | 22:56 |
clarkb | jungleboyj: you can translate them in transifex, or I think it is still possible to propose a patch that fixes them, but that may not be the case. I will have to double check that | 22:57 |
openstackgerrit | Elizabeth Krumbach Joseph proposed a change to openstack-infra/config: Swap git daemon in xinetd for service https://review.openstack.org/43012 | 22:57 |
*** mkirk_ has quit IRC | 22:58 | |
jungleboyj | clarkb: Forgive all the noob questions. How do I translate them in transifex? | 22:58 |
clarkb | jungleboyj: https://github.com/openstack-infra/config/blob/master/modules/jenkins/files/slave_scripts/upstream_translation_update.sh#L42-L53 we still push local git contents back to transifex so you can propose a fix in git if you like | 22:58 |
*** mkirk_ has joined #openstack-infra | 22:58 | |
clarkb | jungleboyj: I have actually never done it :) but I believe you log into https://transifex.com find the cinder project and then you can either update strings in your browser or use the tx tool | 22:59 |
*** gordc has left #openstack-infra | 22:59 | |
jungleboyj | clarkb: Ok. | 22:59 |
jungleboyj | clarkb: FYI, the pot file doesn't have any msgstrs defined in it. Will changing the pos make a difference? | 23:00 |
clarkb | the pot file is a template, it should not have any msgstrs in it | 23:00 |
clarkb | the .po files contain the actual translations | 23:00 |
*** rcleere has quit IRC | 23:01 | |
openstackgerrit | Elizabeth Krumbach Joseph proposed a change to openstack-infra/config: Swap git daemon in xinetd for service https://review.openstack.org/43012 | 23:01 |
jungleboyj | clarkb: That is what I thought. So, I would need to actually put the changes in the POs. | 23:01 |
*** sgviking has quit IRC | 23:02 | |
*** dkliban has joined #openstack-infra | 23:02 | |
clarkb | jeblair: pleia2 mordred https://jenkins01.openstack.org/job/gate-neutron-pep8/434/console ugh. I think centos and ubuntu must be sufficiently different that this doesn't work quite right. Or something replication related | 23:02 |
clarkb | jungleboyj: yup | 23:02 |
jungleboyj | clarkb: Once I do that, is there something I need to do to get a new transifex import to happen? | 23:03 |
*** jpich has quit IRC | 23:03 | |
clarkb | jungleboyj: using transifex's tx tool you can get pull the pos and push them back to transifex if you want to use their workflow | 23:03 |
clarkb | jungleboyj: we import from transifex once a day per project | 23:03 |
clarkb | so you don't need anything special it should just happen | 23:03 |
jungleboyj | clarkb: Ok, and you don't recommend clearing out all the english msgstrs ? Just fix the ones that are wrong? | 23:04 |
clarkb | jungleboyj: right. as en_US is different than C | 23:04 |
jeblair | clarkb: yeah, three differences: replication over ssh, operating system, git version | 23:04 |
clarkb | and different than en_UK and so on | 23:04 |
jungleboyj | clarkb: Ok. Thank you so much for the help! | 23:04 |
pleia2 | clarkb: I think it's a rewrite problem! pulling that file from /cgit works, but not the direct git.openstack.org/openstack/neutron/... location | 23:05 |
clarkb | pleia2: interesting | 23:05 |
openstackgerrit | Mathieu Gagné proposed a change to openstack-infra/jenkins-job-builder: Job-specific subst. in a job group's job list https://review.openstack.org/43013 | 23:05 |
*** mrodden has quit IRC | 23:06 | |
clarkb | pleia2: /cgit will be served by cgit though right? | 23:07 |
clarkb | pleia2: so possibly completely different processes | 23:07 |
pleia2 | clarkb: right | 23:07 |
pleia2 | but at least the files do exist and are servable by apache somewhere | 23:07 |
pleia2 | might be right about git version weirdness | 23:08 |
jeblair | clarkb: maybe check if that file exists on disk? | 23:08 |
pleia2 | cgit is serving it | 23:08 |
jeblair | pleia2: could be cached | 23:09 |
pleia2 | ah | 23:09 |
jeblair | pleia2: if it exists on disk and apache does not serve it, it's as you say, a rewrite problem | 23:09 |
jeblair | pleia2: if not, we're back to where we were | 23:09 |
clarkb | jeblair: the files do exist on disk, at least the ones that I have seen | 23:09 |
clarkb | s/seen/looked at/ | 23:09 |
*** sgviking has joined #openstack-infra | 23:09 | |
jeblair | clarkb: does openstack/neutron/objects/pack/pack-de6d5d31c8684408cf90392a88fb0176b4ca8f01.idx ? | 23:10 |
clarkb | https://github.com/openstack-infra/config/blob/master/modules/cgit/templates/git.vhost.erb#L19-L30 for those follwoing along. | 23:10 |
clarkb | jeblair: checking | 23:10 |
clarkb | jeblair: yes -r--r--r--. 1 cgit cgit 4488 Aug 20 06:18 pack-de6d5d31c8684408cf90392a88fb0176b4ca8f01.idx | 23:11 |
jeblair | pleia2: sounds like you're on to something | 23:12 |
clarkb | jeblair: pleia2 does the RewriteRule and ScriptAlias conflict? | 23:12 |
pleia2 | hmm | 23:12 |
clarkb | oh you know | 23:13 |
*** jerryz has joined #openstack-infra | 23:13 | |
clarkb | actually no that can't be it | 23:13 |
pleia2 | the regex for pack|idx seems right | 23:14 |
clarkb | pleia2: yeah that comes straight from the git http man page iirc | 23:14 |
*** dims has quit IRC | 23:15 | |
*** ken1ohmichi has joined #openstack-infra | 23:18 | |
*** ryanpetrello has quit IRC | 23:20 | |
openstackgerrit | James E. Blair proposed a change to openstack-infra/nodepool: Add JenkinsManager https://review.openstack.org/43014 | 23:21 |
openstackgerrit | James E. Blair proposed a change to openstack-infra/nodepool: Add an ssh check periodic task https://review.openstack.org/43015 | 23:21 |
openstackgerrit | James E. Blair proposed a change to openstack-infra/nodepool: Change credentials-id parameter in config file https://review.openstack.org/43016 | 23:21 |
openstackgerrit | James E. Blair proposed a change to openstack-infra/nodepool: Reduce timeout when waiting for server deletion https://review.openstack.org/43017 | 23:21 |
openstackgerrit | James E. Blair proposed a change to openstack-infra/nodepool: Add ProviderManager https://review.openstack.org/42973 | 23:21 |
mgagne | which repo should I clone to test? I was able to clone stackforge/puppet-glance and openstack/python-heatclient without problem | 23:21 |
clarkb | mgagne: neutron and nova appear to currently be failing fairly frequently according to the logs | 23:21 |
mgagne | clarkb: is it therefore an intermittent issue? | 23:22 |
pleia2 | clarkb: so can it get to some pack-de6d5d31c8684408cf90392a88fb0176b4ca8f01.idx files? | 23:23 |
clarkb | mgagne: yes, it seems to be intermittent | 23:23 |
pleia2 | er, .idx files | 23:23 |
clarkb | pleia2: I am not sure yet, actually let me try getting that file direclty | 23:23 |
clarkb | pleia2: mgagne: this may in part depend on the local state of your repo | 23:23 |
mgagne | clarkb: I'm cloning from scratch, are tests fetching and checking out a specific ref instead? | 23:24 |
clarkb | mgagne: tests will clone if the repo doesn't already exist otherwise they will do a remote update to fetch what they are missing | 23:25 |
clarkb | pleia2: directly fetching one of those neutron files with wget fails. This must've been what you tested before | 23:25 |
clarkb | pleia2: for whatever reason I thought you tested with a git clone which does work | 23:25 |
pleia2 | clarkb: I just tested via web browser | 23:26 |
clarkb | pleia2: looking at the vhost cgit will serve anything not under .*/objects because ScriptAlias / /usr/libexec/git-core/git-http-backend/ will never be used as we rewrite / to /cgit | 23:27 |
clarkb | pleia2: oh but we rewrite ^/$ to /cgit so anything like /openstack/foo should go to git-http-backend right? | 23:28 |
pleia2 | clarkb: yeah, I think those rewrite things are not for cgit | 23:28 |
*** mrodden has joined #openstack-infra | 23:28 | |
pleia2 | clarkb: I think they are just for git-http-backend | 23:29 |
pleia2 | fungi added them in a change to support git-http-backend | 23:29 |
*** changbl has joined #openstack-infra | 23:29 | |
*** dims has joined #openstack-infra | 23:30 | |
*** HenryG has joined #openstack-infra | 23:31 | |
jeblair | clarkb: ^ the new stack of nodepool changes is in production | 23:32 |
fungi | yup | 23:32 |
jeblair | clarkb: (i did reduce that timeout, btw, because i think it was ridiculously large) | 23:32 |
fungi | from an hour to...? | 23:33 |
*** ken1ohmichi has quit IRC | 23:33 | |
jeblair | 10 mins | 23:33 |
* fungi nods. sounds sane | 23:33 | |
jeblair | which is just, well, large. :) | 23:33 |
fungi | s/ridiculously// | 23:33 |
pleia2 | clarkb: confirmed, I don't have any of the pack rewrite rules in my test instance and I can download packs via cgit (hi fungi!) | 23:33 |
clarkb | pleia2: I think it may be an selinux thing | 23:34 |
clarkb | pleia2: httpd itself will access the git files when they hit the AliasMatches | 23:35 |
* fungi retries to grok where the ^/$ rewrite could conflict at all with the git-http-backend cgi scriptalias | 23:35 | |
clarkb | but httpd runs under a different selinux type | 23:35 |
clarkb | I am very quickly learning about selinux types so that I can test | 23:35 |
jeblair | selinux would show that error | 23:35 |
jeblair | clarkb: look in audit.olg | 23:35 |
jeblair | log | 23:35 |
clarkb | audit.log was a pain to look at ... | 23:36 |
pleia2 | hah | 23:36 |
pleia2 | can grep for git probably | 23:36 |
clarkb | but I think I just get annoyed when there are no timestamps. I will look again | 23:36 |
fungi | clarkb: well, there are timestamps, you just need to learn to read unixtime directly ;) | 23:37 |
clarkb | I don't see any AVC messages in audit.log | 23:38 |
mgagne | clarkb: I think it has to do with the way packs are generated. Could be that they are generated on-the-fly and there is contention issues on git.o.o due to the high volume of clone, fetch, etc. | 23:38 |
mgagne | clarkb: https://www.kernel.org/pub//software/scm/git/docs/git-update-server-info.html | 23:39 |
clarkb | mgagne: it seems to know where the files are though, it just can't get them | 23:39 |
mgagne | clarkb: a curl returns the file? Could it be caching issue? Or is it a timing issue, by the time you test the existence of the file, it got generated. Trying to figure out what have been tried/tested. | 23:41 |
*** rfolco has joined #openstack-infra | 23:42 | |
clarkb | mgagne: wgetting the file that was failed to fetch on a jenkins slave fails, but the file is on disk and has been there for at leasthours | 23:42 |
clarkb | mgagne: https://jenkins01.openstack.org/job/gate-neutron-pep8/434/console has a list of things that can't be fetched | 23:42 |
clarkb | mgagne: however changing the root of the url to /cgit you are able to get the file | 23:43 |
clarkb | mgagne: so it is only when apache attempts direct access via https://github.com/openstack-infra/config/blob/master/modules/cgit/templates/git.vhost.erb#L28-L29 that it fails | 23:43 |
jeblair | further evidence the scriptalias is not working: the actual apache error log message says "File does not exist: /var/lib/git/openstack/neutron" | 23:44 |
jeblair | and that _doesn't_ exist | 23:44 |
jeblair | because it's /var/lib/git/openstack/neutron.git | 23:44 |
jeblair | so presumably the scriptalias directive to use the smart http server would normally translate that, | 23:44 |
clarkb | oh that may be it | 23:44 |
pleia2 | oh wow, right | 23:45 |
jeblair | but it's not, so apache is just trying to serve a simple file | 23:45 |
pleia2 | https://git.openstack.org/openstack/neutron.git/objects/pack/pack-8dd2daf4e48bc336b39e06bcb5612bdc2c7bec7c.idx works! | 23:46 |
pleia2 | nice one jeblair | 23:46 |
jeblair | but looking at that, i think we're trying to get apache to just serve the files | 23:46 |
jeblair | it looks like the aliasmatch directives are intended to take precedence, and then scriptalias catches the rest | 23:47 |
mrodden | any idea why i'm seeing this in my tox runs? http://paste.openstack.org/show/44692/ | 23:47 |
mrodden | cannot import setuptools | 23:47 |
clarkb | jeblair: the config comes from https://www.kernel.org/pub/software/scm/git/docs/git-http-backend.html | 23:47 |
mrodden | but it actually installs setuptools 1.0 above... | 23:47 |
jeblair | clarkb: yeah, and it's the same as on review | 23:47 |
*** mriedem has joined #openstack-infra | 23:48 | |
jeblair | clarkb: what if the git smart http server is providing the wrong urls? | 23:48 |
jeblair | (git version difference) | 23:48 |
clarkb | jeblair: could be | 23:49 |
mgagne | GIT_PROJECT_ROOT has a trailing slash | 23:49 |
mgagne | could it be? | 23:49 |
clarkb | mrodden: the uninstall of distribute that happens first is causing the problem I htink | 23:49 |
mgagne | doc doesn't show/use trailing slash | 23:49 |
clarkb | mrodden: try updating tox? | 23:50 |
pleia2 | mgagne: perhaps, maybe if it has a trailing slash it does assume neutron/ and won't expand to neutron.git/ | 23:50 |
mrodden | clarkb: ok i'm on 1.4 | 23:50 |
mrodden | 1.4.2 i think | 23:50 |
clarkb | there is a trailing slash on review.o.o, but I can go ahead and update it git.o.o and restart apache to check | 23:51 |
mrodden | wow they have 1.6.0 out now... | 23:51 |
clarkb | mrodden: there has been a lot of churn around setuptools and distribute merging | 23:51 |
clarkb | mrodden: so there are a bunch of updates from tools | 23:51 |
fungi | well, we have trailing / on GIT_PROJECT_ROOT for the gerrit servers and zuul in fact | 23:51 |
*** UtahDave has quit IRC | 23:51 | |
mrodden | crazy | 23:51 |
openstackgerrit | Joshua Hesketh proposed a change to openstack-infra/zuul: Move gerrit specific result actions under reporter https://review.openstack.org/42644 | 23:52 |
openstackgerrit | Joshua Hesketh proposed a change to openstack-infra/zuul: Add support for emailing results via SMTP https://review.openstack.org/42645 | 23:52 |
openstackgerrit | Joshua Hesketh proposed a change to openstack-infra/zuul: Separate reporters from triggers https://review.openstack.org/42643 | 23:52 |
clarkb | fungi: yeah but this is the only server with this version of git | 23:52 |
clarkb | anyways restarting apache now | 23:52 |
clarkb | didn't help | 23:53 |
pleia2 | nope :\ | 23:53 |
jeblair | uh, so there are very few references to pack files in the gerrit logs | 23:54 |
clarkb | jeblair: maybe it isn't working there either? | 23:54 |
mordred | clarkb: oh - interesting | 23:54 |
jeblair | some of them are to '.git' dirs, and they work, some omit '.git' and are 404s | 23:54 |
pleia2 | same thing here | 23:55 |
jeblair | by very few, i mean 1 client this week. | 23:55 |
clarkb | warning hack: what if we just symlink openstack/foo to openstack/foo.git on disk? | 23:55 |
clarkb | and handle both cases? | 23:55 |
pleia2 | clarkb: it hurts, but if we do we can do it in the jeepyb script | 23:56 |
jeblair | clarkb: maybe to stop the bleeding? but we really should figure out the problem. | 23:57 |
clarkb | jeblair: I agree | 23:57 |
clarkb | let me add a neutron symlink then try grabbing that idx file again | 23:57 |
clarkb | that will at least tell us if this is the only problem | 23:58 |
* pleia2 nods | 23:58 | |
jeblair | (i don't think we should add it to jeepyb, (unless we decide it's the actual solution) we'll never fix it) | 23:58 |
pleia2 | jeblair: ah, ok | 23:58 |
jeblair | mordred: i forgot a step earlier: set the nodes to deleted in nodepool | 23:59 |
jeblair | i'll do that now | 23:59 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!