Wednesday, 2014-04-02

*** radsy has joined #tripleo00:00
lifelessslagle: one thing I'd like to do00:01
lifelessslagle: like source-repositories factors out 'things to download'00:01
lifelessslagle: I'd love a declarative 'things to install' in elements00:01
SpamapSlifeless: yes00:01
slaglelifeless: yes00:01
lifelessslagle: so we can scan and do one big install run00:01
SpamapSlifeless: I started working on that at one point00:01
GheRivero+100:01
slagleand a things to uninstall at the end00:01
SpamapSlifeless: we can also have two of them, dev and runtime00:01
lifelessgreghaynes: reviewed https://review.openstack.org/#/c/83675/ rev 600:01
SpamapSor rather, build / runtime00:01
SpamapSso build deps can be pulled out at the end00:02
lifelessSpamapS: three perhaps00:02
lifelessSpamapS: runtime; buildtime; testtime00:02
SpamapSwe don't really need python-dev et. al on our servers00:02
greghayneslifeless: ty00:02
SpamapSor gcc for that matter.. or.. do we..00:02
* SpamapS shakes fist at cffi00:02
lifelessSpamapS: we do because cffi and packaging fail00:03
lifelessSpamapS: but yes00:03
lifelessnote that by testtime I mean 'things where we test our code, not the CI->glance->deploy pipeline tests.00:03
lifelessthose are separate, obviously.00:03
lifelessgreghaynes: 7 pushed up while I was reviewing - sorry00:04
greghaynesis np, forgot about the stackname deal00:04
greghayneser, forgot about the cluster name00:04
lifelessgreghaynes: as for what next - I think the big arc is something like this - get it working, get it in CI, where working means '3 node control planes work'00:05
lifelessStevenK: reminds me - ping - I'd love it if you broadened your hacking slightly to included hetergeneous VM descriptions00:05
lifelessStevenK: so we can make the hypervisors smaller00:05
greghaynesDoes 3 node control planes work mean with upgrading?00:06
lifelessgreghaynes: we don't test upgrade in CI yet00:06
lifelessgreghaynes: the graceful upgrade arc also needs to be pushed on00:06
lifelessgreghaynes: and there are cross-arc deps - like, hard to test graceful works properly without a cluster to graceful upgrade.00:07
lifelessgreghaynes: and - hard to have a cluster without graceful deploys00:07
greghaynestrue. Such dependencies00:07
greghaynesok, ill play with merge.py and ther other things preventing my stack from reaching CREATE_COMPLETE when deploying to controlscale > 1.00:08
lifelessyeah, CREATE_COMPLETE Is the first step00:10
*** geerdest has quit IRC00:10
derekhlifeless: we got green jobs00:11
lifelessfuckyeah00:12
lifelessand one rather long backlog :)00:12
derekhyup, testenvs will be busy00:12
*** sdake_1 has quit IRC00:15
*** rpodolyaka1 has joined #tripleo00:19
slaglelifeless: so may plan for stable branches is to create an "icehouse" branch for tie, tht, t-inc, and tuskar. and document on the ReleaseManagement page that for doing releases for tie/tht/tuskar from the icehouse branch, you need to add an additional .[0-9] to the tag you create for the version00:23
slaglelifeless: does that sound ok?00:23
*** CaptTofu has joined #tripleo00:23
*** rpodolyaka1 has quit IRC00:24
lifelessslagle: huh00:24
lifelessslagle: let me dig up the thread where we discussed this00:24
lifelessslagle: versions should still be x.y.z right ?00:26
*** matsuhashi has joined #tripleo00:26
lifelessslagle: or are you proposing x.y.z.a ?00:27
slaglelifeless: x.y.z.a, or you wouldn't be able to upgrade and stay on stable00:27
slagleunless we say that you always bump .Y when releasing from master00:27
slagleand we only use .Z for releasing from the stable branch00:28
*** CaptTofu has quit IRC00:28
*** derekh has quit IRC00:28
xuhaiweilifeless: when running the devtest_seed.sh, it's asking root password for many times, is it normal?00:28
lifelessxuhaiwei: no00:28
*** openstackgerrit has joined #tripleo00:30
lifelessslagle: so - I have an alternative proposal00:37
lifelessslagle: staying to x.y.z is important - see the semver docs in pbr00:37
lifelessslagle: if we make sure the next release is tie/tht/tuskar increments the y of x.y.z00:39
lifelessslagle: then the stable branch can increment z indefinitely00:39
lifelessslagle: t-inc doesn't release00:39
lifelessslagle: so it can just create a branch00:39
*** eguz has joined #tripleo00:39
slaglelifeless: ok, that works for me00:39
slagleshould be simpler too. i can doc that on the ReleaseManagement wiki00:40
slagleand yes, no release for t-inc :).00:41
*** eghobo has quit IRC00:43
greghaynesSpamapS: If I want to add a resource to a heat template that is 'somestring-' + Heat::RandomString is there a way to do this? Specifically I want to make a MysqlCluserName resource that mysql can use...00:43
SpamapSgreghaynes: no00:44
greghaynes:_(00:44
SpamapSgreghaynes: the name of the resource is entirely static unfortunately.00:45
SpamapSgreghaynes: you can just have a MySQLClusterUniquePart randomstring and prepend somestring- when you use it.00:45
greghaynesYep, doing that00:46
*** matsuhashi has quit IRC00:47
*** matsuhashi has joined #tripleo00:48
*** sdake has joined #tripleo00:48
*** sdake has quit IRC00:48
*** sdake has joined #tripleo00:48
*** e0ne has joined #tripleo00:52
*** matsuhashi has quit IRC00:52
*** e0ne has quit IRC00:56
*** matsuhashi has joined #tripleo00:57
SpamapSgreghaynes: I do think Heat needs the concept of variables or at the very least macros, just to make code easier to read.00:58
lifelesswe were using params for that00:59
lifelessit broken00:59
lifelessapparently its ok to put it or something like it back in, but needs doing + tests00:59
SpamapSgreghaynes: Something like  Variables: {MySQLClusterName: {Fn::Join: ['-', 'something', {Ref: RandomMySQLPart}]}}00:59
SpamapSlifeless: Yeah, parameters seems like a violation though. I think that should be possible, but local variables would be good too01:00
SpamapSlifeless: btw I fixed this: https://review.openstack.org/#/c/83614/01:00
SpamapSlifeless: it's blocking the software-config migration01:00
greghayneshrm, that doesnt seem too hard to implement either. Would you use the same way (Ref:) to refer to them and how would you deal with colliding namespaces then01:01
greghaynesor do something like VarRef:01:01
lifelessSpamapS: huh we don't even run unittests for occ01:06
lifelessoh, its tie, nvm01:09
openstackgerritA change was merged to openstack-infra/tripleo-ci: Expose more testenv parameters  https://review.openstack.org/8285501:10
openstackgerritlifeless proposed a change to openstack/tripleo-image-elements: Fixup testenv config for interface names.  https://review.openstack.org/8432601:11
openstackgerritlifeless proposed a change to openstack/tripleo-image-elements: Fixup HP region testenv config.  https://review.openstack.org/8407501:11
openstackgerritlifeless proposed a change to openstack/tripleo-image-elements: Performance tweaks for testenv deploy script.  https://review.openstack.org/8407301:11
openstackgerritlifeless proposed a change to openstack/tripleo-image-elements: Tune deploy-ci-overcloud a little.  https://review.openstack.org/8407601:11
*** rpodolyaka1 has joined #tripleo01:19
*** eguz has quit IRC01:20
*** rpodolyaka1 has quit IRC01:21
*** ramishra has joined #tripleo01:22
lifelessthats a new one http://paste.openstack.org/show/74788/http://paste.openstack.org/show/74788/01:27
*** ramishra has quit IRC01:30
openstackgerritDan Prince proposed a change to openstack/tripleo-image-elements: A sysctl element to manage settings via sysctl.d.  https://review.openstack.org/8459901:33
openstackgerritDan Prince proposed a change to openstack/tripleo-image-elements: Update bootstack to use sysctl-set-value.  https://review.openstack.org/8460001:33
*** bauzas has quit IRC01:39
*** nosnos has joined #tripleo01:47
*** e0ne has joined #tripleo01:52
*** e0ne has quit IRC01:57
*** spzala has quit IRC01:58
*** ccrouch has left #tripleo02:07
*** CaptTofu has joined #tripleo02:07
*** sballe has joined #tripleo02:11
*** rpodolyaka1 has joined #tripleo02:20
*** rpodolyaka1 has quit IRC02:24
*** sballe has quit IRC02:26
*** newell_ has quit IRC02:29
*** yamahata has joined #tripleo02:31
*** ramishra has joined #tripleo02:36
*** rlandy has quit IRC02:37
*** ramishra_ has joined #tripleo02:37
*** ramishra has quit IRC02:41
*** giulivo has quit IRC02:46
*** e0ne has joined #tripleo02:52
*** e0ne has quit IRC02:57
*** xuhaiwei has quit IRC02:58
*** untriaged-bot has joined #tripleo03:00
untriaged-botUntriaged bugs so far:03:00
untriaged-bothttps://bugs.launchpad.net/tripleo/+bug/129048803:00
*** untriaged-bot has quit IRC03:00
uvirtbotLaunchpad bug 1290488 in tripleo "Baremetal: Invalid credentials" [Undecided,Incomplete]03:00
openstackgerritOm Kumar proposed a change to openstack/diskimage-builder: Fix Grub configurations for Fedora images built on a UEFI host.  https://review.openstack.org/8334203:09
*** CaptTofu has quit IRC03:12
StevenKlifeless: So if I'm understanding you correctly, init-heat only needs host defined, init-keystone needs them all, and init-swift doesn't need any?03:14
lifelessStevenK: yup03:17
lifelessStevenK: in more detail03:17
lifelessinit-keystone can't use the keystone API normally because the keystone API requires that you have an admin account created03:18
lifelessinit-heat and init-swift should use the normal keystone API and not require (or hav access to) the admin token03:18
lifelessthey should instead use the normal keystoneclient CLI glue - the OS_USERNAME etc etc variables, which I *think* there are trivial facilities to reuse in python-keystoneclient etc03:19
killer_princelifeless: were you able to re-review https://review.openstack.org/#/c/79873/ (Refactor code to select boot kernel)03:20
*** rpodolyaka1 has joined #tripleo03:21
lifelessno, CI cloud was down last 2.75 days03:21
lifelesswe're now up and monitoring it03:21
lifelessbut that took precedence over everything03:21
*** matsuhashi has quit IRC03:21
killer_princeaha.. np... just checking..03:22
killer_princecan you re-review it now..03:22
killer_princeif you have time..03:22
*** rpodolyaka1 has quit IRC03:25
openstackgerritSteve Kowalik proposed a change to openstack/os-cloud-config: Add CLI scripts for init-{heat,keystone,swift}  https://review.openstack.org/8433003:30
*** jtomasek has quit IRC03:33
*** morganfainberg is now known as morganfainberg_Z03:34
*** nosnos has quit IRC03:37
*** ramishra_ has quit IRC03:37
*** ramishra has joined #tripleo03:37
*** xuhaiwei has joined #tripleo03:37
openstackgerritGregory Haynes proposed a change to openstack/tripleo-image-elements: Enable Galera clustering  https://review.openstack.org/8367503:41
*** eghobo has joined #tripleo03:45
openstackgerritSteve Kowalik proposed a change to openstack/os-cloud-config: Add CLI scripts for init-{heat,keystone,swift}  https://review.openstack.org/8433003:47
*** e0ne has joined #tripleo03:52
openstackgerritJames Polley proposed a change to openstack/tripleo-incubator: Standardise location of environment password/rc files.  https://review.openstack.org/8325003:53
*** killer_prince is now known as lazy_prince03:55
*** e0ne has quit IRC03:56
openstackgerritGregory Haynes proposed a change to openstack/tripleo-heat-templates: Add initial support for galera clustering  https://review.openstack.org/8388303:57
tchaypogreghaynes: i love your nitpicking03:58
greghaynes:p03:58
greghaynessorry, I wish I could find it all in one round03:58
tchaypoI wish i thoguht of it when i first wrote it03:59
* tchaypo dreams of denying greghaynes the pleasure of nitpicking03:59
greghaynesone day03:59
tchaypoalmost as fun will be being nitpicky on your commits :)03:59
* greghaynes hides04:00
*** rpodolyaka1 has joined #tripleo04:02
StevenKtchaypo: I find the best solution is to have stuff he doesn't notice land.04:02
tchaypoit's hard when he never seems to not be here04:02
lifelesstchaypo: ok so04:03
tchaypomaybe we should work US-EAST hours to avoid him?04:03
lifelesstchaypo: IIRC you were going to dive in on the HA stuff w/greghaynes?04:03
lifelesstchaypo: hows that going? can I help at all ?04:03
*** akuznetsov has joined #tripleo04:03
tchayposo yesterday I grabbed all of greghaynes ' patches and cherry-picked them, then tried to build myself an overcloud. it failed, but it looks like that might have been my own error, so I'm making another attempt today04:04
lifelesscool04:05
greghaynestchaypo: Want a really bad heat template?04:05
tchaypofrom what I've seen, the notcomputescale stuff is working as far as building out the right number of nodes, it just requires changes to the services that need to be HA04:06
tchaypogreghaynes: yes please04:06
greghaynesmm, well the (now named) controlscale change doesnt work for values != 104:06
tchaypoonce I have enough nodes running, I plan to look at what you had to do for percona and make some guesses based on that about what I'd have to do for rabbitmq (which should be far less, probably just telling it "you're not alone" or something, but since I know nothing about rabbit...)04:07
tchaypoin what way doesn't it work?04:07
greghaynesThe merge.py scaling operation doesnt result in a sane template04:07
greghayneshence the really bad template which I've been using rather than relying on merge.py to perform the scaling04:08
tchaypoah, right. I didn't get that far yesterday.04:08
lifelesstchaypo: one thing you can do to make this less painful is skip the undercloud04:09
lifelessregister all nodes with the seed rather than just the first one04:09
StevenKlifeless: Are you preparing a novel for a PTL mail?04:11
tchaypoI think I understand the concept, but the details of how nodes get registered is one of those areas I've been skipping past and planning to investigate when it starts blocking me. One of the things I saw yesterday is that _overcloud.sh was getting stuck waiting for nova hypervisor-stats to show a count of >104:13
tchaypoI'm assuming that count is the count of registered nodes04:13
tchaypoand I think the registration happens inside setup-baremetal, so I'm just arriving at the point of starting to dig into that to see what it does04:14
StevenKregister-nodes04:14
StevenKWhich setup-baremetal calls04:14
greghaynesnova baremetal-node-list is more directly that, hypervisor-stats is what resources are used / available04:14
greghaynesso they are related04:14
tchaypooh look, it calls a script called setup-nodes04:14
tchaypothat was fairly obvious, once i looked04:15
StevenKI don't think so, I think it's register-nodes :-)04:15
tchaypogreghaynes: but the script is calling hypervisor-stats, so that's what i care about04:15
tchaypoStevenK: dangnabbit.04:15
* StevenK is currently rewriting register-nodes, anyway04:15
greghaynesyep, I mention it because when you start seeing No Valid Host Found errors from nova youll want to know both commands04:16
tchaypothanks04:16
StevenKWell. So far I'm trying to work out how to talk to nova-bm using the python API, and not having much luck.04:16
tchaypookay, so if we're not using ironic it looks like :register a node" means "nova barematal-interface-add"04:16
StevenKNope04:17
lifelessStevenK: can I suggest subprocess.check_call ?04:17
lifelessStevenK: iteration 004:17
StevenKA few lines up04:17
StevenKlifeless: Ew?04:17
lifelessStevenK: I know it's not great, but its better than being bottlenecked on it - once its in the reusable place many people can help fix it04:17
lifelessStevenK: right now, they can't, and tuskar can't use the functionality at all04:17
tchaypooh, nova baremetal-node-create ?04:17
StevenKYeah04:17
greghayneswinner04:17
* tchaypo feels slightly less clueless04:18
StevenKlifeless: Now I'm currently internally battling about the perfect is the enemy of the good04:18
greghaynestchaypo: https://gist.github.com/greghaynes/992786204:19
lifelessStevenK: of course :)04:20
tchaypo ah, ithink i understand how this can save me time now. Instead of having to do all of devtest_undercloud just to register the one undercloud node, then build it out, set up everything again, just to register the overcloud nodes so i can heat stack-create overcloud04:20
StevenKYou're right. That is pretty horrible.04:20
greghaynesbe afraid04:20
tchayporegistering those nodes with the seed should let me skip directly to heat stack-create overcloud?04:20
lifelesstchaypo: well, to all the logic in devtest_overcloud.sh; yes.04:20
tchaypoto me that feels like a lot of effort for very little saving04:21
lifelesstchaypo: 15m or so04:21
lifelesstchaypo: and by lots of effort you mean one script call ?04:21
lifelesstchaypo: source seedrc; register-nodes seed  <(jq '.nodes - [.nodes[0]]' $TE_DATAFILE)04:23
tchaypoi mean that creating the undercloud is one call; to register the nodes it looks like I'd have to pick apart the nova baremetal-node-create call and figure out how to do one by hand04:24
tchaypohowever you've just made me realise i don't need to do that at all.04:24
StevenKtchaypo: Er, you call register-nodes04:24
StevenKLike lifeless just pointed out04:24
lifelesstchaypo: a design principle we have is to have small tools04:26
lifelesstchaypo: that are reusable04:26
tchaypoand loosely joined04:27
lifelessright04:27
lifelessso anytime you have cognitive dissonance between someone saying 'do X, its easy' and 'omg wall of stuff to do' - look for, and ask about, a tool :)04:27
lifeless(because there probably is one)04:28
tchaypoin this case i was too busy reading the guts of the tool to realise that the tool itself is what i wanted04:28
lifeless:)04:28
tchaypookay, I'm going to take a short tea-break and then register some nodes04:29
StevenKtchaypo: My blood, sweat and tears are in the guts of register-nodes04:29
tchaypoI assume that one difference in this case is that I'll want to be sourcing seedrc rather than undercloudrc prior to running _overcloud.sh04:30
*** nosnos has joined #tripleo04:31
lifelesstchaypo: yes04:31
*** killer_prince has joined #tripleo04:35
*** matsuhashi has joined #tripleo04:36
*** Rakesh5 has joined #tripleo04:49
*** matsuhashi has quit IRC04:50
*** e0ne has joined #tripleo04:52
*** radsy has quit IRC04:55
*** cwolferh has quit IRC04:56
*** cwolferh has joined #tripleo04:57
*** e0ne has quit IRC04:57
*** matsuhashi has joined #tripleo05:00
openstackgerritA change was merged to openstack/diskimage-builder: Fix dhcp-all-interfaces upstart job  https://review.openstack.org/8453905:11
*** e0ne has joined #tripleo05:24
xuhaiweiOSError: [Errno 2] No such file or directory: '/tmp/pypi/markupsafe/'  means markupsafe package download failed?05:33
xuhaiweithis package is in the nova/requirements, but can't find it in the mirror05:34
xuhaiweisorry, in the mirror I found MarkupSafe, it can't be used just because the name uses capital character ?05:38
xuhaiweilifeless: Could you please answer this question?05:40
*** e0ne has quit IRC05:45
*** e0ne has joined #tripleo05:45
*** lazy_prince has quit IRC05:46
*** killer_p- has joined #tripleo05:47
tchaypoafternoon xuhaiwei05:47
xuhaiweigood aternoon05:48
xuhaiweiafternoon, :)05:48
tchaypoit's almost 8pm for lifeless, I don't think he's going to be around05:50
lifelessxuhaiwei: I don't see markupsafe in the requirements05:50
lifelesstchaypo: 7pm atm05:50
lifelesstchaypo: and C is just watching TV so I have a minute05:50
*** e0ne has quit IRC05:50
xuhaiweiI am glad you are still here05:50
lifelessxuhaiwei: https://pypi.python.org/pypi/MarkupSafe is the official pypi page for it05:50
xuhaiweibut I see this log :Downloading/unpacking markupsafe (from Jinja2->-r /opt/stack/nova/requirements.txt (line 8))05:50
lifelessxuhaiwei: so you can see the name should be MarkupSafe05:51
lifelessxuhaiwei: so Jinja2 is a dependency05:51
lifelessxuhaiwei: marksupsafe is a dependency of Jinja205:51
xuhaiweijinja2 is not in the mirror05:52
tchaypoI dig into the code a few weeks ago - it turns out there are a few weird things with case05:52
tchaypothe pypi servers do weird things to be mostly case-insensitive; and pip itself will re-try with the name all in lowercase if it fails the first time05:52
xuhaiweiJinja2 is in the mirror, why it fails to use it05:52
tchaypobut usually download fails are just transient network issues - does it consistently fail every time you try?05:52
lifelessxuhaiwei: this is what I see:05:53
lifelessDownloading/unpacking markupsafe (from Jinja2)05:53
lifeless  Real name of requirement markupsafe is MarkupSafe05:53
lifeless  http://mirror.robertcollins.net/pypi/simple/MarkupSafe/ uses an insecure transport scheme (http). Consider using https if mirror.robertcollins.net has it available05:53
lifelessdo you see the Real name line ?05:53
xuhaiweithe Real name line?05:55
xuhaiweifrom where?05:55
xuhaiweiI can't access to http://mirror.robertcollins.net/pypi/simple/MarkupSafe/05:55
lifelessthats my mirror, its a private url05:55
lifeless18:53 < lifeless>   Real name of requirement markupsafe is MarkupSafe05:55
lifeless^ that line05:55
lifelessdo you see it in your log ?05:55
xuhaiweino05:56
*** rpodolyaka1 has quit IRC05:57
lifelessok so now we need to figure out why :)05:57
StevenKHm05:57
StevenKTomorrow or Friday could be interesting -- plumber visit with a new hot water service.05:57
xuhaiweithere is only "markupsafe"05:57
lifelessxuhaiwei: can you pastebin exactly what you see please v?05:58
xuhaiweiok05:58
xuhaiweihttp://paste.openstack.org/show/74795/05:59
lifelessoh wow thats nasty06:01
lifelessI know now how pip handle case insensitivty06:01
StevenKbnemec: Found your blog article about pypi mirroring -- Ubuntu Saucy i386 is ~60GiB, but adding amd64 is probably only going to bump that by 25GiB or so.06:03
*** e0ne has joined #tripleo06:03
tchaypolifeless: case not-entirely-sensitivity might be a better name06:04
*** e0ne has quit IRC06:04
lifelesstchaypo: look at pip/index.py line 43306:04
tchaypolifeless: but the issue is compounded by the fact that the pypi official servers do extra case insentivity server-side06:04
lifelesstchaypo: no, thats irrelevant06:05
tchaypolifeless: that's the line that calls .tolower() and then s/-/_/ right?06:05
lifelesstchaypo: my local mirror *doesn't* and it still works.06:05
lifelesstchaypo: no, its worse06:05
tchaypooh dear.06:05
StevenKHmmm. https://pypi.python.org/pypi/pep381client06:05
tchaypolifeless: for bad_ext in ... ?06:08
tchaypolines 197-214 of pip-1.2-py2.7.egg/pip/index.py is there the "Real name of requirement %s is %s" comes from06:11
lifeless              logger.notify(06:13
lifeless                    'Real name of requirement %s is %s' % (url_name, base)06:13
lifeless yeah anyhwo06:13
*** cwolferh has quit IRC06:14
tchaypoyep. you must be looking at a different version if that's 433 for you06:14
tchaypoanyway - xuhaiwei - have we helped at all, or are you still stuck?06:15
xuhaiweiI am still stuck06:15
lifelesstchaypo: git :)06:15
tchaypoI'd suggest looking at /root/.pip/pip.log, which *mmight have more info06:16
lifelesstchaypo: basically it reads the full index06:16
tchaypothat's from line 448 of your paste06:16
lifelessand then looks for a case insensitive match06:16
xuhaiweican I install it by hand?06:16
tchaypoyeah, my second suggestion was going to be a manual "pip install markupsafe" to see if that works06:16
tchaypoif it does I'd try uninstalling and then retry the previous step again to rule out transient errors06:17
lifelessxuhaiwei: bear with me - I'm looking at your bug, but I'm OTP right now06:17
xuhaiweiotp?06:18
tchaypoon the phone06:18
xuhaiweioh, ok06:18
xuhaiweiI am not so eager06:18
xuhaiweiI tried pip install markupsafe, but it hit an proxy problem again06:19
xuhaiweiwhat about changint the mirror file name to 'markupsafe'06:20
tchaypolifeless: yep, _find_url_name is 197-214 in the version i was looking at, but 414-436 in the git version06:20
*** rpodolyaka1 has joined #tripleo06:21
tchaypoxuhaiwei: you could, but then you're probably going to hit similar issues with the proxy later and have to solve them; I'd rather tackle the root problem so you don't hit it again later06:21
tchaypowhat proxy problem did you see?06:21
xuhaiweiCannot fetch index base URL http://pypi.python.org/simple/   Could not find any downloads that satisfy the requirement markupsafe06:21
lifelesstchaypo: I believe the problem is in pip _get_page06:21
xuhaiweiI think this is caused by the proxy06:22
tchaypoah, right.06:22
tchayposo fixing the name of markupsafe won't do anything here06:22
tchaypoas lifeless said, pip crawls through all the links in the page it retrieves from  http://pypi.python.org/simple/ and looks for something that looks like "markupsafe"06:23
lifelessxuhaiwei: your mirror is in ~/.cache/ ...  roght ?06:23
xuhaiweiIf I fix the mirror file name, it wont go to the proxy, right?06:23
lifelesstchaypo: yeah and if get_page doesn't return things properly ...06:23
xuhaiweiyes06:23
tchaypoin this case it's failing to download that page, so it can't look at the index06:23
lifelessxuhaiwei: can you get me ls -lR from your cache? gzip that and put it somewhere I can download it ?06:24
tchaypoyep, if you can make it never need to look up the index it should be better.06:24
xuhaiweilifeless: I will try to do it06:25
*** rdopieralski has joined #tripleo06:30
xuhaiweican I send you a mail?06:31
*** rdopieralski has quit IRC06:31
*** rdopieralski has joined #tripleo06:31
*** shardy_afk is now known as shardy06:33
xuhaiweilifeless: can I send you a mail?06:33
*** rpodolyaka1 has quit IRC06:33
lifelessyes06:33
lifelessrobertc at robetcollins dot net06:33
tchaypoin other news, I think I know why nova hypervisor-stats is reporting just one - my nodes.js on has two nodes.06:36
greghaynesnodes.js? What are you doing!?06:37
tchaypobah.06:37
tchaypos/nodes.js/$TE_DATAFILE06:37
tchaypobrain thinks: the js file that defines the nodes. fingers type: nodes.js06:37
openstackgerritA change was merged to openstack/tuskar-ui: Removing unused testadata  https://review.openstack.org/8440606:38
xuhaiweilifeless: have you got the mail?06:39
lifelessI dont think so06:42
lifelessoh I typoed06:42
lifelessrobertc at robertcollins dot net06:42
xuhaiweiI will send it again06:43
xuhaiweiand this time?06:46
lifelessi have it06:47
*** mrunge has joined #tripleo06:48
openstackgerritJames Polley proposed a change to openstack/tripleo-incubator: Minor tweaks to docs in _testenv.sh  https://review.openstack.org/8464007:08
tchaypogreghaynes: do your worst.07:08
*** ramishra has quit IRC07:11
StevenKHaha07:14
greghaynesdone07:14
*** ramishra has joined #tripleo07:16
*** jcoufal has joined #tripleo07:17
*** bauzas has joined #tripleo07:20
*** jprovazn has joined #tripleo07:21
rpodolyakamorning07:25
GheRiveromorning all07:28
*** jtomasek has joined #tripleo07:28
lifelessxuhaiwei: looking now07:28
lifelessxuhaiwei: ok, so there is a MarkupSafe directory, now we need to see why pip isn't identifying that correctly07:29
xuhaiweiyeah07:29
*** akuznetsov has quit IRC07:29
*** giulivo has joined #tripleo07:30
ProfFalkenIf I want to run nova client against the undercloud as part of a post-install.d script, which heat variable should I use in my overcloud-source.yaml file?07:31
*** rpodolyaka1 has joined #tripleo07:33
*** akuznetsov has joined #tripleo07:36
*** rpodolyaka1 has quit IRC07:36
lifelessProfFalken: sorry, not entirely sure what you mean by that07:37
lifelessxuhaiwei: what version of pip do you have installed?07:39
lifelessxuhaiwei: trunk looks like it should work for you07:39
*** ifarkas has joined #tripleo07:40
xuhaiweipython-pip                1.0-1build107:40
ProfFalkenlifeless: don't worry, there was an error in my heat template - I was pointing at the wrong host!07:42
lifelessxuhaiwei: try this patch please http://paste.ubuntu.com/7193216/07:43
xuhaiweilifeless: it could be the version's problem?07:43
lifelessxuhaiwei: it may be07:43
ProfFalkenlxsli pointed out the error :)07:43
lifelesscool :)07:44
tchaypoi wonder when Ng is going to be online now that DST has played games07:45
StevenKDST has played games?07:46
StevenKWe don't switch until this weekend, no?07:46
lxsliGB has already switched, think IE has too07:47
xuhaiweilifeless: I am running it again07:48
tchaypoit's a great game. First the US flips, then the UK flips, then AU flips07:48
tchaypoi think we've at least got to the point where all of AU flips on the same night now07:49
*** boris-42 has quit IRC07:50
*** boris-42 has joined #tripleo07:51
lifelessStevenK: and AU flips differently to NZ07:51
lifelessStevenK: and AU flips differently to AU, even07:51
StevenKHeh07:52
lxsliQI tells me the original plan was for the UK to use 4x20min adjustments07:53
lxslilet us be thankful for small mercies07:53
*** jistr has joined #tripleo07:53
openstackgerritRyan Moore proposed a change to openstack/tripleo-image-elements: Restructure the nova.conf to match documentation  https://review.openstack.org/8382108:00
xuhaiweilifeless: I got the same error08:00
* tchaypo grovels tz source data08:00
lifelessxuhaiwei: please file a bug08:02
tchaypoas far as i can tell all states in aus have changed on the 1st sun in april since 200808:02
lifelesstchaypo: what about the territories ? also I may be remembering before then ?08:02
tchaypolooks like NZ is the same08:03
xuhaiweilifeless: This bug belong to tripleo?08:03
*** bauzas has quit IRC08:03
*** viktors has quit IRC08:05
tchaypolifeless: i can't seen any variations in any of the territories08:05
StevenKExcept for QLD, and WA08:07
lifelessxuhaiwei: yes please08:08
lifelessxuhaiwei: pad.lv/b/tripleo08:08
tchaypowhich don't do DST; neither does the NT, except if the year is 189908:08
StevenKtchaypo: WA has tried DST 3 times08:09
tchaypomost recently in ...08:09
xuhaiweipad.lv/b/tripleo???08:10
tchaypoRule>_AW>_2007>_2008>_->Oct>lastSun>2:00s>1:00>_-08:10
lifelesshttp://pad.lv/b/tripleo08:10
xuhaiweihttps://bugs.launchpad.net/tripleo/+bug/130122008:11
uvirtbotLaunchpad bug 1301220 in tripleo "pip can't find markupsafe when running devtest_seed.sh" [Undecided,New]08:11
xuhaiweiis it ok?08:11
openstackgerritRyan Moore proposed a change to openstack/tripleo-heat-templates: Set the block_migration_flag as Heat-configurable  https://review.openstack.org/8465508:13
openstackgerritRyan Moore proposed a change to openstack/tripleo-image-elements: Read libvirt block_migration_flag from nova.conf  https://review.openstack.org/8465708:15
lifelessxuhaiwei: thank you, we'll ask some questions there shortly to gather data08:15
xuhaiweiok, I will see it08:16
*** e0ne has joined #tripleo08:16
tchaypoPAtch set 9 of https://review.openstack.org/#/c/83294/ was uploaded 30 hours ago, and jenkins just got around to failing it. I guess we still have a bit of a backlog :p08:20
*** gcha has joined #tripleo08:22
*** eghobo has quit IRC08:23
*** lucasagomes has joined #tripleo08:25
lifelesstchaypo: yes, 36 hours of downtime more or less08:26
*** ramishra has quit IRC08:27
*** derekh has joined #tripleo08:28
*** jcoufal_ has joined #tripleo08:29
*** jcoufal has quit IRC08:30
*** jcoufal_ is now known as jcoufal08:30
openstackgerritjan grant proposed a change to openstack/tripleo-image-elements: Ensure the (block) loop device is available.  https://review.openstack.org/8338308:33
*** rpodolyaka1 has joined #tripleo08:34
*** rpodolyaka1 has quit IRC08:38
*** yassine has joined #tripleo08:38
derekhhmm, one of the test envs seems to be rejecting the key in its own json08:46
derekh2014-04-02 07:47:21.403 | Permission denied (publickey).08:46
derekh2014-04-02 07:47:21.441 | dd: writing to `standard output': Broken pipe08:46
derekhhttp://logs.openstack.org/09/76509/6/check-tripleo/check-tripleo-undercloud-precise/3eb81fc/console.html08:46
openstackgerritRyan Moore proposed a change to openstack/tripleo-image-elements: Allow settings for Nova quotas  https://review.openstack.org/8466608:52
lifelessderekh: \o/08:52
lifelessderekh: so we're broadly up08:52
lifelessderekh: but only because I have a while loop deleting floating-ips08:52
*** jcoufal has quit IRC08:54
derekhlifeless: ok, ya I seen the bug, it explains why I was sent on a wild goose chase yesterday trying to figure out a traceback when in fact floatingips was the problem08:57
lifelessderekh: *bugs*08:58
openstackgerritRyan Moore proposed a change to openstack/tripleo-heat-templates: Set the block_migration_flag as Heat-configurable  https://review.openstack.org/8465508:58
lifelessderekh: its been a lovely exercise in OMG we release this?08:58
derekhlifeless: yup08:58
lifelessfollowed in short order by OMG people use this :P08:58
lifelesstchaypo: I hesitate to throw yet *more* things your way08:59
tchaypomore learning opportunities!09:00
tchaypoI've already learnt so much today09:00
lifelesstchaypo: but it seems to me xuhaiwei's issue given that his mirror is correct (at least per the ls -lR) should be thoroughly reproducable09:00
*** untriaged-bot has joined #tripleo09:00
untriaged-botUntriaged bugs so far:09:00
untriaged-bothttps://bugs.launchpad.net/tripleo/+bug/130122009:00
untriaged-bothttps://bugs.launchpad.net/tripleo/+bug/129048809:00
uvirtbotLaunchpad bug 1301220 in tripleo "pip can't find markupsafe when running devtest_seed.sh" [Undecided,New]09:00
*** untriaged-bot has quit IRC09:00
uvirtbotLaunchpad bug 1290488 in tripleo "Baremetal: Invalid credentials" [Undecided,Incomplete]09:00
tchaypolifeless: oh right. assign it to me so i can look it up in the morning.09:00
lifelesstchaypo: e.g. a pypi-mirror on local disk, file:// accesss to it (see the pypi element's pre-install rules) installing the nova element and boom09:00
lifelesstchaypo: and I'd love to get a fix into pip to unbreak this for everyon09:01
lifelesse09:01
tchaypoYep, if it's that simple I should be able to reproduce it easily09:01
tchaypoas long as xuhaiwei isn't using a weird case-mangling filesystem like the OS X default fs :p09:02
lifelesstchaypo: even that should work09:03
lifelesstchaypo: at least with trunk pip09:03
lifelesstchaypo: it may be as simple as 'run git' :P09:03
lifelesstchaypo: assignedo09:04
andreaflifeless: ping09:10
lifelessandreaf: hey, I'm just relaxing atm, will be back in ~90m09:11
openstackgerritNicholas Randon proposed a change to openstack/tripleo-incubator: Bridge physical interface to the seed.  https://review.openstack.org/8408309:11
andreaflifeless: eh enjoy ;) ttil09:12
*** jp_at_hp has joined #tripleo09:12
*** ramishra has joined #tripleo09:13
openstackgerritRyan Moore proposed a change to openstack/tripleo-image-elements: Allow setting of common configuration options  https://review.openstack.org/8467209:19
*** yamahata has quit IRC09:24
openstackgerritNicholas Randon proposed a change to openstack/tripleo-incubator: Bridge physical interface to the seed.  https://review.openstack.org/8408309:25
*** hashar has joined #tripleo09:26
*** ProfFalken has quit IRC09:27
*** proffalken has joined #tripleo09:28
openstackgerritNicholas Randon proposed a change to openstack/tripleo-incubator: Bridge physical interface to the seed.  https://review.openstack.org/8408309:31
*** Rakesh5 has quit IRC09:33
lifelessandreaf: ok, so wassup ?09:35
*** Rakesh5 has joined #tripleo09:37
openstackgerritGonéri Le Bouder proposed a change to openstack/diskimage-builder: fix grub2 installation on Debian Wheezy  https://review.openstack.org/8350609:38
*** bauzas has joined #tripleo09:39
*** pblaho has joined #tripleo09:41
lifelessdown to 12 hours behind09:42
*** Rakesh5 has quit IRC09:44
lifelessohhh I know why we might be having glithces... new image is saucy not trusty09:47
lifelessneed to upgrade the network node mellanox driver09:48
andreaflifeless: I just saw your earlier question about CI09:49
*** xuhaiwei has quit IRC09:50
*** matsuhashi has quit IRC09:50
andreaflifeless: at the moment I'm trying to get a clean baseline - a set of tempest config / test selection which runs stable against a tripleo overcloud09:50
lifelessandreaf: ok; I *think* derekh figured that ut previously09:51
lifelessandreaf: so I had a thought about where tempest should run09:51
lifelessandreaf: which is that since its testing the thing, it shouldn't run *in* the thing09:51
lifelesse.g.09:51
lifelesssay we run baremetal tempeset tests against a seed09:51
lifelesswe also want to be able to take the seed image and say 'this is known good'09:52
lifelessin which case we don't really want tempest in the image09:52
lifelessditto undercloud, overcloud09:52
*** ccorrigan has joined #tripleo09:53
andreaflifeless: right09:54
smulcahySo once I've run tripleo and want to kick the tyres of the overcloud, I think I need to run the following,09:54
derekhandreaf: the tempest tests as filtered bey the tempest element were "known to work" a few months back, it may now be out of date but is probably a good place to start09:54
smulcahy. tripleo-incubator/scripts/tripleorc09:54
smulcahy. tripleo-incubator/scripts/devtest_variables.sh09:54
smulcahy. tripleo-incubator/scripts/tripleo-overcloud-passwords09:54
smulcahy. tripleo-incubator/overcloudrc-user09:54
smulcahyexport no_proxy=$OVERCLOUD_IP09:54
smulcahynova list09:54
smulcahy(with the nova list being an example tyre kick)09:54
andreafderekh: yes it is out of date now, I was trying to bring it back to a working state09:54
smulcahydo I really need to source all of those files for a tyre-kick? or is there a single thingie I can run to set up my environment correctly?09:55
andreaflifeless: so we should run tempest from the VM  out of nodepool09:55
lifelessthe jenkins slave yes09:56
lifelesssmulcahy: tripleorc is a snapshot of the state at the end of the run, it should never be necessary - but some folk find it useful09:56
andreaflifeless, derekh: we cannot really use the element to setup tempest on the jenkins slave, so we'll need some different type of tempest configuration, we could use devstack functions (i.e. initset) for example09:57
lifelessandreaf: exactly09:57
*** matsuhashi has joined #tripleo09:57
derekhandreaf: ok09:57
lifelessI think there may in future be a use case for a tempest we can deploy into a real cloud (e.g. for certifications), but not for CI09:57
smulcahylifeless: aha thanks, thats one down09:58
andreaflifeless: +1 yes I'd like to maintain the element alive and working for sure09:58
lifelesssmulcahy: my morning 'resume state' is to source my-rc (which sets things like TRIPLEO_ROOT) for me; devtest_variables (which sets everything up) and then the RC file for the cloud which I want to work with09:58
smulcahybut I still need the other 3 and the no_proxy09:58
smulcahycould we make the no_proxy part of one of the others?09:58
lifelesssmulcahy: you can put noproxy in your myrc - just put the whole subnet you want excluded in09:59
lifeless192.0.2.0/24 or whatever09:59
andreaflifeless: where would you have the logic / script to deploy tempest on the slave? I think it should go in tripleo-ci?09:59
lifelessandreaf: thinking out loud09:59
lifelessandreaf: requirements are: we want to run the zuul_ref of tempest, so we can't install it at buildtime, so not nodepool10:00
lifelessandreaf: that means yes, tripleo-ci10:00
smulcahylifeless: I get that, I'm trying to put together a simple howto for someone unfamiliar with tripleo to kick the tyres on a deployment10:00
lifelessandreaf: note that devtest will be cached on local disk if you need things from it10:00
lifelesssmulcahy: ah! so - they are approaching an existing deploy, and just want to use Nova API ?10:00
andreaflifeless: ok I was about to ask that :D10:00
andreaflifeless: what about testing changes to tripleo-ci itself?10:01
lifelessandreaf: theres a cache of all of openstack git trees locally, set to zuul ref stuff10:01
lifelessandreaf: indeed, thats part of it :)10:01
andreaflifeless: I think for devstack-gate there is some logic in place to care for the case where we have a special zuul ref for devstack-gate10:02
lifelessandreaf: so we get some coverage by the fact CI works; we don't have explicit tests for large chunks though, adding some would be great.10:02
lifelessandreaf: where possible we should use regular devstack nodes rather than the limit tripleo-precise nodes which run in the much smaller tripleo test regions10:02
lifelessandreaf: there is, I think we have that in place for toci already10:02
smulcahylifeless: exactly, its a use-case that we don't cater well for atm10:02
lifelessandreaf: sufficient for what you'll be doing anyhow10:02
smulcahyI think thats what jp_at_hp was aiming at with his select-cloud stuff10:03
lifelesssmulcahy: doesn't horizon have a 'download an rc file' thing10:03
lifeless?10:03
smulcahyerr, the horizon that isn't enabled by default in the cloud?10:03
smulcahyin the tripleo overcloud even10:03
lifelesssmulcahy: thats a bug, no ?10:03
smulcahyit used to at some stage alright10:03
lifelessanyhow10:04
smulcahylifeless: I'm not sure, it might be, but not sure it should be a requirement10:04
lifelesssmulcahy: fair enough10:04
lifelesssmulcahy: so I think whats needed is a way to generate the right final RC file for someone without looking up creds dynamically10:04
lifelessyou don't care about the deployment infra in this use case10:04
lifelesse.g. you want to say 'give me the RC file for user 'demo''10:04
lifelessthat seems to me to be exactly what horizon offered10:05
smulcahyI guess I'm ok if there are a bunch of rc files and I need to decide which one to source10:05
smulcahyit doesn't need to be too fancy10:05
lifelessthe noproxy thing is really only needed for the case where we make a new nonroutable network10:05
lifelesswhich is local machine only10:05
lifelessso (AFAICT?) really not relevant to the use case of giving someone else access to the cloud10:05
smulcahyuhm10:05
lifeless^ checking my assumptions10:06
smulcahyyou still want to no_proxy for a routable network10:06
andreaflifeless: is there a doc somewhere of the current setup of the tripleo test reqion?10:06
smulcahyI believe10:06
lifelessandreaf: https://wiki.openstack.org/wiki/TripleO/TripleOCloud/Regions ?10:06
smulcahyand we've seen people fall over this time and time again10:06
smulcahyif you don't get the no_proxy stuff right, your openstack clients start returning 503 errors10:07
smulcahywhich regularly lead people to the erroneous conclusion that the overcloud is not working10:07
lifelesssmulcahy: yeah, gateway errors from squid10:07
andreaflifeless: ehe thanks10:07
lifelesssmulcahy: the assumption I'm making is that twofold; a) that a running cloud cannot know for a given user whether they are able to route traffic to it or not10:07
lifelesssmulcahy: e.g. proxy[or not] config is a user domain not a cloud domain problem10:08
lifelesssmulcahy: and b) that real environments will have either 1) users set up properly as part of their regular machine setup or 2) proxies setup to know about all internal services10:08
smulcahylifeless: I think you're right, I guess I'm focusing strictly on the end-user use case here and not thinking too much about the architecture of a solution so much as the desired end result10:09
lifelesssmulcahy: I may be thoroughly wrong on b)10:09
smulcahyyou are wrong on a) and b)10:09
smulcahyspeaking as someone in one such real environment10:09
smulcahyoh10:09
lifelessgreat, we've identified one mismatch :)10:09
smulcahywrong on b)1 and b2)10:09
smulcahywhere did a) go to?10:09
lifelessa) that a running cloud cannot know for a given user whether they are able to route10:09
smulcahyah, I see it now10:09
lifeless                  traffic to it or not10:10
smulcahyyes, a) is right10:10
lifelessso10:10
smulcahybut both parts of b) are wrong in the real world where idiots like me live :)10:10
*** jcoufal has joined #tripleo10:10
lifelessthe reason we do no_proxy stuff in *devtest* is that we know the network will be broken for everyone, because 192.0.2.0/24 is TEST_NET_210:10
lifelessand noone is allowed to route it :)10:10
*** tzumainn has quit IRC10:11
smulcahybut even in the case where it might work, in a typical corporate environment, the proxy could be one the other side of the continent so you end up doing a really unneccesary and slow round-trip to the cloud in lab next door to you10:11
lifelesswould it be the case for b1 and b2 then that you can assert that an entire cloud is no_proxy ?10:11
*** tzumainn has joined #tripleo10:11
lifelesse.g. we could write that to horizon.conf to change the output rc file ?10:11
smulcahyI'd prefer to see a solution that didn't require me to login to horizon to get my rc file tbh10:12
*** markmc has joined #tripleo10:12
lifelesssmulcahy: prior to HP the corporate environments I had had proxies in every office, because damn they useful10:12
smulcahythats not how we work on a day to day basis with openstack10:12
lifelesssmulcahy: well bear with me a inute10:12
lifelesssmulcahy: here are the constraints / goals AIUI:10:13
lifeless - make it easy for users approaching a cloud someone else deployed (or potentially the same person days later) to use api scripts etc etc10:13
lifeless - deal with corporate networks that don't know about the cloud for $various reasons, and as such will fail if you use the regular proxy configuration10:14
lifeless - not require manual configuration of the users machine [except perhaps a one time bootstrap thing to get connected at all]10:14
lifeless?10:14
smulcahysounds about right, can you elaborate on the 3rd one?10:15
lifelesssure10:15
lifelesslet me add a couple of my own though :)10:15
lifeless - be *able to be* folded into openstack core libraries - e.g. python-openstackclient / horizon / keystone10:16
smulcahyon 3, are we talking about something like "I install ubuntu 13.10 on a machine, I pip install python-novaclient, I source <magic rc file>, nova list", simples! ?10:16
jp_at_hplifeless: thanks for the config comments - they're really good.  I think I'll try and absorb and reply for your tomorrow morning10:16
lifelesssmulcahy: say you've a cloud deployed in vlads test area10:17
lifelesssmulcahy: how do you get to it over the network at all ?10:17
smulcahylifeless: I've no idea what vlad's test area looks like.10:17
lifelesssmulcahy: firewalled off, jump host only access - that sort of thing10:18
smulcahyare saying its a test area on a 10.x net, for example?10:18
lifelesssmulcahy: yeah10:18
lifeless3 is about me trying to stop it being unlimited scope creep10:18
smulcahyright, well, if its firewalled off, I'll need to login to some gateway machine10:18
lifelessexactly10:18
smulcahywhere I can hopefully source <magic rc file> and nova list10:18
lifelessI'm willing to ack that no_proxy setup can be very valuable10:18
lifelessI'm dubious about putting in openvpn config at this point10:19
smulcahyotoh, if its not firewalled and my entire corporation is on 10.x10:19
smulcahyI just source <magic rc file> and nova list10:19
lifelesssure10:19
lifelessso the final one I wanted to add is10:19
smulcahyand yes, I think we can draw the line at the openvpn config for sure10:19
smulcahybut not ipsec10:19
smulcahyjoke10:19
lifeless - not assume local access to the machine10:19
lifelessfrees/wan 4 ever10:20
smulcahyI think catering for the scenario where I have direct access to the machine covers 80% of real world scenarios10:20
lifelessI have a freeswan mesh config generator around somewhere that I wrote ages back10:20
* smulcahy waves hands10:20
smulcahybecause that whats I'm seeing day to day in our use of openstack clouds (both public and test)10:21
lifelesssmulcahy: so you're thinking you'd generate the rc file for user X ?10:21
lifelesssmulcahy: I certainly don't have shell access to public cloud :P10:21
smulcahywell, tripleo seems to do that already overcloudrc-user and so on10:21
smulcahyit just needs me to . a bunch of other files before I can meaningfully use that one and I think it would be goodness to collapse that sourcing requirement into a single step10:22
lifelessok so10:22
smulcahyok, for some definitions of access - I'm thinking the api port(s), not shell10:22
smulcahythats so 90s! :)10:22
lifelessthere are I think conflicting use cases here10:22
lifelessbut lets pick a real common one10:23
*** e0ne_ has joined #tripleo10:23
lifelessadmin (you) want to let user (fred) onto the cloud10:23
lifelessyou need to:10:23
lifeless - create a user10:23
lifeless - with a password10:23
lifeless - make an rc10:23
lifeless - get it to fred10:23
lifelessthe horizon thing splits that in two10:23
lifelessyou make the user with a password and tell them the uid and password, and they login and get the rest themselves10:24
*** jang1 has joined #tripleo10:24
smulcahyoh yeah, I'm not even talking about that scenario here10:24
lifelessI keep coming back to this not because I think horizon is necessarily the right answer, but because it already exists and does the job :)10:25
smulcahyI'm focusing on the "ok, I've ran devtest and it seems to have worked - how do I run a nova command against the overcloud, undercloud and indeed seed"10:25
lifelesssmulcahy: if you've done that, just source Xrc10:25
lifelesssmulcahy: its all setup already10:25
smulcahyits not10:25
lifelesssmulcahy: did you close your terminal ?10:25
smulcahyI need to source 3 separate files and set a no_proxy variable10:26
smulcahythe network crapped on my briefly after running devtest so I needed to login again10:26
smulcahyit happens10:26
lifelesssmulcahy: ok10:26
*** jtomasek has quit IRC10:26
lifelessso the reason I'm giving pushback here10:26
smulcahyI think the subtext here is that this isn't something we want to address in tripleo devtest10:26
lifelessis that we want small tools that are useful indefinitely10:26
smulcahywhich is fine but I think we should call it out10:26
lifelessif there is a user story, great - lets incubate it10:27
smulcahythen I can proceed with dealing with this in my own environment in the knowledge that I'm not duplicating something that already exists in TripleO10:27
lifelesssmulcahy: I replied with one such one that I think would be great - an rc manager to the select-cloud thing10:27
*** e0ne has quit IRC10:27
lifelesssmulcahy: sure10:27
lifelessuhm10:28
lifelessif the focus is folk doing devtest10:28
lifelesslets start over10:28
lifelesswhat can we fix or unduplicate to make sourcing e.g. undercloudrc on its own work10:28
lifelessright now undercloudrc is a static file - we don't generate it10:28
smulcahyexactly10:28
lifelessour clouds don't know how to generate rc files - which is the path I went down in our previous discussion above.10:29
smulcahyso what generates it?10:29
smulcahyexport OS_PASSWORD=$(os-apply-config -m $TE_DATAFILE --type raw --key undercloud.password)10:30
smulcahyexport OS_AUTH_URL=$(os-apply-config -m $TE_DATAFILE --type raw --key undercloud.endpoint)10:30
smulcahyfrom my undercloudrc looks generated10:30
lifelesssmulcahy: we wrote it by hand10:30
lifelessgit log -p -- undercloudrc10:30
smulcahyaha, gotcha10:30
smulcahyso maybe the solution then is to make a few mods to that file10:31
lifelessits setup to make use of a LOT of assumptions about the environment10:31
lifelessthat is that there is a TE_DATAFILE10:31
smulcahyI think thats ok for now at least10:31
lifelessit knows its the undercloud10:31
jp_at_hplifeless: I really want the select-cloud script to get into the incubator.  I think right now it is a solution for a developer to very easily choose what cloud to interact with for test purposes, and it provides a place where the no_proxy can be set for the seed automagically (which by design is likely always a vm, right?) - and I think it provides a location into which work can go for allowing developers to inspect and manipulate conf10:31
lifelessjp_at_hp: the seed might not be a VM outside of a test environment, its distinguishing characteristic is that it wasn't deployed by nova+heat10:32
smulcahyso maybe we add some additional logic into undercloudrc, overcloudrc and overcloudrc-user as a starting point for this10:32
jp_at_hpfair point :D10:32
lifelesssmulcahy: I could see moving idempotent no_proxy logic into those files10:32
lifelessjp_at_hp: for instance RedHat have a seed thats on a USB key10:32
jang1if I can chip in: perhaps even expecting TE_DATAFILE to be set is a bit much. Many of our scripts expect it; but if it's not set, devtest_variables can pick a suitable spot. I'd personally like to see more of these things work given a mostly 'pristine' environment.10:33
lifelessderekh: ruhroh10:33
lifelessderekh: no jobs running10:33
*** rpodolyaka1 has joined #tripleo10:34
lifelessjang1: so, I dislike that its an environment variable, but thats kindof like gcc assuming a.out10:34
*** Rakesh5 has joined #tripleo10:35
smulcahyso what about adding a sourcing of devtest_variables and tripleo-<environment>-passwords to the <environment>rc files? and a no_proxy line - does that seem reasonably uncontentious?10:35
lifelessjang1: I think its reasonable for our development toolchain to assume that TRIPLEO_ROOT is set (where the code is) and that devtest_variables.sh has been sourced, because thats how we avoid hugely complex boilerplate leaking into every script10:35
lifelesse.g. we use PATH etc10:35
smulcahyI'm happy to submit that change if it has a chance of being approved10:35
lifelesssmulcahy: no_proxy line I think is uncontentious (for me at least)10:36
smulcahyand I think it dramatically improves the usability of tripleo devtest deployments for non-experts10:36
smulcahybut not the other two?10:36
derekhlifeless: crap, looking10:36
lifelessderekh: nova list --all-tenants10:36
lifelessERROR: HTTPSConnectionPool(host='ci-overcloud.tripleo.org', port=13000): Max retries exceeded with url: /v2.0/tokens (Caused by <class 'socket.error'>: [Errno 110] Connection timed out)10:36
lifelessderekh: its to late for me to jump on this10:36
*** CLOUDOUTAGE has joined #tripleo10:36
CLOUDOUTAGElifeless devananda Ng SpamapS jog0 GheRivero derekh dprince slagle  -- ci-overcloud currently down https://etherpad.openstack.org/p/cloud-outage10:36
*** CLOUDOUTAGE has quit IRC10:36
derekhlifeless: no, prob will dig in10:36
lifelessderekh: but I'm going to say OHF*K and leave it to you/ng/ghe10:36
lifelesssmulcahy: the passwords file isn't needed10:37
lifelesssmulcahy: not sure why you want to source that10:37
lifelesssmulcahy: its needed if you're using the current heat commandlines10:38
lifelesssmulcahy: which is why I want to move those to use a JSON environment file10:38
lifelesssmulcahy: but its not needed for the rc files10:38
*** rpodolyaka1 has quit IRC10:38
smulcahyexport OS_PASSWORD=$OVERCLOUD_DEMO_PASSWORD10:38
smulcahythats one of the lines from overcloudrc-user10:38
lifelesssmulcahy: oh, I was focused on the use case you gave :)10:39
smulcahydoes that not mean I need the passwords file?10:39
lifelesssmulcahy: you're right, we haven't changed that one10:39
lifelessthe demo user is a bit of an odd thing really; what we need is a good user management solution10:39
lifeless(the unencrypted asserted-users stuff we do today really isn't good enough - but we needed something automated and repeatable, and blah)10:40
smulcahyand yet ironically for a new user, its the most useful one because it lets you quickly test the whole environment10:40
lifelessanyhow10:40
smulcahybut ok, that may only be neccesary for the overcloudrc-user file10:41
lifelessI think that one we'd want a #FIXME against it10:41
lifelessbecause right now the path to those files can be all over the place10:41
smulcahywhich brings us to source devtest_variables.sh in the overcloudrc and undercloudrc files .. objections to that?10:41
lifelesswhy do you want to source devtest_variables? <- not trolling :)10:42
lifelessI assume its because of this line:10:42
lifelessexport TE_DATAFILE=${TE_DATAFILE:-"$TRIPLEO_ROOT/testenv.json"}10:42
smulcahyan example is easiest10:43
*** lparth has joined #tripleo10:43
smulcahy$ . tripleo-incubator/overcloudrc10:43
smulcahyos-apply-config: command not found10:43
smulcahyos-apply-config: command not found10:43
lifelessah10:43
smulcahyI haven't even gotten to the TE_DATAFILE error yet :)10:43
derekhCan't ssh to ci-overcloud-notCompute0 trying textcons10:43
lifelessso - ok, the scripts from client-tools10:43
jang1lifeless: "hugely complex boilerplate leaking into every script" can be as simple as ". $(dirname "$0")/common", can't it?10:43
lifelessjang1: which then breaks when you install the script10:44
lifelessjang1: we had unpleasant times when we starting productionising things10:44
lifelessjang1: I'd rather not set us up for that again10:44
jang1well, that surely depends what's inside "common".10:44
*** e0ne_ has quit IRC10:44
lifelessanyhow,  - I would object to sourcing devtest_variables; it makes the dependency be on the specific layout rather than on os-apply-config being in the path and TE_DATAFILE being set (which is ugly in its own right, but thats a pre-existing ugly :(10:45
smulcahywe've lost an hour of our lives to this, is the conclusion for now that whats in TripleO is good enough and we should implement something for our own usability for now?10:45
lifelessI suggest you add a patch to move the noproxy logic to the rc files10:46
lifelessidempotently10:46
lifelessand a patch to the -user rc to source the passwords file IFF the variable isn't set10:46
lifelessI would like to see those10:46
lifelessthat should get you down to a) source variables, which gets you your global state setup10:47
smulcahyexcept that needs a TRIPLEO root to work, the passwords one anyway10:47
lifelessand then source the rc you want10:47
smulcahyright, its an improvement10:47
openstackgerritgerry-drudy proposed a change to openstack/tripleo-image-elements: Add swift-get-nodes, swift-recon and swift-recon-cron  https://review.openstack.org/8468910:47
smulcahyok10:47
lifelesssmulcahy: which variables will take care of for you10:47
jang1... so the conclusion is that now there are only two files to source?10:48
jang133% improvement?10:48
derekhgroan, [183900.294370] hpsa 0000:06:00.0: cmd_alloc returned NULL!10:48
smulcahyseems to be10:48
lifelessjang1: 5010:48
smulcahyand is something like export no_proxy=$OVERCLOUD_IP acceptable? or do you imagine something fancier?10:48
lifelessjang1: actually 5 -> 210:49
lifelesssmulcahy: It needs to edit the no_proxy rather than be hardcoded; ideally idempotently to stop it growing out of control10:49
jang1or you could do what I do and depend on the specific layout. I'd be interested to know who _doesn't_ have shell aliases or functions or what-have-you to chop this down to a single line10:49
lifelesssmulcahy: the existing code you'll be moving does the edit10:49
jang1I'm fairly sure I've seen something that tr's , to \n on $no_proxy and goes from there, already10:50
lifelessderekh: yeah, new control plane node is saucy not trusty, so its the older kernel10:50
lifelessderekh: please add to the cloud page - I'm going to capture *everythign* in a postmortem when we're stable and stay up for more than a few hours :)10:51
derekhlifeless: will do, btw before clear the page I hit "save revision" to mark it10:52
derekhlifeless: befor you go, tcp syn/acks not getting out of the box again, we havn't come up with a way around this besides reboot have we?10:56
*** hashar has quit IRC10:56
lifelessderekh: which box ?10:59
derekhlifeless: ci-overcloud controller10:59
smulcahyok, thanks lifeless10:59
lifelessderekh: can try rmmoding the mlx_core etc10:59
*** gcha has quit IRC10:59
lifelessderekh: but yeah reboot11:00
smulcahywe may revisit but I'll take this as a step in the right direction anyway :)(11:00
derekhlifeless: ok willdo, I'll leave you alone now11:00
lifelessderekh: oh before you do11:00
lifelessderekh: /e/n/i hasn't been updated on the controller11:00
derekhlifeless: ok, will do it also11:00
*** matsuhashi has quit IRC11:00
lifelessderekh: so you'll want to do that similarly to the one on the hypervisors (but different bridge names)11:00
lifelessderekh: so adjust to taste! night all11:00
derekhlifeless: ok, night11:01
lifelessderekh: and os-collect-config --force --one after reboot of course, but you know that :)11:01
derekhyup11:01
*** nosnos has quit IRC11:04
*** akuznetsov has quit IRC11:06
*** akuznetsov has joined #tripleo11:07
*** CLOUDOUTAGE has joined #tripleo11:07
CLOUDOUTAGElifeless devananda Ng SpamapS jog0 GheRivero derekh dprince slagle  -- ci-overcloud currently down https://etherpad.openstack.org/p/cloud-outage11:07
*** CLOUDOUTAGE has quit IRC11:07
*** openstackgerrit has quit IRC11:08
*** openstackgerrit has joined #tripleo11:08
openstackgerritGonéri Le Bouder proposed a change to openstack/diskimage-builder: clean up: fix some indent not multiple of 2  https://review.openstack.org/8469311:14
*** e0ne has joined #tripleo11:16
*** matsuhashi has joined #tripleo11:16
*** julim has joined #tripleo11:16
*** lucasagomes is now known as lucas-hungry11:23
*** e0ne_ has joined #tripleo11:24
*** rlandy has joined #tripleo11:24
*** pblaho has quit IRC11:26
*** e0ne has quit IRC11:27
*** CaptTofu has joined #tripleo11:33
*** killer_p- has quit IRC11:34
*** CLOUDOUTAGE has joined #tripleo11:38
CLOUDOUTAGElifeless devananda Ng SpamapS jog0 GheRivero derekh dprince slagle  -- ci-overcloud currently down https://etherpad.openstack.org/p/cloud-outage11:38
*** CLOUDOUTAGE has quit IRC11:38
openstackgerritA change was merged to openstack/tripleo-incubator: overcloud: Look for notCompute or controller  https://review.openstack.org/8418011:48
openstackgerritAna Krivokapic proposed a change to openstack/tuskar-ui: Use num_nodes to get node count if possible  https://review.openstack.org/8470211:51
derekhabout to reboot ci-controller, anybody want to double check my edits to /etc/network/interfaces bfor I do ?11:55
derekhNg: SpamapS ^11:55
*** morazi has joined #tripleo11:55
derekhbrb, its at the bottom of https://etherpad.openstack.org/p/cloud-outage11:55
*** gcha has joined #tripleo11:56
*** ccrouch has joined #tripleo11:59
derekhrebooting ci-controller11:59
*** morazi has quit IRC12:00
*** hashar has joined #tripleo12:00
*** morazi has joined #tripleo12:02
jprovaznback in 60 m12:02
*** jprovazn has quit IRC12:02
*** CLOUDOUTAGE has joined #tripleo12:09
CLOUDOUTAGElifeless devananda Ng SpamapS jog0 GheRivero derekh dprince slagle  -- ci-overcloud currently down https://etherpad.openstack.org/p/cloud-outage12:09
*** CLOUDOUTAGE has quit IRC12:09
*** dprince has joined #tripleo12:12
*** rbrady has joined #tripleo12:13
*** killer_prince has quit IRC12:16
*** lblanchard has joined #tripleo12:17
openstackgerritLadislav Smola proposed a change to openstack/tuskar: Swift parameters fix  https://review.openstack.org/8470512:17
*** Matt2 has quit IRC12:17
*** jistr is now known as jistr|english12:18
*** CaptTofu has quit IRC12:20
*** morazi has quit IRC12:21
* Ng -> office to pick up his laptop at long long last \o/12:22
*** e0ne has joined #tripleo12:22
*** morazi has joined #tripleo12:22
*** jistr|mobi has joined #tripleo12:23
*** morazi has quit IRC12:24
*** e0ne_ has quit IRC12:25
*** morazi has joined #tripleo12:26
*** jdob has joined #tripleo12:28
*** weshay has joined #tripleo12:32
openstackgerritDan Prince proposed a change to openstack/tripleo-image-elements: A sysctl element to manage settings via sysctl.d.  https://review.openstack.org/8459912:40
openstackgerritDan Prince proposed a change to openstack/tripleo-image-elements: Update bootstack to use sysctl-set-value.  https://review.openstack.org/8460012:40
*** CLOUDOUTAGE has joined #tripleo12:40
CLOUDOUTAGElifeless devananda Ng SpamapS jog0 GheRivero derekh dprince slagle  -- ci-overcloud currently down https://etherpad.openstack.org/p/cloud-outage12:40
*** CLOUDOUTAGE has quit IRC12:40
*** ramishra has quit IRC12:42
*** martyntaylor has quit IRC12:46
*** martyntaylor has joined #tripleo12:48
*** lucas-hungry is now known as lucasagomes12:49
openstackgerritDan Prince proposed a change to openstack/tripleo-incubator: Don't hard code the baremetal seed IP in seedrc  https://review.openstack.org/8312612:53
openstackgerritDan Prince proposed a change to openstack/tripleo-incubator: Write network configuration into seeds config.json  https://review.openstack.org/8312512:53
openstackgerritDan Prince proposed a change to openstack/tripleo-incubator: Make the baremetal-network configurable  https://review.openstack.org/8232712:53
openstackgerritDan Prince proposed a change to openstack/tripleo-image-elements: TEST ONLY: make nova depend on common-venv  https://review.openstack.org/7998912:54
openstackgerritDan Prince proposed a change to openstack/tripleo-image-elements: Openstack-clients: don't hard code venv  https://review.openstack.org/7998812:54
openstackgerritDan Prince proposed a change to openstack/tripleo-image-elements: Wire in _EXTRA_INSTALL_OPTS...  https://review.openstack.org/7696612:54
openstackgerritDan Prince proposed a change to openstack/tripleo-image-elements: Add a new common-venv element  https://review.openstack.org/7696712:54
openstackgerritDan Prince proposed a change to openstack/tripleo-image-elements: Horizon: dynamically set config time env vars  https://review.openstack.org/8261112:54
*** jprovazn has joined #tripleo12:55
openstackgerritA change was merged to openstack/tuskar-ui: Adding overcloud keystone client  https://review.openstack.org/8437912:58
*** jistr|mobi has quit IRC13:01
*** yamahata has joined #tripleo13:02
*** matsuhashi has quit IRC13:02
*** matsuhashi has joined #tripleo13:02
openstackgerritgerry-drudy proposed a change to openstack/tripleo-image-elements: Add swift-get-nodes, swift-recon and swift-recon-cron  https://review.openstack.org/8468913:03
*** matsuhashi has quit IRC13:07
*** jcoufal has quit IRC13:08
*** petertoft has joined #tripleo13:08
*** matsuhashi has joined #tripleo13:10
*** jistr|mobi has joined #tripleo13:11
*** jcoufal has joined #tripleo13:11
*** CLOUDOUTAGE has joined #tripleo13:11
CLOUDOUTAGElifeless devananda Ng SpamapS jog0 GheRivero derekh dprince slagle  -- ci-overcloud currently down https://etherpad.openstack.org/p/cloud-outage13:11
*** CLOUDOUTAGE has quit IRC13:11
openstackgerritRadomir Dopieralski proposed a change to openstack/tuskar-ui: Overcloud initialization  https://review.openstack.org/8334013:13
*** matty_dubs|gone is now known as matty_dubs13:18
*** CaptTofu has joined #tripleo13:20
*** CLOUDOUTAGE has joined #tripleo13:42
CLOUDOUTAGElifeless devananda Ng SpamapS jog0 GheRivero derekh dprince slagle  -- ci-overcloud currently down https://etherpad.openstack.org/p/cloud-outage13:42
*** CLOUDOUTAGE has quit IRC13:42
*** jistr|english is now known as jistr13:43
*** jistr|mobi has quit IRC13:44
derekhlifeless devananda Ng SpamapS jog0 GheRivero derekh dprince slagle : controller is back up, floating IP seem to be giving trouble any body want to jump in and see if they can spot the problem ?13:45
openstackgerritA change was merged to openstack/tripleo-incubator: Load undercloud images with -d (delete duplicate)  https://review.openstack.org/8447813:47
*** akuznetsov has quit IRC13:47
*** jpeeler1 is now known as jpeeler13:50
*** jpeeler has joined #tripleo13:50
*** akuznetsov has joined #tripleo13:51
*** ramishra has joined #tripleo14:00
*** Rakesh5 has quit IRC14:07
dprincederekh: seeing a traceback in the neutron-l3-agent log file...14:10
derekh:q14:10
Gonerislagle: Hi, I don't know if you notified. I answered to your comment https://review.openstack.org/#/c/84693/14:11
derekhdprince: hmm, that was possibly when o-c-c was restarting things not sure14:12
dprincederekh: maybe, I don't usually see it but it but sure14:13
derekhdprince: I did notice that there is no router namespase14:13
*** CLOUDOUTAGE has joined #tripleo14:13
CLOUDOUTAGElifeless devananda Ng SpamapS jog0 GheRivero derekh dprince slagle  -- ci-overcloud currently down https://etherpad.openstack.org/p/cloud-outage14:13
*** CLOUDOUTAGE has quit IRC14:13
dprincederekh: I bounced it14:13
derekhdprince: that traceback says14:13
derekh2014-04-02 13:12:37.083 2951 ERROR neutron.agent.l3_agent [req-09de240a-6daf-4810-9898-de030ae6f30c None] Failed synchronizing routers due to RPC error14:13
derekhso maybe relevant14:13
dprincederekh: doesn't seem to have done anything though14:13
derekhdprince: cool, I tried it a few minutes ago but worth another try14:14
dprincederekh: Now that its up can we recreate the NW again?14:14
*** jpeeler has quit IRC14:17
*** jpeeler has joined #tripleo14:18
derekhdprince: could be worth a try, you mean remove and recreate the 3 networks in neutron? It would be nice to figure out whats gone wrong but it could end up bening our only course of action14:19
dprincederekh: I'm not sure what has been done, just trying to clean slate it as much as possible to get it back up.14:20
openstackgerritgerry-drudy proposed a change to openstack/tripleo-image-elements: Add swift-get-nodes, swift-recon and swift-recon-cron  https://review.openstack.org/8468914:20
derekhdprince: yup, pretty much everything I've done so far is in the etherpad https://etherpad.openstack.org/p/cloud-outage14:21
*** hashar has quit IRC14:22
*** hashar has joined #tripleo14:23
dprincederekh: The core network/DHCP on this machine is still concerning to me.14:23
*** bauzas has quit IRC14:24
derekhdprince: yup, were running a manually edited dhcp-all-interfaces that I edited today along with a manually edited /etc/network/interfaces14:25
dprincederekh: as far as I can tell at this point if the machine is rebooting for any reason it will likely not come back up. :(14:25
derekhdprince: but I agree, it would be create if we could bring this back up with a new image and fixed dhcp-all-interfaces14:26
dprincederekh: oh well. I guess that is the cause w/ TripleO in general until ensure-bridge fixes land though :(14:26
*** e0ne_ has joined #tripleo14:26
derekhdprince: yup14:26
*** rdopieralski has quit IRC14:26
derekhdprince: so the error that happened this morning is the same as the one we kept getting a few weeks ago when controllers randomly stopped accepting tcp connections14:28
derekhI'm starting to thing its more then just a hardware issue (which is what is was put down to the last time)14:28
dprincederekh: related to the underlying NW driver?14:28
derekhdprince: yup14:28
*** e0ne has quit IRC14:30
dprincederekh: So... with regards to statically assigning the IP I'm fine with that as a stop-gap but we should at least persist it in /etc/network/interfaces14:32
dprincederekh: otherwise it is game over if someone else comes in and reboots this baby14:32
* dprince really prefers using ifup/ifdown rather than ad hoc ip commands14:33
derekhdprince: yup, I was hoping somebody could look at the interfaces file I edited to see if I screwed something up14:33
derekhso anybody out there with a running devstack, remind me, where is the floating ip mapped to a private IP ? iptables?14:34
derekh*devtest14:34
dprincederekh: I think we need the qrouter namespace first.14:37
dprincederekh: Then this would do it14:37
dprincederek: ip netns exec <namespace> iptables-save14:37
*** jprovazn is now known as jprovazn_afk14:37
dprincederekh: which is why I wanted to recreate the NW14:37
dprincederekh: where do those scripts live? (to recreate the neutron networks)14:37
derekhdprince: ya, I was suspicious that it was missing, should it be created by the neutron-ovs-agent14:37
dprincederekh: not sure14:38
derekhdprince: yup, can we create the networks with a specific id?14:38
dprincederekh: lets bounce that too14:38
derekhdprince: ok14:38
derekhdprince: the network id have to match what infra has in their configs14:39
derekhwhich is why I ask about the id14:39
dprincederekh: not sure, checking. I've never done that.14:41
dprincederekh: if we recreated the overcloud from scratch just yesterday it must be possible (which is why I thought it might have been scripted)14:41
* dprince isn't sure what all has happened on the HP cloud yet...14:41
derekhdprince: updated yesterday https://review.openstack.org/#/c/84263/3/modules/openstack_project/templates/nodepool/nodepool.yaml.erb14:44
*** CLOUDOUTAGE has joined #tripleo14:44
CLOUDOUTAGElifeless devananda Ng SpamapS jog0 GheRivero derekh dprince slagle  -- ci-overcloud currently down https://etherpad.openstack.org/p/cloud-outage14:44
*** CLOUDOUTAGE has quit IRC14:44
*** mrunge has quit IRC14:44
dprincederekh: sounds like an opportunity to add a neutron feature to me :)14:45
derekhdprince: yup14:45
dprincederekh: anyway so the l3-agent should be creating this namespace so that is our problem14:45
dprincederekh: wanna enable debug and bounce the l3-agent to get more info?14:46
derekhdprince: yup, lets do that14:46
*** geerdest has joined #tripleo14:46
derekhdprince: I'm doing it now14:46
dprincederekh: okay, will let you. Presumably you are editing the os-apply-config source, and then re-running os-collect-config?14:47
funginova list|grep -c ERROR14:47
fungi5214:47
dprincefungi: we are working on it sir14:47
fungidprince: i expected you were all on top of it. thanks!14:47
dprincefungi: but let us know if that number goes up :)14:47
fungiit won't. not much anyway. nodepool thinks it's only allowed to have 55 instances in that provider14:48
dprincefungi: ack, thanks14:48
derekhdprince: only goiong to edit the neutron config file (I'm kind of afraid to rerun o-c-c as it does lots of things)14:48
*** akrivoka has quit IRC14:49
*** akrivoka has joined #tripleo14:49
fungithe remaining 3 in its allowance are currently in a build state according to novaclient14:49
dprincederekh: scared to use the TripleO recommended way of doing things! Never!14:49
derekhshh14:49
* dprince tails log in anticipation of debug messages14:50
*** matsuhashi has quit IRC14:52
derekhdprince: hold one, while I pull a thought from the back of my memory14:52
morazilsmola_, slagle I'm talking a bit with jdob and d0ugal about what work is left from a tuskar api side to support for swift.  I noticed you had a bit of a patch out there in progress.  Are you driving that forward?14:52
dprincederekh: So I think the router namespace is only going to be re-added if change/add a router14:53
dprincederekh: lets create a new one and then delete it?14:53
derekhdprince: neutron agent-list shows agents with hostnames with novalocal and some without14:53
derekhdprince: they change after the reboot14:53
lsmola_morazi: yeah it should be done14:53
dprincederekh: oh...14:53
lsmola_morazi: swift is working for me14:53
derekhdprince: I seem to remember lifeless saying something about this14:53
andrearosawendar: I have a question about one og your change (https://review.openstack.org/81552) are u around?14:54
derekhdprince: and the fact that the route is added to the agent with the old name (or something like that)14:54
lsmola_morazi: the latest patch is just cosmetic change I forgot, so it presents the users correct heat params14:54
derekhdprince: lemme check some logs14:54
dprincederekh; The one agent is marked w/ xxx (meaning down)14:55
dprincederekh: so perhaps it got assigned this router?14:55
derekhdprince: yup, that what I'm thinking14:55
derekhdprince: lets remove .novalocal from the hostname and restart it again14:56
dprincederekh: so perhaps this is because the machine was rebooted then?14:57
derekhdprince: yup, I'm pretty sure this has happened before14:58
dprincederekh: bingo14:58
dprincederekh: we have qrouter14:58
derekhqrouter-917ea51e-050a-4945-8af7-1040e77876a114:58
dprincederekh: I see floating IP rules too14:59
derekhdprince: ok, going to create a new instance and tests it floating ip14:59
dprincederekh: go, go!14:59
*** Matt2 has joined #tripleo14:59
derekhbooting, back in 5 minutes15:00
*** untriaged-bot has joined #tripleo15:01
untriaged-botUntriaged bugs so far:15:01
untriaged-bothttps://bugs.launchpad.net/tripleo/+bug/130143115:01
untriaged-bothttps://bugs.launchpad.net/tripleo/+bug/130143515:01
uvirtbotLaunchpad bug 1301431 in tripleo "Nova compute service failed to rebind to rabbit on control node after control node update("nova rebuild")" [Undecided,New]15:01
uvirtbotLaunchpad bug 1301435 in tripleo "devtest -c with --offline and specific DIB_REPOREF_<project>'s fails if cache is older than refs" [Undecided,Confirmed]15:01
untriaged-bothttps://bugs.launchpad.net/tripleo/+bug/129048815:01
uvirtbotLaunchpad bug 1290488 in tripleo "Baremetal: Invalid credentials" [Undecided,Incomplete]15:01
*** untriaged-bot has quit IRC15:01
lsmola_morazi: so it should be working for a week or so :-)15:03
slaglemorazi: i +2'd the patch. am happy to test anything once it's in an rpm build as well15:05
lsmola_d0ugal: about the endpoint, probably speak with slagle how to do it best15:05
lsmola_d0ugal: I expect you will have to add code to devtest setup-endpoints, so we can use it in a comfortable way15:06
d0ugallsmola_: right, makes sense. I'll take a look.15:06
jistri didn't approve this one yet https://review.openstack.org/#/c/84705/15:07
jistrbut i suppose i can - Tuskar is still not part of CI jobs in any way, so it cannot interfere, right?15:07
lsmola_slagle: the support for swift has been done here https://github.com/openstack/tuskar/commit/b6e1c9d0c3b1e2cca37bce8bd46626782455f02a15:07
derekhdprince: ok, I can now ssh to the floating IP for the te-broker, so we've made progress15:08
lsmola_slagle: the current patch is just fix for UI15:08
derekhdprince: but can't ssh to the new instance15:08
jistris anyone against me approving that Tuskar patch even though CI deployment jobs didn't run there yet?15:08
lsmola_jistr: yes we are not in CI15:08
dprincederekh: Hmm. Is this an MTU issue?15:08
derekhdprince: although I do see iptables rules for it15:08
lsmola_jistr: we==Tuskar15:08
jistrlsmola_: right. I'm going to approve it.15:09
dprincederekh: I'm not familiar with the network/MTU settings on the HP rack yet15:09
derekhdprince: the mtu issue usually caused slowness, I don't think it caused any problems connecting15:09
lsmola_jistr: cool15:09
derekhdprince: trying another new instance for the hell of it15:09
dprincederekh: okay, seems like it could cause both but lets go with that15:09
dprincederekh: could be a bad compute host?15:10
derekhdprince: yup15:10
*** e0ne_ has quit IRC15:11
dprincederekh: that instance landed on compute1. Looks like that compute host has a tone of q... interfaces.15:14
openstackgerritA change was merged to openstack/tuskar: Swift parameters fix  https://review.openstack.org/8470515:14
*** CLOUDOUTAGE has joined #tripleo15:15
CLOUDOUTAGElifeless devananda Ng SpamapS jog0 GheRivero derekh dprince slagle  -- ci-overcloud currently down https://etherpad.openstack.org/p/cloud-outage15:15
*** CLOUDOUTAGE has quit IRC15:15
derekhdprince: left over from when lots of instances were running?15:15
dprincederekh: Not sure. I don't think they should still be there though.15:15
derekhdprince: the one on compute1 is one I probably deleted, the newer one is on compute215:15
derekhdprince: agreed they should be gone15:15
derekhso now the floating IP's are working, dhcp has stopped working... instance didn't get its private IP15:16
*** ifarkas has quit IRC15:16
derekhdprince: this is the opposite to what I had ealier15:16
dprincederekh: right, well I wanted to check the original instance... could be related to the fact it was unpingable.15:16
dprincederekh: bounce dhcp too?15:17
dprincederekh: after switching the hostname?15:17
derekhdprince: ok, just did it, trying again15:18
dprincederekh: we should have just re-ran os-collect-config man!15:18
dprincederekh: that would bounce everything... which may be a problem in TripleO at some point but for the case we have here (switching the hostname) is probably what we want15:19
*** mtaylor is now known as mordred15:19
derekhdprince: may have been better, I had to run it 4 times when the server was rebooted tweaking things each time to get it to complete, just didn't want to go through that again15:19
*** mordred has quit IRC15:19
*** mordred has joined #tripleo15:19
*** ifarkas has joined #tripleo15:21
*** openstackgerrit has quit IRC15:21
*** openstackgerrit has joined #tripleo15:22
wendarHi andrearosa, I'm around now.15:25
*** e0ne has joined #tripleo15:26
derekhdprince: ok, that didn't work, will we do it your way ? rerun o-c-c ?15:26
dprincederekh: worth a shot, So DHCP still isn't working after restarting the neutron-dhcp-agent then (post hostname change)?15:28
derekhdprince: correct15:28
andrearosawendar: in that change you added new Parameters in the overcloud-source.yaml and nova-compute-instance.yaml, why in both? is it not enough to add Params in the overcloud-source.yaml?15:28
derekhdprince: ok, running o-c-c15:28
wendarandrearosa: Because the parameters have to be passed into the overcloud templates from the call to heat in the tripleo scripts, and then the overcloud templates pass them on to the nova-compute-instance template. You'll see the same pattern with several other parameters.15:29
*** UtahDave has joined #tripleo15:29
*** UtahDave has left #tripleo15:29
dprincederekh: is neutron-openvswitch-agent on the compute node happy as well?15:30
derekhdprince: lots of errors about failed ovs ports15:31
derekh2014-04-02 15:09:52.197 3526 WARNING neutron.agent.linux.ovs_lib [-] Found failed openvswitch port: [u'qvo3be27798-44', [u'map', [[u'attached-mac', u'fa:16:3e:81:b1:f0'], [u'iface-id', u'3be27798-4411-4f22-9cfa-169889ca50de'], [u'iface-status', u'active'], [u'vm-uuid', u'bb708bf9-aa56-4236-8e7d-8a6475c0e320']]], -1]15:31
andrearosawendar: so basically if we want to add a new Parameter for the compute and notcompute (controller) we need to declare the Parameter in overcloud-source.yaml, nova-compute-instance.yaml and notcompute.yaml?15:32
openstackgerritJon-Paul Sullivan proposed a change to openstack/diskimage-builder: Change refspec used to fetch all branches and tags  https://review.openstack.org/8476315:32
wendarandrearosa: Yes, that would be the current state of affairs.15:32
wendarIIRC, that may change, but I don't have a timeline.15:33
andrearosait's not 100% clear to me but that explains why one fo my changes was not working properly! Thank you very much!15:33
andrearosawendar: ^^15:33
wendarandrearosa: Glad to help. :)15:34
openstackgerritDan Prince proposed a change to openstack/tripleo-image-elements: Make tripleo-cd's te_localrc to support controller  https://review.openstack.org/8476515:35
wendarandrearosa: You can think of each template kind of like a subroutine call. And the overcloud calls the compute and notcompute. So it first has to accept the parameters before it can pass them along.15:35
*** ilives has joined #tripleo15:35
andrearosanice explanation, ta15:37
derekhdprince: o-c-c completed fine, still no luck with dhcp15:38
*** spzala has joined #tripleo15:40
dprincederekh: well, then. DHCP is always a kicker isn't it. Let me see...15:41
*** hashar has quit IRC15:42
*** matty_dubs is now known as matty_dubs|lunch15:45
*** CLOUDOUTAGE has joined #tripleo15:46
CLOUDOUTAGElifeless devananda Ng SpamapS jog0 GheRivero derekh dprince slagle  -- ci-overcloud currently down https://etherpad.openstack.org/p/cloud-outage15:46
*** CLOUDOUTAGE has quit IRC15:46
SpamapSwait it's still down?15:51
*** rpodolyaka1 has joined #tripleo15:51
SpamapSderekh: sitrep?15:51
*** eghobo has joined #tripleo15:51
dprincederekh: see the etherpad...15:51
dprincederekh: wondering if we need to do the same to all the computes? What is the root cause of this hostname issue? (how did it even happen)15:52
SpamapSgah, we have to fix this hostname flipping stuff15:52
SpamapSthe hostname flipping thing has to do with bad interactions between cloud-init and openstack, IIRC15:52
SpamapSI've never gotten a good handle on it.15:52
* mordred reminds people that he thinks everything about how hostnames are handled is evil15:53
SpamapSmordred: +115:53
SpamapSor should I say, +66615:53
SpamapSso why did the SSH key for ci-overcloud change again?15:53
dprinceSpamapS: another day another cloud?15:53
dprinceSpamapS: no idea. I just roll w/ it15:54
SpamapSSSH really needs to grow some kind of PKI other than "I remember your SSH key was X"15:54
Ngit has PKI15:55
SpamapSNg: can I sign the host key somehow with a CA?15:55
SpamapSNg: or "kerberos"15:55
SpamapSdon't say kerberos15:55
SpamapSI'll spit coffee at you15:55
dprincederekh/SpamapS: what is the root cause of the hostname flipping?15:55
NgSpamapS: http://blog.habets.pp.se/2011/07/OpenSSH-certificates15:55
NgSpamapS: I would never say kerberos, I hate that crap15:56
dprincederekh: We can write a quick script to fix all the computes, and then bounce vSwitch's on them/15:56
dprincederekh: ?15:56
Ngsigned host keys *and* signed user keys. I've never deployed it, but it looks like giant epic win15:56
SpamapSdprince: I believe what happens is nova tells cloud-init to name the box '$name', and cloud-init does so. Then on reboot, the system's init scripts see that it has a searchdomain of .novalocal, and that gets tacked on.15:56
openstackgerritAndrea Rosa proposed a change to openstack/tripleo-heat-templates: Adding the reserved host disk  https://review.openstack.org/8477015:56
SpamapSNg: that's exactly what I want.15:56
SpamapSgood to know I'm still about 2 years behind smart people.15:57
NgSpamapS: I only came across it recently and I was all "wow this is a great new feature" and then realised that it's already years old ;)15:57
dprinceSpamapS: So it happens on reboot, presumably that means all of our overcloud boxes got rebooted then.15:57
SpamapSdprince: right15:58
SpamapSdprince: it's possible that we have a boot time race15:58
SpamapSdprince: os-collect-config may need to wait for cloud-config15:58
SpamapSactually I think that is likely15:58
SpamapSalso I think persisting the metadata means that on bootup we're not running os-collect-config's command anyway15:59
SpamapSoops15:59
dprinceSpamapS: we shouldn't unless there is a change15:59
SpamapSdprince: except that there is system state that we need to assert16:00
dprinceSpamapS: services should start themselves via persistent configs16:00
SpamapSdprince: yeah, that's really the failing then isn't it?16:00
SpamapSso here's a thought..16:00
SpamapScloud-config may be runnign in parallel with the persistent service startups16:00
SpamapSrunning even16:00
dprinceSpamapS: That is my prospective. os-collect-config isn't smart enough to assert a config. It just bounces everything.16:00
openstackgerritAna Krivokapic proposed a change to openstack/tuskar-ui: Use num_nodes to get node count if possible  https://review.openstack.org/8470216:01
SpamapSdprince: well it isn't asserting config, it is asserting state.16:01
SpamapSBut the services should in fact be asserting their own state too16:01
dprinceSpamapS: if we make it run on reboot now, it'll cause a double restart of all the services. (really a bad idea)16:01
SpamapSanyway, the reason os-collect-config --force --one works is because it restarts everything and they all match gethostname() to their agent records.16:02
derekhdprince: on the quick script, let me try one node and reboot an instance on it to see if it works16:02
derekhdprince: if it does we can do them all16:02
*** killer_prince has joined #tripleo16:02
SpamapSdprince: Right so we need to not start anything until the hostname is stable.16:02
dprincederekh: ack, may as well re-run os-collect-config on them all16:02
dprinceSpamapS: exactly. Having it swap out is not cool16:03
SpamapSso... let's see16:03
SpamapS/var/log/boot.log has the output of cloud-config16:03
dprinceSpamapS: I don't think is a startup race. I think the reboot is the culprit. But we absolutely need to support that16:03
SpamapSthe reboot is a startup.. no?16:04
SpamapSthe difference is that on the _first_ startup we run os-refresh-config16:04
dprinceSpamapS: it should be logged that way16:04
*** petertoft has quit IRC16:04
derekhdprince: compute4 has novalocal in its hostname and compute2 doesn't.....16:05
derekhwell thats just weird16:05
dprincederekh: perhaps only select computes have been rebooted.16:06
dprincederekh: I added a full list to the etherpad16:06
*** yamahata has quit IRC16:06
dprincederekh: most of them are .novalocal16:07
derekhdprince: they should all have been rebooted see the script lifeless ran in /root/recovert on the undercloud16:07
SpamapSgah, one evil thing is /etc/init/hostname.conf ... we actually run the hostname command before all the filesystems are even mounted readonly.16:07
SpamapSdoesn't systemd have a specific thing just for setting the hostname?16:07
dprinceSpamapS: not sure. I think in the TripleO case cloud-init will do it as well.16:08
SpamapSso I mean the command 'hostname'16:08
SpamapSnot editting /etc/hosts16:08
*** e0ne has quit IRC16:09
SpamapScloud-init does in fact set it though16:09
*** e0ne has joined #tripleo16:09
dprinceSpamapS: so we have systemd-hostnamed.service which I think does what you are asking16:10
*** cwolferh has joined #tripleo16:11
dprincederekh: IMO at this point I'm close to saying lets just respin the whole cloud. IMO rebooting is something that is know to be broken...16:11
dprincederekh: because we don't persist our core networking configs, among other things16:11
SpamapSwait16:11
dprincederekh: my take...16:11
SpamapSso are you saying os-collect-config --force--one is not fixing them?16:12
*** e0ne has quit IRC16:12
SpamapS--force --one I mean16:12
*** ifarkas has quit IRC16:12
SpamapSbecause it has fixed this exact issue in the past16:12
dprinceSpamapS: it may16:12
dprinceSpamapS: I said I'm close... we are trying that now16:12
SpamapSok16:12
derekhdprince: ya, I'm starting to think the same,16:12
dprinceSpamapS: but it is like one thing after another.16:12
SpamapSbecause respin is more unknowns. :P16:12
SpamapSdprince: yeah, hopefully we're getting a clue as to where we are weakest though.16:13
dprinceSpamapS: Not if we use the same images (with known problems we can fix!)16:13
dprinceSpamapS: respin with new images... yes. A total roll of the dice16:13
dprincederekh: on that note I'd like to consider a way to archive our images for the CI overcloud.16:13
derekhSpamapS: dprince ok lets remove novalocal from hostnames and run o-c-c on all compute nodes16:14
derekhSpamapS: dprince if that doesn't work then maybe a redeploy (but it would be great to avoidn having to send new net id's into infra's config)16:15
dprincederekh: fine, I'll do it16:16
derekhdprince: ok, go for it16:17
*** e0ne has joined #tripleo16:17
*** eghobo has quit IRC16:17
SpamapSdprince: well ideally we wouldn't delete the existing ones16:17
*** CLOUDOUTAGE has joined #tripleo16:17
CLOUDOUTAGElifeless devananda Ng SpamapS jog0 GheRivero derekh dprince slagle  -- ci-overcloud currently down https://etherpad.openstack.org/p/cloud-outage16:17
*** CLOUDOUTAGE has quit IRC16:17
SpamapSwe'd just rename them16:17
*** eghobo has joined #tripleo16:17
*** gcha has quit IRC16:17
SpamapSand use IDs for the stack parameter16:17
SpamapSthen they'll just stick around in glance.16:17
howleytAre there any tricks for getting rid of a heat stack stuck in 'DELETE_FAILED' state?16:17
SpamapSderekh: you should not need to remove anything16:17
SpamapSderekh: you shouldn't have to do anything manual except run 'os-collect-config --force --one'16:18
howleytI'd love to avoid having to re-run devtest from the beginning16:18
SpamapShowleyt: delete again16:18
SpamapShowleyt: or stack-abandon if you know the resources are in fact deleted16:18
SpamapSdprince: ^^ do not manually edit /etc/hostname or anything16:19
dprinceSpamapS: I'm writing a script man16:19
howleytSpamapS: I already tried both. Did a stack-abandon after I had deleted the nova instances, but it stays in DELETE_FAILED.16:19
howleytResource DELETE failed: NotFound: No resource data                                                                               |16:19
howleyt|                      | found16:19
SpamapSdprince: do not automatically edit /etc/hostname or anything :)16:19
SpamapSor run the 'hostname' command16:19
dprinceSpamapS: why note?16:20
SpamapSjust run os-collect-config --force --one16:20
SpamapShowleyt: you have found a bug in heat16:20
*** e0ne has quit IRC16:20
*** rpodolyaka1 has quit IRC16:20
SpamapShowleyt: please report it. I'm glad to help you debug it.16:20
*** e0ne has joined #tripleo16:21
dprinceSpamapS: running just os-collect-config doesn't fix the hostname sir.16:21
SpamapSdprince: we don't need to fix the hostname. we need to fix the running daemons.16:21
howleytSpamapS: Think there maybe a bug for this already, let me check first.16:21
SpamapSdprince: as long as hostname == agent name they'll be happy16:21
SpamapSthey were likely started with the wrong hostname and now are ignoring messages for themselves.16:22
dprinceSpamapS: well, we were going for consistency as well. But whatever16:22
dprinceSpamapS: if we use this approach we'll have a mixed bag...16:22
SpamapSdprince: consistency is to accept the (broken) automatically assigned hostname unfortunately.16:22
SpamapSdprince: yes, we'll have a mixed bag, until we fix this problem, and deploy new images which don't have it.16:23
dprincederekh: the os-collect-configs are running now.16:23
dprincederekh: in serial :(. But probably fine16:23
SpamapSthat's probably better really :)16:24
SpamapSneutron will die if we hit it with all the work at once :)16:24
SpamapSdprince: so I think we might want to change the system default startup to delay runlevel 2 until cloud-config is done.16:25
SpamapSI think we may even want to question upstream why they don't do this.16:25
*** e0ne has quit IRC16:25
SpamapSdprince: the alternative is to have the os-svc-install upstart jobs changed to be 'start on runlevel [2345] and stopped cloud-config'16:25
SpamapS(cloud-config is a "task", hence the 'stopped')16:26
SpamapStask + stopped == done16:26
derekhdprince: SpamapS I'll be popping off soon, pretty much everything I did is on the etherpad, can ye keep it uptodate so as lifeless want to do a postportem after16:26
dprincederekh: Thanks for all the work dude16:28
SpamapSderekh: sure and many many thanks16:28
SpamapSto both of ye ;)16:28
SpamapSI feel like we need full time ops on this cloud16:28
derekhno probs,16:28
SpamapSnot joking16:29
*** rpodolyaka1 has joined #tripleo16:29
SpamapSlike when we get it from "on fire" to "smoldering" we need to not then go back to writing code that breaks it. ;)16:29
derekhSpamapS: btw, the origional problem "ci-controller not responding to tcp connections" is the exact same as what we had a few weeks back around the time of the sprint16:30
dprinceSpamapS: on Fedora we have os-collect-config run After=cloud-final.service16:30
dprinceSpamapS: which I think means we are good to go.16:30
*** d0ugal has quit IRC16:30
derekhSpamapS: I'm starting to think it can't be the same HW problem again and we have a net kernel module problem16:30
dprinceSpamapS: For Debian it would be nice if you could wait until after cloud-init is finished...16:30
*** ramishra has quit IRC16:31
derekhbut that just me thinking out loud with no proof16:31
derekhoutgoing tcp was fine16:31
SpamapSderekh: Oh we have that too16:31
SpamapSderekh: if we haven't upgraded the mellanox driver then that is the problem.16:31
SpamapSderekh: I was hoping we'd be able to limp that ci-overcloud controller along until trusty, which has the newer mellanox built in.16:31
derekhSpamapS: ok cool16:32
SpamapSdprince: cloud-config != cloud-init16:32
*** d0ugal has joined #tripleo16:32
SpamapSdprince: cloud-final might be cloud-init + cloud-config .. is it?16:32
* derekh signs off16:32
*** derekh has quit IRC16:32
dprinceSpamapS: I think so16:33
*** d0ugal has quit IRC16:33
dprinceSpamapS: I don't see any dependencies on Debian though so you'll probably want something16:33
*** jistr has quit IRC16:33
*** e0ne has joined #tripleo16:33
SpamapSdprince: ok, I'll change all of our upstart jobs to be runlevel [2345] and stopped cloud-config.16:33
SpamapSthough that leaves mysql screwed16:33
SpamapSand probably rabbitmq too16:33
dprinceSpamapS: No need to change all of them16:33
SpamapSwhich is why I'm wondering if we should override the rc default16:34
*** e0ne_ has joined #tripleo16:34
SpamapSdprince: this is not just os-collect-config's problem16:34
dprinceSpamapS: I think just os-collect-config would suffice here no?16:34
SpamapSdprince: on reboot, all of the services start on their own... and they're starting before cloud-config sets the hostname to the one from ec2 metadata16:34
SpamapS_I think_16:35
* SpamapS curses race conditions16:35
*** stevehuang has joined #tripleo16:35
dprinceSpamapS: not sure about that. I think changing them all may be overkill16:35
*** e0ne has quit IRC16:35
dprinceSpamapS: if we change hostnames we may need a scrubber/cleaner for other reasons anyways16:35
howleytSpamapS: think I need to pull some changes in, probably hitting this one: https://bugs.launchpad.net/tripleo/+bug/129106016:36
uvirtbotLaunchpad bug 1291060 in tripleo "stack delete overcloud fails on Delete AccessKey "NovaCompute0Key"" [Critical,Fix released]16:36
SpamapSdprince: We're not changing the hostname if we always start for the first time in the same state.16:36
SpamapSdprince: changing the system-wide default for runlevel 2 to be after cloud-config should actually work fine.16:37
SpamapSdprince: and I could make a strong argument that it should be the default, but upstart makes that hard. ;)16:37
*** martyntaylor has left #tripleo16:41
SpamapSdprince: I'll discuss w/ cloud-init people16:41
*** blamar_ has joined #tripleo16:41
*** blamar has quit IRC16:42
*** blamar_ is now known as blamar16:42
*** ilives has quit IRC16:42
openstackgerritClint "SpamapS" Byrum proposed a change to openstack/diskimage-builder: Delay runlevel2 on Ubuntu until after cloud-config  https://review.openstack.org/8479016:43
SpamapSdprince: ^^ suggested fix. Will test here locally.16:43
openstackgerritTzu-Mainn Chen proposed a change to openstack/tuskar-ui: Adds additional overcloud deployment config params  https://review.openstack.org/8479116:43
dprinceSpamapS: isn't the definition of runlevel 2 after networking is up (and cloud-init requires networking to actually do its thing)16:44
dprinceSpamapS: Sorry, mistated that.16:44
dprinceSpamapS: runlevel 2 is without networking right?16:45
*** martyntaylor has joined #tripleo16:46
*** CaptTofu has quit IRC16:47
*** CLOUDOUTAGE has joined #tripleo16:48
CLOUDOUTAGElifeless devananda Ng SpamapS jog0 GheRivero derekh dprince slagle  -- ci-overcloud currently down https://etherpad.openstack.org/p/cloud-outage16:48
*** CLOUDOUTAGE has quit IRC16:48
dprinceSpamapS: I think pinning this to the runlevel is the sledge hammer solution. Why can't you just simply make os-collect-config run first after cloud-init finishes?16:48
jang1support-2% who -r16:48
jang1         run-level 2  2014-04-02 11:2216:48
jang1not on ubuntu, not any more.16:48
SpamapSdprince: runlevel2, in debian land, is "ready to do business"16:48
SpamapSdprince: networking, users, etc.16:49
jang1SpamapS: if that patch works, it'd be good to see it on debian-upstart, too.16:49
SpamapSjang1: good point16:49
jang1(I don't think we have a generic "upstart", do we?)16:49
SpamapSdprince: I don't see this as a sledgehammer. The system is booting and the hostname is changing right out from under daemons.16:50
SpamapSjang1: no we don't16:50
SpamapSjang1: but we could. :)16:50
jp_at_hpslagle: nice spot on the manifests change...16:50
SpamapSdprince: I'm looking into pushing this all the way up into Ubuntu.16:50
SpamapSand I guess Debian too really16:51
SpamapSdprince: I'm failing at communicating the problem.16:51
SpamapSdprince: the problem is not that os-collect-config runs in parallel with cloud-config on the first boot.16:52
jp_at_hpslagle: I think what I'm going to do is to undo my changes to the base store-build-settings file, and if those files exist, move them in the manifests element?  Either that or make the base element depend on the manifests element.  The reasoning being, if the manifests element is not included I still want something like that left in the base image...16:52
SpamapSdprince: the problem is that _everything_ runs in parallel with cloud-config on every boot. So the hostname changes at ??? moment.16:52
SpamapSdprince: I'm suggesting that we shoudl stop that. cloud-config will have the facilities it needs, and then everything else will start after it has morphed system state16:53
dprinceSpamapS: So long as networking is up that is fine. We aren't seeing that on Fedora... What guards it is the 'After' clause for os-collect-config.16:54
SpamapSdprince: right, and then every other service starts After=os-refresh-config.service (btw is that a bug?)16:55
SpamapSyou shouldn't still have an os-refresh-config service16:55
dprinceSpamapS: so long as what you are doing works with dhcp-all-interfaces on first boot I think you'll be fine16:55
jang1I have a question. Why are we having any services start themselves at all, and not letting os-c-c do it?16:56
dprincejang1: because o-c-c doesn't always run unless there is new metadata16:56
jang1okay. So, I've heard it claimed that a power state change is a config change. Shouldn't o-c-c actually run at least once on boot?16:57
SpamapSdprince: oh interesting, there may be another race there. :)16:57
SpamapSno.. no race16:57
SpamapSstart on starting network-interface16:57
SpamapSdprince: so the chain for that is udev->device-added->starting network-interface->dhcp-all-interface runs and configures the interface->network-interface ifup's it, if they're all configured, emit static-networking-up16:58
SpamapSbut the ifup script that emits static-networking-up does a lock.. so in theory they'll all be blocked up on that.16:59
SpamapShm there may still be a race16:59
SpamapSwhen I wrote it to check all interfaces every time, thats why I did all, not a single one16:59
SpamapSbut that's flawed too17:00
dprinceSpamapS: On Fedora I wrote the dhcp-interface@.service such that is is fail fast if the interface script exists (it doesn't even bother calling dhcp-all-interfaces.sh... so no need to flock it)17:00
SpamapSdprince: anyway, I'm confident the patch I did will eliminate _a_ race, but not _all_ races.17:01
SpamapSdprince: and it will have as much networking as it does today.17:02
dprinceSpamapS: so our overcloud is still busted :/17:03
SpamapSdprince: down agents?17:04
dprinceSpamapS: should be17:04
dprinceSpamapS: oh, restarted os-collect-config rather.17:04
SpamapSneutron agents are up17:04
dprinceSpamapS: which should have rekicked all the agents17:04
SpamapSnova computes are up17:05
*** jcoufal has quit IRC17:05
SpamapSrestarting os-collect-config wouldn't do anything17:05
*** lucasagomes is now known as lucas-afk17:06
SpamapSrunning --force --one would bounce everything17:06
dprinceSpamapS: dude, that is what I ran!17:06
SpamapSok just making sure17:07
dprinceSpamapS: See /root/recovert/hostname_fix.sh17:07
SpamapSso I see a working cloud. What's broken now?17:07
dprinceSpamapS: still no floating IPs17:07
dprinceSpamapS: we'll they aren't pinging17:07
dprinceSpamapS: it is getting to the point where my confidence in this is little to none17:07
dprinceSpamapS: I'm inclined to say lets fix some issues and try again17:08
*** newell_ has joined #tripleo17:08
*** akuznetsov has quit IRC17:09
mordred /buffer Ng17:09
mordredgah17:09
*** eguz has joined #tripleo17:09
SpamapSdprince: Well perhaps we should debug a bit more before giving up on it. We're gaining insight into why this is broken.17:10
dprinceSpamapS: Sure. Well FWIW I already knew rebooting caused problems :(17:10
SpamapSdprince: yes, but did you know why? :)17:11
dprinceSpamapS: Well, one of them is DHCP (which is still hosed)17:11
dprinceSpamapS: the hostname thing I had forgotten though17:12
*** eghobo has quit IRC17:14
*** killer_prince has quit IRC17:16
openstackgerritRicardo Carrillo Cruz proposed a change to openstack/tripleo-incubator: Add 'Supported Platforms' section in README.md  https://review.openstack.org/8480117:18
SpamapSwow security-group-list is... useless17:18
*** CLOUDOUTAGE has joined #tripleo17:19
CLOUDOUTAGElifeless devananda Ng SpamapS jog0 GheRivero derekh dprince slagle  -- ci-overcloud currently down https://etherpad.openstack.org/p/cloud-outage17:19
*** CLOUDOUTAGE has quit IRC17:19
*** morganfainberg_Z is now known as morganfainberg17:23
SpamapSdprince: I'm able to get to instances I've booted17:26
dprinceSpamapS: cool, floating too?17:27
openstackgerritMichael Tupitsyn proposed a change to openstack/tripleo-image-elements: Configurable Keystone token provider  https://review.openstack.org/8480217:28
SpamapSdprince: $ telnet 138.35.77.39 2217:28
SpamapSTrying 138.35.77.39...17:28
SpamapSConnected to 138.35.77.39.17:28
SpamapSEscape character is '^]'.17:28
SpamapSSSH-2.0-OpenSSH_6.2p2 Ubuntu-6ubuntu0.117:28
SpamapSdprince: had to add a security group rule allowing it17:28
dprinceSpamapS: Okay. I was testing one of derekh's. Not sure if he did that...17:28
SpamapSI see nodepool slaves building too17:28
dprinceSpamapS: So I suppose our os-collect-config did the trick...17:29
SpamapSwait no those might be old17:29
*** e0ne has joined #tripleo17:29
SpamapSdprince: word back from cloud-init is that hostname should not be changing after runlevel 2 btw17:29
SpamapSso my theory is bust17:29
*** matty_dubs|lunch is now known as matty_dubs17:29
openstackgerritMichael Tupitsyn proposed a change to openstack/tripleo-heat-templates: Configurable Keystone token provider  https://review.openstack.org/8480717:30
openstackgerritMichael Tupitsyn proposed a change to openstack/tripleo-incubator: Configurable Keystone token provider  https://review.openstack.org/8480817:30
*** e0ne has quit IRC17:31
openstackgerritColeman Corrigan proposed a change to openstack/tripleo-image-elements: Activate venvs in os-*-config elements source install  https://review.openstack.org/8481017:31
openstackgerritColeman Corrigan proposed a change to openstack/tripleo-image-elements: Activate venvs in os-*-config elements source install  https://review.openstack.org/8481017:33
*** jprovazn_afk is now known as jprovazn17:37
*** dividebin has joined #tripleo17:37
*** e0ne_ has quit IRC17:38
*** noslzzp has quit IRC17:38
*** tchaypo has quit IRC17:38
*** ohadlevy has quit IRC17:38
*** dividehex has quit IRC17:38
*** dividebin is now known as dividehex17:38
*** akuznetsov has joined #tripleo17:40
*** jcoufal has joined #tripleo17:41
*** ohadlevy has joined #tripleo17:43
*** ohadlevy is now known as Guest1397317:43
*** tchaypo has joined #tripleo17:44
*** CLOUDOUTAGE has joined #tripleo17:50
CLOUDOUTAGElifeless devananda Ng SpamapS jog0 GheRivero derekh dprince slagle  -- ci-overcloud currently down https://etherpad.openstack.org/p/cloud-outage17:50
*** CLOUDOUTAGE has quit IRC17:50
openstackgerritSaurabh Surana proposed a change to openstack/tripleo-image-elements: base element for trove control plane elements  https://review.openstack.org/8260517:51
*** UtahDave has joined #tripleo17:57
jprovazngreghaynes, hi18:01
openstackgerritTzu-Mainn Chen proposed a change to openstack/tuskar-ui: Adds additional overcloud deployment config params  https://review.openstack.org/8479118:06
*** akuznetsov has quit IRC18:09
*** giulivo has quit IRC18:11
openstackgerritJay Dobies proposed a change to openstack/tuskar: Added keystone configuration to install guide  https://review.openstack.org/8482718:14
*** akuznetsov has joined #tripleo18:17
*** dkehn__ has joined #tripleo18:19
*** e0ne has joined #tripleo18:20
*** CLOUDOUTAGE has joined #tripleo18:21
CLOUDOUTAGElifeless devananda Ng SpamapS jog0 GheRivero derekh dprince slagle  -- ci-overcloud currently down https://etherpad.openstack.org/p/cloud-outage18:21
*** CLOUDOUTAGE has quit IRC18:21
*** dkehn_ has quit IRC18:23
tchayposurely greghaynes can't be awake at the moment18:27
jprovazntchaypo, what TZ is greghaynes?18:29
greghaynesjprovazn: hey18:29
tchaypotechnically US-west, but he seems to work AUS hours18:29
greghaynespsh, always awake :)18:29
jprovazngreghaynes, :) sorry for waking you up18:30
greghaynesoh, you didnt. Just had a few real world things to do18:30
jprovazngreghaynes, I enjoyed today very "nice" time with galera cluster18:30
greghaynesuh oh18:31
greghayneswell... did it work? ;)18:32
*** noslzzp has joined #tripleo18:32
jprovazngreghaynes, the thing is that cluster init is more tricky - 1) start first node in standalone mode (boostrap param), then *after* cluster is created (after more nodes join) and this first node is being restarted, it can be restarted in common way (so it joins to cluster represented by other nodes)18:32
jprovazngreghaynes, but if no other nodes have joined meantime and this first node is being restarted, then it should be started again as standalone18:33
greghaynesyes, are you gettting at how rebooting the whole cluster is broken?18:33
jprovazngreghaynes, no - ^ this is still first boot phase :)18:34
jprovazngreghaynes, IOW current cluster status has to be considered when starting first node18:34
geerdestthe heat metadata needed to configure ntp server in overcloud…,where does that go?18:35
greghaynesWell, its the same problem that comes up when the hole cluster is shut down and restarted - someone has to say "I have the actual on disk data"18:36
SpamapSthey call that leader election18:36
greghayneshehe yep18:36
*** rpodolyaka1 has quit IRC18:36
SpamapSand we had originally decided that node 0 is special for the bootstrap case only18:36
greghaynesso for the one cluster we can put in some special logic18:36
jprovazngreghaynes, yes, if you shutdown whole cluster, situation is same18:37
greghaynesIf there is more than one node though you cant just pick anyone - you have to figure out who has the most up to date data18:37
SpamapSonce it has bootstrapped, they'll all join that one, and then from that point on, we try really hard not to lose quorum. If we do, then we have to manually kick it.18:37
SpamapSgreghaynes: most up to date is easy because quorum18:37
SpamapSif you lost quorum, manual kick18:37
greghaynesIf they all have been shut down for some reason then they all have to propose their latest data revision and highest wins18:38
SpamapSgreghaynes: true thats still a quorum18:38
greghaynesYep, so I kind of punted on implementing that part18:38
*** hashar has joined #tripleo18:38
SpamapSgreghaynes: galera doesn't?18:38
greghayneshrm, xtradb-cluster docs indicated that needed to be done by the operator18:39
SpamapSgreghaynes: seems to me that you could say that galera should never start if it isn't the bootstrap node and there isn't quorum already.18:39
SpamapSbut yeah that sounds easier than it probably is18:39
jprovazngreghaynes, SpamapS galera keeps seqno attr for this reason (most up to date node) in grastate.dat18:39
SpamapSright18:39
greghaynesjprovazn: Yes, but not sure what facility use to resolve that info across the cluster18:40
SpamapSjprovazn: so if the first two nodes that start up (0, 1) pick 1 as the most up to date, 1 will send its data to 0, right?18:40
SpamapSjprovazn: what if 2 starts up, and it is higher than 0 or 1?18:40
jprovaznSpamapS, I think so18:40
greghaynesFor a temp fix though I could just special case for a cluster size of 1 ...18:40
SpamapSgreghaynes: cluster size of 1 is special indeed.18:41
jprovaznSpamapS, I suppose it should fail to start (node 2)18:41
SpamapSjprovazn: ok, that is what worries me, especially in the datacenter-power-on scenario. So I bet Galera has a way to delay startup until all nodes are present.18:42
*** spzala has quit IRC18:42
jprovaznSpamapS, hm I don't know about such mechanism, but it deserves checking it18:43
SpamapScodership's wiki seems to have gone all 404'18:44
greghaynesYes. So the question is - is the N cold reboot something required for that patch to land18:45
*** yassine has quit IRC18:45
greghaynesI can see an argument that 1 node is, but I dont think > 1 should be as its not something we support yet18:45
jprovazngreghaynes, the issue with https://review.openstack.org/#/c/83675/8/elements/mysql/os-refresh-config/configure.d/51-mysql-init is that this starts mysql in standalone mode only on first run of os-refresh-config, then on the next run db.initialized file is created, and it will try to start in "join to cluster" mode18:46
greghaynesYep. really any time you try to restart mysql because the next time its possible that it should be the leader18:46
SpamapSthis is most disturbing18:47
jprovazngreghaynes, not exactly18:47
SpamapSgalera's manuals have gone away18:47
*** akuznetsov has quit IRC18:47
jprovazngreghaynes, by "bootstrap" param (or by gcom//) you just sayin "don't join to cluster", but if you restart this first node later (once cluster was created), you want to join it to existing cluster18:48
greghaynesdocs? who needs thsoe18:48
SpamapSgreghaynes: I think as long as the cold reboot results in nodes -down- and not nodes inconsistent, then the patch is fine.18:48
greghaynesjprovazn: yes, its possible that another node in the cluster has more up to date info in which case the 0th node should be started without the bootstrap param18:49
greghaynesSpamapS: Agreed18:49
SpamapSright so here's a thought. bootstrap with node 0, that's easy, but then never do that again. If you encounter another situation where there is no known leader, error, exit, done.18:50
jprovazngreghaynes, SpamapS: well, it seems to me that the patch will not work because when you boot up first node, it will be ffailing because from second os-refresh-config run it will be starting node "mysql restart" - which will fail beause it will try to connect to other mysql nodes which are not running yet18:51
SpamapSthen we can cross the cold boot bridge later.18:51
greghaynesw00t, thats basically what happens. One enhancement that would probably be nice is an if cluster_size==1 then always reboot with bootstrap18:51
SpamapSjprovazn: will it require those other nodes, or just try to connect for the purposes of sharing?18:51
* SpamapS is actually quite hungry and will go soon18:52
jprovaznSpamapS, other nodes are required I believe - it tries ndoes one by one, if none is avail, it fails18:52
*** CLOUDOUTAGE has joined #tripleo18:52
CLOUDOUTAGElifeless devananda Ng SpamapS jog0 GheRivero derekh dprince slagle  -- ci-overcloud currently down https://etherpad.openstack.org/p/cloud-outage18:52
*** CLOUDOUTAGE has quit IRC18:52
jprovazngreghaynes, +1 for the enhancement18:52
SpamapSjprovazn: why not just try forever at that point?18:53
jprovazn(which is waht I was playing with today :)18:53
jprovaznSpamapS, because it would be forever ;) - other nodes will not come up because these will not have other running nodes too18:53
jprovaznSpamapS, I think everybody will be waiting for first running node18:54
SpamapSyou're talking about the 1 node cluster.. that is special cased. The one with quorum expected will have quorum eventually no?18:54
jprovaznSpamapS, no, the same happens with multiple nodes18:54
jprovaznI think, I tried with 2 only18:54
SpamapSah, ok, I do not understand enough18:55
SpamapSand my belly is overriding my curiosity18:55
* SpamapS will bbiab18:55
*** marun is now known as maru_afk18:57
*** rwsu has quit IRC19:01
*** ramishra has joined #tripleo19:01
openstackgerritJan Provaznik proposed a change to openstack/tripleo-image-elements: WIP: Add mysql/mariadb cluster support  https://review.openstack.org/8483819:02
jprovazngreghaynes, SpamapS: ^19:02
jprovazngreghaynes, SpamapS: this adds check if first node should be started with boostrap or joined to cluster, as you can see, determine if cluster was already created is not trivial19:03
tchaypo9919:04
tchaypoc19:04
*** rwsu has joined #tripleo19:07
tchaypowell done me.19:10
greghaynesjprovazn: yes, so the issue is the second time around the node 0 could have a data version that is not most up to date and node 1 could be most up to date19:11
greghaynesjprovazn: in that case node 1 should do the bootstrap and node 0 should join19:11
greghaynesif you have node 0 bootstrap then there will be data loss19:12
tchaypohow do you determine which node has the most recent data?19:16
jprovazngreghaynes, well, that would be situation when all your nodes have been down at the same time and you need to do bootstrap again (hope this will not happen), really major issue which the script is trying to solve the first cluster setup19:16
tchaypoIs it possible to have cases where node 0 and node 1 both have some data that the other hasn't seen (maybe because of a network segementation event before they both died)?19:17
jprovazngreghaynes, it takes care of bootstrapping the first node for the right time19:17
jprovazn(as long as cluster was not created yet)19:18
*** CaptTofu has joined #tripleo19:18
greghayneshow do you know with that check the second time through that it shouldnt bootstrap again?19:18
jprovazngreghaynes, https://review.openstack.org/#/c/84838/1/elements/mysql-common/os-refresh-config/configure.d/51-init-mysql-cluster - cluster_created function - it checks if cluster was created meantime or not19:19
greghaynes:) aha19:20
greghaynesI like that idea19:20
greghaynesIll mess with it a bit19:20
jprovazngreghaynes, and yes, it's not very nice - it would be great to have solution without notify script19:21
greghaynesThat stuff should probably go in /mysql not mysql-common right?19:21
jprovazngreghaynes, and I could not find such solution during today19:21
jprovazngreghaynes, no - it's same for mysql and mariadb - mysql-common is right place unless there is percona speciality19:21
greghaynesAwesome19:24
*** CLOUDOUTAGE has joined #tripleo19:24
CLOUDOUTAGElifeless devananda Ng SpamapS jog0 GheRivero derekh dprince slagle  -- ci-overcloud currently down https://etherpad.openstack.org/p/cloud-outage19:24
*** CLOUDOUTAGE has quit IRC19:24
jprovazngreghaynes, alternative which to the notify script would be to always do "service mysql start || service mysql bootstrap" for the first node (if it fails to join to existing cluster, bootstrap it, but there is a problem with timeout - cluster join might take a longer time (data sync), for this reason init script timeout is by default 5 minutes, it could be changed but anyway couple of minutes delay on each start is probably subopt19:25
jprovaznimal19:25
greghaynesyes, that seems very scary to me and likely to cause spurious data corruption19:26
greghaynesTheres definitely a long term solution needing to be developed so it should be fine to punt on having nodes come back from a full cold reboot19:27
greghaynesand I want to get enough landed to start working on fixing the other parts were missing - like migrations on only one node19:28
greghaynesjprovazn: What do you think about making your patch based on the galera one?19:30
greghaynesotherwise its going to be a bad set of conflicts when one merges19:30
*** rpodolyaka1 has joined #tripleo19:30
*** e0ne has quit IRC19:37
jprovazngreghaynes, sure thing19:37
*** spzala has joined #tripleo19:38
*** jp_at_hp has quit IRC19:42
*** cwolferh has quit IRC19:43
*** cwolferh has joined #tripleo19:43
*** dividehex has quit IRC19:44
*** jcoufal has quit IRC19:44
tchaypodprince: you just made my morning :)19:45
dprincetchaypo: cool. What did I do?19:46
tchaypodprince: it's more what you didn't do19:46
tchaypodprince: specifically, find nits on the standarize-location-of-passwords change19:47
tchaypoI'm easily pleased19:48
dprincetchaypo: ah, well I'm glad it worked out then :)19:48
*** e0ne has joined #tripleo19:51
*** dividehex has joined #tripleo19:51
*** ramishra has quit IRC19:51
*** e0ne has quit IRC19:52
*** CLOUDOUTAGE has joined #tripleo19:55
CLOUDOUTAGElifeless devananda Ng SpamapS jog0 GheRivero derekh dprince slagle  -- ci-overcloud currently down https://etherpad.openstack.org/p/cloud-outage19:55
*** CLOUDOUTAGE has quit IRC19:55
openstackgerritMichael Tupitsyn proposed a change to openstack/tripleo-image-elements: Configure logging for keystone  https://review.openstack.org/8484720:09
*** dprince has quit IRC20:15
*** CLOUDOUTAGE has joined #tripleo20:26
CLOUDOUTAGElifeless devananda Ng SpamapS jog0 GheRivero derekh dprince slagle  -- ci-overcloud currently down https://etherpad.openstack.org/p/cloud-outage20:26
*** CLOUDOUTAGE has quit IRC20:26
*** akrivoka has quit IRC20:29
*** dkehn__ is now known as dkehn_20:29
*** maru_afk is now known as marun20:31
*** jprovazn has quit IRC20:32
*** lblanchard has quit IRC20:32
*** lblanchard has joined #tripleo20:33
*** blamar_ has joined #tripleo20:34
*** blamar has quit IRC20:36
*** blamar_ is now known as blamar20:36
greghaynestchaypo: You have any luck with that heat template?20:40
tchaypogot distracted by something else, and now I've inadvertently destroyed my seed20:42
greghaynesrip20:42
lifelesswow, I slept in20:43
lifelessSpamapS: hi20:43
tchaypomorning lifeless20:43
SpamapSlifeless: howdy20:43
tchaypowelcome to the day20:43
SpamapSlifeless: so ci-overcloud looks to be working.. but nodepool isn't taking advantage just yet.. not sure why20:44
*** blamar has quit IRC20:44
*** blamar has joined #tripleo20:44
lifeless| cdf9ae2d-5b65-4911-a5e3-4ea982a6fead | tripleo-precise-tripleo-test-cloud-3392283.slave.openstack.org | BUILD  | -          | NOSTATE     |                                                                     |20:45
lifeless| 78b0f203-3d26-42f8-9ef4-c0aa2c29dd93 | tripleo-precise-tripleo-test-cloud-3392286.slave.openstack.org | BUILD  | -          | NOSTATE     |                                                                     |20:45
SpamapSlifeless: those are old20:45
lifelessok20:46
SpamapSlifeless: from right before we got things back up IIRC20:46
lifelessSpamapS: have we upgraded the kernel ?20:47
SpamapSlifeless: not that I know of20:47
*** blamar has quit IRC20:47
SpamapSlifeless: I came in after DHCP was working again20:47
SpamapSthey were about to start mucking with /etc/hostname and I said just run os-collect-config --force --one and that seemed to have worked20:48
*** marun is now known as maru_afk20:48
*** spzala has quit IRC20:49
*** spzala has joined #tripleo20:50
*** blamar has joined #tripleo20:50
* tchaypo discovers dib's "check-break" function20:53
*** CLOUDOUTAGE has joined #tripleo20:57
CLOUDOUTAGElifeless devananda Ng SpamapS jog0 GheRivero derekh dprince slagle  -- ci-overcloud currently down https://etherpad.openstack.org/p/cloud-outage20:57
*** CLOUDOUTAGE has quit IRC20:57
*** CaptTofu has quit IRC20:58
*** rpodolyaka1 has quit IRC20:59
*** jdob has quit IRC21:00
*** untriaged-bot has joined #tripleo21:00
untriaged-botUntriaged bugs so far:21:00
untriaged-bothttps://bugs.launchpad.net/tripleo/+bug/129048821:00
uvirtbotLaunchpad bug 1290488 in tripleo "Baremetal: Invalid credentials" [Undecided,Incomplete]21:00
*** untriaged-bot has quit IRC21:00
*** rpodolyaka1 has joined #tripleo21:02
*** blamar has quit IRC21:02
*** lblanchard has quit IRC21:11
*** rpodolyaka1 has quit IRC21:14
*** jang1 has quit IRC21:16
*** hashar has quit IRC21:18
*** eguz has quit IRC21:25
*** eghobo has joined #tripleo21:25
openstackgerritBen Nemec proposed a change to openstack/diskimage-builder: set -u and -o pipefail everywhere  https://review.openstack.org/8486821:26
openstackgerritBen Nemec proposed a change to openstack/diskimage-builder: set -e all the things  https://review.openstack.org/8392721:26
openstackgerritBen Nemec proposed a change to openstack/diskimage-builder: Make sure all scripts are set -e  https://review.openstack.org/8163721:26
openstackgerritBen Nemec proposed a change to openstack/diskimage-builder: dib-lint does not work with set -e  https://review.openstack.org/8393021:26
openstackgerritBen Nemec proposed a change to openstack/diskimage-builder: Check for set -o pipefail  https://review.openstack.org/8392921:26
openstackgerritBen Nemec proposed a change to openstack/diskimage-builder: Ensure scripts are set -u  https://review.openstack.org/8392821:26
*** CLOUDOUTAGE has joined #tripleo21:28
CLOUDOUTAGElifeless devananda Ng SpamapS jog0 GheRivero derekh dprince slagle  -- ci-overcloud currently down https://etherpad.openstack.org/p/cloud-outage21:28
*** CLOUDOUTAGE has quit IRC21:28
*** matty_dubs is now known as matty_dubs|gone21:29
*** meena has quit IRC21:29
*** meena has joined #tripleo21:30
*** meena has joined #tripleo21:30
openstackgerritMichael Tupitsyn proposed a change to openstack/tripleo-image-elements: Configurable Keystone token provider  https://review.openstack.org/8480221:31
*** markmc has quit IRC21:32
*** eguz has joined #tripleo21:38
*** eghobo has quit IRC21:41
* tchaypo wonders how StevenK enjoyed his 3am phonecall21:47
*** blamar has joined #tripleo21:51
greghaynes3am and then one coming up21:52
tchaypoI did not expect the phone conference to get all metaphysical on me21:56
tchaypo"while you wait you will hear silence"21:56
tchaypowill i? really? is that even possible?21:56
greghaynesTrying to give you something to think about while you wait21:56
*** UtahDave has quit IRC21:57
tchaypoit's a lot better than spamming muzak in my ear21:58
*** CLOUDOUTAGE has joined #tripleo21:59
CLOUDOUTAGElifeless devananda Ng SpamapS jog0 GheRivero derekh dprince slagle  -- ci-overcloud currently down https://etherpad.openstack.org/p/cloud-outage21:59
*** CLOUDOUTAGE has quit IRC21:59
greghaynesmmm seems merge.py isnt correctly scaling compute with software-config22:00
tchaypocody-somerville: do we have any plans for pycon-au yet?22:00
cody-somervilleOf course. :)22:01
tchaypoah, excellent! How can I find out more about these plans?22:01
openstackgerritDerek Higgins proposed a change to openstack/tripleo-image-elements: Install bridge-utils on compute nodes  https://review.openstack.org/8487622:01
openstackgerritDerek Higgins proposed a change to openstack/tripleo-image-elements: Install bridge-utils on compute nodes  https://review.openstack.org/8487622:02
cody-somervilletchaypo: So it doesn't look like HP is currently sponsoring (we can see about changing that if we feel it important - though we're already sponsoring a number of Python events this year) but I do have money set aside for about 3-4 people to go.22:03
tchaypoDo we plan to run a miniconf again?22:04
tchaypoor rather - did we find that valuable last year?22:04
cody-somervilleI wasn't involved in our activities at PyCon-AU last year so uncertain.22:05
cody-somervilleBut CFP closes on April 25th.22:05
lifelesstchaypo: we didn't run the miniconf last time22:06
lifelesstchaypo: tristan did, and roped me in to help22:06
tchaypoah. Tristan is.. rackspace?22:06
lifelesstchaypo: what I think we should do is actively be involved and reach out to pull in talks etc22:07
lifelesstchaypo: aptira22:07
tchaypoI've seen his name on the ozstackers meetup group22:07
bnemecIf any cores are looking for an easy review, I'd love a +2 on https://review.openstack.org/#/c/78461 (assuming there are no problems, of course :-)22:24
*** TravT has joined #tripleo22:26
openstackgerritBen Nemec proposed a change to openstack/tripleo-image-elements: Make cinder-tgt/lio depend on cinder-volume  https://review.openstack.org/8488422:27
openstackgerritBen Nemec proposed a change to openstack/tripleo-image-elements: Add cinder-lio element  https://review.openstack.org/7846322:27
openstackgerritBen Nemec proposed a change to openstack/tripleo-image-elements: Factor out tgt-specific parts of cinder element  https://review.openstack.org/7846222:27
*** CLOUDOUTAGE has joined #tripleo22:30
CLOUDOUTAGElifeless devananda Ng SpamapS jog0 GheRivero derekh dprince slagle  -- ci-overcloud currently down https://etherpad.openstack.org/p/cloud-outage22:30
*** CLOUDOUTAGE has quit IRC22:30
*** tchaypo has quit IRC22:40
*** tchaypo has joined #tripleo22:54
openstackgerritMichael Tupitsyn proposed a change to openstack/tripleo-incubator: Configurable Keystone token provider  https://review.openstack.org/8480822:57
*** weshay has quit IRC22:59
*** john-n-seattle2 has joined #tripleo23:01
*** yamahata has joined #tripleo23:01
*** john-n-seattle2 has left #tripleo23:01
*** CLOUDOUTAGE has joined #tripleo23:01
CLOUDOUTAGElifeless devananda Ng SpamapS jog0 GheRivero derekh dprince slagle  -- ci-overcloud currently down https://etherpad.openstack.org/p/cloud-outage23:01
*** CLOUDOUTAGE has quit IRC23:01
openstackgerritMichael Tupitsyn proposed a change to openstack/tripleo-heat-templates: Configurable Keystone token provider  https://review.openstack.org/8480723:03
*** spzala has quit IRC23:04
openstackgerritMichael Tupitsyn proposed a change to openstack/tripleo-image-elements: Configurable Keystone token provider  https://review.openstack.org/8480223:07
*** yamahata has quit IRC23:11
*** lucas-afk has quit IRC23:15
*** noslzzp has quit IRC23:16
greghaynesSpamapS: Is the plan for merge.py to not scale out NovaComputeConfig in https://review.openstack.org/#/c/81666/11/nova-compute-instance.yaml ?23:33
*** xuhaiwei has joined #tripleo23:34
greghaynesAs is it currently scales that resource into NovaCompute0Config and NovaCompute1Config, but doesnt update refs23:34
xuhaiweimorning23:35
lifelesswow the day has gone fast23:36
greghaynesI wonder if its sane to make merge.py not apply scaling if resource doesnt match scaling_prefix + 0 ?23:37
greghayneslifeless: ^ or why doesnt it currently operate that way?23:38
lifelessso it scales the configs out today because they need to be unique. I'd be ok with making it only scale Foo0 and changing the CFN templates to match23:39
greghaynesawesome. Seems safer than assuming prefix.*23:39
tchaypofrom my cursory reading I thought it was looking for exactly prefix0, I didn't realise it was prefix.*23:41
*** andreaf2 has joined #tripleo23:41
greghaynesoh, youre reading merge.py23:41
* greghaynes hands tchaypo a pot of coffee23:41
greghaynesyoull need this23:41
StevenKgreghaynes: Don't forget the flask of whiskey23:42
greghaynesYes, good call23:42
tchaypono no23:43
tchaypoI'm currently reading pip23:43
tchaypoand trying to find why it doesn't seem to bother reading mirror_base/index.html if mirror_base starts with file://23:43
tchaypovarious bits of code hint that it should be doing that, but strace shows that it doesn't.23:44
greghaynesstrace for great good?23:44
greghaynes:/23:44
greghaynesmuch fun23:44
tchaypoOne of these days I'm going to figure out how to make pip set loglevel.DEBUG and then i might get more of an idea what it's doing.23:45
tchaypooh, and I'm pretty sure it's one of those things where I'm going to feel stupid that I didn't figure it out sooner, too23:45
lifelesstchaypo: does it treat the directory as the page and readdir instead?23:46
lifelesstchaypo: and cast that into links23:46
lifelesstchaypo: personally for this sort of thing I tend to edit the code in question and insert 'import pdb; pdb.set_trace()'23:46
tchaypothe comment from the --index-url param in the help output says that that's what it does23:47
tchaypoand of course that's why it breaks; when it tries to get the directory name that doesn't exist23:47
tchaypobut other comments say that it should be looking for mirror_base/index.html first, even if it's a file://, and that doesn't seem to be happening23:47
lifelesstchaypo: I have different help I think23:48
lifeless  -i, --index-url <url>       Base URL of Python Package Index (default https://pypi.python.org/simple/).23:48
lifeless  --extra-index-url <url>     Extra URLs of package indexes to use in addition to --index-url.23:48
lifeless  --no-index                  Ignore package index (only looking at --find-links URLs instead).23:48
lifeless  -f, --find-links <url>      If a url or path to an html file, then parse for links to archives. If a local path or file:// url that's a directory, then23:48
lifeless                              look for archives in the directory listing.23:48
lifelesswe're not using -f23:48
tchaypono, but even if we do, it stats /tmp/pypi/markupsafe and dies23:49
tchayposorry, distracted moving stuff getting ready to get it taken to storage23:49
lifelesstis ok23:49
lifelessmay I suggest that -f is a distraction23:49
lifelessthe mirror we have is correct AFAICT, even *without* an index.html23:49
tchaypoyep, I'm more interested in why the index-url isn't being hit23:50
lifelesstchaypo: so I'd put a breakpoint as above, at the start of the function where Real name of *should* be output23:50
lifelessbut thats me :)23:50
tchaypoI agree with the breakpoint idea23:52
tchaypobut the "Real name" bit happens inside the call to self._find_url_name() on line 196, and we never get that far - calls self._get_page() on line 194 and (further down the call tree) does the stat and exits23:52
tchaypoThere's another obvious target there - should that exception get caught?23:53
lifelesslet me have a look23:53
lifelesswhats the function name in the routine - the one line 194 is in23:53
tchayposorry, was just musing, not actually intending that to be a question to you23:53
lifelesscuriousity piqued23:53
tchaypo  File "/usr/local/lib/python2.7/dist-packages/pip/index.py", line 194, in find_requirement23:53
tchaypo    page = self._get_page(main_index_url, req)23:53
lifelessyeah23:54
lifelesswhat exception is thrown ?23:54
lifeless            if scheme == 'file' and os.path.isdir(url2pathname(path)):23:55
lifelessIs that what raises?23:55
*** spzala has joined #tripleo23:55
lifelessno23:55
lifelessthat should be safe23:56
tchaypo  File "/usr/local/lib/python2.7/dist-packages/pip/download.py", line 194, in send23:56
tchaypo    stats = os.stat(pathname)23:56
tchaypoOSError: [Errno 2] No such file or directory: '/tmp/pypi/markupsafe/'23:56
lifelesswhats the frame above ?23:56
lifelesserm23:56
lifelesswhats the first frame in index.py23:56
Shrewslifeless: fyi, i've gotten much better battery life since installing pm-powersave23:56
lifelessapt-cache show pm-powersave23:57
lifelessN: Unable to locate package pm-powersave23:57
Shrewsand powertop23:57
tchaypowhat lifeless said23:57
Shrewslifeless: pm-utils23:57
lifelessShrews: I have that already :)23:57
lifelessShrews: I guess you mean running pm-powersave?23:57
Shrewshrm, maybe it was the powertop twiddling that did it23:57
tchaypoI find that when I run powertop --auto-tune my usage drops 3-5W, which is nice23:58
tchaypobut every time things change, various settings fall back to bad23:58
lifelessTIL23:58
lifelesstchaypo: so - frames ?23:58
tchaypolifeless: in http://paste.openstack.org/show/74795/https://bugs.launchpad.net/tripleo/+bug/130122023:59
lifelessreally?23:59

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!