*** tetsuro has joined #openstack-placement | 00:08 | |
openstackgerrit | JiaJunsu proposed openstack/nova master: Remove args(os=False) in monkey_patch https://review.openstack.org/568999 | 01:32 |
---|---|---|
openstackgerrit | Yikun Jiang (Kero) proposed openstack/nova master: Make monkey patch work in uWSGI mode https://review.openstack.org/592285 | 01:51 |
*** lei-zh has joined #openstack-placement | 02:24 | |
*** lei-zh has quit IRC | 02:33 | |
*** lei-zh has joined #openstack-placement | 02:33 | |
*** lei-zh has quit IRC | 03:39 | |
*** lei-zh has joined #openstack-placement | 03:39 | |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Transform libvirt.error notification https://review.openstack.org/484851 | 03:58 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Adds view builders for keypairs controller https://review.openstack.org/347289 | 03:58 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (3) https://review.openstack.org/574104 | 03:58 |
*** takashin has joined #openstack-placement | 03:59 | |
*** lei-zh has quit IRC | 03:59 | |
*** lei-zh1 has joined #openstack-placement | 03:59 | |
*** lei-zh1 has quit IRC | 04:11 | |
*** lei-zh1 has joined #openstack-placement | 04:11 | |
*** lei-zh1 has quit IRC | 04:16 | |
openstackgerrit | Merged openstack/nova master: Make instance_list perform per-cell batching https://review.openstack.org/593131 | 04:55 |
*** tetsuro has quit IRC | 05:24 | |
*** lei-zh1 has joined #openstack-placement | 05:43 | |
*** tetsuro has joined #openstack-placement | 06:08 | |
*** tetsuro has quit IRC | 06:09 | |
*** tetsuro has joined #openstack-placement | 06:57 | |
*** tetsuro has quit IRC | 07:07 | |
*** tssurya has joined #openstack-placement | 07:08 | |
*** dims has quit IRC | 07:08 | |
*** dims has joined #openstack-placement | 07:10 | |
*** cdent has joined #openstack-placement | 07:39 | |
openstackgerrit | Tushar Patil proposed openstack/nova-specs master: Bi-directional enforcement of traits https://review.openstack.org/593475 | 08:05 |
*** tetsuro has joined #openstack-placement | 08:19 | |
openstackgerrit | Alex Xu proposed openstack/nova-specs master: Resource retrieving: add change-before filter https://review.openstack.org/591976 | 08:21 |
*** ttsiouts has joined #openstack-placement | 08:29 | |
*** takashin has left #openstack-placement | 08:32 | |
*** e0ne has joined #openstack-placement | 08:43 | |
*** ttsiouts has quit IRC | 09:05 | |
openstackgerrit | Stephen Finucane proposed openstack/nova master: conf: Use new-style choice values https://review.openstack.org/530924 | 09:21 |
openstackgerrit | Moshe Levi proposed openstack/nova master: libvirt: skip setting rx/tx queue sizes for not virto interfaces https://review.openstack.org/595592 | 09:24 |
*** ttsiouts has joined #openstack-placement | 09:30 | |
*** lei-zh1 has quit IRC | 09:33 | |
*** cdent has quit IRC | 09:45 | |
*** ttsiouts has quit IRC | 10:01 | |
*** tetsuro has quit IRC | 10:02 | |
*** cdent has joined #openstack-placement | 10:20 | |
*** nicolasbock has joined #openstack-placement | 10:20 | |
*** stephenfin has quit IRC | 10:35 | |
*** stephenfin has joined #openstack-placement | 10:36 | |
*** ttsiouts has joined #openstack-placement | 10:52 | |
cdent | efried or edleafe either of you around yet? | 11:15 |
cdent | nm, will check in later | 11:21 |
*** giblet_off is now known as gibi | 11:49 | |
gibi | jaypipes: thanks for the feedback on the any traits spec, I will respin to include efried's additional explanation | 11:53 |
jaypipes | gibi: cool :) | 11:56 |
gibi | efried: regarding evacuation and reshape. I'm think we need to explicitly filter for the outsider RP (e.g. the other compute RP) before we pass the input to the virt driver. I feel it is easy for the resource tracker to do this filtering than let the virt driver do it | 12:05 |
cdent | jaypipes: with regard to generation in allocation GETs it looks like you asked for them on patchset 2 of https://review.openstack.org/#/c/366789/ | 12:32 |
jaypipes | cdent: bad on me, then. | 12:34 |
*** mriedem has joined #openstack-placement | 12:35 | |
cdent | jaypipes: it _may_ have been something to do with generation cache handling, but I don't recall the details | 12:35 |
openstackgerrit | Konstantinos Samaras-Tsakiris proposed openstack/os-traits master: Add CUDA versions 8 and 9 https://review.openstack.org/597111 | 12:42 |
cdent | gonna change locations and then gonna review specs | 12:45 |
*** cdent has quit IRC | 12:46 | |
openstackgerrit | Merged openstack/nova master: Deprecate Core/Ram/DiskFilter https://review.openstack.org/596502 | 12:47 |
jaypipes | efried: done reviewing reshaper series. | 12:48 |
efried | jaypipes: ack, thx | 12:53 |
efried | FYI, I have a doc appt in an hour. Shouldn't take long. | 12:54 |
efried | gibi, mriedem: top two reshaper patches just need one more approval. | 13:02 |
gibi | efried: ack | 13:02 |
mriedem | i'm reviewing them now | 13:04 |
efried | thanks y'all | 13:05 |
*** cdent has joined #openstack-placement | 13:29 | |
*** efried is now known as efried_doc | 13:29 | |
*** stephenfin has quit IRC | 13:33 | |
*** stephenfin has joined #openstack-placement | 13:34 | |
*** purplerbot has joined #openstack-placement | 13:34 | |
cdent | gibi: thanks for the quick responses on that email | 13:42 |
gibi | cdent: I hope I can help | 13:50 |
openstackgerrit | Jay Pipes proposed openstack/os-traits master: Add CUDA versions 8 and 9 https://review.openstack.org/597111 | 14:10 |
*** ttsiouts has quit IRC | 14:16 | |
mriedem | is a shared storage provider in an aggregate relationship with a compute node root provider considered part of the comptue node providers "tree"? | 14:22 |
cdent | mriedem: efried_doc will have the details on that, but the short answer is yet | 14:23 |
cdent | yes | 14:23 |
cdent | it's one of the reasons that aggregate mgt got generations | 14:24 |
mriedem | efried_doc: ok questions in https://review.openstack.org/#/c/585049/ and clearly i don't know how shared providers are modeled in the tree | 14:24 |
dansmith | so, the shared provider isn't actually in the tree, | 14:26 |
dansmith | because it doesn't have a parent node of the compute node, right? | 14:26 |
dansmith | but some other operations act like it _is_ under the compute node? | 14:26 |
cdent | there's the collection of the nested providers under the compute node. and then there's the content the ProviderTree (or trees) which is being managed by the nova-compute process. As I recall, these are not quite the same thing | 14:27 |
cdent | I assumed mriedem was talking about the ProviderTree ? | 14:27 |
mriedem | https://review.openstack.org/#/c/585049/19/nova/tests/functional/test_report_client.py@1334 for context | 14:29 |
mriedem | that code walks the tree and adds a new inventory resource class to all providers in the tree, | 14:30 |
mriedem | i assumed that didn't include the shared storage provider since it's not a child of the root compute node, as dansmith said | 14:30 |
dansmith | mriedem: right, I would assume the same | 14:31 |
mriedem | looking at _Provider.get_provider_uuids() it only returns children | 14:31 |
dansmith | I would expect a client to have to follow the aggregate relationship to find the shared provider if it needs it | 14:31 |
mriedem | yeah me too | 14:31 |
dansmith | that also makes me wonder what happens if you do an in_tree with the compute node, do you magically get things out of the tree because allocation_candidates thinks you probably wanted that? | 14:32 |
mriedem | you mean on GET /resource_providers?in_tree=<cn_rp_uuid>? | 14:35 |
mriedem | we don't have GET /allocation_candidates?in_tree yet | 14:35 |
dansmith | mriedem: right, I'm just trying to think about what it would mean | 14:35 |
jaypipes | mriedem: answered your question on the review. | 14:35 |
mriedem | i've always wanted some examples that the "in_tree" param to GET /resource_providers could link to in the docs, for various scenarios | 14:36 |
mriedem | because today it just says, "A UUID of a resource provider. The returned resource providers will be in the same “provider tree” as the specified provider." | 14:36 |
dansmith | mriedem: weren't you discussing in_tree for like resize to same host? | 14:36 |
dansmith | er, in-place resize whatever | 14:36 |
*** ttsiouts has joined #openstack-placement | 14:36 | |
mriedem | dansmith: for getting allocation candidates? | 14:36 |
dansmith | maybe we wouldn't need to query a_c for that | 14:36 |
mriedem | i've talked about adding in_tree to GET /a_c for force_hosts | 14:36 |
dansmith | ah, that's it I guess | 14:37 |
jaypipes | dansmith: calling GET /resource_providers?in_tree=X only returns the root and any children (and their children) of X. it does not return sharing providers. | 14:37 |
mriedem | we don't know if we're resizing to the same host until the scheduler gives us the same host as a selection | 14:37 |
mriedem | jaypipes: ok good; would be nice to document that in the API reference :) | 14:37 |
dansmith | mriedem: I meant in-place resize | 14:37 |
mriedem | "sharing providers are in the same forest, but not the same tree" | 14:37 |
jaypipes | dansmith, mriedem: however, that said, we *do* grab sharing providers as top-level root nodes in the ProviderTree object that is used in update_provdier_tree() and update_from_provider_tree(). See my comment on https://review.openstack.org/#/c/585049/19/nova/tests/functional/test_report_client.py@1334 | 14:38 |
mriedem | dansmith: in-place resize = resize to same host yeah | 14:38 |
mriedem | ? | 14:38 |
jaypipes | mriedem: you mean basically replace what's there now ("The returned resource providers will be in the same “provider tree” as the specified provider.") with what I just said above? | 14:39 |
dansmith | mriedem: there's a distinction of whether or not we got the same host while looking for a candidate during a normal schedule (which would call a_c) and an actual future in-place resize operation where we're intending _not_ to move, but like I realized above, probably no reason to call a_c for that | 14:39 |
jaypipes | mriedem: I can do that, but we don't really mention sharing providers in the placement api reference and since we don't yet officially support them, might want to hold off on that. | 14:40 |
mriedem | jaypipes: yeah good point on not calling those out in docs | 14:40 |
*** efried_doc is now known as efried | 14:41 | |
* efried is officially healthy for another year. | 14:41 | |
jaypipes | efried: good for you! | 14:41 |
efried | reading back... | 14:41 |
mriedem | dansmith: is "an actual future in-place resize operation" something that's not in nova today? maybe you're referring to live resize? | 14:41 |
efried | mriedem: The sharing provider is not part of the tree qua tree as parent/root providers go, but it *is* loaded up and passed in the ProviderTree object as a separate root provider that the virt driver gets to see and play with. | 14:42 |
dansmith | mriedem: yes. I'm talking about in-place live non-disruptive resize of a guest without moving it in the future when we have that someday | 14:42 |
dansmith | mriedem: was just thinking out loud above, my apologies | 14:42 |
mriedem | dansmith: ok, np, b/c i also have a todo to fix doubled up allocations on the same rp for same-host resize | 14:43 |
mriedem | i wasn't sure if you were talking about the same thing | 14:43 |
efried | oh, look, jaypipes already said all of that. | 14:43 |
efried | mriedem: get_provider_uuids() on the ProviderTree object will return the UUIDs of all the providers in the object, which may include multiple roots and their children, depending how the ProviderTree was constructed. | 14:46 |
efried | mriedem: Don't read too much into the test case as representing something that you would actually do in real life. It was built to try to exercise the corners, which real life usually won't do. | 14:47 |
mriedem | i was looking at get_provider_tree_and_ensure_root and trying to figure out where the aggregates are pulled in | 14:47 |
openstackgerrit | Merged openstack/os-traits master: Add CUDA versions 8 and 9 https://review.openstack.org/597111 | 14:47 |
mriedem | oh i guess _ensure_resource_provider | 14:47 |
mriedem | which now does *way* more than the method name suggests | 14:48 |
mriedem | that calls _refresh_associations and that's what pulls in the shared aggregate providers | 14:48 |
efried | yes | 14:49 |
efried | that was like Queens stuff. | 14:49 |
mriedem | sorry | 14:50 |
mriedem | i'm old and stuck in pike land | 14:50 |
mriedem | gd millenial helper methods | 14:50 |
*** e0ne has quit IRC | 14:56 | |
*** nicolasbock has quit IRC | 14:56 | |
*** nicolasbock has joined #openstack-placement | 14:59 | |
*** ttsiouts has quit IRC | 15:04 | |
mriedem | efried: i'll wait to vote on https://review.openstack.org/#/c/585049/ until you reply | 15:05 |
*** jroll has quit IRC | 15:05 | |
efried | mriedem: ack, working on it now. | 15:06 |
*** jroll has joined #openstack-placement | 15:06 | |
*** alex_xu has quit IRC | 15:10 | |
openstackgerrit | Merged openstack/nova stable/ocata: Default embedded instance.flavor.disabled attribute https://review.openstack.org/580525 | 15:24 |
efried | mriedem: Responded | 15:34 |
openstackgerrit | Merged openstack/nova master: Make monkey patch work in uWSGI mode https://review.openstack.org/592285 | 15:38 |
openstackgerrit | Jay Pipes proposed openstack/os-traits master: clean up CUDA traits https://review.openstack.org/597170 | 15:57 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Don't use '_TransactionContextManager._async' https://review.openstack.org/597173 | 16:06 |
*** e0ne has joined #openstack-placement | 16:06 | |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Don't use '_TransactionContextManager._async' https://review.openstack.org/597173 | 16:14 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Revert "Don't use '_TransactionContextManager._async'" https://review.openstack.org/597174 | 16:18 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Revert "Don't use '_TransactionContextManager._async'" https://review.openstack.org/597174 | 16:19 |
gibi | efried, jaypipes: I replied in https://review.openstack.org/#/c/576236 I still think we have problems with killing the nova-compute service | 16:20 |
gibi | efried, jaypipes, cdent the rest of the reshaper series looks good to me | 16:22 |
mriedem | yeah we're +2 up through to the end | 16:26 |
gibi | there are places with 3 +2s :) | 16:27 |
gibi | I'm call it a day. See you tomorrow. | 16:28 |
efried | gibi: Thanks. | 16:29 |
openstackgerrit | Merged openstack/nova master: Make scheduler.utils.setup_instance_group query all cells https://review.openstack.org/540258 | 16:32 |
efried | jaypipes, mriedem: I may need a bit of help to resolve what gibi pointed out on https://review.openstack.org/#/c/576236/ | 16:34 |
efried | Do we try to change something to actually make it blow up the compute service, or do we amend the commit message, comments, and spec to say that we don't actually blow up on periodic? | 16:34 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: tests: Further simplification of test_numa_servers https://review.openstack.org/596832 | 16:39 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: tests: Validate huge pages https://review.openstack.org/399653 | 16:39 |
mriedem | i haven't gotten into the bowels of that change yet, but i'm assuming the issue is the periodic runs at the same time we startup and are reshaping? i don't think that's actually possible. | 16:40 |
mriedem | ComputeManager.pre_start_hook is where we would initiate the reshape | 16:42 |
mriedem | by calling update_available_resource | 16:42 |
mriedem | then we join the service group, | 16:42 |
mriedem | and then we start the periodic thread | 16:42 |
mriedem | so i guess i don't understand gibi's concern - i thought we agreed in the spec that we wouldn't allow reshape during a periodic, and only on startup, which is why the startup param is passed through | 16:45 |
mriedem | so if we can't get ReshapeNeeded during a periodic, it's a non-concern isn't it? | 16:45 |
efried | mriedem: I think the contention is simply that, if we attempt a reshape during the periodic (we're not supposed to) then it should raise one of the Reshape* exceptions and we *said* in the spec, and I'm asserting in the commit message and code comments, that we should react by blowing up the compute service. | 16:45 |
efried | Well, the point is "can't". | 16:46 |
openstackgerrit | Merged openstack/nova master: Record cell success/failure/timeout in CrossCellLister https://review.openstack.org/594265 | 16:46 |
efried | The virt driver has the reins at this point. | 16:46 |
efried | It can raise ReshapeNeeded due to a bug or whatever. | 16:46 |
efried | which was why we wanted to handle that condition, and handle it severely. | 16:46 |
mriedem | a failure in the periodic like that won't actually kill the service | 16:47 |
cdent | by stating "this compute node is irredeemable until you come look at it" | 16:47 |
efried | Yeah, what cdent said. | 16:47 |
mriedem | periodics are run off on their own thread | 16:47 |
efried | But mriedem right, that's what gibi is pointing out. | 16:47 |
efried | so do we reword or recode? | 16:48 |
efried | sounds like recode would be a pretty serious thing. | 16:48 |
mriedem | i don't think we kill the service | 16:48 |
mriedem | yes | 16:48 |
mriedem | the libvirt driver has known races during server delete while update_available_resource runs, blows up the periodic, but it's ok on the next run | 16:48 |
efried | So how do we make clear that "this compute node is irredeemable until you come look at it"? | 16:49 |
efried | Or do we not worry about that, and just let the next periodic happen, and hope *that* guy doesn't raise Reshape* | 16:49 |
mriedem | the latter | 16:49 |
mriedem | this shouldn't happen | 16:49 |
mriedem | if it does, well, it shouldn't, it's a bug, so report a bug and we'll figure out wtf happened | 16:49 |
mriedem | we can't predict what would incorrectly introduce a bug | 16:49 |
mriedem | unless you're in that tom cruise movie | 16:49 |
efried | right, which is why we wanted to make it fail hard and fast, so you'll pay attention right away and the bug won't go undiscovered. | 16:50 |
efried | Cause as it is, the symptom will be that we don't effect any changes to the provider tree. Which may not cause anything to come crashing down. | 16:51 |
efried | So you might not notice a problem for... a while. | 16:51 |
efried | Once you do, the logs will have the failure in 'em. | 16:51 |
efried | But until then, things might appear normal. | 16:51 |
mriedem | i just replied, | 16:52 |
mriedem | presumably if inventory is f'ed on this host, we'll fail builds and the scheduler will weigh it lower b/c of our failed_builds stats stuff that dansmith added | 16:52 |
efried | we might not fail builds. | 16:52 |
mriedem | i just don't really think it's worth getting too fussy over something that shouldn't happen | 16:52 |
jaypipes | efried: unfortunately today from now for the next four hours I have sprint planning meetings :( | 16:52 |
jaypipes | efried: might be a while until I can get to it. | 16:52 |
mriedem | jaypipes: i think i've answered | 16:53 |
efried | jaypipes: ack, I think mriedem is covering it. | 16:53 |
efried | mriedem: Disable the compute service, maybe? | 16:53 |
mriedem | auto-disable? | 16:53 |
mriedem | that's an option if this ever actually becomes a problem | 16:53 |
mriedem | i don't think we need to conflate *this* change with *that* possibility though | 16:53 |
efried | Right, so I'm trying to prevent it becoming a problem that nobody notices. | 16:53 |
efried | ...until much later, kind of thing. | 16:54 |
efried | I guess I can do the reword now and we can talk about that bit later on. | 16:54 |
mriedem | i need to reshape some food into my gut | 16:55 |
cdent | remind me efried : this is only an issue if the virt driver raises the specific ReshapeRequired exception at the wrong time? Or: several different exceptions can be a problem? | 16:55 |
cdent | (i've not been following closely, sorry) | 16:55 |
efried | cdent: ReshapeRequired at the wrong time is the more likely. We should also trap ReshapeFailed, but that one should "never" happen. | 16:56 |
cdent | so in order for the problem case to happen, it would have be a pretty egregious bug in the code, not an accidental oversight? | 16:57 |
cdent | (not trying to value the bug, rather the odds of it happening) | 16:57 |
efried | cdent: update_provider_tree, as implemented by the virt driver, would have to think it needs to do a reshape despite not being allowed to. | 17:00 |
efried | But upt has no way to know it's running on startup vs periodic or whatever. | 17:00 |
efried | so yeah, totally plausible bug. | 17:00 |
cdent | I wish we, as devs who build this stuff, had more experience with runs thousands of compute nodes. What I'm wondering is if someone is more interested in discovering the breakage by direct monitoring of the service, or log-based processing/alarming | 17:03 |
mriedem | i also wish, as a dev that builds this stuff, had infinite time to play around with things but i don't and i feel like my list of things to do is ever expanding beyond my ability to complete them; so gotta pick what needs to be worried about at the time. | 17:21 |
efried | cdent: UUID sentinel in oslo_utils has merged FYI. Course we'll need a release etc, but in case you're following along :) | 17:32 |
efried | https://review.openstack.org/#/c/594179/ (I guess you're on the review so you probably knew that happened) | 17:32 |
cdent | efried: yeah, been getting the emails, but thanks for the heads | 17:32 |
cdent | mriedem: you phrased that like you're wanting to disagree with me or disabuse me of some idealism, but I think at root we're in violent agreement with one another | 17:33 |
cdent | efried: I'm glad that it resolved in the way it did. having a different behavior would have been a buzzkill | 17:34 |
* cdent really like uuidsentinel | 17:35 | |
mriedem | cdent: yes the latter "I think at root we're in violent agreement with one another" | 17:37 |
melwitt | cdent: fwiw, second best is being able to ask operators. you could ask on the openstack-operators@ list, ask people like mnaser, penick, belmiro, mgagne_, OVH people, RDO cloud people, etc | 17:38 |
cdent | melwitt: yes, I know, but in this particular case I'm trying to keep myself focussed because it is clear the efried, jaypipes, mriedem and others have this particular incident mostly in hand. My questions we trying to clarify eric's position, as it wasn't clear. And my final comment was exactly what it said: wishful thinking, exactly in line with matt's thing: "ain't nobody got time for that" | 17:40 |
mriedem | i suspect most people aren't monitoring based on log levels, because if you just look at the gate logs across nova/cinder/neutron etc there is a shitload of red | 17:41 |
cdent | I suspect you are right, I was thinking in terms of specifc things being watched, not just redness. In the world of infinite time: kill all the red would be a nice thing to do | 17:43 |
melwitt | it's been awhile, but when I was at yahoo, what I recall were aggregated logs and dashboards showing certain flagged/important errors as a way to discover problems. I don't know if they're still doing it that way | 17:43 |
jroll | pretty much | 17:43 |
mriedem | sdague, mtrienish and i used to (years ago) go around the logs and open bugs for random "lots of red" errors, but it's hard to keep whacking those moles | 17:45 |
jroll | in general, if something is hard-breaking a service, I'd prefer it just go down so we can alert on it | 17:45 |
jroll | as long as it's fixable | 17:45 |
mriedem | which reminds me, we should downgrade the "unexpected network-vif-(un)plugged" warnings in the n-cpu logs | 17:47 |
melwitt | ++ | 17:47 |
mriedem | b/c they are most always expected, we just don't care to listen | 17:47 |
cdent | jroll: if something is hard-breaking a service, but not fixable, what do you want? | 17:53 |
jroll | cdent: to uninstall nova? | 17:54 |
cdent | woot! | 17:54 |
jroll | cdent: when I say fixable, I mean manually too | 17:54 |
jroll | like, if I have to go take manual action, and that service isn't going to do anything right, just kill it | 17:54 |
jaypipes | jroll: if that service has 100s of VMs actively running on it, I don't think you want to do that. | 18:09 |
jaypipes | jroll: however, I *would* think you'd want to run an external script to correct behaviour (without killing the service itself) | 18:09 |
jroll | jaypipes: if the nova-compute service is 100% broken, why not kill it? | 18:10 |
cdent | jroll: you specifically mean kill that process, not the host, right? So vms (at least in a kvm host) are still happy (but stuck) | 18:11 |
jroll | cdent: yes, the process | 18:12 |
jroll | and alert via process monitoring | 18:12 |
jroll | the data plane is sacred :P | 18:12 |
* cdent nods | 18:12 | |
* cdent hears a song | 18:12 | |
jaypipes | jroll: it isn't broken, though. the VMs are still up and running and humming along. it's just that the desired action of remodeling the resource consumption on that nova-compute service ran into snags. That doesn't and shouldn't affect the data plane though I guess. | 18:17 |
jaypipes | jroll: which is kinda the reason I brought up some gripes in the review in question about "why are we even going through the bother of adding a crap-load of messy, one-time use code into the virt drivers to do this remodeling instead of just writing some external thing that calls reshape() once and is done with it" | 18:18 |
jaypipes | jroll: this is the equivalent of a database migration being embedded in the virt driver, FWIW. | 18:19 |
jaypipes | data migration. not just schema migration | 18:19 |
jaypipes | but whatevs, like you and I said, as long as the data plane is unaffected, meh | 18:20 |
dansmith | jaypipes: it's a data migration embedded in the virt drivers because only they can do the translation | 18:20 |
jaypipes | dansmith: I think we'll find that isn't the case when we start reviewing the actual implementation of update_provider_tree() in the virt drivers. | 18:21 |
dansmith | jaypipes: how many numa nodes do I have? | 18:21 |
jaypipes | dansmith: you personally have 42. | 18:21 |
dansmith | jaypipes: wrong. | 18:21 |
jaypipes | dansmith: and I love them all. | 18:21 |
dansmith | wrong. | 18:21 |
jaypipes | :P | 18:21 |
openstackgerrit | Merged openstack/nova master: Optimize global marker re-lookup in multi_cell_list https://review.openstack.org/594577 | 18:23 |
jroll | jaypipes: right, I don't have full context, which is why I'm talking in generalizations, not about this specific problem. I don't know how unusable the nova-compute service is in this case, but it sounded to me like it was completely unusable in control plane terms | 18:25 |
jroll | or rather s/don't have full context/don't fully understand the situation/ | 18:26 |
mriedem | see -dev for some fun | 18:26 |
jroll | copyright fun or extraction fun or understanding this specific problem fun? | 18:26 |
cdent | So: we're waiting to resolve the questions on https://review.openstack.org/#/c/576236/ before efried takes his -2 off the bottom of the reshaper stack? Or is there more? | 18:27 |
mriedem | unrelated fun | 18:27 |
* jroll sees very little fun there | 18:27 | |
efried | cdent: IIUC, I'm just needing to update the commit message and comments on the top patch and we'll be good to go; the rest can be done in a fup. mriedem jaypipes agree? | 18:28 |
jaypipes | efried: yes from me. | 18:29 |
mriedem | efried: i only looked at that small slice | 18:29 |
mriedem | so feel free to reve | 18:29 |
mriedem | *rev | 18:29 |
efried | ight | 18:29 |
cdent | efried: cool. was looking forward to having the 1.30 microversion settled (acknowledging that it could well need a 31 after real life intrudes) | 18:30 |
efried | jaypipes: mriedem, cdent, melwitt: I +A'd the bottom patch (I still get to do that, I think, since it's cdent's code). Working on the top one and fups now... | 18:36 |
cdent | huzzah | 18:36 |
edleafe | cdent: efried: I ran a directory diff on nova and the result of the history filtering process: http://paste.openstack.org/show/728996/ | 18:37 |
edleafe | Those are the deleted files. Can you see any that we should keep? | 18:38 |
mriedem | efried: you were very much a co-author on https://review.openstack.org/#/c/576927/ | 18:38 |
efried | um | 18:38 |
efried | nova? | 18:38 |
edleafe | FYI - it just lists the nova director | 18:38 |
mriedem | so approving isn't really appropriate imo | 18:38 |
edleafe | I can change that temporarily | 18:39 |
mriedem | unless it's just all rebases | 18:39 |
efried | mriedem: I think it was rebases or such minor things as to be of negligible effect. | 18:39 |
efried | mriedem: But I can remove my vote and let you do it :) | 18:39 |
cdent | edleafe: looking | 18:40 |
mriedem | efried: it's on its way to the gate at this point | 18:40 |
efried | mriedem: Done. (Though I guess it's probably already in the gate?) | 18:40 |
efried | mriedem: Okay, but for the sake of form, you wanna add your +W? | 18:40 |
mriedem | i just haven't gone through https://review.openstack.org/#/c/576236/ | 18:40 |
efried | It already had two +2s. | 18:40 |
efried | oh | 18:40 |
mriedem | i commented on the very specific thing from gibi | 18:41 |
mriedem | that was it | 18:41 |
edleafe | efried: cdent: Try this one instead: http://paste.openstack.org/show/728998/ | 18:41 |
* cdent starts over | 18:42 | |
efried | edleafe: Looks truncated. | 18:43 |
edleafe | efried: ugh - -you're right | 18:43 |
efried | but what is the goal of this exercise? | 18:43 |
efried | There's no way I'm going to be able to tell by looking at it whether you've missed deleting something. | 18:44 |
edleafe | efried: 2 fold: to minimize the number of files accidentally removed that have to be added back, and to show what's removed, since gerrit can't | 18:44 |
efried | mm | 18:45 |
cdent | efried, edleafe : I've done this enough times that I've got a memory that I can compare this list against for things that are "odd" | 18:45 |
edleafe | efried: cdent: Here is what was cut off: http://paste.openstack.org/show/729000/ | 18:45 |
cdent | not accurately, but potentially usefully while tweaking and dry-running | 18:45 |
edleafe | cdent: yeah, I was gonna scan your PR to find any others that I missed | 18:46 |
efried | With the working side repo: | 18:47 |
efried | for f in `find . -type f`; do mv $f /tmp/f; if ! tox; then echo "$f needs to stay!"; mv /tmp/f $f; fi; done | 18:47 |
efried | There, that was easy. | 18:47 |
efried | (or it means we have a hole in test coverage :) | 18:47 |
efried | (or it's going to remove files that aren't supposed to get hit in test) | 18:48 |
efried | (I'm not serious, of course) | 18:48 |
cdent | if only it was so easy | 18:48 |
cdent | edleafe: tools/flak8.sh or whatever it is called | 18:48 |
edleafe | ah, good catch | 18:49 |
cdent | as in, we want that (as far as I recall the only thing from tools/) | 18:49 |
cdent | nothing obvious I could find in the other hotspots, the remaining conf files don't include the ones I'm aware of needing to keep | 18:49 |
cdent | edleafe: the release notes I kind have to take your word for it, especially from just a list of files. I didn't inspect of them during my experments. As a topic "docs" and "doc like things" are a blind spot for me | 18:52 |
cdent | (not for lack of interest, but lack of time) | 18:52 |
edleafe | cdent: I reviewed them to see if there was anything placement-related, and if so, added them to my filter arguments. | 18:54 |
edleafe | For reference, here is my current filter script: http://paste.openstack.org/show/729002/ | 18:54 |
openstackgerrit | Eric Fried proposed openstack/nova master: Compute: Handle reshaped provider trees https://review.openstack.org/576236 | 19:05 |
efried | mriedem, jaypipes, cdent, gibi: ^ | 19:05 |
*** e0ne has quit IRC | 19:08 | |
mriedem | efried: ok looking | 19:11 |
*** e0ne has joined #openstack-placement | 19:23 | |
cdent | efried: acking you've had a busy day, if you could review the steps in the 1.x list in http://lists.openstack.org/pipermail/openstack-dev/2018-August/133902.html when you get a chance (if you haven't already) that would be groovus | 19:25 |
efried | cdent: I did, and had no disagreement with them. | 19:25 |
cdent | keen | 19:26 |
efried | cdent: But please have only the appropriate amount of faith in the depth of my understanding of all of that. | 19:26 |
cdent | noted. | 19:26 |
cdent | as long as nothing in makes you scream, is good | 19:26 |
efried | cdent: Also, I'm avoiding getting too involved in the pre-work. Partly out of laziness but partly to keep myself "clean" for the "real" reviews. | 19:27 |
efried | (s/laziness/full plate-ness etc./) | 19:27 |
cdent | efried: i hear that, just trying to make sure we don't have a false start so getting input from as many possible before diving back in | 19:28 |
efried | yup, good things. | 19:28 |
efried | kutgw | 19:28 |
* cdent does the dishes | 19:32 | |
openstackgerrit | Eric Fried proposed openstack/nova master: Do test_reshape with an actual startup https://review.openstack.org/597218 | 19:33 |
openstackgerrit | Eric Fried proposed openstack/nova master: reshaper gabbit: Nix comments re doubled max_unit https://review.openstack.org/597220 | 19:49 |
*** tssurya has quit IRC | 20:16 | |
openstackgerrit | Eric Fried proposed openstack/nova master: Fail heal_allocations if placement is borked https://review.openstack.org/597237 | 20:55 |
*** e0ne has quit IRC | 20:58 | |
openstackgerrit | Eric Fried proposed openstack/nova master: Name arguments to _get_provider_ids_matching https://review.openstack.org/597291 | 21:45 |
efried | mriedem: In case you need an easy one, that you can do while your kid is crawling on you ^ | 21:45 |
*** takashin has joined #openstack-placement | 21:51 | |
mriedem | she just went up the street so i'm kid free for an hour, so it's MY time | 21:54 |
cdent | good night all | 22:16 |
*** cdent has quit IRC | 22:16 | |
mriedem | and of course 1 leaves and 3 come back | 22:19 |
mriedem | fml | 22:19 |
openstackgerrit | Eric Fried proposed openstack/nova master: Other host allocs may appear in gafpt during evac https://review.openstack.org/597301 | 22:24 |
mriedem | efried: your green screen proclivity is shining through in your comments and commit messages now | 22:24 |
mriedem | do i need qsecofr to gafpt | 22:25 |
efried | Just trying to follow length restrictions | 22:25 |
efried | You can blame jaypipes for loving long method names. Hard to fit those into commit headers. | 22:26 |
efried | mriedem: Whassa conclusion of https://review.openstack.org/#/c/585034/20/nova/scheduler/client/report.py@1437 ? I'm not following how the doc is wrong, or how it should be fixed? | 22:30 |
mriedem | "allocations": { | 22:31 |
mriedem | $RP_UUID: { | 22:31 |
mriedem | "resources": { $RC: $AMOUNT, ... } | 22:31 |
mriedem | technically has a 'generation' key in the rp sub-dict | 22:31 |
mriedem | but, | 22:31 |
mriedem | we don't use it, so probably don't care | 22:31 |
efried | So the delta would be to add a row to the table indicating 'generation': ... ignored ... | 22:32 |
efried | and perhaps adding it to the samples | 22:32 |
mriedem | well, we don't take it in the API i guess? http://logs.openstack.org/27/576927/35/check/build-placement-api-ref/24b1ab2/html/#id86 | 22:32 |
mriedem | it's not in the same, not sure about the schema, | 22:33 |
mriedem | but this is modeled after PUT / POST allocations right? and as noted in there, the POST allocations api-ref is missing that 'generation' field | 22:33 |
mriedem | but it's unused someh | 22:33 |
mriedem | it is called out for PUT allocations http://logs.openstack.org/27/576927/35/check/build-placement-api-ref/24b1ab2/html/#update-allocations | 22:33 |
mriedem | as optional and not used | 22:33 |
mriedem | btw, mocking around RetryDecorator blows | 22:34 |
openstackgerrit | Eric Fried proposed openstack/nova master: Mention (unused) RP generation in POST /allocs/{c} https://review.openstack.org/597304 | 22:37 |
efried | mriedem, jaypipes: cdent: ^ | 22:38 |
efried | mriedem: Yes, mocking RetryDecorator threading bs is why I wound up going with retrying. | 22:39 |
mriedem | question inline in that one | 22:43 |
efried | mriedem: Responded. You're correct. It's good as is. | 22:49 |
efried | ō/ See y'all tomorrow | 22:50 |
openstackgerrit | melanie witt proposed openstack/nova-specs master: Propose configurable maximum number of volumes to attach https://review.openstack.org/597306 | 22:54 |
openstackgerrit | Merged openstack/nova master: Don't use '_TransactionContextManager._async' https://review.openstack.org/597173 | 23:34 |
openstackgerrit | Merged openstack/nova master: Revert "Don't use '_TransactionContextManager._async'" https://review.openstack.org/597174 | 23:34 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (4) https://review.openstack.org/574106 | 23:40 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (5) https://review.openstack.org/574110 | 23:40 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (6) https://review.openstack.org/574113 | 23:40 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (7) https://review.openstack.org/574974 | 23:41 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (8) https://review.openstack.org/575311 | 23:41 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (9) https://review.openstack.org/575581 | 23:41 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (10) https://review.openstack.org/576017 | 23:43 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (11) https://review.openstack.org/576018 | 23:44 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (12) https://review.openstack.org/576019 | 23:44 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (13) https://review.openstack.org/576020 | 23:44 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (14) https://review.openstack.org/576027 | 23:45 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (15) https://review.openstack.org/576031 | 23:45 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (16) https://review.openstack.org/576299 | 23:45 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (17) https://review.openstack.org/576344 | 23:46 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (18) https://review.openstack.org/576673 | 23:46 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (19) https://review.openstack.org/576676 | 23:46 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (20) https://review.openstack.org/576689 | 23:47 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (21) https://review.openstack.org/576709 | 23:47 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (22) https://review.openstack.org/576712 | 23:48 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!