*** nafiux has quit IRC | 00:02 | |
*** betherly has joined #openstack-nova | 00:06 | |
*** nafiux has joined #openstack-nova | 00:06 | |
*** betherly has quit IRC | 00:11 | |
*** nafiux has quit IRC | 00:14 | |
*** brinzhang has joined #openstack-nova | 00:17 | |
*** brinzhang_ has quit IRC | 00:20 | |
*** ivve has quit IRC | 00:24 | |
*** nafiux has joined #openstack-nova | 00:35 | |
*** threestrands has quit IRC | 00:49 | |
*** efried has quit IRC | 00:51 | |
*** bhagyashris has joined #openstack-nova | 00:51 | |
*** efried has joined #openstack-nova | 00:59 | |
*** ricolin has joined #openstack-nova | 01:04 | |
*** betherly has joined #openstack-nova | 01:08 | |
*** tetsuro has joined #openstack-nova | 01:12 | |
*** betherly has quit IRC | 01:13 | |
*** nafiux has quit IRC | 01:14 | |
*** nafiux has joined #openstack-nova | 01:21 | |
*** igordc has quit IRC | 01:31 | |
*** boxiang has joined #openstack-nova | 01:38 | |
*** nafiux has quit IRC | 01:38 | |
*** boxiang has quit IRC | 01:39 | |
*** boxiang has joined #openstack-nova | 01:39 | |
*** betherly has joined #openstack-nova | 01:40 | |
*** boxiang has quit IRC | 01:42 | |
*** betherly has quit IRC | 01:44 | |
*** gyee has quit IRC | 02:00 | |
*** betherly has joined #openstack-nova | 02:01 | |
*** betherly has quit IRC | 02:05 | |
*** slaweq has joined #openstack-nova | 02:11 | |
*** slaweq has quit IRC | 02:15 | |
*** tetsuro has quit IRC | 02:24 | |
*** betherly has joined #openstack-nova | 02:25 | |
openstackgerrit | pengyuesheng proposed openstack/os-vif master: Bump the openstackdocstheme extension to 1.20 https://review.opendev.org/672857 | 02:28 |
---|---|---|
*** betherly has quit IRC | 02:29 | |
*** BjoernT has joined #openstack-nova | 02:33 | |
*** tetsuro has joined #openstack-nova | 02:48 | |
*** threestrands has joined #openstack-nova | 03:02 | |
*** tetsuro has quit IRC | 03:21 | |
openstackgerrit | Li Liu proposed openstack/nova master: Define new exceptions related to device profiles and ARQs. https://review.opendev.org/673733 | 03:30 |
openstackgerrit | Li Liu proposed openstack/nova master: Refactor some methods for reuse by Cyborg code. https://review.opendev.org/673734 | 03:30 |
openstackgerrit | Li Liu proposed openstack/nova master: WIP: Add Cyborg device profile groups to request spec. https://review.opendev.org/631243 | 03:30 |
openstackgerrit | Li Liu proposed openstack/nova master: WIP: Create and bind Cyborg ARQs. https://review.opendev.org/631244 | 03:30 |
openstackgerrit | Li Liu proposed openstack/nova master: fixed merge conflict https://review.opendev.org/673938 | 03:30 |
openstackgerrit | Li Liu proposed openstack/nova master: added cyborg external event https://review.opendev.org/673939 | 03:31 |
openstackgerrit | Li Liu proposed openstack/nova master: WIP: Create and bind Cyborg ARQs. https://review.opendev.org/631244 | 03:37 |
*** whoami-rajat has joined #openstack-nova | 03:41 | |
*** hongbin has joined #openstack-nova | 03:45 | |
*** hongbin has quit IRC | 03:46 | |
*** udesale has joined #openstack-nova | 03:46 | |
*** psachin has joined #openstack-nova | 03:55 | |
*** slaweq has joined #openstack-nova | 04:11 | |
*** slaweq has quit IRC | 04:17 | |
*** mkrai has joined #openstack-nova | 04:23 | |
*** Luzi has joined #openstack-nova | 04:25 | |
*** tetsuro has joined #openstack-nova | 04:28 | |
*** BjoernT has quit IRC | 04:43 | |
*** bhagyashris has quit IRC | 04:52 | |
*** tetsuro has quit IRC | 05:02 | |
*** ratailor has joined #openstack-nova | 05:09 | |
openstackgerrit | Sundar Nadathur proposed openstack/nova master: ksa auth conf and client for Cyborg access https://review.opendev.org/631242 | 05:11 |
openstackgerrit | Sundar Nadathur proposed openstack/nova master: Refactor some methods for reuse by Cyborg-related code. https://review.opendev.org/673734 | 05:11 |
openstackgerrit | Sundar Nadathur proposed openstack/nova master: WIP: Add Cyborg device profile groups to request spec. https://review.opendev.org/631243 | 05:11 |
openstackgerrit | Sundar Nadathur proposed openstack/nova master: WIP: Create and bind Cyborg ARQs. https://review.opendev.org/631244 | 05:11 |
openstackgerrit | Sundar Nadathur proposed openstack/nova master: WIP: Get resolved Cyborg ARQs and add PCI BDFs to VM's domain XML. https://review.opendev.org/631245 | 05:11 |
openstackgerrit | Sundar Nadathur proposed openstack/nova master: Delete ARQs for an instance when the instance is deleted. https://review.opendev.org/673735 | 05:11 |
*** nafiux has joined #openstack-nova | 05:11 | |
*** ociuhandu has joined #openstack-nova | 05:22 | |
*** dpawlik has joined #openstack-nova | 05:26 | |
*** dansmith has quit IRC | 05:26 | |
*** ociuhandu has quit IRC | 05:27 | |
*** dansmith has joined #openstack-nova | 05:28 | |
*** tetsuro has joined #openstack-nova | 05:43 | |
*** jaosorior has quit IRC | 05:46 | |
*** tetsuro has quit IRC | 05:48 | |
*** belmoreira has joined #openstack-nova | 05:50 | |
*** belmoreira has quit IRC | 05:50 | |
*** belmoreira has joined #openstack-nova | 05:52 | |
*** ccamacho has quit IRC | 05:54 | |
*** maciejjozefczyk has joined #openstack-nova | 05:59 | |
*** bhagyashris_ has joined #openstack-nova | 06:03 | |
*** slaweq has joined #openstack-nova | 06:04 | |
*** slaweq has quit IRC | 06:09 | |
*** slaweq has joined #openstack-nova | 06:11 | |
*** takamatsu has joined #openstack-nova | 06:13 | |
*** slaweq has quit IRC | 06:16 | |
*** xek has joined #openstack-nova | 06:17 | |
*** janki has joined #openstack-nova | 06:22 | |
*** ricolin_ has joined #openstack-nova | 06:26 | |
*** xek has quit IRC | 06:27 | |
*** ricolin has quit IRC | 06:29 | |
*** aojea has joined #openstack-nova | 06:31 | |
*** belmoreira has quit IRC | 07:01 | |
*** kashyap has joined #openstack-nova | 07:02 | |
*** belmoreira has joined #openstack-nova | 07:03 | |
kashyap | stephenfin: Morning, when you can, mind having a look at this small patch (that fixes two bugs)? Pinging you because, we've discussed this in the past: https://review.opendev.org/#/c/348394/ | 07:03 |
*** slaweq has joined #openstack-nova | 07:04 | |
kashyap | stephenfin: For context, you've quoted me in the change, at that time I thought it wasn't worth it | 07:04 |
kashyap | stephenfin: But seeing that CentOS _and_ SLES are broken, this temporary solution seemed acceptable | 07:05 |
kashyap | aspiers: https://review.opendev.org/#/c/348394/ | 07:05 |
*** maciejjozefczyk_ has joined #openstack-nova | 07:05 | |
*** maciejjozefczyk has quit IRC | 07:08 | |
*** pcaruana has quit IRC | 07:12 | |
*** rpittau|afk is now known as rpittau | 07:13 | |
*** tetsuro has joined #openstack-nova | 07:13 | |
*** tetsuro has quit IRC | 07:17 | |
*** ccamacho has joined #openstack-nova | 07:18 | |
*** tesseract has joined #openstack-nova | 07:20 | |
*** nafiux has quit IRC | 07:23 | |
*** belmoreira has quit IRC | 07:28 | |
*** tssurya has joined #openstack-nova | 07:32 | |
*** belmoreira has joined #openstack-nova | 07:35 | |
*** cdent has joined #openstack-nova | 07:36 | |
*** ralonsoh has joined #openstack-nova | 07:39 | |
*** ociuhandu has joined #openstack-nova | 07:39 | |
*** boxiang has joined #openstack-nova | 07:39 | |
*** boxiang has quit IRC | 07:39 | |
*** boxiang has joined #openstack-nova | 07:40 | |
*** ralonsoh has quit IRC | 07:40 | |
*** ralonsoh has joined #openstack-nova | 07:40 | |
*** pcaruana has joined #openstack-nova | 07:42 | |
*** ivve has joined #openstack-nova | 07:51 | |
*** ociuhandu has quit IRC | 07:56 | |
*** boxiang has quit IRC | 07:59 | |
*** boxiang has joined #openstack-nova | 07:59 | |
*** belmoreira has quit IRC | 08:01 | |
*** lpetrut has joined #openstack-nova | 08:03 | |
*** lpetrut has quit IRC | 08:04 | |
*** lpetrut has joined #openstack-nova | 08:04 | |
*** belmoreira has joined #openstack-nova | 08:07 | |
*** belmoreira has quit IRC | 08:09 | |
*** belmoreira has joined #openstack-nova | 08:11 | |
kashyap | alex_xu: Hi, mind also looking at this? -- https://review.opendev.org/#/c/348394/ | 08:12 |
*** boxiang has quit IRC | 08:17 | |
*** maciejjozefczyk_ is now known as maciejjozefczyk | 08:17 | |
*** derekh has joined #openstack-nova | 08:25 | |
*** shilpasd has joined #openstack-nova | 08:25 | |
*** tkajinam has quit IRC | 08:27 | |
*** nfakhir has joined #openstack-nova | 08:27 | |
*** lpetrut has quit IRC | 08:29 | |
*** ivve has quit IRC | 08:30 | |
stephenfin | kashyap: Sure, I'll take a look now | 08:32 |
kashyap | stephenfin: Thank you. I also want to do a functional test (as this is not functionally tested in the Gate) | 08:32 |
kashyap | I'll report back my findings on the patch, too | 08:32 |
kashyap | Thank you. To save you time, see the last three comments (two from me, one from Adam) | 08:33 |
bhagyashris_ | stephenfin: Hi, | 08:35 |
stephenfin | bhagyashris_: o. | 08:35 |
stephenfin | *o/ | 08:35 |
bhagyashris_ | stephenfin: I just encountered in one issue and I have posted comment on your patch https://review.opendev.org/#/c/671793/5 . | 08:35 |
bhagyashris_ | stephenfin: Just to info you I am currently working on the upgrade patch https://review.opendev.org/#/c/672224/1 (Fixing review comments given by you) and also testing those change on top of your changes. Once it’s fixed I will push the patch soon. | 08:36 |
*** purplerbot has quit IRC | 08:36 | |
*** purplerbot has joined #openstack-nova | 08:36 | |
stephenfin | bhagyashris_: Ack. I hope to have a new revision with that fixed and the WIP flag removed today | 08:37 |
*** ivve has joined #openstack-nova | 08:40 | |
bhagyashris_ | stephenfin: ok... Me too will push the upgrade patch soon on top of your changes... | 08:40 |
stephenfin | Sweet | 08:41 |
bhagyashris_ | stephenfin: and also one thing that the new syntax "resources:(PCPU|VCPU) = <no of cpus>" is not accepted in the current patch series so that thing also we need to take care... | 08:43 |
stephenfin | bhagyashris_: Yes. I think I'm going to drop my conversion patch and use yours, if that's okay? | 08:44 |
stephenfin | (the entire commit, so will keep authorship) | 08:44 |
bhagyashris_ | stephenfin: sure No problem! | 08:45 |
*** priteau has joined #openstack-nova | 08:54 | |
*** belmoreira has quit IRC | 08:54 | |
openstackgerrit | Brin Zhang proposed openstack/nova master: nit: fix the test case of migration obj_make_compatible https://review.opendev.org/673961 | 08:57 |
*** psachin has quit IRC | 09:00 | |
*** belmoreira has joined #openstack-nova | 09:03 | |
*** janki has quit IRC | 09:08 | |
*** janki has joined #openstack-nova | 09:09 | |
kashyap | aspiers: Let me know when you're about to double-check me on a domainCapabilities parsing thing | 09:11 |
kashyap | s/"about to"/"about -- to"/ | 09:11 |
*** jchhatbar has joined #openstack-nova | 09:17 | |
*** janki has quit IRC | 09:17 | |
*** belmoreira has quit IRC | 09:22 | |
*** jchhatbar has quit IRC | 09:25 | |
*** belmoreira has joined #openstack-nova | 09:26 | |
kashyap | Unrelated...that's some commit message right there: https://github.com/dtschump/CImg/commit/47e57118bc1eb58c0 | 09:33 |
kashyap | (And many others in that repo :D) | 09:33 |
*** shilpasd has quit IRC | 09:34 | |
*** ivve has quit IRC | 09:37 | |
sean-k-mooney | bhagyashris_: stephenfin i know efried wants use to support the resouces: syntax but im really not sure if that is a good idea | 09:38 |
sean-k-mooney | bhagyashris_: stephenfin if we do allow "resources:(PCPU|VCPU) = <no of cpus>" we shoudl only allow it if cpu policy=mixed | 09:39 |
sean-k-mooney | e.g. it houls be an error to have PCPUs if you policy is shared or VCPUs if the policy is dedicated | 09:39 |
stephenfin | sean-k-mooney: I was thinking defining the policy along with explicit resources doesn't make sense and should be an error in its own right | 09:42 |
*** priteau has quit IRC | 09:43 | |
sean-k-mooney | if you dont define the policy PCPUs will never be used | 09:43 |
*** mkrai has quit IRC | 09:43 | |
sean-k-mooney | we dont want to infer masks from "resources:(PCPU|VCPU) = <no of cpus>" | 09:43 |
*** priteau has joined #openstack-nova | 09:44 | |
sean-k-mooney | so i find it hard to understand why we would support this at all | 09:44 |
sean-k-mooney | give "resources:(PCPU|VCPU) = <no of cpus>" will not affect xml generation in any way | 09:44 |
*** priteau has quit IRC | 09:50 | |
*** priteau has joined #openstack-nova | 09:52 | |
*** belmoreira has quit IRC | 09:53 | |
openstackgerrit | Brin Zhang proposed openstack/nova master: WIP: Add user_id and project_id colume to Migration https://review.opendev.org/673990 | 09:54 |
*** ociuhandu has joined #openstack-nova | 09:57 | |
*** shilpasd has joined #openstack-nova | 09:57 | |
*** sapd1_x has joined #openstack-nova | 09:59 | |
*** belmoreira has joined #openstack-nova | 10:00 | |
*** ociuhandu has quit IRC | 10:01 | |
*** prometheanfire has quit IRC | 10:04 | |
*** rpittau is now known as rpittau|bbl | 10:04 | |
aspiers | sean-k-mooney: do you know which part of nova-conductor this comes from? http://paste.openstack.org/show/755199/ | 10:12 |
aspiers | I'm seeing it spammed to mysqld at a rate of 2M/s | 10:12 |
aspiers | even though mysqladmin processlist shows everything as "Sleep" | 10:12 |
aspiers | I don't know wtf is going on | 10:12 |
aspiers | it's from devstack@n-super-cond.service | 10:13 |
*** bhagyashris_ has quit IRC | 10:16 | |
sean-k-mooney | 2M/s as in 2,000,000/s | 10:16 |
sean-k-mooney | or 2 messages a second | 10:17 |
sean-k-mooney | because if its the former no wonder you service died | 10:17 |
aspiers | 2Mb/s | 10:17 |
aspiers | actually no, 2MB/s | 10:18 |
aspiers | even worse | 10:18 |
sean-k-mooney | i dont know if that is higher or lower | 10:18 |
aspiers | B = bytes | 10:18 |
aspiers | b = bits | 10:18 |
sean-k-mooney | no not that | 10:18 |
sean-k-mooney | 2MB/s of a tiny record | 10:18 |
sean-k-mooney | is that more or less the 2 million queries | 10:18 |
aspiers | ? | 10:18 |
sean-k-mooney | its a lot in any case | 10:19 |
aspiers | huh | 10:19 |
aspiers | 2 megabytes / s | 10:19 |
aspiers | I'm not talking about numbers of queries | 10:19 |
sean-k-mooney | i know im trying to convert to quires | 10:19 |
aspiers | well queries range from 10s to 100s of bytes | 10:20 |
aspiers | let's say each query is 200 bytes | 10:20 |
aspiers | then that would be 100k queries per sec | 10:20 |
aspiers | no wait | 10:20 |
aspiers | 10kq/s | 10:20 |
aspiers | a lot anyway | 10:21 |
aspiers | htf do I debug this? | 10:21 |
sean-k-mooney | yep it looks like its trying to determin if a compute service exists | 10:21 |
sean-k-mooney | or rather look it up by its id | 10:22 |
aspiers | but why in an infinite loop, and why the commit and rollback? | 10:24 |
sean-k-mooney | is it alway the same service id | 10:25 |
aspiers | oh good point | 10:25 |
sean-k-mooney | it could be trying to register the conductor service and failing | 10:25 |
aspiers | it's either id 1 or 2 | 10:27 |
sean-k-mooney | that log should WHERE services.deleted = 0 AND services.id = 5 | 10:27 |
sean-k-mooney | but ok | 10:27 |
aspiers | hrm | 10:27 |
sean-k-mooney | what are service id 1 and 2 | 10:27 |
aspiers | which table do I check in which db? | 10:28 |
*** ricolin__ has joined #openstack-nova | 10:28 | |
aspiers | keystone / service? | 10:29 |
aspiers | no | 10:29 |
sean-k-mooney | no nova_cell1 | 10:29 |
aspiers | why not cell0? | 10:30 |
sean-k-mooney | nova_cell1.services | 10:30 |
aspiers | 1 is nova-conductor | 10:31 |
aspiers | 3 is nova-compute | 10:31 |
aspiers | that's it | 10:31 |
*** ricolin_ has quit IRC | 10:31 | |
sean-k-mooney | im not sure where 2 and 5 came form | 10:31 |
aspiers | nor me | 10:31 |
aspiers | so n-super-cond seems to use cell1 and n-cond-cell0 uses cell0 | 10:32 |
aspiers | why is that? | 10:32 |
aspiers | cells have 2 tiers of conductors? | 10:32 |
sean-k-mooney | yes | 10:32 |
aspiers | but in that case what if another cell got added | 10:33 |
aspiers | that would be cell2? | 10:33 |
aspiers | seems weird | 10:33 |
sean-k-mooney | the db name is up to you | 10:33 |
aspiers | surely the top tier super-cond would come first as cell0 | 10:33 |
sean-k-mooney | but cells are ways for sharding your db and messagebus | 10:33 |
aspiers | this is devstack so I didn't decide | 10:33 |
aspiers | if devstack was configured with 3 cells, would n-super-cond be cell3? | 10:34 |
aspiers | and n-cond-cell[0-2] | 10:34 |
aspiers | ? | 10:34 |
sean-k-mooney | i dont know | 10:34 |
aspiers | ok anyway | 10:34 |
aspiers | something is super weird | 10:34 |
aspiers | it's not just conductor | 10:34 |
aspiers | wtf | 10:35 |
aspiers | OK there is no nova processes running now but mysqladmin processlist still shows a nova_cell1 process in sleep | 10:36 |
aspiers | and mysqld still spikes to 2MB/s disk writes every few seconds | 10:36 |
sean-k-mooney | this kind of sounds like a perodic task | 10:36 |
sean-k-mooney | or you have duplicate host names somewer and some other devstack is trying to use yoru db and cause werid things to happen | 10:37 |
sean-k-mooney | e.g. its either supper weird or it likely just a backroud taks | 10:37 |
aspiers | 15502 be/4 mysql 0.00 B/s 2.08 M/s 0.00 % 0.12 % mysqld --defaults-file=/etc/my.cnf --user=mysql | 10:38 |
sean-k-mooney | have you tried restarting mysql | 10:38 |
aspiers | background tasks from where? | 10:38 |
sean-k-mooney | a periodic task from one of the nova services | 10:38 |
aspiers | there are no nova services running! | 10:39 |
aspiers | I stopped them all | 10:39 |
aspiers | <aspiers> OK there is no nova processes running now but mysqladmin processlist still shows a nova_cell1 process in sleep | 10:39 |
aspiers | this is f'ing weird | 10:39 |
aspiers | | Id | User | Host | db | Command | Time | State | Info | Progress | | 10:40 |
sean-k-mooney | have you check ps to see if there is a zombie python process | 10:40 |
aspiers | | 5774 | root | localhost | nova_cell1 | Sleep | 537 | | | 0.000 | | 10:40 |
aspiers | but 5774 is an internal mysqld pid I think | 10:40 |
aspiers | ohhhh no wait | 10:40 |
aspiers | UID PID PPID C STIME TTY TIME CMD | 10:40 |
aspiers | stack 5774 5329 0 03:20 ? 00:00:05 placementuWSGI worker 50 | 10:40 |
aspiers | wow | 10:40 |
aspiers | I forgot about wsgi | 10:41 |
aspiers | was only looking at systemd services | 10:41 |
sean-k-mooney | placement has a systemd service | 10:41 |
aspiers | but apparently also wsgi? | 10:41 |
sean-k-mooney | its the same thing | 10:41 |
aspiers | ProxyPass "/placement" "unix:/var/run/uwsgi/placement-api.socket|uwsgi://uwsgi-uds-placement-api/" retry=0 | 10:42 |
aspiers | I think I'm gonna unstack.sh this damn thing | 10:42 |
aspiers | and rebuild | 10:42 |
sean-k-mooney | devstack@placement-api.service | 10:43 |
*** bbowen has joined #openstack-nova | 10:44 | |
aspiers | yeah I've stopped that and 5774 is still listed in mysqld | 10:44 |
aspiers | but not a kernel process any more | 10:44 |
*** bbowen has quit IRC | 10:45 | |
sean-k-mooney | ./clean.sh && sudo reboot | 10:45 |
aspiers | hah | 10:45 |
sean-k-mooney | clean will remove mysql and your problem with it | 10:45 |
*** bbowen has joined #openstack-nova | 10:46 | |
aspiers | how is that different from unstack.sh? | 10:46 |
aspiers | # ``clean.sh`` does its best to eradicate traces of a Grenade | 10:46 |
aspiers | I'm not using Grenade | 10:46 |
sean-k-mooney | clean does more then that | 10:46 |
sean-k-mooney | it removes several packages and files created by devstack that unstack does not | 10:47 |
aspiers | OK I guess I can run both | 10:47 |
sean-k-mooney | you normally only use clean if changing branches e.g. master to stable | 10:47 |
sean-k-mooney | clean runs unstack | 10:47 |
aspiers | oh right | 10:47 |
aspiers | someone should fix the misleading comment at the top of clean then | 10:47 |
sean-k-mooney | well normally you dont need to use it but when weird stuff happens somethimes its for the best | 10:48 |
*** prometheanfire has joined #openstack-nova | 10:50 | |
sean-k-mooney | every few days i close all my fire fox windows becaue i have just too many tabs open to go back too that i dont rememebr what tabs i have already dealt with | 10:51 |
*** tbachman has quit IRC | 10:51 | |
aspiers | you should use a tab limiter | 10:51 |
sean-k-mooney | well when i had 10 windows im not sure that will help | 10:52 |
sean-k-mooney | also why | 10:52 |
aspiers | mine stops me from that crazy behaviour | 10:52 |
aspiers | reminds me not to excessively multi-task | 10:52 |
aspiers | if I want to remember a URL, I add it to a file of structured notes | 10:52 |
aspiers | or to my TODO list | 10:52 |
aspiers | then I stay focused on fewer things | 10:52 |
sean-k-mooney | nomally 90% of my tabs are gerrit reviews | 10:53 |
sean-k-mooney | and like 20 copys of nova's github | 10:53 |
aspiers | why are you wanting to keep them open though? | 10:53 |
aspiers | sounds like your workflow needs a bit of reevaluating | 10:53 |
aspiers | I find it hard not to multitask which is why the extension helps | 10:54 |
sean-k-mooney | often i plan to check back on the review to see if people replied or i leave it open when i join a meeting | 10:54 |
aspiers | mtultitasking is so damaging to sanity and productivity | 10:54 |
sean-k-mooney | i usually have 1 or two windows that i use mostly on two different monitors and the other are form my poping out bluejeans for a video meeting and all the tabs i opened during that meeting | 10:56 |
sean-k-mooney | so i just need to get in the habit of killing the meeting windows after the meeting but i leave it open if i have ot go update a bug or something | 10:56 |
*** dpawlik has quit IRC | 10:58 | |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Add test coverage of existing os-services policies https://review.opendev.org/669181 | 10:58 |
*** rcernin has quit IRC | 10:58 | |
*** dpawlik has joined #openstack-nova | 11:06 | |
*** ivve has joined #openstack-nova | 11:07 | |
jangutter | sean-k-mooney: At least the debate hasn't started veering into the tabs-vs-spaces territory. | 11:12 |
aspiers | jangutter: just you wait | 11:13 |
aspiers | sean-k-mooney: clean.sh didn't remove my OVS bridges | 11:13 |
*** tbachman has joined #openstack-nova | 11:14 | |
sean-k-mooney | ya it doesnt | 11:14 |
jangutter | I have a friend that uses all 16 virtual desktops. One for each separate client engagement and/or project. They have a special ultra-reliable UPS for his workstation in-line with the building UPS. | 11:14 |
sean-k-mooney | it also does not unistall and nuke apache config | 11:15 |
sean-k-mooney | it would be nice if it it gets close enough | 11:15 |
aspiers | SSLError: HTTPSConnectionPool(host='files.pythonhosted.org', port=443): Max retries exceeded with url: /packages/11/fa/0160cd525c62d7abd076a070ff02b2b94de589f1a9789774f17d7c54058e/pyparsing-2.4.2-py2.py3-none-any.whl (Caused by SSLError(SSLError(1, u'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:661)'),)) | 11:16 |
sean-k-mooney | jangutter: but still just one power supply in the desktop[ | 11:16 |
aspiers | devstack hates me today | 11:16 |
*** belmoreira has quit IRC | 11:17 | |
sean-k-mooney | jangutter: did you see https://review.opendev.org/#/c/672834/ if not can you review | 11:18 |
sean-k-mooney | stephenfin: you too ^ | 11:18 |
*** belmoreira has joined #openstack-nova | 11:19 | |
*** ociuhandu has joined #openstack-nova | 11:20 | |
*** ociuhandu has quit IRC | 11:21 | |
*** panda is now known as panda|eat | 11:25 | |
openstackgerrit | Huachang Wang proposed openstack/nova-specs master: Use PCPU and VCPU in one instance https://review.opendev.org/668656 | 11:25 |
*** dpawlik has quit IRC | 11:27 | |
aspiers | Successfully uninstalled pip-19.2.1 | 11:27 |
aspiers | Successfully installed pip-9.0.3 | 11:27 |
aspiers | thanks devstack, that's really helpful! | 11:27 |
*** Conqueror has quit IRC | 11:27 | |
aspiers | X-( | 11:27 |
kashyap | Wish me luck, I'm just about to kick off a fresh DevStack on F30 with a custom patch | 11:28 |
sean-k-mooney | well devstack does not support ver 10+ | 11:28 |
sean-k-mooney | pip broke compatiablity with something and i dont think we have fixed it yet | 11:29 |
sean-k-mooney | or maybe we have an we just didnt bother to update it | 11:29 |
sean-k-mooney | we shoudl at some point | 11:29 |
aspiers | sean-k-mooney: https://github.com/pypa/pip/issues/4459 | 11:29 |
aspiers | open for over 2 years | 11:30 |
aspiers | and people say Python doesn't suck | 11:31 |
sean-k-mooney | well you can just set you corperate proxt as trusted in you /etc/pip.conf | 11:31 |
aspiers | what does this have to do with proxies? | 11:32 |
aspiers | and why did devstack work fine before on the same machine? | 11:32 |
aspiers | I *never* had to tweak /etc/pip.conf for devstack before | 11:32 |
sean-k-mooney | "a note about my environment - i'm runing behind cntlm and the corp proxy." | 11:32 |
aspiers | I didn't write that | 11:32 |
aspiers | not sure where you saw that | 11:33 |
sean-k-mooney | its the 5 or 6 line in the bug | 11:33 |
sean-k-mooney | second line of the description | 11:33 |
aspiers | yeah but devstack worked *yesterday* on this same machine | 11:33 |
aspiers | maybe clean.sh has removed some certificates? | 11:33 |
sean-k-mooney | i doubt it | 11:33 |
aspiers | well how else do you explain the cert verificiation failing? | 11:33 |
sean-k-mooney | try stacking again | 11:33 |
aspiers | that's what I tried | 11:34 |
aspiers | that's where I'm seeing the error | 11:34 |
sean-k-mooney | well i was wondering if the cert expired honestly on the mirror you hit | 11:34 |
aspiers | expired in the last 12 hours? | 11:34 |
aspiers | when https://github.com/pypa/pip/issues/4459 has been open since May 1 2017? | 11:34 |
aspiers | I doubt it | 11:35 |
aspiers | this is some horrible pip breakage | 11:35 |
openstackgerrit | Brin Zhang proposed openstack/nova master: WIP: Add user_id and project_id colume to Migration https://review.opendev.org/673990 | 11:35 |
kashyap | aspiers: Hi. How's DevStack treating you today? | 11:36 |
* kashyap hopes he's not rubbing salt on wound | 11:36 | |
aspiers | kashyap: not funny | 11:36 |
kashyap | Whoops. Sorry | 11:36 |
aspiers | I'm in the middle of this ridiculous deadline and everything is conspiring against me | 11:36 |
* kashyap was about to ask a favor; not a good time | 11:37 | |
aspiers | ok, update-ca-certificates fixed it | 11:37 |
*** ivve has quit IRC | 11:37 | |
kashyap | aspiers: O, I didn't read the full scrollback, and made a poor joke; disregard me... | 11:37 |
sean-k-mooney | ya i was debating if that would be related | 11:37 |
aspiers | kashyap: it's fine | 11:37 |
aspiers | kashyap: devstack is now running finally so I have time to kill | 11:37 |
sean-k-mooney | but i dont know why it would have been unless a cert expired | 11:37 |
aspiers | I'm guessing something to do with devstack clashing with SUSE's weird cert handling | 11:38 |
kashyap | aspiers: I see. Maybe you want to tackle other bits of your deadline. I was going to ask, if you want to triple-confirm if I'm not missing anything in the 'enum' parsing bits here - https://review.opendev.org/#/c/673790/2/nova/virt/libvirt/config.py | 11:38 |
aspiers | kashyap: ok | 11:38 |
aspiers | the depressing thing is I've wasted loads of time due to broken hardware, and I strongly suspect the "fix" won't have fixed it | 11:39 |
aspiers | so most likely this reinstall is a waste of time too | 11:39 |
kashyap | sean-k-mooney: You're using 'xpath' here: https://review.opendev.org/#/c/666915/6/nova/virt/libvirt/config.py@183 | 11:39 |
kashyap | aspiers: Yeah, broken-hardware-- | 11:40 |
sean-k-mooney | kashyap: yes because i did not want to create lots of small clases for the enums | 11:40 |
kashyap | sean-k-mooney: It's the first usage in config.py. I'd say we should be consistent and use 'enum' approach? | 11:40 |
sean-k-mooney | to avoid https://review.opendev.org/#/c/673790/2/nova/virt/libvirt/config.py@214 | 11:40 |
sean-k-mooney | well i stongly prefer xpath so its going to be rather low on my priority list to chage | 11:41 |
kashyap | sean-k-mooney: Hmm, I'm not "opposed" to it, though. As I'm a heavy 'xpath' user myself | 11:41 |
aspiers | ImportError: No module named enum | 11:41 |
sean-k-mooney | i dont object to other using the domb parsing either | 11:41 |
aspiers | this just keeps getting better | 11:41 |
sean-k-mooney | *dom | 11:41 |
kashyap | sean-k-mooney: BTW, it's not about our personal preferences; just using something consistently well. You don't need to create lots of classes | 11:42 |
aspiers | kashyap: we had this conversation yesterday or the day before | 11:42 |
kashyap | sean-k-mooney: You can parse the enums in the same class, no? | 11:42 |
sean-k-mooney | you can i just really diskile how that module it written | 11:43 |
kashyap | aspiers: Did we? Then my bumblebee-like memory forgot, less than 7 seconds | 11:43 |
sean-k-mooney | to the point i considered rewringing it to use xpath and decided not to and just use it for my bit | 11:43 |
aspiers | "we" as in me and sean-k-mooney | 11:43 |
aspiers | http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2019-07-29.log.html#t2019-07-29T12:34:58 | 11:43 |
kashyap | Aah, /me clicks | 11:43 |
aspiers | http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2019-07-29.log.html#t2019-07-29T12:42:02 | 11:44 |
aspiers | personally I'm with kashyap on this but I don't have the energy to bikeshed it because the rest of my world is falling apart | 11:44 |
* kashyap clicks further | 11:44 | |
kashyap | aspiers: Yeah, go on. This is not super urgent, but I really dislike if we all codify our random preferences | 11:45 |
sean-k-mooney | aspiers dont bother updating your code right now | 11:45 |
kashyap | That would suck hard | 11:45 |
sean-k-mooney | it work and its correct | 11:45 |
stephenfin | sean-k-mooney: sure | 11:45 |
aspiers | isn't it amazing how everything magically breaks just before a deadline? | 11:45 |
aspiers | seen it so many times | 11:45 |
kashyap | aspiers: Yeah, Murphy dancing on the head, seems like :-( | 11:45 |
sean-k-mooney | kashyap: sure but i hat the fact that we are califying parts of the code such that we can never use better solution to solve thing | 11:46 |
*** priteau has quit IRC | 11:46 | |
kashyap | sean-k-mooney: Again, note - I like 'xpath' myself and happy to use it in the code :-) | 11:46 |
sean-k-mooney | it activly makes me want to stop working on nova sometimes | 11:46 |
kashyap | Oh, no... | 11:46 |
kashyap | sean-k-mooney: Okay, leave it as-is for now. | 11:46 |
sean-k-mooney | if the general concenous is to do it the old way i can redo my code its just lots of wasted effort | 11:47 |
kashyap | sean-k-mooney: I myself might even rewrite to use 'xpath' because ... I am not sure if it has the expected 'enum' property defined: https://review.opendev.org/#/c/673790/2/nova/virt/libvirt/config.py | 11:48 |
kashyap | Anyway. Don't want to get hung up on this :-) | 11:49 |
*** udesale has quit IRC | 11:49 | |
*** udesale has joined #openstack-nova | 11:50 | |
aspiers | OK so my install of enum34 was corrupted | 11:52 |
aspiers | probably a fight between python module rpms and devstack | 11:52 |
sean-k-mooney | is enum34 the python 3.4 version? | 11:53 |
aspiers | it's the backport | 11:53 |
sean-k-mooney | ah ok | 11:53 |
aspiers | BTW I tried python 3.4 | 11:53 |
aspiers | it failed horribly | 11:53 |
sean-k-mooney | ya we never ran nova with it so im not surprised | 11:53 |
aspiers | but thinking about it more, maybe that was because I was trying to partially rely on packages | 11:53 |
sean-k-mooney | some oslo libs did breifly but most project start with 3.5 | 11:53 |
aspiers | actually no | 11:53 |
aspiers | it was trying to install the latest versions of requests and Flask and neither work with 3.4 | 11:54 |
aspiers | probably lots of modules dropped support for 3.4 | 11:54 |
sean-k-mooney | 3.5 added a lot of nic features so it was kind of the first good python 3 release | 11:55 |
sean-k-mooney | or the first one people agreed was worth porting too | 11:55 |
sean-k-mooney | its when python3 got performace parity with py 2.7 and started to pull ahead in some cases | 11:56 |
aspiers | ImportError: No module named requests | 11:56 |
aspiers | awesome! | 11:56 |
aspiers | best day ever! | 11:56 |
sean-k-mooney | i think the latest requests package got pulled form pypi by the way | 11:56 |
sean-k-mooney | the one in upper constratis nolonger exists | 11:56 |
aspiers | they fixed that last week | 11:56 |
sean-k-mooney | at least the one that was there yesterday didnt | 11:56 |
sean-k-mooney | ah ok | 11:57 |
sean-k-mooney | havent updated my requirement repo in a while | 11:57 |
aspiers | yeah you should | 11:57 |
*** dpawlik has joined #openstack-nova | 11:57 | |
aspiers | hooray it's finally starting rabbit | 11:57 |
aspiers | originally I was trying to install SOC on this node so getting python modules from packages | 11:58 |
aspiers | then I switched to devstack which is why the rpms are fighting with pip | 11:58 |
sean-k-mooney | you know when we update to a new pip in devstack maybe we should add --user | 11:59 |
aspiers | well isn't it using virtualenv? | 11:59 |
sean-k-mooney | no | 11:59 |
sean-k-mooney | not by default | 11:59 |
sean-k-mooney | it can but its not really tested anywhere | 11:59 |
aspiers | I see | 12:00 |
*** ociuhandu has joined #openstack-nova | 12:01 | |
*** dpawlik has quit IRC | 12:01 | |
*** jcosmao has joined #openstack-nova | 12:04 | |
*** ociuhandu has quit IRC | 12:05 | |
aspiers | sean-k-mooney: maybe I misremembered, it was pyparsing not requests https://review.opendev.org/#/c/672395/ | 12:08 |
aspiers | but requests seems OK now | 12:08 |
aspiers | or maybe I'm using a local branch of requirements | 12:09 |
mordred | efried, stephenfin: left a comment on https://review.opendev.org/#/c/665518 | 12:10 |
sean-k-mooney | mordred: we are not going to use the pre-commit stuff in jobs | 12:11 |
aspiers | the best bit about devstack is how if it fails near the end you have to rerun the whole thing from the beginning | 12:11 |
mordred | sean-k-mooney: ok. I'm less worried about it then | 12:11 |
aspiers | sort of guaranteed O(n^2) debug time | 12:12 |
sean-k-mooney | mordred: it will be just optional for people to use locally and tox will still be used for jobs | 12:12 |
mordred | sean-k-mooney: we got a patch to zuul-jobs suggesting using pre-commit in our tox.ini as the command tox -elinters runs - which didn't seem like a TERRIBLE idea other than the github cloning ... but having it there just as an option for people who want such a thing is even better | 12:13 |
aspiers | I also love how pip (at least devstack version) doesn't have any way to verify that modules are intact | 12:13 |
sean-k-mooney | mordred: ya i noted that we could run it via a tox env | 12:13 |
sean-k-mooney | but i at least was not planning to propose it as a job | 12:13 |
sean-k-mooney | but i guess i can see why other might | 12:14 |
mordred | sean-k-mooney: yeah. it's .. you know, I *like* the concept of a declarative file with a list of linters to run. that part is honesty not a terrible design | 12:14 |
sean-k-mooney | mordred: the removing tabs thing however is really annoying to make work with emacs and i get tired of the fact we enforce it manually | 12:14 |
mordred | sean-k-mooney: is it really? weird | 12:15 |
sean-k-mooney | to do it consitenly across all mode ya | 12:15 |
*** ociuhandu has joined #openstack-nova | 12:15 | |
mordred | oh. yeah. that I could see being annoying | 12:15 |
sean-k-mooney | i have found no simple way to do it that "just works" | 12:15 |
mordred | you didn't want to write your own minor mode in elisp? | 12:16 |
sean-k-mooney | i have kind of stopped using emacs again partly because of it | 12:16 |
stephenfin | mordred: I'm not sure I get you. Like sean-k-mooney says, none of this is integrated into tox or anything. It's purposefully kept separate | 12:16 |
mordred | stephenfin: yup. I somehow missed that | 12:16 |
mordred | stephenfin: I now understand better | 12:16 |
mordred | concerns withdrawn | 12:17 |
sean-k-mooney | stephenfin: others have suggested that we could use it for jobs in other repos | 12:17 |
stephenfin | Ah, fair. I think TripleO might be doing something different but I just want this for our personal, non-gating uses | 12:17 |
mordred | yeah - if we wanted to do that - I think we'd want to think about it systemically - because I do think it would be possible to do sanely - it would just take some work | 12:17 |
sean-k-mooney | by the way we could port the remove tabs script in repo | 12:17 |
sean-k-mooney | but that one works so effort+mainatince | 12:18 |
mordred | ++ | 12:18 |
mordred | if we were to go with a job, I'd say put the remove tabs script into the hacking repo, then add hacking as a required-project and have the pre-playbook we'd need to edit the pre-commit config know how to do the pre-commit-hooks repo and the hacking repo - then we'd have a generalized solution that was workable ... but there's several pieces of work in there and I don't want to do any of them right now | 12:19 |
mordred | ... so I like your current approach | 12:20 |
stephenfin | I also like not having to do work | 12:20 |
sean-k-mooney | ya that could be a good idea going forward | 12:20 |
mordred | not doing work so far exceeds doing work in desirability | 12:20 |
stephenfin | Yes. Yes it does. | 12:21 |
sean-k-mooney | when we get support for pre-comit in nova i can see moving it to hacking as step two | 12:21 |
*** ociuhandu has quit IRC | 12:25 | |
sean-k-mooney | i sugessted usign that repo by the way because 1 its mit so its a nice license an 2 the hook is in pyton | 12:26 |
*** tbachman has quit IRC | 12:26 | |
*** ociuhandu has joined #openstack-nova | 12:27 | |
mordred | sean-k-mooney: ok - you've poked at this a bit ... how do you get pre-commit run -a to load in-repo hacking checks? | 12:28 |
mordred | I just tried the config file from that patch in the openstacksdk repo so I could play with it and understand it better - but we've got an in-repo hacking check that it's barfing on not being able to find | 12:28 |
sean-k-mooney | like this https://github.com/SeanMooney/gerio/blob/58c26decf2dfcf1a7deeedf5f016257b89c24149/.pre-commit-config.yaml | 12:28 |
*** dpawlik has joined #openstack-nova | 12:29 | |
aspiers | OMG I got devstack to work | 12:30 |
sean-k-mooney | congrats you grauate to step two | 12:31 |
aspiers | I can hardly believe it | 12:31 |
* aspiers waits for the kernel oops to reappear | 12:31 | |
*** ociuhandu has quit IRC | 12:31 | |
aspiers | and for services to randomly segfault | 12:31 |
sean-k-mooney | time to get multi node devstack to work or the thing that working in devstack working in not devstack the choice is yours | 12:31 |
mordred | well - that's a local script - but in my case just running flake8 which loads hacking finds that I have this: https://opendev.org/openstack/openstacksdk/src/branch/master/tox.ini#L46-L47 defined | 12:31 |
*** ociuhandu has joined #openstack-nova | 12:31 | |
aspiers | sean-k-mooney: I still see mysqld spinning at 2MB/s | 12:32 |
sean-k-mooney | - repo: local | 12:32 |
sean-k-mooney | hooks: | 12:32 |
mordred | sean-k-mooney: and since nova also has local-check-factory = nova.hacking.checks.factory in tox.ini ... I'm wondering how the flake8 task is working | 12:32 |
sean-k-mooney | - id: black | 12:32 |
sean-k-mooney | name: black | 12:32 |
sean-k-mooney | language: system | 12:32 |
sean-k-mooney | entry: sh -c "tools/run_black.sh" | 12:32 |
sean-k-mooney | files: '' | 12:32 |
sean-k-mooney | mordred: so you could replace entry with a call to run tox | 12:32 |
sean-k-mooney | or just the flake8 command | 12:32 |
mordred | sure ... but how is the config in that nova patch working since it calls flake8 directly as a hook? | 12:32 |
sean-k-mooney | oh i see what you you ment | 12:33 |
* mordred decides to try the nova repo and see for himself ... :) | 12:33 | |
*** panda|eat is now known as panda | 12:34 | |
sean-k-mooney | mordred: well the flak8 check is not readign tox.ini | 12:35 |
sean-k-mooney | at least i dont think it is | 12:35 |
mordred | but hacking does if it exists to load its own config whetherit's running in tox or not | 12:35 |
sean-k-mooney | ah ok well maybe this only works if you have ran devstack on that host | 12:36 |
*** tbachman has joined #openstack-nova | 12:36 | |
sean-k-mooney | which installs all the test requrieemtn system wide | 12:36 |
sean-k-mooney | i havent tried it on a clean system too check | 12:37 |
*** tssurya_ has joined #openstack-nova | 12:37 | |
*** tssurya has quit IRC | 12:37 | |
*** tssurya_ is now known as tssurya | 12:37 | |
mordred | sean-k-mooney: well - inexplicably - the nova one seems to be working | 12:37 |
*** artom has quit IRC | 12:37 | |
mordred | so I'll take that to mean there is _Something_ broken _somewhere_ -- which sadly means now I have to figure out what or else I won't be able to sleep | 12:38 |
sean-k-mooney | well the flake8 hook is just https://github.com/pre-commit/pre-commit-hooks/blob/master/.pre-commit-hooks.yaml#L134-L140 | 12:39 |
sean-k-mooney | so does it work in the sdk if you just run flake8 | 12:40 |
mordred | sean-k-mooney: it has found a niggly issue! | 12:40 |
sean-k-mooney | pre-commit++ | 12:41 |
mordred | the issue is that our flake8 checks have been running in a tox env which means openstacksdk has been installed in the venv ... and since the local hacking hooks are in openstack._hacking.factory, that means openstack itself gets imported - and because of something in openstack/__init__.py it triggers an import appdirs from a subfile which is a dependency | 12:41 |
mordred | so clearly I need to fix it so that openstack/__init__.py does not trigger that import | 12:42 |
mordred | since that's just rude in general | 12:42 |
sean-k-mooney | i am a firm beliver that __init__.py shoudl always be empty unless you have a very good reason | 12:43 |
sean-k-mooney | i never think to check __init__.py unless something is broken in a weird way | 12:44 |
mordred | yeah - I'm also a believe in that. in this case it's for factory functions - so you can do "import openstack ; conn = openstack.connect('foo')" ... but the importing there can be greatly improved so as to avoid this and go back to being mostly like being empty | 12:46 |
sean-k-mooney | the one usecase i kind of get is where you want to decuple the folder stucture form the structure of the modules in teh public interface | 12:47 |
sean-k-mooney | so that if you reange files internally your public api remains the same | 12:47 |
*** ratailor has quit IRC | 12:48 | |
*** belmoreira has quit IRC | 12:48 | |
sean-k-mooney | which is ok for a libary to do but non libary code really shouldnt do that | 12:48 |
mordred | yah. totally | 12:49 |
*** tbachman has quit IRC | 12:51 | |
*** tbachman has joined #openstack-nova | 12:54 | |
*** Luzi has quit IRC | 12:55 | |
openstackgerrit | Merged openstack/nova stable/stein: Avoid logging traceback when detach device not found https://review.opendev.org/672833 | 12:56 |
*** rpittau|bbl is now known as rpittau | 12:57 | |
*** jaosorior has joined #openstack-nova | 12:58 | |
openstackgerrit | Merged openstack/nova stable/stein: Fix no propagation of nova context request_id https://review.opendev.org/670694 | 13:00 |
openstackgerrit | Merged openstack/nova stable/stein: Restore RT.old_resources if ComputeNode.save() fails https://review.opendev.org/672038 | 13:06 |
openstackgerrit | Merged openstack/nova stable/stein: Fix GET /servers/detail host_status performance regression https://review.opendev.org/669958 | 13:06 |
*** belmoreira has joined #openstack-nova | 13:06 | |
kashyap | aspiers: Did I interpret your comment correctly there? - https://review.opendev.org/#/c/348394/10 | 13:09 |
*** mchlumsky has joined #openstack-nova | 13:13 | |
kashyap | sean-k-mooney: Or anyone, can you confirm if my notes are correct: to test DevStack with a random change: | 13:14 |
*** mdbooth has quit IRC | 13:14 | |
kashyap | Have these two in local.conf: | 13:14 |
kashyap | NOVA_REPO=$GIT_BASE/openstack/nova.git | 13:14 |
kashyap | NOVA_BRANCH=refs/changes/348394/10/ | 13:14 |
kashyap | No, the 'refs' are wrong. It should be: | 13:16 |
kashyap | - refs/changes/348394/10 | 13:16 |
kashyap | + refs/changes/94/348394/10 | 13:16 |
*** mdbooth has joined #openstack-nova | 13:23 | |
sean-k-mooney | yes | 13:23 |
efried | sean-k-mooney, stephenfin: I haven't read all the scrollback, but | 13:23 |
efried | if you wanted to *not* support resources:{P|V}CPU, you would have to put an explicit check in place for that. | 13:23 |
efried | Because right now we support arbitrary placement-isms for traits and resources. | 13:23 |
efried | The reason for that syntax would be to support the "overloaded" meaning of PCPU/VCPU for donwstream use cases like high/normal priority. | 13:23 |
efried | But I suppose there's no reason you *need* that syntax to support that -- you could still just use the maskage. | 13:23 |
efried | So I guess I'm fine either way. | 13:23 |
sean-k-mooney | although you dont need the NOVA_REPO if you going to set it to $GIT_BASE/openstack/nova.git that is the default | 13:24 |
sean-k-mooney | efried: there is no need to specify it driectly to do that | 13:24 |
sean-k-mooney | yep you can just use the mask | 13:25 |
sean-k-mooney | we dont use teh PCPU/VCPU resouce class values to generate teh xml either by the way | 13:25 |
sean-k-mooney | so it would not result in them being pinned | 13:25 |
kashyap | sean-k-mooney: Also, got a latest very minimal local.conf? Can this be cut down any further? -- http://paste.openstack.org/show/755214/ | 13:25 |
sean-k-mooney | or not pinned | 13:25 |
sean-k-mooney | well ignoreing the fact you are moving where it install | 13:26 |
*** ociuhandu has quit IRC | 13:26 | |
sean-k-mooney | you dont need VIRT_DRIVER=libvirt | 13:26 |
sean-k-mooney | that is the default | 13:26 |
kashyap | sean-k-mooney: Ah, right | 13:26 |
sean-k-mooney | as is your ml2 config section | 13:26 |
kashyap | sean-k-mooney: Can nuke it? | 13:27 |
sean-k-mooney | and the neutron section | 13:27 |
kashyap | Okido; /me goes to nuke | 13:27 |
*** ociuhandu has joined #openstack-nova | 13:27 | |
*** mriedem has joined #openstack-nova | 13:27 | |
*** priteau has joined #openstack-nova | 13:28 | |
*** mchlumsky has quit IRC | 13:28 | |
kashyap | Thanks | 13:29 |
*** mchlumsky has joined #openstack-nova | 13:30 | |
sean-k-mooney | kashyap: i think you can just do http://paste.openstack.org/show/755215/ | 13:30 |
kashyap | I also need 'FORCE=yes' as I'm testing on F30... | 13:31 |
*** tbachman has quit IRC | 13:31 | |
sean-k-mooney | #MULTI_HOST=True | 13:31 |
sean-k-mooney | that is for nova networks | 13:31 |
sean-k-mooney | so you dont need that | 13:31 |
*** ociuhandu has quit IRC | 13:31 | |
kashyap | sean-k-mooney: Yeah, I thought I commented it out | 13:31 |
* kashyap should update these https://kashyapc.fedorapeople.org/virt/openstack/multi-node-configs/ | 13:31 | |
sean-k-mooney | you did | 13:33 |
kashyap | Yep, thx | 13:33 |
kashyap | sean-k-mooney: Remind me again ... do I need 'force_config_drive = False' for Live Mig -- or is that fixed? (It is, IIRC) | 13:34 |
sean-k-mooney | but there is no reason to keep it commeted we dont support nova net downstream and we shouldnt use it upstream | 13:34 |
mriedem | efried: dustinc: i'm about to add https://blueprints.launchpad.net/nova/+spec/openstacksdk-in-nova to a runway slot, is it ready? | 13:34 |
* kashyap needs to check | 13:34 | |
sean-k-mooney | kashyap: config drive is off by default | 13:34 |
kashyap | Sweet | 13:34 |
sean-k-mooney | and in theory new version of libvirt can copy readonly configdrive so it should not be needed | 13:35 |
sean-k-mooney | genrally i used to change the configdriver to vfat instaead as a better workaround for the same issue | 13:35 |
kashyap | Okido, it's on progress. Let's see, if it goes through | 13:35 |
efried | mriedem: I will need to let dustinc answer that. The bits I worked on - the base and the placement cutover - have already merged. | 13:35 |
efried | mriedem: dustinc is Pacific, so might be a couple hours. | 13:36 |
mriedem | ok | 13:36 |
mriedem | though you're on https://review.opendev.org/#/c/642899/ which is my current hangup | 13:36 |
efried | mriedem: oh, yeah, I introduced like the very first PoC PS of that, but dustinc has owned it since then. | 13:37 |
sean-k-mooney | kashyap: oh by the way you should make it less minimal | 13:38 |
sean-k-mooney | add USE_PYTHON3=True | 13:38 |
efried | pretty sure all my touches since the PTG have been rebases (possibly a couple of trivial manual ones in there) | 13:38 |
*** tbachman has joined #openstack-nova | 13:38 | |
sean-k-mooney | kashyap: actully on fedora 30 you might only have python 3 so it might do that by default | 13:39 |
kashyap | sean-k-mooney: Afraid, it's in progress. | 13:40 |
openstackgerrit | Merged openstack/nova stable/stein: Handle Invalid exceptions as expected in attach_interface https://review.opendev.org/672384 | 13:41 |
kashyap | sean-k-mooney: Indeed, F30 _does_ use PY3. | 13:41 |
openstackgerrit | Merged openstack/nova stable/stein: Add functional regression test for bug 1837955 https://review.opendev.org/673532 | 13:41 |
openstack | bug 1837955 in OpenStack Compute (nova) stein "MaxRetriesExceeded sometime fails with messaging exception" [Medium,In progress] https://launchpad.net/bugs/1837955 - Assigned to Matt Riedemann (mriedem) | 13:41 |
sean-k-mooney | its fine it should do the right thing | 13:41 |
openstackgerrit | Merged openstack/nova stable/stein: Cleanup when hitting MaxRetriesExceeded from no host_available https://review.opendev.org/673533 | 13:41 |
*** eharney has joined #openstack-nova | 13:46 | |
*** priteau has quit IRC | 13:53 | |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Pass RequestContext to oslo_policy https://review.opendev.org/674038 | 13:59 |
*** lbragstad has joined #openstack-nova | 13:59 | |
kashyap | sean-k-mooney: Sigh, I missed the SUBUNIT_OUTPUT=$DEST/devstack.subunit | 14:00 |
sean-k-mooney | kashyap: why do you change the dest folder by the way | 14:01 |
*** priteau has joined #openstack-nova | 14:01 | |
sean-k-mooney | /opt/stack usually works fine | 14:01 |
kashyap | sean-k-mooney: Because I want them all in one place, and that's how my muscle memory is fixed, afraid | 14:01 |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Add test coverage of existing os-services policies https://review.opendev.org/669181 | 14:01 |
*** BjoernT has joined #openstack-nova | 14:01 | |
kashyap | Otherwise DevStack drops turds all over the place | 14:02 |
kashyap | (Not that I can completely escape from it...) | 14:02 |
sean-k-mooney | no devstack puts everting in /opt/stack otherwise | 14:02 |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Pass RequestContext to oslo_policy https://review.opendev.org/674038 | 14:02 |
*** mdbooth has quit IRC | 14:02 | |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Add test coverage of existing os-services policies https://review.opendev.org/669181 | 14:03 |
sean-k-mooney | e.g. DEST default to /opt/stack so all the things that use DEST will be sub directories of it | 14:03 |
sean-k-mooney | anyway you do you | 14:03 |
kashyap | Nod :-) | 14:03 |
*** mdbooth has joined #openstack-nova | 14:05 | |
*** spatel has joined #openstack-nova | 14:06 | |
mordred | sean-k-mooney, stephenfin: you nerd-sniped me - https://review.opendev.org/#/c/674040/ . However, word of warning, using the flake8 hook results in a different version of flake8 than what is installed by hacking | 14:06 |
*** ociuhandu has joined #openstack-nova | 14:07 | |
mordred | (easy workaround - just a thing to be aware of) | 14:07 |
sean-k-mooney | it will use whatever version is on your system path right | 14:08 |
*** BjoernT_ has joined #openstack-nova | 14:09 | |
sean-k-mooney | mordred: i think the version we use is generally the same as its teh one install by devstack | 14:10 |
*** belmoreira has quit IRC | 14:10 | |
sean-k-mooney | but ya it could be much newer i guess too depening on how it was installed | 14:10 |
*** BjoernT has quit IRC | 14:10 | |
mriedem | slaweq: i wanted to apply https://github.com/openstack/tempest/commit/eb0a2cc5f240d52efa3a58c5a1ba8821bae3147e to the nova-next job, do you think i should wait for the devstack change? otherwise it's easy for me to just throw it in the nova-next zuul yaml config | 14:10 |
mriedem | oh huzzah https://review.opendev.org/#/c/674025/ | 14:11 |
mriedem | i guess once that lands nova-next will just get it for free | 14:12 |
gmann | mriedem: yeah tempest one is reverted back. | 14:14 |
mriedem | yup i saw | 14:15 |
mriedem | the devstack patch looks good to me | 14:15 |
mriedem | https://logs.opendev.org/25/674025/1/check/devstack-multinode/48f53e2/controller/logs/etc/nova/nova_conf.txt.gz | 14:15 |
mriedem | [cache] memcache_servers = localhost:11211 backend = dogpile.cache.memcached enabled = True | 14:15 |
*** ociuhandu has quit IRC | 14:16 | |
mordred | sean-k-mooney: if you just do "pip install --user pre-commit ; pre-commit run -a" pre-commit will helpfully install flake8 for you into a virtualenv - resulting in a wildly different version: http://paste.openstack.org/show/755219/ | 14:17 |
sean-k-mooney | mordred: even if flak8 is already installed | 14:17 |
sean-k-mooney | but good to know | 14:17 |
slaweq | mriedem: I hope this devstack change can be merged soon so You should have it "for free" in Your jobs | 14:17 |
mriedem | yup same | 14:18 |
mriedem | it only takes clarkb and gmann to merge it :) | 14:18 |
mordred | sean-k-mooney: yeah - one of the things we've tried to do is make sure the gate runs the same commands that devs do so that in addition to making sure the software is good, we're making sure that devs running commands have a decent expectation of those commands working and only breaking if their code broke something | 14:18 |
sean-k-mooney | mordred: i personally havent tried using it for flake8 yet | 14:18 |
sean-k-mooney | mordred: i was usinging it for basic white space managment | 14:19 |
mordred | yeah - I ran pre-commit run -a in the nova repo with stephenfin's patch and it blew up massively | 14:19 |
mordred | I actually think basic whitespace management is a great use of it | 14:19 |
*** ociuhandu has joined #openstack-nova | 14:19 | |
mordred | like - flake8 isn't quick to run :) | 14:19 |
sean-k-mooney | apparently it is faster with pre-commit as its ment to only run on the files you are modifying | 14:20 |
sean-k-mooney | like our fast8 tox env does | 14:20 |
sean-k-mooney | but again havent tested stephenfin patch | 14:20 |
mordred | yeah - if I run without -a it's very quick | 14:20 |
sean-k-mooney | yep that was the goal. quick test/hooks that fix the common style stuff before it hits the gate | 14:21 |
sean-k-mooney | its always frustrating to come back after a tempest run and notice a trailing whitespace | 14:22 |
mordred | right? | 14:22 |
*** jhesketh has quit IRC | 14:23 | |
*** ociuhandu has quit IRC | 14:23 | |
*** artom has joined #openstack-nova | 14:25 | |
mriedem | efried: on this func test from gibi https://review.opendev.org/#/c/667913/ it's not really a regression so i'd like to move it elsewhere. i suggested nova/tests/functional/compute/test_init_host.py which i have added here https://review.opendev.org/#/c/670393/ - i'm just wondering if it's ok to rebase gibi's change on top of mine since the only dependency would be the module name | 14:33 |
mriedem | and mine hasn't received any core review | 14:33 |
mriedem | though artom loves it | 14:33 |
artom | I do what I can | 14:34 |
efried | mriedem: what are you asking? | 14:42 |
efried | You want to rebase gibi's change and also move the test into another module? | 14:42 |
efried | and then fast approve it? | 14:43 |
*** mlavalle has joined #openstack-nova | 14:43 | |
*** dpawlik has quit IRC | 14:43 | |
mriedem | idk about fast approve | 14:44 |
mriedem | but yeah i want to move his test out of regressions since it's not a regression, it's latent behavior | 14:44 |
efried | okay, that would be the only part I'd be hesitant about, so go for it. | 14:44 |
mriedem | ack | 14:44 |
*** jaosorior has quit IRC | 14:46 | |
openstackgerrit | Monty Taylor proposed openstack/nova master: Keep pre-commit inline with hacking and fix whitespace https://review.opendev.org/674057 | 14:51 |
mordred | sean-k-mooney, stephenfin: ^^ followup that fixes those issues - feel free to squash or ignore or whatever. | 14:52 |
sean-k-mooney | we have way more tabs in the repo then i hoped for. e.g. more then 0 | 14:54 |
sean-k-mooney | we could proably exclude svg files form that but on the other hand it wont break them so i guess its fine | 14:54 |
*** aojea has quit IRC | 14:56 | |
mordred | sean-k-mooney: want me to re-run with an exclude added for .svg? | 14:56 |
*** priteau has quit IRC | 14:56 | |
sean-k-mooney | am i dont mind but if someone generated tehm with a tool we proably dont want to process them | 14:56 |
sean-k-mooney | i doubt the wrote them by hand so its proably for the best to exclude them | 14:57 |
stephenfin | mordred: I was just going to fix stuff as we went, tbh :) (i.e. not using the '-a' flag) | 14:57 |
sean-k-mooney | stephenfin: well ignoring the svg files the rest is minor | 14:58 |
stephenfin | yeah, no merge conflicts either. Easy straight up merge, IMO | 14:58 |
sean-k-mooney | you spoke too soon https://review.opendev.org/#/c/660147/8 | 14:59 |
stephenfin | sean-k-mooney: Yeah, ignore that. That patch needs massive rework as-is | 15:00 |
openstackgerrit | Monty Taylor proposed openstack/nova master: Keep pre-commit inline with hacking and fix whitespace https://review.opendev.org/674057 | 15:02 |
mordred | that's a much smaller version | 15:02 |
*** tbachman has quit IRC | 15:03 | |
sean-k-mooney | yep it looks sane | 15:04 |
stephenfin | mordred: Nice. LGTM | 15:04 |
stephenfin | Just need to find someone to hold their nose and approve the parent patch now | 15:04 |
*** ratailor has joined #openstack-nova | 15:05 | |
mordred | heh | 15:05 |
*** maciejjozefczyk has quit IRC | 15:07 | |
openstackgerrit | melanie witt proposed openstack/nova stable/rocky: Avoid logging traceback when detach device not found https://review.opendev.org/674068 | 15:07 |
*** ociuhandu has joined #openstack-nova | 15:10 | |
*** tbachman has joined #openstack-nova | 15:11 | |
*** bbowen has quit IRC | 15:15 | |
*** bbowen has joined #openstack-nova | 15:15 | |
openstackgerrit | sean mooney proposed openstack/nova master: support pci numa affinity policies in flavor and image https://review.opendev.org/674072 | 15:26 |
*** gyee has joined #openstack-nova | 15:27 | |
*** tssurya has quit IRC | 15:30 | |
*** tbachman has quit IRC | 15:30 | |
stephenfin | Can someone explain wtf is going on here? http://paste.openstack.org/show/755239/ | 15:41 |
stephenfin | does 'x = y = set([])' mean 'x' and 'y' are pointing to the same thing in memory or something? | 15:42 |
* stephenfin hasn't noticed that before, if so | 15:42 | |
*** ccamacho has quit IRC | 15:43 | |
stephenfin | TIL https://stackoverflow.com/a/16349356 | 15:44 |
*** ociuhandu has quit IRC | 15:47 | |
*** cdent has quit IRC | 15:48 | |
*** priteau has joined #openstack-nova | 15:50 | |
*** whoami-rajat has quit IRC | 15:51 | |
*** dpawlik has joined #openstack-nova | 15:52 | |
*** jcosmao has left #openstack-nova | 15:53 | |
*** tbachman has joined #openstack-nova | 15:55 | |
prometheanfire | sean-k-mooney: do you need a bug for https://review.opendev.org/673848 ? | 15:56 |
*** dpawlik has quit IRC | 15:59 | |
edleafe | stephenfin: wow, usually people get bit by that within their first few Python programs. | 15:59 |
stephenfin | I know. I've seen it before with objects but I didn't think sets would work like that too | 16:00 |
* artom has never seen nor felt the need for x = y = foo | 16:03 | |
sean-k-mooney | stephenfin: x an y should both be refernces to the same set yes | 16:03 |
* artom tries to write the least clever code possible | 16:03 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Functional reproduce for bug 1833581 https://review.opendev.org/667913 | 16:04 |
openstack | bug 1833581 in OpenStack Compute (nova) "instance stuck in BUILD state if nova-compute is restarted" [Low,In progress] https://launchpad.net/bugs/1833581 - Assigned to Balazs Gibizer (balazs-gibizer) | 16:04 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Error out interrupted builds https://review.opendev.org/666857 | 16:04 |
*** artom has quit IRC | 16:04 | |
sean-k-mooney | prometheanfire: am given i forgot about that until you pinged me we proably should file one yes | 16:05 |
melwitt | efried, mriedem: I saw the chat on IRC about RequestContext and logging, so I added a comment on https://review.opendev.org/673924 | 16:07 |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Introduce scope_types in os-services https://review.opendev.org/645427 | 16:08 |
mriedem | melwitt: thanks. sounds like you should -2 that? | 16:08 |
*** ksdean has joined #openstack-nova | 16:08 | |
*** igordc has joined #openstack-nova | 16:08 | |
*** ksdean has quit IRC | 16:08 | |
prometheanfire | sean-k-mooney: :D | 16:09 |
*** yaawang has quit IRC | 16:09 | |
*** ksdean has joined #openstack-nova | 16:09 | |
*** ksdean has quit IRC | 16:10 | |
aspiers | so it turns out that on a box with 256 CPUs, devstack configures nova scheduler/conductor/metadata/api with 64 workers which spams mysqld to hell and back | 16:11 |
sean-k-mooney | aspiers: ya | 16:11 |
*** ratailor has quit IRC | 16:11 | |
sean-k-mooney | i have hit that proablem several times | 16:11 |
sean-k-mooney | over the years | 16:11 |
aspiers | sean-k-mooney: bah, but you didn't mention it earlier when I asked ;-p | 16:11 |
sean-k-mooney | it used to default to 1 worker per core | 16:12 |
*** ksdean has joined #openstack-nova | 16:12 | |
sean-k-mooney | i see they tried to make it less dumb | 16:12 |
sean-k-mooney | what that it create X works per core | 16:12 |
aspiers | API_WORKERS=${API_WORKERS:=$(( ($(nproc)/4)<2 ? 2 : ($(nproc)/4) ))} | 16:12 |
sean-k-mooney | ya the ": ($(nproc)/4 " shoudl be ":8" | 16:13 |
*** yaawang has joined #openstack-nova | 16:13 | |
melwitt | mriedem: I didn't want to be heavy-handed, it's WIP and I was thinking the approach would be changed based on the information | 16:13 |
sean-k-mooney | but i used ot hit those issue the whole time at intel and just ened up hardcoding the worker when i did | 16:14 |
sean-k-mooney | i just am condition to assume other dont have that many cpus | 16:14 |
aspiers | sean-k-mooney: think there should be a hard cap? | 16:14 |
sean-k-mooney | and those that do know about this behavior | 16:14 |
sean-k-mooney | aspiers: yes 8 | 16:14 |
aspiers | that low? | 16:14 |
sean-k-mooney | for devstack yes | 16:14 |
aspiers | I guess | 16:15 |
aspiers | but also I wonder if the heartbeating code has a bug | 16:15 |
sean-k-mooney | if you want to set it you can use the [[post-config| path/to/file]] | 16:15 |
sean-k-mooney | syntax | 16:15 |
aspiers | it does a COMMIT then ROLLBACK after every SELECT | 16:15 |
aspiers | I don't even understand how mysqld would interpret that | 16:15 |
sean-k-mooney | well the heartbeating code reconnect every 60 seconds to rabbit | 16:16 |
aspiers | but that's what the general_log says | 16:16 |
sean-k-mooney | but that is a knonw issue with uwsgi and eventlets | 16:16 |
sean-k-mooney | that shoudl only effect teh api services | 16:16 |
sean-k-mooney | if you are seeing heatbeat issues on other services then that a new issue | 16:16 |
aspiers | I see this with conductor | 16:16 |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Add new default roles and mapping in policy base class https://review.opendev.org/645452 | 16:17 |
aspiers | as soon as it starts, mysqld starts writing at ~2MB/s | 16:17 |
sean-k-mooney | what are the symtoms in the conductor logs | 16:17 |
aspiers | nothing | 16:17 |
sean-k-mooney | no repeating messages | 16:17 |
sean-k-mooney | wait are you seeing spikes at fixed intervals? | 16:18 |
aspiers | no repeating messages | 16:18 |
sean-k-mooney | actully never mind i was thinking of somethign else you only have 1 compute node | 16:18 |
aspiers | no it's more or less continuous 2MB/s | 16:18 |
aspiers | sometimes up to 5MB/s | 16:19 |
aspiers | that is writes not reads | 16:19 |
prometheanfire | sean-k-mooney: https://bugs.launchpad.net/nova/+bug/1838666 | 16:19 |
openstack | Launchpad bug 1838666 in OpenStack Compute (nova) "lxml 4.4.0 causes failed tests in nova" [Undecided,New] | 16:19 |
aspiers | WTF is it writing?! | 16:19 |
sean-k-mooney | aspiers: is the file growing? | 16:19 |
aspiers | which file? | 16:19 |
sean-k-mooney | the size of the mysql data directory | 16:19 |
sean-k-mooney | e.g. is it updating stuff or writing new data | 16:19 |
aspiers | updating | 16:21 |
aspiers | stays the same size | 16:21 |
sean-k-mooney | ok and you think its the heartbeat? | 16:22 |
aspiers | I enabled general_log | 16:22 |
aspiers | which is supposed to show *all* queries, right? | 16:22 |
aspiers | and it just shows this select from services over and over | 16:23 |
aspiers | http://paste.openstack.org/show/755199/ | 16:23 |
sean-k-mooney | and you have tracked it to the conductor | 16:23 |
aspiers | not just the conductor | 16:23 |
aspiers | several nova services | 16:23 |
aspiers | I stopped them all => mysqld goes quiet | 16:23 |
aspiers | start one => mysqld goes nuts | 16:24 |
aspiers | I think it's any which has 64 workers | 16:24 |
sean-k-mooney | right the compute agent dont use workers | 16:24 |
aspiers | mysqladmin processlist shows 853 entries! | 16:24 |
sean-k-mooney | the only ones that do are the metadata api, api, shduler and conductor | 16:24 |
aspiers | yes those are the ones I saw the spam from | 16:25 |
sean-k-mooney | actully im not sure we use worker in the conductor | 16:25 |
aspiers | Main PID: 177005 (nova-conductor) | 16:25 |
aspiers | Tasks: 64 (limit: 51200) | 16:25 |
aspiers | it does | 16:25 |
aspiers | nova is using 513 db connections | 16:25 |
sean-k-mooney | the condoctor can do db acess on behalf of other service like the scheduler i think | 16:26 |
aspiers | neutron 155 | 16:26 |
aspiers | maybe but anyway why is it doing COMMIT and ROLLBACK? | 16:26 |
aspiers | select should just be reads | 16:26 |
sean-k-mooney | no idea | 16:26 |
aspiers | surely a bug | 16:26 |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Add new default roles and mapping in policy base class https://review.opendev.org/645452 | 16:26 |
sean-k-mooney | it might be using a a read context manager incorrectly | 16:27 |
aspiers | nova.servicegroup.drivers.db.DbDriver.is_up | 16:28 |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: WIP:Introduce scope_types in servers API https://review.opendev.org/662968 | 16:28 |
openstackgerrit | Dongcan Ye proposed openstack/nova master: Docs: Fix launch an instance from a volume https://review.opendev.org/674086 | 16:29 |
*** lbragstad has quit IRC | 16:29 | |
*** artom has joined #openstack-nova | 16:29 | |
*** yaawang has quit IRC | 16:30 | |
*** yaawang has joined #openstack-nova | 16:32 | |
efried | thanks melwitt, that's helpful, ahem, context. | 16:33 |
efried | What I'm confused about, though, is: isn't a threadlocal... local... to the thread? So like a boot request will be running in a separate thread from a periodic, and thus *should* have its own context and request_id? | 16:35 |
efried | (I haven't yet gotten back to looking at what happened to my test patch, which ought to be informative.) | 16:35 |
*** gregwork has joined #openstack-nova | 16:36 | |
sean-k-mooney | thread locals are typically implemented by kepying a read only value of the intialiastion value | 16:36 |
sean-k-mooney | and then copying when a new tread is spawned | 16:36 |
sean-k-mooney | at least that is how it work in c/c++ | 16:37 |
efried | I probably need a better understanding of the lifecycle of the various threads in a nova-compute process. | 16:38 |
sean-k-mooney | same but i think im more or less done for today | 16:39 |
*** rpittau is now known as rpittau|afk | 16:40 | |
sean-k-mooney | once i triage one bug... | 16:40 |
gregwork | has anyone seen a situation where nova fails to create the record in placement for a queens/rdo overcloud deployment on baremetal? we are getting this on a clean undercloud install for all baremetal nodes: "There was a conflict when trying to complete your request.\n\n Unable to allocate inventory: Unable to create allocation for 'CUSTOM_BAREMETAL' on resource provider 'ed51f633-972e-4d72-ac7e-6478d301c57b'. The | 16:41 |
gregwork | requested amount would exceed the capacity. ", "title": "Conflict"}]}) | 16:41 |
gregwork | the resource tags for vcpu/memory/disk on things like control are all 0 | 16:41 |
gregwork | and looking at the nova_api db shows that all bm nodes have values > 0 | 16:41 |
gregwork | for those requisites | 16:41 |
efried | gregwork: It's telling you the CUSTOM_BAREMETAL resource is unavailable, so VCPU/MEMORY_MB/DISK_GB should all be n/a. Can you pull the inventory record for that provider? | 16:43 |
efried | openstack resource provider inventory show $node_uuid CUSTOM_BAREMETAL | 16:44 |
efried | what you're looking for there is reserved==total==1 | 16:45 |
efried | if reserved==0, then look for usage: | 16:45 |
*** nafiux has joined #openstack-nova | 16:45 | |
efried | openstack resource provider usage show $node_uuid | 16:45 |
* efried lunches | 16:46 | |
gregwork | i dont have openstack resource provider | 16:46 |
gregwork | i have openstack resource member | 16:46 |
*** sapd1_x has quit IRC | 16:49 | |
sean-k-mooney | gregwork: you are missing the osc-placment plugin | 16:55 |
*** ivve has joined #openstack-nova | 16:55 | |
gregwork | im installing the package | 16:56 |
gregwork | seems to not get installed with a base rdo/queens install | 16:56 |
gregwork | +------------------+-------+ | 17:00 |
gregwork | | resource_class | usage | | 17:00 |
gregwork | +------------------+-------+ | 17:00 |
gregwork | | VCPU | 0 | | 17:00 |
gregwork | | MEMORY_MB | 0 | | 17:00 |
gregwork | | CUSTOM_BAREMETAL | 0 | | 17:00 |
gregwork | | DISK_GB | 0 | | 17:00 |
gregwork | +------------------+-------+ | 17:00 |
gregwork | all of them show that | 17:00 |
*** derekh has quit IRC | 17:01 | |
*** nafiux has quit IRC | 17:02 | |
*** igordc has quit IRC | 17:03 | |
*** kdean has joined #openstack-nova | 17:04 | |
*** priteau has quit IRC | 17:05 | |
*** ksdean has quit IRC | 17:06 | |
stephenfin | I *think* I finally have all my failing unit tests for cpu-resources fixed | 17:10 |
gregwork | efried: here was the other command | 17:11 |
gregwork | +------------------+-------+ | 17:12 |
gregwork | | Field | Value | | 17:12 |
gregwork | +------------------+-------+ | 17:12 |
gregwork | | allocation_ratio | 1.0 | | 17:12 |
gregwork | | max_unit | 1 | | 17:12 |
gregwork | | reserved | 0 | | 17:12 |
gregwork | | step_size | 1 | | 17:12 |
gregwork | | min_unit | 1 | | 17:12 |
gregwork | | total | 1 | | 17:12 |
gregwork | +------------------+-------+ | 17:12 |
melwitt | mriedem: what do you think about backporting the --before command for archive_deleted_rows? https://review.opendev.org/556751 this has come up repeatedly downstream in clouds that run archive_deleted_rows periodically + have outages where local deletes happen. if archive runs at an inopportune time, we end up with "running deleted" guests on the hypervisor that are never cleaned up, along with placement allocations that are never | 17:13 |
melwitt | cleaned up, | 17:13 |
melwitt | being able to use --before could mitigate that | 17:14 |
*** udesale has quit IRC | 17:14 | |
*** whoami-rajat has joined #openstack-nova | 17:15 | |
stephenfin | f*** yeah | 17:18 |
stephenfin | onto functional tests | 17:18 |
*** ralonsoh has quit IRC | 17:20 | |
*** ashish2307 has joined #openstack-nova | 17:22 | |
melwitt | correction: I see now that allocations are not cleaned up as part of the periodic. they are instead deleted at local delete time. so --before would help only in the orphaned libvirt guests case | 17:26 |
*** igordc has joined #openstack-nova | 17:27 | |
*** ricolin__ is now known as ricolin | 17:30 | |
*** igordc has quit IRC | 17:30 | |
efried | gregwork: That all looks copacetic. Is the issue reproducible? | 17:40 |
*** nafiux has joined #openstack-nova | 17:41 | |
*** slaweq has quit IRC | 17:42 | |
*** igordc has joined #openstack-nova | 17:42 | |
zzzeek | mriedem: ping | 17:43 |
zzzeek | mriedem: need a +2 on https://review.opendev.org/#/c/671040/ | 17:43 |
*** ociuhandu has joined #openstack-nova | 17:44 | |
*** ociuhandu has quit IRC | 17:48 | |
*** betherly has joined #openstack-nova | 17:49 | |
*** betherly has quit IRC | 17:54 | |
*** slaweq has joined #openstack-nova | 17:57 | |
openstackgerrit | Dustin Cowles proposed openstack/nova master: Provider config file schema and loader https://review.opendev.org/673341 | 17:58 |
efried | melwitt: are you around? | 18:02 |
melwitt | efried: yes | 18:02 |
mriedem | melwitt: orphans ala https://review.opendev.org/#/c/627765/ | 18:03 |
mriedem | melwitt: idk about backporting https://review.opendev.org/#/c/556751/, it's tied to a blueprint | 18:03 |
efried | melwitt: I've got some clever ideas about how to make global_request_id a) present and b) useful in (at least) the nova-compute logs, but I need some help understanding how the processes use threads. Is that something you know about? (or mriedem) | 18:03 |
melwitt | mriedem: yes those orphans. and yeah, that's why I wasn't sure. it's useful for operations though. on the orphan patch, I'm concerned about it because afaict, it's going to be libvirt-only and I hope that's ok. because to make it work for other drivers would take a lot more change. reason being, all the driver.destroy() require instance objects and when we're dealing with db-record-less guests, we have no real objects in that case | 18:05 |
melwitt | efried: I'd say I know a little. I'm in the middle of writing another comment on the patch based on your question about periodics having their own thread. to correct my comment | 18:06 |
mriedem | zzzeek: done | 18:07 |
stephenfin | Patchset incoming (I had to rebase :() | 18:07 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Follow-up for I2936ce8cb293dc80e1a426094fdae6e675461470 https://review.opendev.org/672669 | 18:07 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: trivial: Remove unused function parameter https://review.opendev.org/671796 | 18:07 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: objects: Rename 'nova.objects.instance_numa_topology' https://review.opendev.org/671789 | 18:07 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: libvirt: Remove unnecessary try-catch around 'getCPUMap' https://review.opendev.org/671790 | 18:07 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: objects: Remove legacy '_from_dict' functions https://review.opendev.org/537414 | 18:07 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: claims: Remove useless caching https://review.opendev.org/671791 | 18:07 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Add '[compute] cpu_dedicated_set' option https://review.opendev.org/671792 | 18:07 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: libvirt: Start reporting PCPU inventory to placement https://review.opendev.org/671793 | 18:07 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: trivial: Rename exception argument https://review.opendev.org/671795 | 18:07 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Remove 'hardware.get_host_numa_usage_from_instance' https://review.opendev.org/671797 | 18:07 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Remove 'hardware.host_topology_and_format_from_host' https://review.opendev.org/671798 | 18:07 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Remove 'hardware.instance_topology_from_instance' https://review.opendev.org/671799 | 18:07 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Rework 'hardware.numa_usage_from_instances' https://review.opendev.org/672565 | 18:07 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: tests: Split NUMA object tests https://review.opendev.org/672336 | 18:07 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: libvirt: '_get_(v|p)cpu_total' to '_get_(v|p)cpu_available' https://review.opendev.org/672693 | 18:07 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: hardware: Differentiate between shared and dedicated CPUs https://review.opendev.org/671800 | 18:07 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Add support translating CPU policy extra specs, image meta https://review.opendev.org/671801 | 18:07 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: objects: Rename 'fields' import to 'obj_fields' https://review.opendev.org/674103 | 18:07 |
gregwork | efried: the issue is presently 100% reproducible it is blocking all overcloud deploys | 18:10 |
zzzeek | mriedem: thanks! | 18:14 |
*** shilpasd has quit IRC | 18:20 | |
efried | melwitt: Is there a way to get at the Service's context from within the virt driver? I was looking to fix the func failure exposed by gibi's poison patch | 18:23 |
efried | I could do it by adding a context arg to get_available_resource, but that's a nontrivial impact | 18:24 |
melwitt | efried: not that I know of. have to dig a bit but I doubt it's accessible | 18:27 |
*** betherly has joined #openstack-nova | 18:30 | |
mriedem | melwitt: just a thought on the downstream deployment problem, as a workaround those tools that run the archive cron could check if any compute service are down (that info is in the api) and not do the archive | 18:31 |
*** eharney has quit IRC | 18:32 | |
mriedem | something like, | 18:32 |
mriedem | if [ openstack compute service list -f value -c State | grep down ]; then don't run archive | 18:33 |
melwitt | mriedem: thanks. that could mitigate some of it, the other issue is that the reap task does not run immediately during nova-compute start until https://review.opendev.org/657132 landed this cycle. which means reap will not run until 30 min have elapsed since starting nova-compute, by default. and I unfortunately did not create a bug for that change, so does that cause a problem for wanting to backport? | 18:34 |
mriedem | it's a behavior change so yeah | 18:35 |
melwitt | so prior to that change, would have to check whether it's down and also how long since it was down | 18:35 |
*** betherly has quit IRC | 18:35 | |
melwitt | yeah. that's why I was going more toward wanting to backport --before | 18:35 |
melwitt | for archive | 18:36 |
mriedem | how long since it was down isn't a thing we record | 18:36 |
mriedem | in the api anyway | 18:36 |
mriedem | we do have last_seen_up in the db | 18:36 |
*** eharney has joined #openstack-nova | 18:36 | |
melwitt | yeah, I was thinking something like that, if there's a way to see when it started or came up, could do a time delta on that | 18:37 |
mriedem | the api does have an updated_at it returns which may or may not help | 18:37 |
melwitt | efried: it looks like a virt driver could get to the compute manager (though not supposed to, I think) by self.virtapi._compute, so if the context were stored not only on service but also on the compute manager, then the driver could get it e.g. self.virtapi._compute.ctxt. I don't know whether that would be a good idea, just saying for the sake of thinking about it | 18:45 |
melwitt | the service creates the manager, so adding the context to the manager would be simple | 18:46 |
efried | melwitt: cool, thanks. I'm adding it to get_available_resource right now and it's, predictably, a minor PITA. Not sure which is worse though | 18:47 |
efried | but it's certainly not the only driver method to which we pass a context, so... | 18:48 |
*** betherly has joined #openstack-nova | 18:58 | |
nafiux | Hi Team, I’m receiving this error: “Received a sync request from an unknown host 'c1.o7k.io'. Re-created its InstanceList.” in nova-scheduler.log, and “The instance sync for host 'c1.o7k.io' did not match. Re-created its InstanceList.”, any hint? | 18:59 |
nafiux | Actually, nevermind, that isn’t impacting the creationg of the instance… sorry. Thanks! | 18:59 |
nafiux | Should I worry for that message at all? | 19:01 |
*** betherly has quit IRC | 19:03 | |
* aspiers wonders why devstack rolls its own ini editor instead of using crudini | 19:04 | |
openstackgerrit | Eric Fried proposed openstack/nova master: Poison context usage in periodic tasks https://review.opendev.org/542891 | 19:14 |
openstackgerrit | Eric Fried proposed openstack/nova master: WIP: Generate and log global_request_id properly https://review.opendev.org/673924 | 19:14 |
openstackgerrit | Eric Fried proposed openstack/nova master: Add context param to get_available_resource https://review.opendev.org/674112 | 19:14 |
openstackgerrit | Eric Fried proposed openstack/nova master: Pass context to _get_disk_over_committed_size_total https://review.opendev.org/674113 | 19:14 |
efried | melwitt: So I added the context param in one patch, and funneled it through to that one LibvirtDriver method in another ^ | 19:15 |
efried | really I think I'm just stalling trying to figure out the right thing to do about oslo.log's threadlocal RequestContext... | 19:15 |
mriedem | artom: fyi https://review.opendev.org/#/q/topic:story/2006302+(status:open+OR+status:merged) | 19:16 |
artom | mriedem, the osc thing I signed up for yeah :( | 19:17 |
artom | Sorry, I clearly overestimated my availability for such things | 19:17 |
melwitt | efried: ok. I'm not up to speed on the problem, so I'll try to look at the patch later to see what you're trying to do | 19:19 |
efried | melwitt: oh, the thing above is just to fix up the func test failure that gibi uncovered when writing the poison fixture to enforce the "don't get_[admin_]context from periodics" that you wrote. | 19:20 |
melwitt | oh ok | 19:20 |
efried | melwitt: The thing that led me down the rabbit hole is the fact that we're not logging the global request ID | 19:21 |
efried | at all | 19:21 |
efried | and the (non-global) request ID that we're logging isn't the one from the context we're using to set the global request ID header we're sending to placement | 19:21 |
melwitt | yeah. I guess I was thinking I dunno why logging that should involve needing to overwrite thread local context | 19:21 |
efried | so the only way we can possibly correlate what's happening in n-cpu with what's happening in placement is to log the placement (local) request_id it sends back in the response. | 19:21 |
efried | which sucks | 19:21 |
efried | yeah, to solve the general problem I'll need a better understanding of how threads are used throughout | 19:22 |
efried | but for this specific case, I think I have a solution | 19:22 |
melwitt | it would seem like you would just take the global_request_id from the thread local request context (if there is one) and send that in the placement client calls | 19:23 |
efried | bingo | 19:23 |
efried | well, there isn't a global_request_id in the thread local request context | 19:23 |
efried | which I would like to remedy as well | 19:23 |
efried | though separately | 19:23 |
*** tesseract has quit IRC | 19:23 | |
efried | but context.global_id is a @property that gives you the global_request_id or the request_id if global isn't set | 19:23 |
melwitt | I see. yeah, would want to store that too similarly. is it not one of the normal attrs on RequestContext I guess? | 19:23 |
melwitt | oh | 19:24 |
efried | so I can set it up and it'll be an improvement, and then if we can get a real global_request_id into the threadlocal guy, it'll get better without having to change it again. | 19:24 |
efried | but what I'd like to do is be logging *and* sending across the wire a global_request_id that actually makes sense for a given operation's lifecycle. | 19:24 |
melwitt | makes sense | 19:24 |
efried | which was the whole purpose of global_request_id to start with. | 19:25 |
aspiers | sean-k-mooney: https://bugs.launchpad.net/devstack/+bug/1838688 | 19:27 |
openstack | Launchpad bug 1838688 in devstack "API_WORKERS default too high on machines with many CPUs" [Undecided,New] | 19:27 |
efried | melwitt: What I'd like to be the case is | 19:27 |
efried | - When an operation comes into the API, we create one RequestContext for it. We should be initializing global_request_id here, and that guy should be going in the (n-api) threadlocal for logging purposes | 19:27 |
efried | - Then whenever that n-api thread does a request across the wire, we send the global_request_id along. If over REST, via the header. If over RPC, by serializing the RequestContext itself - and then when we deserialize on the other side, *that* side should overwrite *its* threadlocal so the logs consistently log it. | 19:27 |
melwitt | efried: yeah, that makes sense. being that most of it works that way today (the creation of RequestContext on API request start and the carrying over of it over RPC), the main thing is you'd have to add global_request_id to RequestContext in general. the REST thing would be new though | 19:31 |
efried | The REST thing is easy. It's the one bit I actually understand :) | 19:31 |
*** dpawlik has joined #openstack-nova | 19:32 | |
melwitt | yeah, we already have normal request_id going across services over rpc, so you get should that part for free. just have to add global_request_id as a new field and make sure it's part of what gets de/serialized | 19:34 |
melwitt | though I'm already seeing it in oslo.context, there's a global_request_id field. so maybe we're just not setting it | 19:35 |
*** nafiux has quit IRC | 19:38 | |
melwitt | https://github.com/openstack/nova/blob/master/nova/api/auth.py#L95 which will generate a new request_id (as it's not being set) https://github.com/openstack/oslo.context/blob/master/oslo_context/context.py#L426 but global_request_id is not auto-generated | 19:40 |
efried | correct | 19:40 |
melwitt | just stays none | 19:40 |
melwitt | so you're thinking grab it out of the header and set it there? | 19:41 |
melwitt | (if present) | 19:41 |
efried | If it's present in the header, set it. | 19:42 |
efried | If it's absent from the header, I'd like to generate it as early as possible in whatever request flow we're initiating. | 19:42 |
efried | What I haven't even started digging into yet is where I would do that ^ from. | 19:42 |
efried | if it's just one place or multiple | 19:42 |
melwitt | it would be there only, I think | 19:42 |
efried | nice, that would be cool. | 19:43 |
efried | I'll try shoving that change on top of this one I'm working on for the report client | 19:43 |
melwitt | there's also another place in nova/api/openstack/auth.py where you could also do it for the NoAuth middleware | 19:43 |
efried | okay | 19:43 |
melwitt | and then because of https://github.com/openstack/oslo.context/blob/master/oslo_context/context.py#L333 you'll get de/serialization over rpc for free | 19:43 |
melwitt | once you do that change, you should be able to use context.get_current().global_request_id (check for None first) to get it to send in placement client call https://github.com/openstack/oslo.context/blob/master/oslo_context/context.py#L502 | 19:47 |
efried | melwitt: Thanks, this gels with what I've been thinking. | 19:48 |
melwitt | looks like there's only one other use of it I find in the codebase https://github.com/openstack/nova/blob/master/nova/utils.py#L790 | 19:49 |
openstackgerrit | Merged openstack/nova stable/rocky: Add functional recreate test for regression bug 1825537 https://review.opendev.org/669361 | 19:49 |
openstack | bug 1825537 in OpenStack Compute (nova) rocky "finish_resize failures incorrectly revert allocations" [Medium,In progress] https://launchpad.net/bugs/1825537 - Assigned to Matt Riedemann (mriedem) | 19:49 |
openstackgerrit | Merged openstack/nova stable/rocky: Perf: Use dicts for ProviderTree roots https://review.opendev.org/670182 | 19:49 |
openstackgerrit | Merged openstack/nova stable/rocky: doc: Fix a parameter of NotificationPublisher https://review.opendev.org/670225 | 19:49 |
openstackgerrit | Merged openstack/nova stable/rocky: Stabilize unshelve notification sample tests https://review.opendev.org/669118 | 19:49 |
openstackgerrit | Merged openstack/nova stable/rocky: docs: Correct issues with 'openstack quota set' commands https://review.opendev.org/670097 | 19:49 |
*** ash2307 has joined #openstack-nova | 19:54 | |
*** amodi has joined #openstack-nova | 19:55 | |
*** ash2307 has quit IRC | 19:58 | |
*** betherly has joined #openstack-nova | 20:00 | |
*** ash2307 has joined #openstack-nova | 20:00 | |
*** eharney has quit IRC | 20:03 | |
*** betherly has quit IRC | 20:04 | |
*** Luzi has joined #openstack-nova | 20:06 | |
aspiers | any SQLAlchemy experts here? I'm wondering if it's really OK that every heartbeat has a COMMIT *and* ROLLBACK after it http://paste.openstack.org/show/755199/ | 20:19 |
aspiers | I'm seeing mysqld get an endless stream of heartbeats from conductor, and it's also writing constantly at 2--5MB/s which seems insane for a devstack cloud sitting there doing literally nothing | 20:21 |
aspiers | mysqld is writing constantly, I mean | 20:21 |
aspiers | efried: does this make any sense to you? | 20:22 |
aspiers | maybe I'm just missing something | 20:22 |
mriedem | zzzeek is the sqlalchemy expert but i'm not sure he'd have context on what nova is doing | 20:24 |
aspiers | the weird thing is it only happens on this one devstack, not my others | 20:24 |
aspiers | one question is whether rollback after every commit makes any sense | 20:24 |
efried | I wouldn't think we should be doing that for normal non-error cases, no. | 20:25 |
aspiers | another is why there are so many damn selects constantly checking the services table | 20:25 |
aspiers | surely heartbeats should only happen every few seconds at most | 20:25 |
mriedem | what's your underlying mysql library? | 20:25 |
mriedem | mysqldb? | 20:25 |
mriedem | python-mysql? | 20:25 |
aspiers | checking | 20:25 |
mriedem | https://stackoverflow.com/questions/13287749/should-i-commit-after-a-single-select | 20:25 |
aspiers | yeah I found that one | 20:26 |
aspiers | it doesn't explain select -> commit -> rollback though | 20:26 |
aspiers | only select -> commit or select -> rollback | 20:26 |
mriedem | i'm not sure why it's doing a commit at all since it's a select query not changing anything | 20:27 |
aspiers | hrm, looks like this is not related to https://bugs.launchpad.net/devstack/+bug/1838688 after all | 20:27 |
openstack | Launchpad bug 1838688 in devstack "API_WORKERS default too high on machines with many CPUs" [Undecided,New] | 20:27 |
aspiers | mriedem: that's exactly what I thought | 20:27 |
mriedem | unless the query is in an engine facade transaction context manager (oslo.db thing) and it's just adding that on automatically | 20:28 |
aspiers | unless there's some kind of locking or transactional thing to avoid race conditions, but that is clutching at straws | 20:28 |
aspiers | well yeah that would seem more likely | 20:28 |
mriedem | would need to identify where the query is coming from, but it's probably the service group API heartbeat stuff | 20:28 |
aspiers | I just had a look | 20:28 |
melwitt | looks like heartbeat goes every 10 seconds | 20:28 |
aspiers | I found the Service object and the last_updated_time, and service_is_up() / is_up() | 20:28 |
aspiers | melwitt: where is that? I couldn't find it | 20:29 |
openstackgerrit | Eric Fried proposed openstack/nova master: Correct global_request_id sent to Placement https://review.opendev.org/674129 | 20:29 |
mriedem | https://github.com/openstack/nova/blob/master/nova/servicegroup/drivers/db.py | 20:29 |
efried | melwitt: let's see how that shakes out ^ | 20:29 |
mriedem | aspiers: periodic is run here v | 20:29 |
mriedem | https://github.com/openstack/nova/blob/master/nova/servicegroup/drivers/db.py#L53 | 20:29 |
aspiers | oh yeah | 20:29 |
aspiers | thanks | 20:29 |
mriedem | https://github.com/openstack/nova/blob/master/nova/servicegroup/api.py#L47 | 20:29 |
mriedem | https://docs.openstack.org/nova/latest/configuration/config.html#DEFAULT.report_interval | 20:30 |
mriedem | so yeah you'll see that every 10 seconds per service | 20:30 |
mriedem | here is the select i think https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/api.py#L550 | 20:31 |
melwitt | and every heartbeat will do a write to the database | 20:31 |
mriedem | each report interval saves the report_count increment | 20:31 |
mriedem | and that service_update does a select first | 20:31 |
aspiers | I see no writes in general_log though | 20:31 |
aspiers | only selects | 20:31 |
aspiers | like the paste above | 20:31 |
melwitt | which conductor is this? super conductor or cell conductor | 20:32 |
aspiers | both I think | 20:32 |
aspiers | let me check | 20:32 |
aspiers | originally I thought this was due to having an insane number of workers | 20:32 |
aspiers | but my other SEV boxen have lots and their mysqld is sitting there doing nothing | 20:33 |
aspiers | this one is hovering between 2MB/s and 15MB/s total disk write | 20:33 |
aspiers | and it's a devstack with 0 users :-o | 20:34 |
melwitt | fwiw, it's a well-known operational problem for computes to be hammering the database with heartbeats. that's why long ago yahoo was pushing for fixes to the zookeeper servicegroup driver, since that uses a passive "watch" concept where when the node you're watching goes down, it calls you. IIRC | 20:34 |
melwitt | but this sounds like something new and out of the ordinary? I dunno | 20:35 |
aspiers | why does it have to save the report_count increment? | 20:35 |
aspiers | and shouldn't just the parent be heartbeating, not the workers? | 20:35 |
melwitt | the main point is the update the service record, so that looking at it's update time shows you whether it's "down" | 20:35 |
melwitt | something like that | 20:36 |
melwitt | I haven't looked at this in a long time | 20:36 |
aspiers | ah OK, I'm confusing two things | 20:36 |
aspiers | 1) a service heartbeating to say it's alive | 20:36 |
aspiers | 2) a service checking the heartbeat of another service | 20:37 |
aspiers | the SQL spam I'm seeing has to be 2) I think | 20:37 |
aspiers | because it's only doing SELECTs | 20:37 |
aspiers | not updates | 20:37 |
aspiers | as in http://paste.openstack.org/show/755199/ | 20:37 |
melwitt | ok. sorry I was not up to date on what you were talking about | 20:37 |
aspiers | no totally my fault since I was not clear in my head until just now | 20:38 |
aspiers | IIUC the join() mriedem pointed out is 1) not 2) | 20:38 |
melwitt | so the question is, what is checking for is_up if no requests are happening | 20:38 |
aspiers | exactly | 20:38 |
melwitt | *the question that you are asking | 20:38 |
melwitt | yeah, I think the join() is 1) | 20:38 |
aspiers | also, does DbDriver.is_up() really read the db? | 20:38 |
melwitt | yes, I think so | 20:39 |
aspiers | I only see it checking the Service object's last_seen_up | 20:39 |
melwitt | yeah... tracing that | 20:39 |
mriedem | no it doesn't | 20:39 |
mriedem | _report_state is what runs in the cron and updates the report_count, | 20:39 |
mriedem | which down in the db api changes last_seen_up | 20:39 |
mriedem | which is what is_up checks | 20:39 |
aspiers | so where the **** are these selects coming from? | 20:39 |
melwitt | ok, my mistake | 20:40 |
*** spatel has quit IRC | 20:41 | |
mriedem | there could be something hitting a service version check | 20:41 |
mriedem | and not caching it | 20:41 |
mriedem | try turning this on? https://docs.openstack.org/nova/latest/configuration/config.html#database.connection_trace | 20:41 |
aspiers | whoa cool! | 20:42 |
melwitt | well, ok, is_up(service) tells you whether the service is "down" but you have to pass it a service object, which you most of the time got from a db query. this is again, only during requests though | 20:43 |
aspiers | oh I guess I need connection_debug too | 20:44 |
mriedem | do any of the services in any of the dbs have version=0? | 20:44 |
aspiers | mriedem: checking | 20:44 |
mriedem | because if so, i think you could hit a case where every service save checks the min version, finds 0 and decides not to cache it | 20:44 |
mriedem | and that might explain why you'd have one busted devstack and not others | 20:45 |
aspiers | they're all version 38 AFAICS | 20:45 |
mriedem | in nova_cell1? | 20:45 |
aspiers | both cells | 20:45 |
mriedem | well then i guess you're going connection trace debugging | 20:46 |
aspiers | which is good because dumbass here didn't even know about connection_debug until just now | 20:46 |
aspiers | what's a good value to start with? | 20:46 |
aspiers | hah I guess I'll find out | 20:46 |
mriedem | ...true? | 20:46 |
aspiers | it's an int | 20:46 |
aspiers | 100=everything | 20:46 |
aspiers | 0=none | 20:46 |
mriedem | https://docs.openstack.org/nova/latest/configuration/config.html#database.connection_trace is a bool | 20:46 |
aspiers | connection_debug not _trace | 20:47 |
mriedem | sure, | 20:47 |
mriedem | but you want to know where it's coming from right? | 20:47 |
aspiers | I'm assuming I need _debug on for _trace to have effect? | 20:47 |
mriedem | idk | 20:47 |
aspiers | I'll try :) | 20:47 |
mriedem | if it turns out it's all shitty to configure, then we should update help on those options | 20:47 |
aspiers | before I'd just told mysqld set global general_log=1; | 20:47 |
aspiers | and then it dumps to /var/lib/mysql/devstack.log | 20:48 |
aspiers | but it doesn't give any context, just the internal mysqld process ids | 20:48 |
aspiers | which you then have to map back to normal pids via mysqladmin processlist and lsof -i tcp:$port | 20:48 |
aspiers | this way will be much better I guess | 20:48 |
efried | Nova meeting in 12 minutes in #openstack-meeting | 20:49 |
mriedem | i can't even keep services up and running in my devstack, with API_WORKERS=1 | 20:49 |
mriedem | things just randomly fall over | 20:49 |
aspiers | lol | 20:49 |
aspiers | oh, on that note | 20:49 |
aspiers | turns out I wasn't having the same issue as you | 20:49 |
aspiers | it was a hardware thing | 20:49 |
aspiers | random segfaults all over the place | 20:49 |
aspiers | I guess when you have 512GB RAM the chances of some of it being bad are quite high | 20:49 |
mriedem | i just get, | 20:50 |
mriedem | Aug 01 20:36:05 devstack systemd[1]: devstack@n-sch.service: Main process exited, code=dumped, status=11/SEGV | 20:50 |
mriedem | Aug 01 20:36:05 devstack systemd[1]: devstack@n-sch.service: Failed with result 'core-dump'. | 20:50 |
aspiers | yeah that's what I was getting | 20:50 |
aspiers | but it affected non-python stuff too | 20:50 |
aspiers | and weird kernel oops etc. | 20:50 |
aspiers | and random system hangs | 20:50 |
mriedem | this is an 8 VCPU, 8 GB RAM, 200 GB disk vm and i've never had this kind of persistent problem with a devstack vm that is basically configured like the gate nodes | 20:50 |
*** takashin has joined #openstack-nova | 20:50 | |
aspiers | well it could be a hw issue for you too I guess | 20:51 |
aspiers | although that would be a slightly weird coincidence | 20:51 |
mriedem | this image is a bit old (18.04 LTS from last year) so i probably should have updated the os | 20:51 |
aspiers | maybe do a memory check? | 20:51 |
aspiers | or install the python gdb extensions and do a coredumpctl gdb and py-bt | 20:51 |
mriedem | $ free -h | 20:51 |
mriedem | total used free shared buff/cache available | 20:51 |
mriedem | Mem: 7.8G 3.4G 747M 3.2M 3.7G 4.1G | 20:51 |
mriedem | Swap: 0B 0B 0B | 20:51 |
aspiers | I meant a memory hardware soak test | 20:52 |
aspiers | like those ones you can do from a tiny boot image | 20:52 |
aspiers | alternatively the python gdb extension sounds like a great way to pin it down, I was about to try that just as I realised it must be a h/w issue | 20:53 |
aspiers | OK now I'm getting useful info | 20:57 |
aspiers | it's coming from service_update() in db/api.py | 20:58 |
*** betherly has joined #openstack-nova | 20:59 | |
aspiers | but the trace is only 3 layers deep | 20:59 |
aspiers | I can add debug | 20:59 |
mriedem | yeah it's doing a query of the service on each save | 21:00 |
efried | nova meeting now | 21:01 |
mriedem | but you need to find what's calling Service.save | 21:02 |
melwitt | isn't it the heartbeats? | 21:02 |
melwitt | _report_state | 21:02 |
mriedem | he said he wasn't seeing writes/updates, | 21:02 |
mriedem | which should happen if report_count changes | 21:02 |
melwitt | ok, I thought that's what "it's coming from the service_update" meant | 21:03 |
melwitt | nevermind then | 21:03 |
*** betherly has quit IRC | 21:03 | |
mriedem | btw, watching top in devstack cinder-volume seems to be pretty cpu heavy at idle | 21:06 |
mriedem | well, not a high % but always running | 21:06 |
*** nafiux has joined #openstack-nova | 21:13 | |
*** betherly has joined #openstack-nova | 21:19 | |
*** betherly has quit IRC | 21:24 | |
openstackgerrit | Merged openstack/nova stable/rocky: Revert resize: wait for events according to hybrid plug https://review.opendev.org/670648 | 21:31 |
openstackgerrit | Merged openstack/nova stable/rocky: libvirt: move checking CONF.my_ip to init_host() https://review.opendev.org/672155 | 21:31 |
openstackgerrit | Merged openstack/nova stable/rocky: Fix type error on call to mount device https://review.opendev.org/669664 | 21:31 |
openstackgerrit | Merged openstack/nova stable/rocky: Avoid crashing while getting libvirt capabilities with unknown arch names https://review.opendev.org/672746 | 21:31 |
openstackgerrit | Igor D.C. proposed openstack/nova master: Libvirt: add nfv job https://review.opendev.org/652197 | 21:35 |
*** artom has quit IRC | 21:39 | |
*** Luzi has quit IRC | 21:46 | |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (13) https://review.opendev.org/576020 | 21:46 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (14) https://review.opendev.org/576027 | 21:46 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (15) https://review.opendev.org/576031 | 21:46 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (16) https://review.opendev.org/576299 | 21:46 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (17) https://review.opendev.org/576344 | 21:47 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (18) https://review.opendev.org/576673 | 21:48 |
mriedem | melwitt: efried: posted https://blueprints.launchpad.net/nova/+spec/policy-rule-for-host-status-unknown | 21:49 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (19) https://review.opendev.org/576676 | 21:49 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (20) https://review.opendev.org/576689 | 21:49 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (21) https://review.opendev.org/576709 | 21:49 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (22) https://review.opendev.org/576712 | 21:49 |
*** betherly has joined #openstack-nova | 21:50 | |
melwitt | mriedem: thank you | 21:50 |
mriedem | and with that i'm going to put on my lawn mowing outfit and hit the yard https://i.ytimg.com/vi/DOwqfxkx6mE/maxresdefault.jpg | 21:52 |
mriedem | o/ | 21:52 |
*** mriedem has quit IRC | 21:53 | |
mnaser | i left a review on that spec | 21:55 |
*** betherly has quit IRC | 21:55 | |
* mnaser has a clusterf' of reviews and gerrit emails disabled so pings welcome D: | 21:55 | |
*** slaweq has quit IRC | 22:10 | |
*** betherly has joined #openstack-nova | 22:11 | |
*** slaweq has joined #openstack-nova | 22:11 | |
openstackgerrit | Eric Fried proposed openstack/nova master: WIP: Always set a global_request_id in RequestContext https://review.opendev.org/674138 | 22:12 |
*** betherly has quit IRC | 22:15 | |
*** slaweq has quit IRC | 22:16 | |
*** BjoernT_ has quit IRC | 22:17 | |
*** jbernard_ has joined #openstack-nova | 22:29 | |
*** fyx_ has joined #openstack-nova | 22:29 | |
*** dustinc_ has joined #openstack-nova | 22:29 | |
*** mnasiadka_ has joined #openstack-nova | 22:30 | |
*** jrosser_ has joined #openstack-nova | 22:30 | |
*** kmalloc_ has joined #openstack-nova | 22:30 | |
*** betherly has joined #openstack-nova | 22:31 | |
*** Ben78 has joined #openstack-nova | 22:34 | |
Ben78 | Can a user create a VM and a new volume with a single curl command? | 22:35 |
*** betherly has quit IRC | 22:36 | |
*** fyx has quit IRC | 22:37 | |
*** jbernard has quit IRC | 22:37 | |
*** mordred has quit IRC | 22:37 | |
*** mnasiadka has quit IRC | 22:37 | |
*** kmalloc has quit IRC | 22:37 | |
*** dustinc has quit IRC | 22:37 | |
*** jrosser has quit IRC | 22:37 | |
*** mnasiadka_ is now known as mnasiadka | 22:37 | |
*** kmalloc_ is now known as kmalloc | 22:37 | |
*** fyx_ is now known as fyx | 22:37 | |
*** dustinc_ is now known as dustinc | 22:37 | |
*** jrosser_ is now known as jrosser | 22:37 | |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Add database schema upgrade check https://review.opendev.org/667047 | 22:39 |
*** panda has quit IRC | 22:41 | |
*** panda has joined #openstack-nova | 22:42 | |
*** mordred has joined #openstack-nova | 22:44 | |
*** mlavalle has quit IRC | 22:45 | |
*** tkajinam has joined #openstack-nova | 22:51 | |
*** whoami-rajat has quit IRC | 22:55 | |
*** nafiux has quit IRC | 23:00 | |
*** Ben78 has quit IRC | 23:02 | |
*** ivve has quit IRC | 23:05 | |
*** bbowen has quit IRC | 23:06 | |
*** nafiux has joined #openstack-nova | 23:06 | |
*** igordc has quit IRC | 23:10 | |
*** betherly has joined #openstack-nova | 23:12 | |
*** betherly has quit IRC | 23:18 | |
*** rcernin has joined #openstack-nova | 23:21 | |
*** artom has joined #openstack-nova | 23:52 | |
*** betherly has joined #openstack-nova | 23:54 | |
*** trident has quit IRC | 23:54 | |
*** betherly has quit IRC | 23:58 | |
*** trident has joined #openstack-nova | 23:59 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!