| wilkmarcin | Hello Watcher Team! I'd appreciate it if someone could take a final look at this tiny patch: https://review.opendev.org/c/openstack/watcher/+/973839 | 09:15 |
|---|---|---|
| wilkmarcin | thank you in advance! | 09:15 |
| opendevreview | Merged openstack/watcher master: Add note about using long period with continuous audits https://review.opendev.org/c/openstack/watcher/+/973839 | 11:27 |
| morenod | #startmeeting Watcher | 12:00 |
| opendevmeet | Meeting started Thu Feb 5 12:00:14 2026 UTC and is due to finish in 60 minutes. The chair is morenod. Information about MeetBot at http://wiki.debian.org/MeetBot. | 12:00 |
| opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 12:00 |
| opendevmeet | The meeting name has been set to 'watcher' | 12:00 |
| morenod | welcome to the weekly meeting of watcher, who is around today? | 12:00 |
| jgilaber | o/ | 12:00 |
| amoralej_ | o/ | 12:00 |
| dviroel | o/\ | 12:01 |
| rlandy | o/ | 12:02 |
| morenod | courtesy ping list: sean-k-mooney chandankumar | 12:02 |
| sean-k-mooney | o/ | 12:02 |
| morenod | let's get started with the today's meeting agenda | 12:03 |
| morenod | #link https://etherpad.opendev.org/p/openstack-watcher-irc-meeting#L40 | 12:03 |
| morenod | #topic (dviroel) Applier to run Taskflow parallel engine with native threads (eventlet-removal) | 12:03 |
| morenod | please dviroel | 12:03 |
| dviroel | thanks morenod | 12:03 |
| dviroel | recently i have been working on investigating this issue/limitation | 12:04 |
| dviroel | with taskflow with parallel engine + native threads failing in unit tests... | 12:04 |
| dviroel | the issue mentioned in the LP, happens in a few unit tests | 12:04 |
| dviroel | (sqlite3.OperationalError) cannot start a transaction inside another transaction | 12:05 |
| dviroel | I have been looking into our db integration and how we setup our tests | 12:05 |
| dviroel | also chatting with some folks about these issues | 12:05 |
| dviroel | the transiction error in the end is happenning due to the use of in-memory sqlite database | 12:06 |
| dviroel | which works with a single connection | 12:06 |
| dviroel | #link https://www.sqlite.org/inmemorydb.html | 12:07 |
| dviroel | there is a mention about that in the docs, there is also mention to use a shared cache for multiple connections | 12:07 |
| sean-k-mooney | thats apossiblity | 12:07 |
| dviroel | i tried a bit the shared connection, but I didn't make it work yet | 12:07 |
| sean-k-mooney | sqlite does have transacation suprpot but perhaps that requried the file backend | 12:07 |
| dviroel | yeah | 12:08 |
| sean-k-mooney | that a pretty simple change to the test fixture | 12:08 |
| dviroel | next thing to try was to use a disk file instead on memory | 12:08 |
| sean-k-mooney | right nova or placment i think does that already | 12:08 |
| dviroel | it solved the transaction error and tests that were failing before start to pass | 12:08 |
| sean-k-mooney | cool | 12:08 |
| dviroel | but not all of them | 12:08 |
| dviroel | :) | 12:08 |
| dviroel | some tests started to fail due to concurrency, with a new error: | 12:08 |
| sean-k-mooney | still progress is good | 12:08 |
| dviroel | "(sqlite3.OperationalError) database is locked" | 12:09 |
| dviroel | before giving up | 12:09 |
| dviroel | I tested additional configuration like setting journal mode to WAL, and increasing timeouts, but still without success | 12:09 |
| dviroel | the concurrency with SQLite database sometimes gives database is locked | 12:09 |
| sean-k-mooney | hum do you have a patch up to show this | 12:09 |
| dviroel | correct | 12:10 |
| amoralej_ | so, the good news is that the issue is actually only in CI, not real issues, right? | 12:10 |
| dviroel | #link https://4bb852441388777a25ca-c1d45ef7eb040e2d7e76303651507120.ssl.cf1.rackcdn.com/openstack/895e46290cf94805a28d87a3f5e6042b/testr_results.html | 12:10 |
| sean-k-mooney | amoralej_: i dont think we can assume that | 12:10 |
| dviroel | amoralej_: in the end, it is about the database that we use yes | 12:10 |
| sean-k-mooney | but if we run the same test with the mysql opertunsitc test fixture and they pass | 12:10 |
| dviroel | we can still continue to investigate these issues when running with SQLite, but it seems that SQLite is really bad for concurrency in the end | 12:10 |
| sean-k-mooney | then maybe | 12:11 |
| dviroel | Mike Bayer already antecipated me about SQLite concurrency problems .. | 12:11 |
| dviroel | and then, so I tested some classes with MySQL and the issue does not reproduce, and the failing tests are now passing | 12:11 |
| sean-k-mooney | right so im not sure we want to only run them with the opertunistic test fixture | 12:12 |
| amoralej_ | would that be an option instead of sqlite ? | 12:12 |
| sean-k-mooney | but that would be an option just most folks unless they are running the unti tests in there devstack env wont have set that up | 12:12 |
| dviroel | yeah, so this also brings a discussion on what this units tests are really trying to test | 12:12 |
| amoralej_ | for the entire tests suite, i mean | 12:12 |
| sean-k-mooney | amoralej_: we already haave the ablity to use mariadb/mysql in our tests | 12:13 |
| amoralej_ | yeah, i remember we do it for specific db tests | 12:13 |
| dviroel | migration tests already use MySQL right | 12:13 |
| sean-k-mooney | amoralej_: im -2 on requireing that in general (a non sqlite backend) | 12:13 |
| jgilaber | yes, migration tests and some specific tests related to a column of the efficacy indicators | 12:13 |
| sean-k-mooney | that would be a major testing reguression in my view | 12:13 |
| jgilaber | this is the base class https://github.com/openstack/watcher/blob/3ba748f798f982f35c9fbbe154505f035220a493/watcher/tests/unit/db/base.py#L72 | 12:13 |
| sean-k-mooney | dviroel: i aggree however that reassisng what coverage the test actully provide makes sense | 12:14 |
| jgilaber | and here is the use case https://github.com/openstack/watcher/blob/master/watcher/tests/unit/db/test_efficacy_indicator.py#L414 | 12:14 |
| sean-k-mooney | most unit test shoudl not interact with the db at all | 12:14 |
| dviroel | so in the end, our unit tests are testing more things that they should | 12:14 |
| sean-k-mooney | functional test sure but unit test unless they are testign the db layer should not | 12:15 |
| amoralej_ | so, we could limit using mysql fixtures to only the tests which require proper concurrency | 12:15 |
| amoralej_ | yeah, or change our unit tests to not require it ... | 12:15 |
| sean-k-mooney | amoralej_: that or we coudl maybe use a sepetea sqlite db | 12:15 |
| sean-k-mooney | for apscheduler and taskflow vs watcher | 12:15 |
| sean-k-mooney | so the test in question are unit test under watcher.tests.unit.applier.workflow_engine.test_default_workflow_engine.TestDefaultWorkFlowEngine | 12:17 |
| sean-k-mooney | i woudl agure that they shoudl not need db access in general based on the area of code its intended to test | 12:17 |
| dviroel | correct | 12:17 |
| dviroel | so our option are: | 12:18 |
| dviroel | 1 - rewrite these tests to avoid database writes | 12:18 |
| dviroel | 2 - work more to get sqlite db file to work with more concurrent writes | 12:19 |
| dviroel | 3 - replace sqlite in the affected (or all) tests | 12:19 |
| dviroel | any other? | 12:20 |
| sean-k-mooney | possible split the connetions between difent databases in the test fixture | 12:20 |
| sean-k-mooney | we have 2 diffent db schemas we have our one and then aspschduelr has its own schmea for its internal schduling | 12:21 |
| sean-k-mooney | if the file lock is caused by those conflicting using diffent files for both and as a result diffent connection might help | 12:21 |
| dviroel | i don't think that is caused by that, but between the action in parallel trying to read/write updates about actions | 12:22 |
| sean-k-mooney | well that and we might want to lock into properly treadign the context object to all db calls | 12:23 |
| sean-k-mooney | so https://review.opendev.org/c/openstack/watcher/+/966759 | 12:23 |
| sean-k-mooney | i woudl be interested to see if that helps with the file backed sqlite | 12:24 |
| dviroel | we have a problem with that too | 12:24 |
| dviroel | the request context is currently shared by all actions | 12:24 |
| sean-k-mooney | nova uses sqlite for its functional tests as well and there we have many diffent deams running in sepeate threads | 12:24 |
| sean-k-mooney | right it shoudl not be | 12:25 |
| amoralej_ | nova uses file backed sqlite? | 12:25 |
| sean-k-mooney | that why we need to create and pass it down the call stack instead of using a module gloabl or thread local | 12:25 |
| sean-k-mooney | amoralej_: i belive so | 12:25 |
| sean-k-mooney | ill check | 12:25 |
| amoralej_ | so, if that works, i understand that'd be the easiest and less intrusive approach | 12:26 |
| dviroel | but today thread local already guarantees a different connection for each thread too | 12:26 |
| dviroel | and we still hit the issue | 12:26 |
| amoralej_ | I'd be in favor of rewriting test but I'm afraid of we may miss valid issues in future, specially until we have functional tests | 12:27 |
| sean-k-mooney | https://github.com/openstack/nova/blob/master/nova/tests/fixtures/nova.py#L697-L700 so nova is using a fixture (with teasks) form sqlachmy | 12:28 |
| sean-k-mooney | sorry form oslo.db | 12:29 |
| sean-k-mooney | so if its file based epend on the connection string we pass | 12:30 |
| sean-k-mooney | ill look that up sepratly but dviroel tested file based and it was not enough correct | 12:31 |
| sean-k-mooney | we coudl compare the delta however and see if there is soemthign we missed | 12:31 |
| dviroel | I can spend so more time investigating the sqlite | 12:32 |
| dviroel | using file based | 12:32 |
| dviroel | option 2) | 12:32 |
| dviroel | and provide more feedback next week | 12:32 |
| dviroel | we can continue discussing about this during the week, in the channel too | 12:34 |
| dviroel | agree? | 12:34 |
| amoralej_ | lgtm | 12:34 |
| jgilaber | +1 | 12:34 |
| dviroel | i think that sean-k-mooney is looking at the code right now :) | 12:35 |
| dviroel | ok, but i already consumed half of the meeting, and we can continue async later | 12:35 |
| sean-k-mooney | sure and yes | 12:35 |
| morenod | ok, lets move to the next topic | 12:35 |
| morenod | #topic (amoralej) https://blueprints.launchpad.net/watcher/+spec/osprofiler-in-watcher | 12:35 |
| morenod | please alfredo | 12:36 |
| sean-k-mooney | i just assed claude to cofnirm | 12:36 |
| sean-k-mooney | apprently we are usign in memory https://github.com/openstack/nova/blob/master/nova/tests/fixtures/conf.py#L37-L44 | 12:37 |
| sean-k-mooney | but we are also seting sqlite_synchronous to false | 12:37 |
| amoralej_ | we discussed https://blueprints.launchpad.net/watcher/+spec/osprofiler-in-watcher some weeks ago, i fixed the last sentence about how to add the wsgi middleware which i've been testing locally | 12:38 |
| amoralej_ | is it still possible to get it approved in this cycle? i have been working in implementation in parallel, but i missed that was actually not approved yet | 12:39 |
| sean-k-mooney | possible yes we could allow a spec freeze excption but i dont think we shoudl | 12:40 |
| sean-k-mooney | 2026.2 opens properly in 4? weeks | 12:40 |
| sean-k-mooney | so why woudl we not wait for the next cycle? | 12:40 |
| amoralej_ | note, it is specless blueprint | 12:40 |
| sean-k-mooney | right it has the same dealine | 12:40 |
| amoralej_ | i wanted to start sending patches already, which i guess i still can do anyway | 12:41 |
| sean-k-mooney | yep you can | 12:41 |
| dviroel | yeah, if you already implemented, just send | 12:41 |
| amoralej_ | ack, I will then | 12:41 |
| sean-k-mooney | i dont really have an issue with reviewing and approving the blueprint for the next cycle and starting on the work | 12:41 |
| dviroel | we can merge as soon 2026.2 starts then | 12:41 |
| dviroel | if ready | 12:42 |
| sean-k-mooney | yep | 12:42 |
| amoralej_ | ok, I will send the patches as soon as i have them ready then | 12:42 |
| sean-k-mooney | we are not quite at feature freeze but it would be good if we started shifting focus to completign the planned work | 12:42 |
| amoralej_ | that was part of the planned work in PTG for this cycle, that's why i wanted to have it approved in the cycle | 12:43 |
| dviroel | ack | 12:43 |
| sean-k-mooney | i created https://launchpad.net/watcher/2026.2 | 12:43 |
| sean-k-mooney | so fi we want we can target it to that | 12:43 |
| sean-k-mooney | my question woudl be are we happy to consier this a specless blueprint and are we ok with its curernt content | 12:44 |
| sean-k-mooney | i dont partically think adding osprofiler supprot is contoversial although it shoudl be off by default | 12:45 |
| sean-k-mooney | give we do not use api-paste.ini that woudl meean we woudl need a cofnig option to opt into it | 12:45 |
| sean-k-mooney | well ok | 12:46 |
| sean-k-mooney | nova has it in its defalt middleware pipeline | 12:46 |
| amoralej_ | about spec, i think that's what we discussed as not needed in the past | 12:46 |
| dviroel | i think that we agreed in the past to be a specless blueprint already, yeah | 12:46 |
| sean-k-mooney | amoralej_: do you know if os-profile has its onw opt in mechanium | 12:46 |
| amoralej_ | about the middleware, it will be added only if osprofiler is available and enabled | 12:46 |
| sean-k-mooney | amoralej_: ack so osprofiel has its own way to be enabled | 12:47 |
| sean-k-mooney | im not sure we woudl want this on in production by default for example so i jsut want to make sure if we hardcode teh includion in the middleware it does not create a perfomance regression | 12:48 |
| dviroel | amoralej_: is there also some profiling for sqlalchemy too? | 12:48 |
| amoralej_ | osprofiler has it's own parameters | 12:48 |
| amoralej_ | yes, i know there is the option, but i didn't play with it tbh | 12:48 |
| sean-k-mooney | in devstack we also enabel a db profiling sqlachcmy plugin form oslo.db i beilve | 12:49 |
| dviroel | right, i see that manila has osprofiler_sqlalchemy | 12:49 |
| amoralej_ | https://paste.centos.org/view/d2cf5ce1 that's what i've used | 12:49 |
| amoralej_ | so it's only added to the pipeline if the module is installed and the option enabled (by default is disabled, that comes from osprofiler opts) | 12:49 |
| sean-k-mooney | ack, so importing the midelwaure has teh sideffect fo registrign the config options? | 12:50 |
| amoralej_ | not exactly, but i think it's better to do technical discussions in specs or in irc, we are almost out of time :) | 12:51 |
| amoralej_ | registering config options is automatic if the module is installed | 12:51 |
| amoralej_ | i'm following similar pattern as other projects in that | 12:52 |
| sean-k-mooney | ack in that case do we want to follow up next week | 12:53 |
| sean-k-mooney | or approve https://blueprints.launchpad.net/watcher/+spec/osprofiler-in-watcher for 2026.2 | 12:53 |
| dviroel | ack, we can continue next week | 12:53 |
| sean-k-mooney | i dont think this need a spec but i do think we need a doc page for how to confirue and use it | 12:53 |
| dviroel | +1 | 12:53 |
| amoralej_ | sure | 12:53 |
| dviroel | which i expect that other projects also have? | 12:53 |
| dviroel | :) | 12:53 |
| sean-k-mooney | it should also be an optional dep meaning it shoudl not be in requriements.txt bug can be in test-requrieemtnes.txt | 12:54 |
| sean-k-mooney | dviroel: at least zaaqar has | 12:54 |
| sean-k-mooney | its the first hit on google | 12:54 |
| dviroel | ack | 12:54 |
| sean-k-mooney | https://docs.openstack.org/zaqar/latest/admin/OSprofiler.html | 12:54 |
| amoralej_ | yes, that's my plan too, being just optional | 12:54 |
| morenod | lets move to the reviews | 12:55 |
| dviroel | yes | 12:55 |
| morenod | #topic reviews | 12:55 |
| morenod | #link https://review.opendev.org/c/openstack/watcher/+/968568 | 12:55 |
| dviroel | this patch removes the dependency on eventlet.Timeout for cluster data model (CDM) collectors and implements a different timeout | 12:56 |
| dviroel | I would like to get this patch merged, but I think that we need to agree in the new timeout configuration option. | 12:56 |
| dviroel | It was raised a concern about its minimal supported value, if we should support 0 as value for instance | 12:56 |
| dviroel | in my tests, 0 means return immediately | 12:56 |
| dviroel | which means also that there is no option to disable this timeout | 12:57 |
| sean-k-mooney | i think adding a min value makes sesne jus tnot sure what the value should be | 12:57 |
| sean-k-mooney | so we could supprot -1 for disabeld | 12:58 |
| sean-k-mooney | however | 12:58 |
| sean-k-mooney | the current way to disable collectors is to not list them in the list of collectors right? | 12:58 |
| dviroel | we could indeed | 12:58 |
| dviroel | hum, i don't really remember now... | 12:59 |
| sean-k-mooney | my main issue actully is i dont think we shoudl be defining cofnig option in https://review.opendev.org/c/openstack/watcher/+/968568/6/watcher/decision_engine/model/collector/nova.py | 12:59 |
| sean-k-mooney | config option shoudl exlusively be defiend under https://github.com/openstack/watcher/tree/master/watcher/conf | 12:59 |
| dviroel | ah correct | 13:00 |
| sean-k-mooney | so it shoudl be here right https://github.com/openstack/watcher/blob/master/watcher/conf/nova.py | 13:00 |
| sean-k-mooney | or here https://github.com/openstack/watcher/blob/master/watcher/conf/collector.py | 13:00 |
| dviroel | i think that I already discussed that with amoralej_ to in the patch | 13:01 |
| sean-k-mooney | collector_plugins is the way to disabel collectors today so grouping it there might make sense | 13:01 |
| dviroel | it can, but others colletors don't use them | 13:01 |
| dviroel | i mean | 13:01 |
| dviroel | other collectors don't use the timeout config in this patch | 13:01 |
| sean-k-mooney | right hey probaly shoudl evenutlaly no? | 13:02 |
| sean-k-mooney | im fine with per collector values | 13:02 |
| dviroel | they don't start additional threads | 13:02 |
| sean-k-mooney | but in that case nova.py might make sense | 13:02 |
| dviroel | the period in under collector too | 13:03 |
| dviroel | not nova or others | 13:03 |
| dviroel | like this https://docs.openstack.org/watcher/latest/configuration/watcher.html#watcher-cluster-data-model-collectors-compute | 13:03 |
| dviroel | so it will end there too | 13:03 |
| sean-k-mooney | ya so tome this is more a code mantaicne thing | 13:04 |
| dviroel | since we are out of time, i think that we can discuss more in the channel or in the patch | 13:04 |
| sean-k-mooney | we spent a lot of work centralising config optin so that they are nto spread acoss impleaiton files | 13:04 |
| sean-k-mooney | i guess i dont midn where under watcher/conf it lives as long as it is there | 13:05 |
| dviroel | morenod: i can chair next meeting btw | 13:05 |
| morenod | thanks dviroel | 13:05 |
| morenod | we have a bug, but I think we will cover next week, as we are out of time | 13:05 |
| dviroel | +1 | 13:06 |
| jgilaber | I just wanted to mention that the series of patches to move novaclient usage to openstacksdk is up for review | 13:06 |
| jgilaber | since the patches are big, I wanted to give notice with enough time before FF which is in 3 weeks | 13:06 |
| morenod | sorry jgilaber, I missed that | 13:07 |
| morenod | #link https://review.opendev.org/c/openstack/watcher/+/972913 | 13:07 |
| jgilaber | no problem, just a quick mention is enough | 13:07 |
| morenod | lets finish here then, thank you all for joining the meetin | 13:08 |
| morenod | #endmeeting | 13:08 |
| opendevmeet | Meeting ended Thu Feb 5 13:08:18 2026 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 13:08 |
| opendevmeet | Minutes: https://meetings.opendev.org/meetings/watcher/2026/watcher.2026-02-05-12.00.html | 13:08 |
| opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/watcher/2026/watcher.2026-02-05-12.00.txt | 13:08 |
| opendevmeet | Log: https://meetings.opendev.org/meetings/watcher/2026/watcher.2026-02-05-12.00.log.html | 13:08 |
| opendevreview | Takashi Kajinami proposed openstack/watcher master: Deprecate unused [DEFAULT] bindir https://review.opendev.org/c/openstack/watcher/+/975808 | 15:06 |
| opendevreview | Takashi Kajinami proposed openstack/watcher master: devstack: Remove ineffective ssl options https://review.opendev.org/c/openstack/watcher/+/975809 | 15:18 |
| *** haleyb is now known as haleyb|out | 22:21 | |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!