[tts] Cache mechanism #3057

dalgwen · 2022-08-08T15:25:20Z

Implements a cache mechanism for all TTS services.

Reason :

Online TTS service can be costly, and reducing call to the cloud is always good.
It will also improve user experience (less latency for local services)
Amazon Polly TTS and Google TTS both implement their own mechanism, and I thought that it could be interesting to mutualize on the same code base (and for other services as well)

Functional specification :

Eviction policy is LRU mode.
Cache size is a voice bundle parameter (10 mb default)
You can enable or disable this, system wide, ~~or by TTS service~~ Each TTS service has to enable this.
Default mode is cache enabled. ~~I'm wondering if it should be switched to default off, at least for the beginning, what do you think ?~~ (Activated by the TTSService, so it's opt-in for the dev)
It doesn't wait for the stream to end and can serve data as soon as a small bunch is available (10kb).
A side effect functionnality : this cache can serve several streams concurrently, for the same utterance, with only one call to the TTS.
Another side effect : provide transparently, with a wrapper, the benefit of a FixedLengthAudioStream for playing on sink that requires it (such as all the one based on the openHAB audio servlet).

Technical :

A new generic LRUMediaCache is added in the core.cache package.
This class use ~~a double linked list with head and tail, and a hashmap (a rather classical LRU implementation)~~ a LinkedHashMap in LRU mode.
We use it the with get method, which take a Supplier as an argument. The Supplier must give a LRUMediaCacheEntry, which is an InputStream along with an arbitrary metadata object. The Supplier is of course used only if there is a cache miss.
The InputStream will be read on the fly and stored in a file.
The metadata will be stored thanks to the openhab Storage service. It must be a serializable object.
For subsequent call with the same key, a LRUMediaCacheEntry is given to the caller, transparently using the file on disk (even if not fully completed). It uses under the hood an InputStreamCacheWrapper, responsible for querying the cache entry for data.
It allows several clients to request the same ~~TTSResult~~ LRUMediaCacheEntry without waiting, each getting their own InputStream.

A LRU cache (TTSLRUCacheImpl) implementing a simple interface (TTSCache with a get method) uses this cache implementation, and is provided as an OSGI Component.

~~TTS service can disable this by a hard coded value in the TTS service implementation. (Default method in the interface returns true=enable)~~ Each TTSService must opt-in (simplest solution is to extend the AbstractCachedTTSService which provide the functionnality transparantly).

~~The TTS AudioStream result is provided by a supplier (AudioStreamSupplier, which can delay call to the TTS service, thus allowing the cache service to not wait during the cached entry creation).~~ (<-- Using a more advanced locking mechanism to lock only the part needed, and removing some fallback mechanism, allowed to get rid of this class.)

The ~~TTS Results~~ LRUMediaCacheEntry are created when calling the TTS service for the first time, or loaded from the disk at startup. ~~An "info" file is stored alongside the sound file and contains the AudioFormat information.~~ <-- No more .info file, using Storage service.

The AudioStreamFromCache object provides an AudioStream wrapper implementation which is send to the sink responsible for playing. This wrapper override the read() method to call the InputStreamCacheWrapper transparently.

A fallback mechanism is implemented (if the cache mechanism failed for whatever reason, then the TTS is directly called). It has been rewritten for genericity and it is now much simpler (the cost is that a little robustness disappears : now a call after a forced file deletion is not honored. But the next will still be)

Closes #3039

Signed-off-by: Gwendal Roulleau gwendal.roulleau@gmail.com

GiviMAD · 2022-12-10T15:59:21Z

This looks great, seems also a great idea to reduce the resource usage on offline services. I have a couple of questions about the cache implementation and tts resolver, I will open a review tomorrow and give it a try.

GiviMAD · 2022-12-12T17:31:50Z

....openhab.core.voice/src/main/java/org/openhab/core/voice/internal/cache/TTSLRUCacheImpl.java

+        }
+        logger.debug("Using TTS cache folder '{}'", cacheFolder.getAbsolutePath());
+
+        cleanCacheDirectory();


Do you think is ok to move this to a static context so the clean of orphan files is only done on program startup?

The TTSLRUCacheImpl is instantiated each time the voice config is modified.
If the clean is done one only once, at startup, it won't take into account the new cache size the user could set when he modifies the voice parameters.
He could set a lesser value. The cache has to take this new value and clean again accordingly.

GiviMAD

I have added some comments.
Do you think this could be refactored into an AudioStream Cache that is not tied to a TTS?
Because I think that if you allow to provide a supplier factory into the TTSCache that generate the AudioStreamSupplier based on the properties file it has more or less no real dependency on the tts itself. Do you think this makes sense?

GiviMAD · 2022-12-12T17:49:14Z

....core.voice/src/main/java/org/openhab/core/voice/internal/cache/AudioStreamCacheWrapper.java

+                }
+                int i = 0;
+                for (; i < len && i < bytesRead.length; i++) {
+                    b[off + i] = bytesRead[i];


Is ok for you to implement the InputStream interface in the TTSResult so this class only handles the fallback mechanism?
For me it will look better if this code is inside the result and will allow you to write directly into the input byte[] I think.

There is only one TTSResult, but many AudioStream can be instanciated from it (even concurrently)
If the TTSResult implements InputStream, it wouldn't be able to be reused (because once closed it is useless)
The TTSResults are made to be stored as LRU entries, and reused at will by creating several AudioStream from it.

(But it's also equally possible that christmas put a toll on my mind and that I don't understand your proposition 😅 ?)

openhab-bot · 2022-12-12T21:23:10Z

This pull request has been mentioned on openHAB Community. There might be relevant details there:

https://community.openhab.org/t/mimic-text-to-speech/137040/10

dalgwen · 2022-12-30T17:45:30Z

Thanks @GiviMAD for your propositions and review !

Do you think this could be refactored into an AudioStream Cache that is not tied to a TTS?

What do you mean by tied to a TTS ? Do you mean an AudioStreamLRUCache class with an implementation with no reference to the TTSService class ? If so, then yes, it could be.
Do you think such cache could have some usage beside the one here ? if you provide me some use case I can do it.

J-N-K

Thank. I have left some comments, but we'll probably need a second round.

Please first have a look at my comment regarding the use of LinkedHashMap as that will probably reduce code size quite a lot.

J-N-K · 2022-12-30T17:49:35Z

....core.io.rest.voice/src/main/java/org/openhab/core/io/rest/voice/internal/VoiceResource.java

+        if (volume != null && !volume.isEmpty()) {
+            volumePercent = new PercentType(volume);
+        }
+        voiceManager.say(text, voiceId, sinkId, volumePercent, enableCache);


I think adding an enable cache parameter here makes it more complicated fr the user. If the cache is enabled for the TTS use it, If it is not enable, don't use it. What would be the use-cache to bypass the cache if it is enabled? (also below)

@lolodomo provides me with a use case for bypassing the cache :
#3039 (comment)
But simplifying the code and parameters here is also a good objective. I have to say I was not fond of adding the numerous methods for this parameter 😅
What shoud I do ?

The contribution guidelines say:

We're trying very hard to keep openHAB lean and focused. We don't want it to do everything for everybody.

I don't think we should blow up API to cover every use-case. At maximum this will add one cache-miss (because the oldest entry is removed and would be a miss on next call). Of course the rate increases the more one-time calls you make, but on the other hand: if the majority of your calls is not to be cached, then using the cache makes no sense anyway.

OK, I'm convinced. And the added complexity of a new parameter is a strong argument to my lazy part.
@lolodomo do you have a counter argument, or is it also OK for you ?

....media/src/main/java/org/openhab/core/automation/module/media/internal/SayActionHandler.java

J-N-K · 2022-12-30T17:54:11Z

bundles/org.openhab.core.voice/src/main/java/org/openhab/core/voice/TTSService.java

+     *
+     * @return
+     */
+    public default boolean isCacheEnabled() {


Methods in interface should not use public, they are public by default. Please also clean-up the other methods in this TTSService. Why should the service hard-disable the cache? Wouldn't it be better to have a configurable service and let the user decide?

One example : the TTS service already uses its own cache, better suited, and doesn't want the global one ? Or there is a compatibility issue with it ? In this case, if the cache is defaulted to true, a user installing the TTS service and not aware of this subtlety will complain that it is not working.

But, let's be clear : I added this parameter because I wanted to let as many options as possible for other (dev and user), and I didn't want to the cache system to force other TTS. But in fact I'm totally in favor of simplifying the code here and letting only one global parameter, if it is deemed solid enough.

It can default to false, that's not my point. I would prefer a user-configurable option for that, not something in the interface.

Done, with a twist I propose : the TTSService is now responsible for using the cache or not.
(see below)

J-N-K · 2022-12-30T18:03:16Z

bundles/org.openhab.core.voice/src/main/java/org/openhab/core/voice/TTSService.java

+     * @return A likely unique key identifying the combination of parameters and/or internal state,
+     *         as a string suitable to be part of a filename. This will be used in the cache system to store the result.
+     */
+    public default String getCacheKey(String text, Voice voice, AudioFormat requestedFormat) {


I don't think we should implement such things in default methods.

From the "Oracle Java Tutorial":

Default methods enable you to add new functionality to the interfaces of your libraries and ensure binary compatibility with code written for older versions of those interfaces.

We don't need binary compatibility for OH4, so we should not use such "hacks" here. Are there other common methods that can be re-used for different TTS services? I would prefer a AbstractTTSService then that implements TTSService and can be extended by the implementations.

OK. I see two options here :

"can be" extended : if TTS service are not compeled to extends the AbstractTTSService, then I have to check with instanceof and use explicit call to the method for getting "getCacheKey" (or "isCacheEnabled" if we keep it)

"must be" extended : no alternative method needed, but every TTS service have to be changed accordingly

WDYT ?

You could also add an interface CachedTTSService which extends TTSService. This is probably even a better idea, because it can separate the cache methods from the TTSService methods. TTS services that don't want to (or can't) use the cache implement TTSService, others CachedTTSService (similar to PersistenceService and ModifiablePersistenceService). This also removes the need to programatically disable the cache.

Done, with a proposal of separation of concerns :
The TTSCache is now an OSGI service (with TTSLRUCacheImpl implementing it). It respects the global parameter to enable the cache or not.
VoiceManagerImpl is left totally untouched, not tied to the cache anymore.
Each TTSService can choose to use the TTSCache or not.
To do so, they can extend a helper Class (AbstractCachedTTSService, which implements the TTSCachedService extending the TTSService). The TTSCache is injected inside and used transparently.

...s/org.openhab.core.voice/src/main/java/org/openhab/core/voice/internal/VoiceManagerImpl.java

J-N-K · 2022-12-30T19:18:31Z

...es/org.openhab.core.voice/src/main/java/org/openhab/core/voice/internal/cache/TTSResult.java

+            String containerS = properties.getProperty(PROPERTY_FORMAT_CONTAINER);
+            String bigEndianS = properties.getProperty(PROPERTY_FORMAT_BIGENDIAN);
+            String bitDepthS = properties.getProperty(PROPERTY_FORMAT_BITDEPTH);
+            String bitRateS = properties.getProperty(PROPERTY_FORMAT_BITRATE);
+            String channelsS = properties.getProperty(PROPERTY_FORMAT_CHANNELS);
+            String codecS = properties.getProperty(PROPERTY_FORMAT_CODEC);
+            String frequencyS = properties.getProperty(PROPERTY_FORMAT_FREQUENCY);
+            String textS = properties.getProperty(PROPERTY_TEXT);


Wouldn't it make sense to create a record or class holding this information and leave the serialization/deserialization to Gson? Probably performance is not so much an issue here and you could simplify code to about two or three lines here.

Interesting !
Sadly, deserializing record needs Gson 2.10, and we have Gson 2.9.1
Do you think it could be pushed for openHAB 4 ?

I will use a standard class for the moment.

Probably when we switch to Karaf 4.4. class is fine for now.

J-N-K · 2022-12-30T19:19:20Z

...es/org.openhab.core.voice/src/main/java/org/openhab/core/voice/internal/cache/TTSResult.java

+            if (ttsAudioStreamFinal != null && ttsAudioStreamFinal instanceof FixedLengthAudioStream) {
+                return ((FixedLengthAudioStream) ttsAudioStreamFinal).length();
+            }


Use pattern matching (see above)

J-N-K · 2022-12-30T19:25:42Z

....openhab.core.voice/src/main/java/org/openhab/core/voice/internal/cache/TTSLRUCacheImpl.java

+            }
+
+            ttsResultMap = ttsResultOrderedList.stream()
+                    .collect(Collectors.toMap(TTSResult::getKey, Function.identity()));


Suggested change

.collect(Collectors.toMap(TTSResult::getKey, Function.identity()));

.collect(Collectors.toMap(TTSResult::getKey, v -> v));

Strangely enough, if I do this, I have a compilation error about a Null type mismatch ?
(EDIT : But the refactor removes the need for such a Collector)

J-N-K · 2022-12-30T19:28:34Z

...e.voice/src/test/java/org/openhab/core/voice/internal/cache/AudioStreamCacheWrapperTest.java

+        TTSResult mockedTTSResult = Mockito.mock(TTSResult.class);
+        AudioStreamSupplier mockedAudioStreamSupplier = Mockito.mock(AudioStreamSupplier.class);


You can make these mocks members of the class:

private @Mock @NonNullByDefault({}) TTSResult ttsResultMock;

and they get mocked automatically if you annotate the class with

@ExtendWith(MockitoExtensions.class)

Done, thanks

J-N-K · 2022-12-30T19:29:07Z

...nhab.core.voice/src/test/java/org/openhab/core/voice/internal/cache/TTSLRUCacheImplTest.java

+    @NonNullByDefault({})
+    private @Mock Voice voiceMock;


inline null-annotations

GiviMAD · 2023-01-01T12:59:40Z

Thanks @GiviMAD for your propositions and review !

Do you think this could be refactored into an AudioStream Cache that is not tied to a TTS?

What do you mean by tied to a TTS ? Do you mean an AudioStreamLRUCache class with an implementation with no reference to the TTSService class ? If so, then yes, it could be. Do you think such cache could have some usage beside the one here ? if you provide me some use case I can do it.

I think that the cache mechanism that you have added is really complete and could be useful to cache any kind of data that is tied to a format. I don't have specific examples other than cache the custom sounds of the dialog. But I think that having a media cache is something valuable.

J-N-K

In general looks much better now, and many thanks for your speedy response. I have left some comments.

One general questions: We already have an infrastructure for storing data, the StorageService, which already takes care of serialization/de-serialization and has a nice interface for handling key-value-pairs with an arbitrary unique String as key. With that your "consistency check" should be very much improved: You can use Storage.getKeys and check if the corresponding voice-files exist and Storage.containsKey to check if there is metadata for the voice-file in the cache folder. As a plus you wouldn't need any file handling for the metadata, just call .get/.put on the storage.

J-N-K · 2023-01-03T15:41:18Z

...es/org.openhab.core.voice/src/main/java/org/openhab/core/voice/AbstractCachedTTSService.java

@@ -0,0 +1,55 @@
+/**
+ * Copyright (c) 2010-2022 Contributors to the openHAB project


Suggested change

* Copyright (c) 2010-2022 Contributors to the openHAB project

* Copyright (c) 2010-2023 Contributors to the openHAB project

Please also update the other new files, otherwise the build will fail.

J-N-K · 2023-01-03T15:42:22Z

bundles/org.openhab.core.voice/src/main/java/org/openhab/core/voice/TTSCache.java

+     *             are not supported or another error occurs while creating an
+     *             {@link AudioStream}
+     */
+    AudioStream getOrSynthetize(TTSCachedService tts, String text, Voice voice, AudioFormat requestedFormat)


Suggested change

AudioStream getOrSynthetize(TTSCachedService tts, String text, Voice voice, AudioFormat requestedFormat)

AudioStream getOrSynthesize(TTSCachedService tts, String text, Voice voice, AudioFormat requestedFormat)

Can't this be just get?

J-N-K · 2023-01-03T15:44:15Z

....core.voice/src/main/java/org/openhab/core/voice/internal/cache/AudioStreamCacheWrapper.java

+import org.slf4j.LoggerFactory;
+
+/**
+ * Each {@link TTSResult} instance can handle several AudioStream.


Suggested change

* Each {@link TTSResult} instance can handle several AudioStream.

* Each {@link TTSResult} instance can handle several {@link AudioStream}s.

J-N-K · 2023-01-03T15:44:27Z

....core.voice/src/main/java/org/openhab/core/voice/internal/cache/AudioStreamCacheWrapper.java

+
+/**
+ * Each {@link TTSResult} instance can handle several AudioStream.
+ * This class is a wrapper for such functionality, and can


Suggested change

* This class is a wrapper for such functionality, and can

* This class is a wrapper for such functionality and can

J-N-K · 2023-01-03T15:46:29Z

...nhab.core.voice/src/main/java/org/openhab/core/voice/internal/cache/AudioStreamSupplier.java

+import org.openhab.core.voice.Voice;
+
+/**
+ * Custom supplier class to defer synthetizing with a TTS service


Suggested change

* Custom supplier class to defer synthetizing with a TTS service

* Custom supplier class to defer synthesizing with a TTS service

Done in an intermediary commit, but I deleted the file as I simplified the code...

J-N-K · 2023-01-03T15:52:47Z

....openhab.core.voice/src/main/java/org/openhab/core/voice/internal/cache/TTSLRUCacheImpl.java

+            String key = tts.getClass().getSimpleName() + "_" + tts.getCacheKey(text, voice, requestedFormat);
+            // try to get from cache
+            TTSResult ttsResult = get(key);
+            if (ttsResult == null || !ttsResult.getText().equals(text)) { // it's a cache miss or a false positive, we


I don't think it's necessary, but probably will not hurt. It's just a simple string compare.

J-N-K · 2023-01-03T15:58:18Z

...es/org.openhab.core.voice/src/main/java/org/openhab/core/voice/internal/cache/TTSResult.java

+    private long currentSize = 0;
+    private boolean completed;
+
+    @NonNullByDefault({}) // file channel could not be null when we read or write from the wrapper


This is not correct, it's @Nullable. You also ensure that when calling by creating a local variable. Please change the annotation.

Done (and the class is now a generic LRUMediaCacheEntry)

J-N-K · 2023-01-03T15:59:01Z

...es/org.openhab.core.voice/src/main/java/org/openhab/core/voice/internal/cache/TTSResult.java

+     * @param fallbackAudioStreamSupplier If something goes wrong with the cache, this supplier will provide the
+     *            AudioStream directly from the TTS service
+     *
+     * @return An @AudioStream that can be used to play sound


Suggested change

* @return An @AudioStream that can be used to play sound

* @return An {@link AudioStream} that can be used to play sound

why not name it getAudioStream?

Indeed.
In fact, it is now a getInputStream() because of the new genericity.

J-N-K · 2023-01-03T16:00:59Z

...es/org.openhab.core.voice/src/main/java/org/openhab/core/voice/internal/cache/TTSResult.java

+            } catch (TTSException e) {
+                logger.warn("Cannot get or store audio format from the TTS audio service: {}", e.getMessage());
+            }
+            audioFormatFinal = audioFormat;


If it's final, don't reuse it. Either create another object or re-name to something that is not final (e.g. local)

Indeed. Done by renaming it in "local"

J-N-K · 2023-01-03T16:03:27Z

...e.voice/src/test/java/org/openhab/core/voice/internal/cache/AudioStreamCacheWrapperTest.java

+@ExtendWith(MockitoExtension.class)
+public class AudioStreamCacheWrapperTest {
+
+    TTSResult mockedTTSResult = Mockito.mock(TTSResult.class);


Suggested change

TTSResult mockedTTSResult = Mockito.mock(TTSResult.class);

private @Mock @NonNullByDefault({}) TTSResult ttsResultMock;

Mockito can do the mocking by itself, no need to call mock. We usually name mocked objects <name>Mock in core. For consistency, please change that (also below, and inline the null-annotation).

Done, thanks for the tip.

dalgwen · 2023-02-06T17:44:36Z

Oops, I did (another) rewrite....
You can blame @GiviMAD for this. He gave me the idea to make the cache generic.
Joking aside, thanks GiviMAD for the idea. I comply to your suggestion because I have a use case in mind (two sinks, pulseaudio and doorbell, use codec conversion before sending audio, and a cache could enhance latency, which is as right now very noticeable)

I also rewrote to use the Storage openhab service, as @J-N-K suggested.
And I did a rebase.

I will update the open post to match the new proposal.

GiviMAD · 2023-02-06T21:34:03Z

Oops, I did (another) rewrite.... You can blame @GiviMAD for this. He gave me the idea to make the cache generic. Joking aside, thanks GiviMAD for the idea. I comply to your suggestion because I have a use case in mind (two sinks, pulseaudio and doorbell, use codec conversion before sending audio, and a cache could enhance latency, which is as right now very noticeable)

Good to know you already found an use for it.
With all the emerging IA services, I think there will be more use cases on the future, an idea that came to my mind will be to creating a DALL·E binding that allows the user to ask for images through a text item, another is an interpreter that allows to ask to ChatGPT. I assume those services will not be cheap so I think a cache like this will help there. Lets see, maybe I'm totally wrong.

Regards!

Implements a cache mechanism for all TTS services. Eviction policy is LRU mode. This cache can serve several streams concurrently, for the same utterance, with only one call to the TTS. It doesn't wait for the stream to end and can serve data rapidly. Cache size is a voice bundle parameter (10 mb default) Closes openhab#3039 Signed-off-by: Gwendal Roulleau <gwendal.roulleau@gmail.com>

Signed-off-by: Gwendal Roulleau <gwendal.roulleau@gmail.com>

And also the volume parameter Signed-off-by: Gwendal Roulleau <gwendal.roulleau@gmail.com>

Signed-off-by: Gwendal Roulleau <gwendal.roulleau@gmail.com>

The LRU cache use a LinkedHashMap instead of a custom implementation Signed-off-by: Gwendal Roulleau <gwendal.roulleau@gmail.com>

Using gson instead of java properties for .info files Signed-off-by: Gwendal Roulleau <gwendal.roulleau@gmail.com>

…tion of concerns Reverts (dropping) the enableCache parameter per request Removes the cache from the VoiceManager to put it in a dedicated service "TTSCache" (now declared as a separated OSGI component). This service respects the global parameter that allows TTSService to use the cache. And it is now the responsability of the TTSService to use the TTSCache or not. To do so, it can extends the AbstractCachedTTSService, which then transparently use the TTSCache injected. Signed-off-by: Gwendal Roulleau <gwendal.roulleau@gmail.com>

Signed-off-by: Gwendal Roulleau <gwendal.roulleau@gmail.com>

Instead of custom .info files Signed-off-by: Gwendal Roulleau <gwendal.roulleau@gmail.com>

Signed-off-by: Gwendal Roulleau <gwendal.roulleau@gmail.com>

J-N-K

Sorry for letting you wait so long, unfortunately we don't get notified on pushes, only on comments. Thanks again for this contribution.

I have left some last comments which I believe can be solved quickly. Please ping me when you are done.

J-N-K · 2023-02-24T19:41:34Z

....core.io.rest.voice/src/main/java/org/openhab/core/io/rest/voice/internal/VoiceResource.java

+            @QueryParam("sinkid") @Parameter(description = "audio sink id") @Nullable String sinkId,
+            @QueryParam("volume") @Parameter(description = "volume level") @Nullable String volume) {
+        PercentType volumePercent = null;
+        if (volume != null && !volume.isEmpty()) {


Suggested change

if (volume != null && !volume.isEmpty()) {

if (volume != null && !volume.isBlank()) {

" " would also not work.

J-N-K · 2023-02-24T19:44:04Z

....openhab.core.voice/src/main/java/org/openhab/core/voice/internal/cache/AudioFormatInfo.java

+    public AudioFormatInfo(String text, @Nullable Boolean bigEndian, @Nullable Integer bitDepth,
+            @Nullable Integer bitRate, @Nullable Long frequency, @Nullable Integer channels, @Nullable String codec,
+            @Nullable String container) {
+        super();


Suggested change

super();

There is not need for a call to the super constructor if the class has no parent.

J-N-K · 2023-02-24T19:45:59Z

....openhab.core.voice/src/main/java/org/openhab/core/voice/internal/cache/TTSLRUCacheImpl.java

+ *
+ * @author Gwendal Roulleau - Initial contribution
+ */
+@Component(immediate = false, configurationPid = VoiceManagerImpl.CONFIGURATION_PID)


Suggested change

@Component(immediate = false, configurationPid = VoiceManagerImpl.CONFIGURATION_PID)

@Component(configurationPid = VoiceManagerImpl.CONFIGURATION_PID)

no need to add default values

J-N-K · 2023-02-24T19:47:39Z

....openhab.core.voice/src/main/java/org/openhab/core/voice/internal/cache/TTSLRUCacheImpl.java

+     * @throws IOException when we cannot create the cache directory or if we have not enough space (*2 security margin)
+     */
+    @Modified
+    protected void activate(Map<String, Object> config) {


Suggested change

protected void activate(Map<String, Object> config) {

protected void modified(Map<String, Object> config) {

J-N-K · 2023-02-24T19:50:50Z

....openhab.core.voice/src/main/java/org/openhab/core/voice/internal/cache/TTSLRUCacheImpl.java

+                return (AudioStream) fileAndMetadata.getInputStream();
+            }
+        } catch (IOException e) {
+            logger.warn("Cannot get audio from cache, fallback to TTS service");


Is there something the user can do about it? If I understand your code correctly, no harm is done and the user will get the requested value. At DEBUG level it might make sense to either log the full stack trace, or just the message (e.getMesage()), so we get a hint what went wrong.

Suggested change

logger.warn("Cannot get audio from cache, fallback to TTS service");

logger.debug("Cannot get audio from cache, fallback to TTS service.", e);

I thought the user would like to be aware of a failure of the cache, as he could investigate, but indeed, it is not mandatory.
Done

J-N-K · 2023-02-24T19:54:00Z

...nhab.core.voice/src/test/java/org/openhab/core/voice/internal/cache/TTSLRUCacheImplTest.java

+    /**
+     * Simulate a cache miss, then two other hits
+     * The TTS service is called only once
+     *
+     * @throws TTSException
+     * @throws IOException
+     */
+    @Test


No need for @JavaDoc on individual tests. In general the test method's name should explain what is tested (also on other test classes).

J-N-K · 2023-02-24T19:54:50Z

bundles/org.openhab.core/src/main/java/org/openhab/core/cache/lru/DataRetrievalException.java

+ *
+ */
+@NonNullByDefault
+public class DataRetrievalException extends RuntimeException {


Is there a reason why this extends RuntimeException and not Exception? In most cases it's better to have checked exceptions.

I don't like it neither, but I did not find a proper solution.

I made it a runtime exception because, as you can see in the TTSLRUCacheImpl (line 114), it's the only way I can think of to help handling exception from a Supplier that can launch an exception (such as the TTSService)

I can also completely drop this DataRetrievalException and let the service using the cache cope with its own exception on its own term.

Do you have a better idea ?

Eventually, I deleted the DataRetrievalException.
As a result, the code now uses a RuntimeException to wrap the checked TTSException that must be catched inside the supplier get function. (and I just find out that the try / catch was wrongly placed and moved it)
Don't like it either, but it sounds better than a custom RuntimeException with no real use

J-N-K · 2023-02-24T19:57:48Z

bundles/org.openhab.core/src/main/java/org/openhab/core/cache/lru/LRUMediaCacheEntry.java

+        return currentSize;
+    }
+
+    protected String getKey() {


JavaDoc missing

J-N-K · 2023-02-24T19:59:38Z

...s/org.openhab.core/src/test/java/org/openhab/core/cache/lru/InputStreamCacheWrapperTest.java

+@NonNullByDefault
+@ExtendWith(MockitoExtension.class)


Suggested change

@NonNullByDefault

@ExtendWith(MockitoExtension.class)

@ExtendWith(MockitoExtension.class)

@NonNullByDefault

you chose this order for the other test classes

Signed-off-by: Gwendal Roulleau <gwendal.roulleau@gmail.com>

J-N-K

LGTM, Thanks!

lolodomo · 2023-03-01T18:53:56Z

Can we have a summary of the changes? Is the cache enabled by default? For what TTS services?
Is there a need to adjust some existing TTS services? For example, VoiceRSS has its own cache mechanism.

dalgwen · 2023-03-01T21:04:08Z

Hello @lolodomo

First, this pull request add a LRU cache implementation for media files. (it is generic and not only for TTS, I have one idea to use it elsewhere). Its main capabilities are :

least recently used files are evicted when threshold is reached
can serve several thread concurrently, even if launched at the same time, with only one call to the underlying service, without waiting for the entire file to be downloaded (it means for TTS, saying something simultaneously all around the house with only one computation)

Second, an OSGI service instantiates one LRU cache for TTS files.

This OSGI service is enabled by default (10 MB, we can customize this value or disable it in the "voice" configuration), and provides cache capability but ONLY for TTS services designed for it. It means that, as for now, no TTS services is using it.

The simplest way for a TTS service to use this cache is to extend the AbstractCachedTTSService and implements its abstract method instead of the classical TTSService interface.

I will make a pull request in the next days to add this capability for the MimicTTS service and for the PicoTTS service.

I don't have an opinion about the other TTS services, especially for those already using a custom cache. I would be glad if it is deemed worthy for other TTS services, but meh, "if it works, don't fix it" ? If maintainers think it is worth a shot, I can make some pull request for others.
The main advantage should be to simplify their code, provide a thread safe implementation (not so easy) and to use an application wide voice parameter to configure / enable / disable it.

lolodomo · 2023-03-02T12:33:18Z

Ok, perfect, so nothing is broken.
In VoiceRSS, the cache was mainly to create predefined TTS messages. But it is also used when requesting any TTS. I think the original cache should be kept for predefined messages but for any other TTS message, it should be better to use your new LRU cache. I will study such solution.

Your new cache is a common and unique cache for all openHAB services?

lolodomo · 2023-03-08T19:20:41Z

@dalgwen : why did you set the synthesize method as final in AbstractCachedTTSService ? Because of that, I can't override it.
I was trying to implement the new cache in VoiceRSS, it should be easy I believe, but I need to override this method because I have to consider in higher priority the cache of this voice binding (not for storing new TTS messages but to retrieved predefined TTS messages).

dalgwen · 2023-03-09T22:47:42Z

Hello @lolodomo

Sorry for the delay,
I set the synthetize method "final" because I wanted to let the addon developers know that they do not have to implement it.
The synthetizeForCache method is enough.
I just made a PR for a "sample" implementation for mimic TTS
openhab/openhab-addons#14564

dalgwen · 2023-03-09T22:50:38Z

You can look at the MimicTTSService.java file for a quick implementation. (the AutoDeleteFileAudioStream is a fix for another bug)
For your information, the getCacheKeymethod is sometimes not even needed.
It is only needed when the tts service use some "hidden" parameter that can affect the sound produced.

lolodomo · 2023-03-09T22:52:26Z

I set the synthetize method "final" because I wanted to let the addon developers know that they do not have to implement it.
The synthetizeForCache method is enough.

Generally, yes, but some cases require to oveeride it.

dalgwen · 2023-03-09T23:06:35Z

Indeed, I didn't think about a use case like yours. I wanted the effort for the basic use case to be minimal, but I now hope the implementation choice I made will not be a hindrance for your TTS service. Let me know if I can help you with something.

lolodomo · 2023-03-09T23:12:13Z

I now hope the implementation choice I made will not be a hindrance for your TTS service

Absolutely not. After removing the "final" constraint, your implementation looks very good to me.
I just have to find how to fix my audio stream when your cache is not used., to make it usable with the OH HTTP audio server

* [tts] Cache mechanism Implements a cache mechanism for all TTS services. Eviction policy is LRU mode. This cache can serve several streams concurrently, for the same utterance, with only one call to the TTS. It doesn't wait for the stream to end and can serve data rapidly. Cache size is a voice bundle parameter (10 mb default) Closes openhab#3039 Signed-off-by: Gwendal Roulleau <gwendal.roulleau@gmail.com> GitOrigin-RevId: 5544945

dalgwen requested a review from a team as a code owner August 8, 2022 15:25

dalgwen force-pushed the tts_cache branch from 66f0ebb to f383052 Compare August 9, 2022 11:09

wborn added the enhancement An enhancement or new feature of the Core label Aug 20, 2022

wborn requested a review from GiviMAD December 10, 2022 15:13

GiviMAD reviewed Dec 12, 2022

View reviewed changes

GiviMAD suggested changes Dec 12, 2022

View reviewed changes

dalgwen force-pushed the tts_cache branch 2 times, most recently from 12772d7 to 94b5cb1 Compare December 30, 2022 17:02

J-N-K requested changes Dec 30, 2022

View reviewed changes

dalgwen force-pushed the tts_cache branch from ba5f653 to 9c1350a Compare January 3, 2023 00:00

J-N-K requested changes Jan 3, 2023

View reviewed changes

dalgwen force-pushed the tts_cache branch 2 times, most recently from f229298 to fea3693 Compare February 6, 2023 18:23

dalgwen added 11 commits February 9, 2023 14:53

[tts] make the cache deletion proof

a561a10

Signed-off-by: Gwendal Roulleau <gwendal.roulleau@gmail.com>

[tts] TTS enable cache parameter in say action in GUI

a035545

Signed-off-by: Gwendal Roulleau <gwendal.roulleau@gmail.com>

[tts] Add enable cache parameter in DSL say action

f18e4b4

Signed-off-by: Gwendal Roulleau <gwendal.roulleau@gmail.com>

[tts] Rule for enabling or disabling cache

9a950cb

Signed-off-by: Gwendal Roulleau <gwendal.roulleau@gmail.com>

[tts] Add enable cache parameter in the REST service

b7acd09

And also the volume parameter Signed-off-by: Gwendal Roulleau <gwendal.roulleau@gmail.com>

[tts] Cache : apply minor code reviews

5f893be

Signed-off-by: Gwendal Roulleau <gwendal.roulleau@gmail.com>

[tts] Cache : refactor with LinkedHashMap

e42a68f

The LRU cache use a LinkedHashMap instead of a custom implementation Signed-off-by: Gwendal Roulleau <gwendal.roulleau@gmail.com>

[tts] Cache : apply code review suggestion on serialization

3d82bde

Using gson instead of java properties for .info files Signed-off-by: Gwendal Roulleau <gwendal.roulleau@gmail.com>

[tts] Cache : apply minor code review

1da21a9

Signed-off-by: Gwendal Roulleau <gwendal.roulleau@gmail.com>

dalgwen added 2 commits February 9, 2023 14:54

[tts] Cache : Use StorageService for storing metadata

a8df4b7

Instead of custom .info files Signed-off-by: Gwendal Roulleau <gwendal.roulleau@gmail.com>

[tts] Make the LRU media cache generic

a9a80c0

Signed-off-by: Gwendal Roulleau <gwendal.roulleau@gmail.com>

dalgwen force-pushed the tts_cache branch from fea3693 to a9a80c0 Compare February 9, 2023 14:28

J-N-K requested changes Feb 24, 2023

View reviewed changes

[tts] Apply code review comments

75dcead

Signed-off-by: Gwendal Roulleau <gwendal.roulleau@gmail.com>

J-N-K approved these changes Feb 28, 2023

View reviewed changes

J-N-K added this to the 4.0 milestone Feb 28, 2023

J-N-K merged commit 5544945 into openhab:main Feb 28, 2023

lolodomo mentioned this pull request Mar 8, 2023

AbstractCachedTTSService: make synthesize method non final #3437

Merged

dalgwen mentioned this pull request Mar 10, 2023

[picotts] Add LRU cache openhab/openhab-addons#14565

Merged

wborn mentioned this pull request Mar 29, 2023

LRUMediaCacheEntryTest unstable #3507

Closed

dalgwen deleted the tts_cache branch August 21, 2023 09:46

	.collect(Collectors.toMap(TTSResult::getKey, Function.identity()));
	.collect(Collectors.toMap(TTSResult::getKey, v -> v));

		TTSResult mockedTTSResult = Mockito.mock(TTSResult.class);
		AudioStreamSupplier mockedAudioStreamSupplier = Mockito.mock(AudioStreamSupplier.class);

		@@ -0,0 +1,55 @@
		/**
		* Copyright (c) 2010-2022 Contributors to the openHAB project

	* Copyright (c) 2010-2022 Contributors to the openHAB project
	* Copyright (c) 2010-2023 Contributors to the openHAB project

	AudioStream getOrSynthetize(TTSCachedService tts, String text, Voice voice, AudioFormat requestedFormat)
	AudioStream getOrSynthesize(TTSCachedService tts, String text, Voice voice, AudioFormat requestedFormat)

	* Each {@link TTSResult} instance can handle several AudioStream.
	* Each {@link TTSResult} instance can handle several {@link AudioStream}s.

	* This class is a wrapper for such functionality, and can
	* This class is a wrapper for such functionality and can

	* Custom supplier class to defer synthetizing with a TTS service
	* Custom supplier class to defer synthesizing with a TTS service

	* @return An @AudioStream that can be used to play sound
	* @return An {@link AudioStream} that can be used to play sound

	TTSResult mockedTTSResult = Mockito.mock(TTSResult.class);
	private @Mock @NonNullByDefault({}) TTSResult ttsResultMock;

	if (volume != null && !volume.isEmpty()) {
	if (volume != null && !volume.isBlank()) {

	@Component(immediate = false, configurationPid = VoiceManagerImpl.CONFIGURATION_PID)
	@Component(configurationPid = VoiceManagerImpl.CONFIGURATION_PID)

	protected void activate(Map<String, Object> config) {
	protected void modified(Map<String, Object> config) {

	logger.warn("Cannot get audio from cache, fallback to TTS service");
	logger.debug("Cannot get audio from cache, fallback to TTS service.", e);

[tts] Cache mechanism #3057

[tts] Cache mechanism #3057

Conversation

dalgwen commented Aug 8, 2022 • edited Loading

GiviMAD commented Dec 10, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

GiviMAD left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

openhab-bot commented Dec 12, 2022

dalgwen commented Dec 30, 2022 • edited Loading

J-N-K left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dalgwen Jan 1, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dalgwen Jan 2, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dalgwen Jan 2, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dalgwen Jan 1, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

GiviMAD commented Jan 1, 2023

J-N-K left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dalgwen commented Feb 6, 2023 • edited Loading

GiviMAD commented Feb 6, 2023

J-N-K left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dalgwen commented Aug 8, 2022 •

edited

Loading

GiviMAD commented Dec 10, 2022 •

edited

Loading

GiviMAD left a comment •

edited

Loading

dalgwen commented Dec 30, 2022 •

edited

Loading

dalgwen Jan 1, 2023 •

edited

Loading

dalgwen Jan 2, 2023 •

edited

Loading

dalgwen Jan 2, 2023 •

edited

Loading

dalgwen Jan 1, 2023 •

edited

Loading

dalgwen commented Feb 6, 2023 •

edited

Loading

dalgwen Feb 27, 2023 •

edited

Loading