CASSANDRA-21209 Rework ZSTD dictionary compression logic to create a trainer per training#4667
Conversation
| } | ||
| finally | ||
| { | ||
| refViewFragment.close(); |
There was a problem hiding this comment.
I do not think what we did here was too smart (same concept was there before we started to selectAndReference) because training is done asychronously, so this method returns and finally is called before the sampling is actually finished. We should close in callback, as done above, or only in case we catch exception, as done here.
Or no?
trainer.trainDictionaryAsync(force).addCallback
This "addCallback" makes synchronous call from that? I do not think so, it just registers what should be done after it is finished, but it is not a blocking call, I guess.
ccbef18 to
596182d
Compare
| ScheduledExecutors.nonPeriodicTasks.submit(task); | ||
| try | ||
| { | ||
| trainer = ICompressionDictionaryTrainer.create(keyspaceName, tableName, compressionParams); |
There was a problem hiding this comment.
whole execution chain (from manager.train) does everything to postpone trainer creation until it is absolutely necessary and all is OK, as the instantiation of a trainer might be memory-wise very demanding (when max sample size is not trivial) as it allocates a direct ByteBuffer. We do not want to create a trainer allocating a big buffer just to throw it away if something else goes south.
yifan-c
left a comment
There was a problem hiding this comment.
Some minor nits. Looks good overall. Thanks for the simplification!
src/java/org/apache/cassandra/db/compression/CompressionDictionaryScheduler.java
Outdated
Show resolved
Hide resolved
src/java/org/apache/cassandra/db/compression/CompressionDictionaryTrainingConfig.java
Outdated
Show resolved
Hide resolved
src/java/org/apache/cassandra/db/compression/ZstdDictionaryTrainer.java
Outdated
Show resolved
Hide resolved
010004d to
32a1f78
Compare
…ning patch by Stefan Miklosovic; reviewed by Yifan Cai for CASSANDRA-21209
32a1f78 to
a54d227
Compare
Thanks for sending a pull request! Here are some tips if you're new here:
Commit messages should follow the following format:
The Cassandra Jira