Provide configuration option to declare CQL scripts that create a "named" Keyspace on startup [DATACASS-723]
John Blum opened DATACASS-723 and commented
The o.s.d.c.core.cql.session.init.SessionFactoryInitializer is not sufficient to replace the o.s.d.c.config.AbstractSessionConfiguration class's now deprecated getStartupScripts() and getShutdownScripts() methods.
As as application developer using Spring Data for Apache Cassandra, if I want to specify a Cassandra Keyspace used by my application, then in my application configuration, I might do, or start with the following, which is very useful...
package ...;
import org.springframework.data.cassandra.config.AbstractCassandraConfiguration;
class MyCassandraApplicationConfiguration extends AbstractCassandraConfiguration {
@Override
public String getKeyspaceName() {
return "MyAppKeyspace";
}
...
}
However, without also specifying the now deprecated methods, for example...
...
class MyCassandraApplicationConfiguration extends AbstractCassandraConfiguration {
@Override
protected List<String> getStartupScripts() {
return Collections.singletonList("schema.cql");
}
@Override
protected List<String> getShutdownScripts() {
...
}
...
}
Where schema.cql is defined as...
CREATE KEYSPACE IF NOT EXISTS MyAppKeyspace WITH replication = { 'class':'SimpleStrategy', 'replication_factor':1 };
USE MyAppKeyspace;
CREATE TABLE IF NOT EXISTS customers (id BIGINT PRIMARY KEY, name TEXT);
CREATE INDEX IF NOT EXISTS CustomerNameIdx ON customers(name);
...
Then the application will throw an Exception on startup stating that the named/specified Keyspace (i.e. MyAppKeyspace does not exist!
java.lang.IllegalStateException: Failed to load ApplicationContext
....
..
.
Caused by: org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'cassandraSessionFactory' defined in example.app.crm.config.CassandraConfiguration: Unsatisfied dependency expressed through method 'cassandraSessionFactory' parameter 0; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'cassandraSession' defined in example.app.crm.config.CassandraConfiguration: Invocation of init method failed; nested exception is com.datastax.oss.driver.api.core.InvalidKeyspaceException: Invalid keyspace customerservice
at org.springframework.beans.factory.support.ConstructorResolver.createArgumentArray(ConstructorResolver.java:798)
at org.springframework.beans.factory.support.ConstructorResolver.instantiateUsingFactoryMethod(ConstructorResolver.java:539)
....
..
.
Caused by: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'cassandraSession' defined in example.app.crm.config.CassandraConfiguration: Invocation of init method failed; nested exception is com.datastax.oss.driver.api.core.InvalidKeyspaceException: Invalid keyspace customerservice
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1796)
...
..
.
Caused by: com.datastax.oss.driver.api.core.InvalidKeyspaceException: Invalid keyspace customerservice
at com.datastax.oss.driver.api.core.InvalidKeyspaceException.copy(InvalidKeyspaceException.java:34)
at com.datastax.oss.driver.internal.core.util.concurrent.CompletableFutures.getUninterruptibly(CompletableFutures.java:149)
at com.datastax.oss.driver.api.core.session.SessionBuilder.build(SessionBuilder.java:501)
at org.springframework.data.cassandra.config.CqlSessionFactoryBean.buildSession(CqlSessionFactoryBean.java:456)
at org.springframework.data.cassandra.config.CqlSessionFactoryBean.afterPropertiesSet(CqlSessionFactoryBean.java:427)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.invokeInitMethods(AbstractAutowireCapableBeanFactory.java:1855)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1792)
... 83 more
The SessionFactoryInitializer (or even the KeyspacePopulator registered on the SessionFactoryFactoryBean provided via extension of the AbstractCassandraConfiguration}) is, or are far too late in the initialization process to "initialize" the "application" (-defined) Keyspace using the Cassandra {{Session provided by the SD Cassandra SessionFactory, which is created by the CqlSessionFactoryBean definition and is based off the "name", application-defined Keyspace anyway.
Essentially the problem can be reproduced by:
-
First declaring the application
Keyspacename. -
The
Keyspacename is then configured (i.e. set) on the {{CqlSessionFactoryBean} bean definition declared in theAbstractSessionConfigurationclass, which theAbstractCassandraConfigurationbase class, extended by application code to simplify configuration, extends.
NOTE: Notice that, fortunately, the
getStartupScripts()andgetShutdownScripts()methods are still used in SD Cassandra despite the deprecation and (e.g. this) comment.
NOTE: Because the
SessionFactoryInitializeris too late in the initialization process, it technically breaks the contract stated in the comment ofgetStartupScripts()andgetShutdownScripts()methods. However, the comment(s) are also ambiguous because they do not clarify that the startup and shutdown (CQL) scripts are applied before the application "named"Keyspaceis created/initialized, as will be witnessed in #3 following...
- Therefore, the only opportunity to "create" and "initialize" the application "named"
Keyspaceis directly after a SD Cassandra framework (internal)Sessionis opened to the Cassandra "system"Keyspaceand subsequently initialized that would then further allow additionalKeyspacesto be defined, created and initialized, via CQL scripts applied on startup.
NOTE: Indeed, if we trace through the code, we notice that the
keyspaceStartupScriptsare passed to theexecuteSpecsAndScripts(..)method, which ultimately executes the CQL statements. ThekeyspaceStartupScriptswere initialized from the now deprecatedAbstractCassandraConfiguration.getStartupScripts()method (also see here).
-
However, the very next thing to happen is that now the application "named"
Keyspaceis created, which leads to theIllegalStateExceptionshown above. -
If we tried to follow the logic of using a
SessionFactoryInitializerto perform the schema actions above, we'd see that A) theSessionconnected to the application "named"Keyspacewould then be supplied by theCqlSessionFactoryBeanbean definition from theAbstractSessionConfigurationclass (which again, the user's application would indirectly extend) to theSessionFactoryFactoryBeanbean definition declared in theAbstractCassandraConfigurationclass (the class our application configuration class extends). -
It is the
SessionFactoryFactoryBeanthat supplies theSessionFactorythat ultimately is post processed by the declaredSessionFactoryInitializersin the Spring context. -
Yet, as stated above, this is too late in the initialization process.
See comments below for possible solutions.
Affects: 3.0 M2 (Neumann)
John Blum commented
This problem was first detected in Spring Boot for Apache Geode (SBDG), testing the Inline Caching capabilities provided by SBDG to back an Apache Geode cache with a persistent data store like Apache Cassandra to perform read/write-through or write-behind data access operations.
This test class driving the configuration and test cases lives here.
The configuration of Apache Cassandra using Spring Data for Apache Cassandra lives here.
When trying to use either a KeyspacePopulator or SessionFactoryInitializer, the IllegalStateException noted in the description above was thrown
John Blum commented
I see 2 possible solutions (there maybe others as well):
-
First is to define
getKeyspaceStartupScripts()andgetKeyspaceShutdownScripts()methods in theAbstractCassandraConfigurationclass (again, see here) matching theCqlSessionFactoryBeanconfiguration properties (i.e.keyspaceStartupScriptsandkeyspaceShutdownScripts) and corresponding setters by the same name. -
Alternatively (or additively), the
SessionFactoryInitializerbean definitions could be applied to theSessionobject created (internally) by Spring Data for Apache Cassandra in theCqlSessionFactoryBeanclass for the "system"Keyspace. Again, see here. This would allow the deprecatedgetStartupScripts()andgetShutdownScripts()methods to be truly replaceable by theSessionFactoryInitializeras stated in the comment.
Of course, the SessionFactoryInitializers would need to be looked up in the Spring container and applied manually to the "system" Keyspace Session object built and used by CqlSessionFactroyBean since no Spring bean exists for the "system" Keyspace Session object, which is therefore not subject to bean post processing where the SessionFactoryInitializers are applied.
If a bean definition were created for the "system" Keyspace Session object created by the CqlSessionFactoryBean in addition to the application "named" Keyspace based Session object (the actual target Session object intended to be created by the CqlSessionFactoryBean), then the application "named" Keyspace Session object bean would need to be made the "primary" bean definition for the Session since then there would be minimally 2 Session beans in the Spring context created by Spring Data for Apache Cassandra. This gets tricky since users may also be defining their own CqlSessionFactoyBean bean definitions to define additional Sessions, most likely to different Keyspaces. As such, I would not recommend this approach, although possible.
Mark Paluch commented
Keyspace creation and initialization were intentionally split with the driver upgrade. Previously, factory beans carried a lot of functionality that blurred the lines of responsibility. We have a clear separation now between keyspace creation and keyspace initialization.
In general, keyspace creation on startup through configuration classes is discouraged as all data frameworks require the database (which corresponds with the keyspace) already created.
AbstractSessionConfiguration provides two entrypoints to keyspace creation:
-
getKeyspaceCreations() -
getStartupScripts()
getStartupScripts() had no clear responsibility and a verbatim script allows all sorts of CQL to run, therefore it's deprecated as of version 3.0.
Regarding the schema.cql file from above, it already mixes the concerns of keyspace creation and keyspace initialization. To initialize a keyspace, we introduced KeyspacePopulator to safely execute CQL scripts (i.e. an arrangement such as schema.cql and data.cql that define objects and data within a keyspace) within the context of a keyspace. SessionFactoryInitializer and KeyspacePopulator materialize the intentional split from CREATE KEYSPACE statements.
SessionFactoryInitializer is generally a standalone utility to achieve the same from above. That being said, I struggle to see which aspect is missing. Care to elaborate?
John Blum commented
Mark Paluch- Just acknowledging that I saw your reply comments, but need more time to respond to your questions. Will followup more shortly