[ZEPPELIN-3438] Passing Z variables to BigQuery
What is this PR for?
This PR enables the interpolation of ZeppelinContext objects into the paragraph text of BigQuery cells. It also introduces a new interpreter-level configuration parameter named zeppelin.bigquery.interpolation. This new parameter is false by default, and must be set to true to enable object interpolation. The default value of false guarantees backward compatibility for users who are not aware of the new feature.
The implementation takes the same approach that was followed in PR 2898 and 2903.
What type of PR is it?
[ Feature ]
Todos
- [ ] - Task
What is the Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-3438
How should this be tested?
Two new unit test classes have been added: BigQueryInterpolationDisabled.java, and BigQueryInterpolationEnabled.java.
Also, the code in this PR merely causes the BigQuery interpreter to "opt-in" to the implementation of z-variable interpolation already existing in the Interpreter base class - described in PR-2898. The unit tests necessary for testing z-variable interpolation in detail are already present in PR-2898
Screenshots (if appropriate)
Questions:
- Does the licenses files need update? No
- Is there breaking changes for older versions? No
- Does this needs documentation? Yes, documentation changes provided in bigquery.md
LGTM
Thanks for pointing out the issue @felixcheung
I've corrected the link as well as some of the preceding words as the ZeppelinContext documentation is now in its own page - not under Spark Interpreter. This correction (the words preceding the link) will have to be made in the Spark and JDBC interpreter documentation too, and I will bundle it later with another PR.
I have also made another correction in the configuration table - the default value false has been moved into the middle column (where it should have been) from its previous location at the end of the description in the last column.
The two failed tests on travis - apparently unrelated to the code from this PR - are really stubborn and wont go away no matter what I do!
@sanjaydasgupta I was thinking would it better to put it in Interpreter base class instead of doing it in every interpreter implementation.
Thanks for the suggestion @zjffdu.
The same idea was given earlier by Leemoonsoo here, and I had then moved all of the required code into the interpolate(...) method of the base class.
So, the PR for each interpreter now contains the following 4 elements:
- The configuration changes (defining the
zeppelin.???.interpolateparameter) in the associatedinterpreter-setting.jsonfile (seen here), - The documentation changes in the associated
docs/interpreter/???.mdfile (seen here), - A 2-line change to the interpreter's
interpret()method causing it to invokeinterpolate(...)in the base class ifzeppelin.???.interpolateis found to be true (seen here), and - The specific unit-test classes as applicable (seen here and here)
Keeping this consistent set of files together in one PR helps the review process IMHO, but I am open to any suggestions for further improvement.
Thanks @sanjaydasgupta I was thinking whether we could do it in Interpreter.java like this.
public abstract InterpreterResult interpret2(String st,
InterpreterContext context)
throws InterpreterException;
public InterpreterResult interpret(String st, InterpreterContext context)
throws InterpreterException {
String cmd = Boolean.parseBoolean(getProperty("zeppelin.shell.interpolation")) ?
interpolate(st, context.getResourcePool()) : st;
return interpret2(cmd, context);
}
So that each sub class can just implement method interpret2 (we can make a more proper name for this method).
This is a great idea @zjffdu.
There are also a few other things to think of - like automatically generating the name of the enabler parameter on a per-interpreter basis (zeppelin.shell.interpolation in your sample code above). This can be done by adding a static map to interpreter.java that translates from interpreter class name to parameter name, but I think we will obtain greater mileage from the following 2-part strategy:
- refactor the
interpret(...)method in all of the existing interpreters once to introduce the 2 lines that have to change in each interpreter (no need to introduce a new method likeinterpret2(...) - add a map to
Interpreter.javathat translates from interpreter class name to parameter name, for use in the updated code in (1) above. This map could be filled with unused names in advance without doing any harm.
This is more or less what I had in mind during this earlier discussion, but I did not want to start it then as there was other interpreter refactoring work going on. Perhaps this is a better time to resume that thought :-) Allow me to ask the opinions of a wider group of reviewers - what do you think @felixcheung @jongyoul @zjffdu ?
do we really need this for all interpreters though? for those we don't have active maintainer it might better not to change them
We won't enable it for all interpreters. By default we could disable it, interpreter can enable it via setting zeppelin.interpreter.interpolation to true to enable it. The purpose is just to provide hook to allow interpreter have chance to update code before doing real interpretation.
Hi this PR have never been merged. So simple question is variable interpolation for bigquery interpreter is working or not at all ? trying to use z.put(var,val) then use in %%bigquery block but no interpolation the string the placeholder is not replaced Thanks
@richiesgr Yes, I don't think interpolation is working for bigquery interpreter right now. The easy way is just to make BigQuery extends AbstractInterpreter. so that user can enable interpolation via local properties, e.g.
%bigquery(interpolate=true)
...
https://github.com/apache/zeppelin/blob/master/zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/AbstractInterpreter.java#L42