SourceSpec v2
SourceSpec v2
This issue is for discussing the development of SourceSpec v2.
The development takes place in the v2 branch.
The main objectives of this major release concern three areas:
- Make code easier to understand and to maintain
- Provide a single executable, called
sourcespecwith subcommands - Officially support using SourceSpec as a Python API, with provided examples
Code improvements
- [x] Use a global
configobject, making thus unnecessary to passconfigas a function parameter. This has been implemented through these commits - [ ] Make logging optional (to improve API usage)
- [ ] Make writing output to disk optional (to improve API usage)
- [ ] Reorganize the Python sources in submodules (subdirectories), dropping the
ssp_prefix.- [ ] Each Python file should expose only one public function or public class (with maybe, as an exception, a
data_types.pyfile, exposing many data types classes) - [ ] Each submodule should expose, via
__init__.py, only the public functions and classes that are used by other submodules. An example is the current implementation ofconfig/__init__.py - [ ] The main script (which calls the subcommands) will go into a file called
main.pyin the main directory - [ ] Subcommands will go into a
subcommandssubmodule (subdirectory). E.g.,subcommands/spectral_inversion.py,subcommands/direct_modelling.py,subcommands/source_residuals.py
- [ ] Each Python file should expose only one public function or public class (with maybe, as an exception, a
- [ ] Improve the structure of the config file /
configobject (see below) - [ ] Change all the copyright headers to
:copyright: 2018-2024 The SourceSpec Developers - [ ] Change the licence to GPLV3
Single executable
We will provide a single executable, called sourcespec with subcommands.
Here's a mockup of invoking sourcespec -h:
usage: sourcespec [-h] [-c CONFIGFILE] [-v] <command> [options] ...
sourcespec: Earthquake source parameters from P- or S-wave displacement spectra
options:
-h, --help show this help message and exit
-c CONFIGFILE, --configfile CONFIGFILE
config file (default: sourcespec.conf)
-v, --version show program's version number and exit
commands:
sample_config write sample config file to current directory and exit
update_config update an existing config file to the latest version
update_database update an existing SourceSpec database from a previous version
sample_sspevent write sample SourceSpec Event File and exit
spectral_inversion inversion of P- or S-wave spectra
direct_modelling direct modelling of P- or S-wave spectra, based on user-defined earthquake source parameters
source_residuals compute station residuals from source_spec output
clipping_detection test the clipping detection algorithm
plot_sourcepars 1D or 2D plot of source parameters from a sqlite parameter file
The old commands (source_spec, source_model, source_residual, etc.) will still exist in v2.0 and will print an error message when invoked. They will be removed in v2.1.
Python API
Here's some ideas from @krisvanneste, which I reworked:
# Import the global config object, pre-filled with default values
from sourcespec.config import config
# Update one ore more values and, optionally, validate the configuration
config.update(<DICTIONARY WITH CONF PARAMS>)
config.validate()
# Optionally, init logging
from sourcespec.logging import init_logging
init_logging()
# Functions reading inventory and traces from disk.
# They return standard ObsPy objects and can be replaced by user-defined ones.
# Note that those functions are aware of the global `config` object.
from sourcespec.input import read_station_metadata, read_traces
inv = read_station_metadata()
st = read_traces()
# Read events and picks. Returns SSPEvent() and SSPPick() objects.
# Can be replaced by user, but the user should take care of using PEvent() and SSPPick() objects
from sourcespec.input import read_event_and_picks
ssp_event, ssp_picks = read_event_and_picks()
# optionally, the stream can be passed, in case event and/or pick information is in the trace header
ssp_event, ssp_picks = read_event_and_picks(st)
# Functions to further prepare the data for the inversion
from sourcespec.preprocess import augment_event, augment_traces
# add velocity info to hypocenter, add evname, add event to config
augment_event(ssp_event)
# add information in trace objects
st = augment_traces(st, inventory, ssp_event, picks)
# here's what this function does internally:
# for trace in st:
# _correct_traceid(trace)
# _add_instrtype(trace)
# _add_inventory(trace, inventory)
# _check_instrtype(trace)
# _add_coords(trace)
# _add_event(trace, ssp_event)
# _add_picks(trace, picks)
# process traces, build spectra
from sourcespec.process import process_traces, build_spectra
proc_st = process_traces(st)
spec_st, specnoise_st, weight_st = build_spectra(proc_st)
# Spectral inversion
from sourcespec.spectral_inversion import spectral_inversion
sspec_output = spectral_inversion(spec_st, weight_st)
# Compute summary statistics from station spectral parameters
from statistics import compute_summary_statistics
compute_summary_statistics(sspec_output)
# Other optional things like:
# - radiated energy
# - plotting
# - local magnitude
Reorganize configuration
We will reorganize configuration into sections, reflecting the submodules structures. The new config file should look like the following (comments removed here for simplicity):
[ general ]
author_name = None
author_email = None
agency_full_name = None
agency_short_name = None
agency_url = None
agency_logo = None
[ input ]
mis_oriented_channels = None
instrument_code_acceleration = None
instrument_code_velocity = None
traceid_mapping_file = None
ignore_traceids = None
use_traceids = None
epi_dist_ranges = None
station_metadata = None
sensitivity = None
database_file = None
correct_instrumental_response = True
trace_units = auto
[ processing ]
vp_tt = None
vs_tt = None
NLL_time_dir = None
p_arrival_tolerance = 4.0
s_arrival_tolerance = 4.0
noise_pre_time = 6.0
signal_pre_time = 1.0
win_length = 5.0
variable_win_length_factor = None
...
The parameters will be accessible from the config object, as in the following examples:
config.general.author_name
config.input.station_metadata
config.processing.win_length
How to test SourceSpec v2
The easiest way is to clone the git repository to a new directory, called sourcespec2, then immediately switch to the v2 branch:
git clone [email protected]:SeismicSource/sourcespec.git sourcespec2
cd sourcespec2 && git checkout v2
In the v2 branch, the package name has been temporary renamed to sourcespec2 so that it can be installed alongside the current version. For installing, go to the sourcespec2 directory and run:
pip install -e .
This will install the sourcespec2 package and the command line utils, currently named source_spec2, source_model2, etc.
Keeping the branch up-to-date
The v2 branch is frequently rebased, so make sure to do a git pull --force
Contributing to SourceSpec v2
Contributions are always more than welcome!
Just make sure to create your development branch from the v2 branch and to make your pull requests against the v2 branch 😉.
Looking for feedback
Pinging here @krisvanneste and @rcabdia who are the main API users.
Everybody else is welcome to contribute to the discussion!
Hi Claudio,
I made a first attempt to refactor some functions (in ssp_setup.py, ssp_read_traces.py and source_spec.py) in order to make it possible to run sourcespec as a function. Should I create a new branch for this?
OK, I created a new branch called v2_ssp_func, but now I get a strange error trying to push it to my github fork:
refusing to allow an OAuth App to create or update workflow .github/workflows/github-deploy.yml without workflow scope
I will try to resolve this tomorrow...
OK, I created a new branch called v2_ssp_func, but now I get a strange error trying to push it to my github fork:
refusing to allow an OAuth App to create or update workflow .github/workflows/github-deploy.yml without workflow scopeI will try to resolve this tomorrow...
Hi Kris, maybe the solution is here : https://stackoverflow.com/questions/64059610/how-to-resolve-refusing-to-allow-an-oauth-app-to-create-or-update-workflow-on
I have been able to solve it by changing the repository URL in sourcetree, as mentioned here Pull request will follow.
I'm able to run sourcespec2 in a jupyter notebook and without writing anything to disk with the modifications I made! Here's a PDF showing the notebook: test_ssp_func.pdf For now I read all required data from our own servers, which are not open to the outside world. We will need to replace that with FDSN service calls. Some lessons learned:
- it would be nice to have a default
config.optionsmockup of the command-line arguments - I had to add a
TRACEID_MAPattribute toconfig - I noticed that all config attributes that are lists contain strings. In the case of
config.Er_freq_range(default:['None', 'None']), this results in an error; depending on the configuration, there may be other such cases - it would be nice to add a methods to
SSPEvent/SSPPickto construct them fromobspy.core.event.Event/obspy.core.event.Pickobjects - it should be possible to pass an empty inventory if instrument response is already removed or if the metadata are already attached to the traces
There is room for further improvements/streamlining, but I think we are on the right track.
Thanks, Kris, for the example!
Here's my comment:
- it would be nice to have a default
config.optionsmockup of the command-line arguments
I would like to bring the options into the config object as normal attributes: there is no point, from the point of view of the API usage, to have them separated into a sub-object.
For the CLI usage, the options should be used to override the config parameters: it should be possible to run SourceSpec without any option, and have everything in the config file.
- I had to add a
TRACEID_MAPattribute toconfig
Ok, I will put it intoConfig().__init__().
- I noticed that all config attributes that are lists contain strings. In the case of
config.Er_freq_range(default:['None', 'None']), this results in an error; depending on the configuration, there may be other such cases
Ok, I will fix it in Config().__init__(). I made a quick scan: it doesn't seem to me that there are other such cases.
- it would be nice to add a methods to
SSPEvent/SSPPickto construct them fromobspy.core.event.Event/obspy.core.event.Pickobjects
Noted 😉
- it should be possible to pass an empty inventory if instrument response is already removed or if the metadata are already attached to the traces
Noted 😉
There is room for further improvements/streamlining, but I think we are on the right track.
Great!
- I had to add a
TRACEID_MAPattribute toconfig- I noticed that all config attributes that are lists contain strings. In the case of
config.Er_freq_range(default:['None', 'None']), this results in an error; depending on the configuration, there may be other such cases
This two points are fixed in this commit: https://github.com/SeismicSource/sourcespec/commit/87ce4f4315a24cc2da3527be6fbd793e3269b27e
Claudio, I tested my notebook after the new rebase, and almost everything still works, except:
- I had to update the call to the
ssp_outputfunction - I get an error when plotting stacked spectra because my version of matplotlib does not support
ax.inset_axes(); I will try to solve this with a local patch
Based on an issue with the main branch that I experienced yesterday, I realized that we also need to have a function to clean up state at the end of each run (or probably better at the beginning), so that no problems occur when ssp_run is called a second time.
So far, I have to do the following:
from sourcespec import ssp_setup
ssp_setup.oldlogfile = None
from sourcespec import ssp_wave_arrival
ssp_wave_arrival.add_arrival_to_trace.pick_cache = dict()
ssp_wave_arrival.add_arrival_to_trace.travel_time_cache = dict()
ssp_wave_arrival.add_arrival_to_trace.angle_cache = dict()
from sourcespec import ssp_plot_traces
ssp_plot_traces.SAVED_FIGURE_CODES = []
ssp_plot_traces.BBOX = None
from sourcespec import ssp_plot_spectra
ssp_plot_spectra.SAVED_FIGURE_CODES = []
ssp_plot_spectra.BBOX = None
I haven't checked yet if this still works in v2, but I guess you know this better than me.
Ok, thanks for the feedback!
Whenever you have time, that would be great if you can contribute this ssp_clean_state() function 😉
Claudio,
What are the further steps needed to finalize a first v2 version?
Some possible things that come to mind:
- do we need a function or method to populate
Optionswith all possible keys (but set to None)? - do we want to allow for interactive plotting or just stick with writing to an output folder?
- I think the logging doesn't fully work in interactive mode currently: a number of messages seem to be missing, and when I run the code a second time there are even less messages
- ...
Thanks for the feedback
- do we need a function or method to populate
Optionswith all possible keys (but set to None)?
This is interesting and not too difficult to tackle.
Currently, the following code:
from sourcespec2.setup import config
Gives a config object has all the parameters set to default, but config.options is empty.
Can you make a PR with that?
- do we want to allow for interactive plotting or just stick with writing to an output folder?
That would be wonderful. SourceSpec has some sort of interactive plotting (config.plot_show = True). But I never tested it in a Jupyter notebook. We should maybe start from here.
- I think the logging doesn't fully work in interactive mode currently: a number of messages seem to be missing, and when I run the code a second time there are even less messages
I can tackle this later on, when we will finish restructuring the module structure.
- ...
Todo:
- [ ] Rebase on current v1 !
- [ ] Create the
processingsubdir and move processing modules in there - [ ] Move the other modules in their subdirs (e.g.,
inversion,postprocessing,output,plotting) - [ ] Select the examples to show in the paper
- [ ] Prepare a first quick draft for the paper!
Can you make a PR with that?
OK, I will look into that.
I will also check what happens if I set config.plot_show = True in a notebook.
That would be wonderful. SourceSpec has some sort of interactive plotting (
config.plot_show = True). But I never tested it in a Jupyter notebook. We should maybe start from here.
I tried plotting interactively, but this results in the following error:
E:\Home\_kris\Python\cloned_repos\sourcespec2\sourcespec2\ssp_plot_spectra.py in _make_fig(plot_params)
140 if not stack_plots:
141 textstr += (
--> 142 f'- {config.end_of_run.strftime("%Y-%m-%d %H:%M:%S")} '
143 f'{config.end_of_run} '
144 )
E:\Home\_kris\Python\cloned_repos\sourcespec2\sourcespec2\setup\config.py in __getattr__(self, key)
196 return self.__getitem__(key)
197 except KeyError as err:
--> 198 raise AttributeError(err) from err
199
200 __setattr__ = __setitem__
AttributeError: 'end_of_run'
Apparently, end_of_run and end_of_run_tz are added to the config object in the ssp_output.write_output function. Also, additional information is appended to sspec_output.run_info in this function. I think it would be better 1) to add the end of run information to run_info as well, and 2) add all the necessary information to run_info when the inversion is completed, before ssp_output is called.
add all the necessary information to
run_infowhen the inversion is completed, beforessp_outputis called.
Maybe in a dedicated function that is called at the end of ssp_run
add all the necessary information to
run_infowhen the inversion is completed, beforessp_outputis called.Maybe in a dedicated function that is called at the end of
ssp_run
Do you want me to investigate this?
add all the necessary information to
run_infowhen the inversion is completed, beforessp_outputis called.Maybe in a dedicated function that is called at the end of
ssp_runDo you want me to investigate this?
yes please!
OK, I will try to come up with a solution.