Support URLs in ccget
I would like to help. Can you provide more information about what needs to be done?
The idea is to add the ability to pass URLs for ccget on the command line, so you could load things from the the web. cclib already support this internally, so it's just a question of updating that script.
I suggest using constructor overloading or some sort of implicit URL validation for the input filename.
The second approach can be easily implemented in the openlogfile module of parser/logfileparser.py.
After URL validation, the file can be downloaded either to the working directory or a temporary directory using urllib.request module as follows,
- Saving to a temporary location
tempfile_path, http_messege = urllib.request.urlretrieve(url) - or, Saving to some other path
urllib.request.urlretrieve(url, file_path)
The saved file can be opened and used as any other local file would be used. I can implement this code and create a PR if this approach sounds fine.
@gaursagar I don't think we need to validate anything. cclib already support loading the URLs, we just need to make sure the ccget script support it as well.
I believe ccget has support for file handling after PR #351 (change). Please verify.
The following is working for me,
ccget -l https://raw.githubusercontent.com/cclib/cclib/master/data/Molpro/basicMolpro2012/dvb_gopt.log https://raw.githubusercontent.com/cclib/cclib/master/data/Molpro/basicMolpro2012/h2o_mp2.out
and giving output:
Attempting to read https://raw.githubusercontent.com/cclib/cclib/master/data/Molpro/basicMolpro2012/dvb_gopt.log
cclib can parse the following attributes from https://raw.githubusercontent.com/cclib/cclib/master/data/Molpro/basicMolpro2012/dvb_gopt.log:
aonames
atombasis
atomcoords
atomnos
charge
coreelectrons
gbasis
geotargets
homos
metadata
moments
mult
natom
nbasis
nmo
scfenergies
scftargets
scfvalues
Attempting to read https://raw.githubusercontent.com/cclib/cclib/master/data/Molpro/basicMolpro2012/h2o_mp2.out
cclib can parse the following attributes from https://raw.githubusercontent.com/cclib/cclib/master/data/Molpro/basicMolpro2012/h2o_mp2.out:
atomcoords
atomnos
charge
coreelectrons
geotargets
homos
metadata
moments
mpenergies
mult
natom
nbasis
nmo
scfenergies
scftargets
scfvalues
If this is the expected behavior, this issue can be closed.
@gaursagar Could you make sure we have appropriate unit tests for this? Related to #54
Yes, I will look into making the tests.
On 13-May-2017 5:46 AM, "Karol M. Langner" [email protected] wrote:
@gaursagar https://github.com/gaursagar Could you make sure we have appropriate unit tests for this? Related to #54 https://github.com/cclib/cclib/issues/54
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cclib/cclib/issues/267#issuecomment-301212040, or mute the thread https://github.com/notifications/unsubscribe-auth/AFpYrq7ui1J6t7H9BNIxECFXzzU-Ihgiks5r5PY_gaJpZM4IMaGI .
ccget calls ccopen via ccread for all sorts of file handling including URL IO. Tests for these already cover URL handling.
If there is any interest in testing ccget as a script, can we consider using bash scripts?
#!/bin/bash
function assertEquals()
{
MESSEGE=$1; shift
EXPECTED=$1; shift
ACTUAL=$1; shift
echo -n "$MESSEGE: "
if [ "$EXPECTED" != "$ACTUAL" ]; then
echo "FAILED: EXPECTED=$EXPECTED ACTUAL=$ACTUAL"
else
echo PASSED
fi
}
ACTUAL=$(ccget -l https://raw.githubusercontent.com/cclib/cclib/master/data/QChem/basicQChem4.2/C_bigbasis.out | xargs | md5)
EXPECTED=$(echo "Attempting to read https://raw.githubusercontent.com/cclib/cclib/master/data/QChem/basicQChem4.2/C_bigbasis.out
cclib can parse the following attributes from https://raw.githubusercontent.com/cclib/cclib/master/data/QChem/basicQChem4.2/C_bigbasis.out: aonames atombasis atomcharges atomcoords atomnos charge coreelectrons enthalpy entropy freeenergy gbasis homos metadata mocoeffs moenergies moments mult natom nbasis nmo pressure scfenergies scftargets scfvalues temperature" | xargs | md5)
assertEquals "checking ccget url IO" $EXPECTED $ACTUAL
If there is any interest in testing ccget as a script, can we consider using bash scripts?
I am definitely interested in this.
Any suggestions regarding what approach to take?
On 15-May-2017 8:17 PM, "Eric Berquist" [email protected] wrote:
If there is any interest in testing ccget as a script, can we consider using bash scripts?
I am definitely interested in this.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cclib/cclib/issues/267#issuecomment-301498001, or mute the thread https://github.com/notifications/unsubscribe-auth/AFpYrmOL-KLcIMS3WVsWnwXWWEM5tA0dks5r6GVygaJpZM4IMaGI .
I don't know much about testing shell scripts (since I've never done it), so this looks like a good start to me. There are probably "best practices" that I'm unaware of.
I think the exit codes (stored in $? after each command) should also be checked, though I don't know if Python automatically sets the right code upon success or failure (such as when we'd have an AttributeError).
One other thing is that we'd need to generalize the above to take any path (not just a URL) and the attributes, but that shouldn't be too bad.
As much as I love to write bash scripts, I would avoid writing tests in bash. And, why would we write tests in bash if we have Python at our disposal? And, ccget is a Python function, it seems much more straightforward to test that function in Python - perhaps breaking it up into smaller pieces where convenient.
Breaking ccget into functions would be a better idea. Currently it is a single large function, making writing python unittests not so straightforward.
On 16-May-2017 9:25 AM, "Karol M. Langner" [email protected] wrote:
As much as I love to write bash scripts, I would avoid writing tests in bash. And, why would we write tests in bash if we have Python at our disposal? And, ccget is a Python function, it seems much more straightforward to test that function in Python - perhaps breaking it up into smaller pieces where convenient.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cclib/cclib/issues/267#issuecomment-301668633, or mute the thread https://github.com/notifications/unsubscribe-auth/AFpYrjEiT2KGVQeQ9Q_gW41JIon0e0OBks5r6R5MgaJpZM4IMaGI .
I suppose that since there would be no problems with the Python -> Bash interface, just breakage Python-side, that shell script tests are unnecessary.
Moving any logic that we can to a module would be great, since we could unit test that. Taking a second look at ccget, however, a substantial amount of code would stay there (dealing with arguments and inputs).
Given that we currently have failures with the scripts, I suggest adding a regression test that runs against all logfiles (at least the unit test ones for now) - and including that in Travis CI. Sagar says it runs in 2 minutes, which is on the same scale as the existing regressions.
@berquist If you're interested in testing bash scripts, check out BATS: https://github.com/sstephenson/bats. I've used it a little bit.
Moving to v1.5.4.
@oliver-s-lee Would you consider helping write a test for this? The feature works fine I think, except for #1268.
Sure can do, although I think we already have a test for this (for the backend machinery anyway) in the IO tests.
Added in #1370