cclib icon indicating copy to clipboard operation
cclib copied to clipboard

Support URLs in ccget

Open langner opened this issue 9 years ago • 17 comments

langner avatar Apr 21 '16 07:04 langner

I would like to help. Can you provide more information about what needs to be done?

gaursagar avatar May 02 '16 16:05 gaursagar

The idea is to add the ability to pass URLs for ccget on the command line, so you could load things from the the web. cclib already support this internally, so it's just a question of updating that script.

langner avatar May 03 '16 14:05 langner

I suggest using constructor overloading or some sort of implicit URL validation for the input filename. The second approach can be easily implemented in the openlogfile module of parser/logfileparser.py.

After URL validation, the file can be downloaded either to the working directory or a temporary directory using urllib.request module as follows,

  1. Saving to a temporary location tempfile_path, http_messege = urllib.request.urlretrieve(url)
  2. or, Saving to some other path urllib.request.urlretrieve(url, file_path)

The saved file can be opened and used as any other local file would be used. I can implement this code and create a PR if this approach sounds fine.

gaursagar avatar Mar 04 '17 15:03 gaursagar

@gaursagar I don't think we need to validate anything. cclib already support loading the URLs, we just need to make sure the ccget script support it as well.

langner avatar May 12 '17 07:05 langner

I believe ccget has support for file handling after PR #351 (change). Please verify.

The following is working for me,

ccget -l https://raw.githubusercontent.com/cclib/cclib/master/data/Molpro/basicMolpro2012/dvb_gopt.log https://raw.githubusercontent.com/cclib/cclib/master/data/Molpro/basicMolpro2012/h2o_mp2.out

and giving output:

Attempting to read https://raw.githubusercontent.com/cclib/cclib/master/data/Molpro/basicMolpro2012/dvb_gopt.log
cclib can parse the following attributes from https://raw.githubusercontent.com/cclib/cclib/master/data/Molpro/basicMolpro2012/dvb_gopt.log:
  aonames
  atombasis
  atomcoords
  atomnos
  charge
  coreelectrons
  gbasis
  geotargets
  homos
  metadata
  moments
  mult
  natom
  nbasis
  nmo
  scfenergies
  scftargets
  scfvalues
Attempting to read https://raw.githubusercontent.com/cclib/cclib/master/data/Molpro/basicMolpro2012/h2o_mp2.out
cclib can parse the following attributes from https://raw.githubusercontent.com/cclib/cclib/master/data/Molpro/basicMolpro2012/h2o_mp2.out:
  atomcoords
  atomnos
  charge
  coreelectrons
  geotargets
  homos
  metadata
  moments
  mpenergies
  mult
  natom
  nbasis
  nmo
  scfenergies
  scftargets
  scfvalues

If this is the expected behavior, this issue can be closed.

gaursagar avatar May 12 '17 09:05 gaursagar

@gaursagar Could you make sure we have appropriate unit tests for this? Related to #54

langner avatar May 13 '17 00:05 langner

Yes, I will look into making the tests.

On 13-May-2017 5:46 AM, "Karol M. Langner" [email protected] wrote:

@gaursagar https://github.com/gaursagar Could you make sure we have appropriate unit tests for this? Related to #54 https://github.com/cclib/cclib/issues/54

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cclib/cclib/issues/267#issuecomment-301212040, or mute the thread https://github.com/notifications/unsubscribe-auth/AFpYrq7ui1J6t7H9BNIxECFXzzU-Ihgiks5r5PY_gaJpZM4IMaGI .

gaursagar avatar May 13 '17 01:05 gaursagar

ccget calls ccopen via ccread for all sorts of file handling including URL IO. Tests for these already cover URL handling.

If there is any interest in testing ccget as a script, can we consider using bash scripts?

#!/bin/bash
function assertEquals()
{
    MESSEGE=$1; shift
    EXPECTED=$1; shift
    ACTUAL=$1; shift
    echo -n "$MESSEGE: "
    if [ "$EXPECTED" != "$ACTUAL" ]; then
        echo "FAILED: EXPECTED=$EXPECTED ACTUAL=$ACTUAL"
    else
        echo PASSED
    fi
}

ACTUAL=$(ccget -l https://raw.githubusercontent.com/cclib/cclib/master/data/QChem/basicQChem4.2/C_bigbasis.out | xargs | md5)
EXPECTED=$(echo "Attempting to read https://raw.githubusercontent.com/cclib/cclib/master/data/QChem/basicQChem4.2/C_bigbasis.out
cclib can parse the following attributes from https://raw.githubusercontent.com/cclib/cclib/master/data/QChem/basicQChem4.2/C_bigbasis.out: aonames atombasis atomcharges atomcoords atomnos charge coreelectrons enthalpy entropy freeenergy gbasis homos metadata mocoeffs moenergies moments mult natom nbasis nmo pressure scfenergies scftargets scfvalues temperature" | xargs | md5)
assertEquals "checking ccget url IO" $EXPECTED $ACTUAL

gaursagar avatar May 15 '17 06:05 gaursagar

If there is any interest in testing ccget as a script, can we consider using bash scripts?

I am definitely interested in this.

berquist avatar May 15 '17 14:05 berquist

Any suggestions regarding what approach to take?

On 15-May-2017 8:17 PM, "Eric Berquist" [email protected] wrote:

If there is any interest in testing ccget as a script, can we consider using bash scripts?

I am definitely interested in this.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cclib/cclib/issues/267#issuecomment-301498001, or mute the thread https://github.com/notifications/unsubscribe-auth/AFpYrmOL-KLcIMS3WVsWnwXWWEM5tA0dks5r6GVygaJpZM4IMaGI .

gaursagar avatar May 15 '17 14:05 gaursagar

I don't know much about testing shell scripts (since I've never done it), so this looks like a good start to me. There are probably "best practices" that I'm unaware of.

I think the exit codes (stored in $? after each command) should also be checked, though I don't know if Python automatically sets the right code upon success or failure (such as when we'd have an AttributeError).

One other thing is that we'd need to generalize the above to take any path (not just a URL) and the attributes, but that shouldn't be too bad.

berquist avatar May 15 '17 14:05 berquist

As much as I love to write bash scripts, I would avoid writing tests in bash. And, why would we write tests in bash if we have Python at our disposal? And, ccget is a Python function, it seems much more straightforward to test that function in Python - perhaps breaking it up into smaller pieces where convenient.

langner avatar May 16 '17 03:05 langner

Breaking ccget into functions​ would be a better idea. Currently it is a single large function, making writing python unittests not so straightforward.

On 16-May-2017 9:25 AM, "Karol M. Langner" [email protected] wrote:

As much as I love to write bash scripts, I would avoid writing tests in bash. And, why would we write tests in bash if we have Python at our disposal? And, ccget is a Python function, it seems much more straightforward to test that function in Python - perhaps breaking it up into smaller pieces where convenient.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cclib/cclib/issues/267#issuecomment-301668633, or mute the thread https://github.com/notifications/unsubscribe-auth/AFpYrjEiT2KGVQeQ9Q_gW41JIon0e0OBks5r6R5MgaJpZM4IMaGI .

gaursagar avatar May 16 '17 04:05 gaursagar

I suppose that since there would be no problems with the Python -> Bash interface, just breakage Python-side, that shell script tests are unnecessary.

berquist avatar May 16 '17 12:05 berquist

Moving any logic that we can to a module would be great, since we could unit test that. Taking a second look at ccget, however, a substantial amount of code would stay there (dealing with arguments and inputs).

Given that we currently have failures with the scripts, I suggest adding a regression test that runs against all logfiles (at least the unit test ones for now) - and including that in Travis CI. Sagar says it runs in 2 minutes, which is on the same scale as the existing regressions.

langner avatar May 16 '17 16:05 langner

@berquist If you're interested in testing bash scripts, check out BATS: https://github.com/sstephenson/bats. I've used it a little bit.

TDI-tenderholt avatar May 20 '17 14:05 TDI-tenderholt

Moving to v1.5.4.

langner avatar Apr 04 '18 15:04 langner

@oliver-s-lee Would you consider helping write a test for this? The feature works fine I think, except for #1268.

berquist avatar Aug 29 '23 23:08 berquist

Sure can do, although I think we already have a test for this (for the backend machinery anyway) in the IO tests.

oliver-s-lee avatar Aug 30 '23 07:08 oliver-s-lee

Added in #1370

oliver-s-lee avatar Feb 26 '24 17:02 oliver-s-lee