Peter Inglesby

Results 9 issues of Peter Inglesby

### Describe the bug There's a nasty interaction between the following: * some import-time caching in SQLAlchemy (in [`sqlalchemy.inspection._registrars`](https://github.com/sqlalchemy/sqlalchemy/blob/66be1482db06adb908432b2e3b41d9393d1319f7/lib/sqlalchemy/inspection.py#L53)), * the way Coverage.py handles editable installs, and * the way...

bug
orm
PRs (with tests!) welcome
external library/application issues

@keaneokelley Can you give me a thumbs up/down if you'd like this merged?

Eg https://travis-ci.org/inglesp/http-crawler/jobs/285774955

#12 means that now we ignore URL schemes that cannot be handled by `requests`, but we should be able to identify mistyped URL schemes. See discussion in #6.

We currently extract links from HTML by looking for `src` and `href` attributes, and from CSS by looking for `@import` rules and `URI` tokens. A user might want to extract...

We currently use `requests`'s default behaviour of following redirects. A user might not always want this, as they might want to use the library to find unnecessary redirects on a...

We currently follow all links, but in some cases this might not be appropriate We should find a way to allow the user to configure which links to follow.

We currently extract links from all pages that are on the same domain as the original URL that is passed to `crawl`. This might be too narrow (for instance, a...

I have kept the same directory structure for content, templates, and media.