documentation missing
I can't seem to find any documentation anywhere, majority of the .m files had little to no comments within them. Makes it difficult to know what some of the functions with in actually do.
Additionally, some code usage examples would be helpful. The examples in the root level README file appear to be swift code, not very useful if required to use objective-c
Best regards dragonsKnight5
@dragonsKnight5 Hello! What documentation we have is mostly in the header files, which are in the Sources/include directory. But your point is well taken: your best chance to grok most of the code is keeping a copy of the specification at hand for reference, and that's pretty inconvenient.
For examples, have a look at an old version of the read me. I figure most new code using HTMLReader would be in Swift, so I converted the examples, but that's awesome to hear there's still some new Objective-C being written in the world.
Let me know if there's anything specific you're curious about in the code, or if you have any questions about how to use HTMLReader!
How do I get a list of elements using a class? I think it may be HTMLElements's initWithTagName but I'm still rather confused
Thank you for getting back to me so quickly
You'll need to make an HTMLDocument, using e.g. [HTMLDocument documentWithString:]. Once you have that, you can ask the document for elements that match CSS selectors, including classes. So if you're looking for elements with the class updog, you might try
HTMLDocument *doc = [HTMLDocument documentWithString:coolString];
NSArray *elements = [doc nodesMatchingSelector:@".updog"];
Let me know if that helps!
That was a big help, thank you very much.
I know the next question is outside the scope of HTMLReader, but I'm hoping you can direct me to some appropriate resources. How do I get the processed data out of the NSURLSession instance, I've tried assigning it to a global NSMutableArray variable but as soon as the NSURLSession finishes the NSMutableArray reverts to its pre NSURLSession content.
Thank you again for patience and help Best regards dragonsKnight5
I've worked around the issue using
NSURL *URL = [NSURL URLWithString:@"http://your-url-here"];
NSData *data = [NSData dataWithContentsOfURL:URL];
NSString *html = [NSString stringWithUTF8String:[data bytes]];
I know this is on the main thread which is bad practice but I can't find a way to get the processed data out of NSURLSession as described in the examples in your readme link above
I suspect the issue is that URLSession does its work in the background and calls you later. For example:
[[[NSURLSession sharedSession] dataTaskWithURL:url completionHandler:
^(NSData *data, NSURLResponse *response, NSError *error) {
NSLog(@"hello from the completion handler");
NSDictionary *headers = [(NSHTTPURLResponse *)response allHeaderFields];
HTMLDocument *doc = [HTMLDocument documentWithData:data
contentTypeHeader:headers[@"Content-Type"]];
// you can work with the document inside this block all you like
// e.g. call a method and pass in the document
// (you probably want to dispatch_async on to the main queue first)
}] resume];
// outside the completion block, execution continues without waiting
NSLog(@"hello from after resuming the session task");
If you run that, you should see logs in this order:
hello from after resuming the session task hello from the completion handler
The completion block is called in the background after the URLSession does its work, so execution in your current queue continues immediately after resuming the task.
I looked around briefly but didn’t find any great NSURLSession tutorials. Search for that (URLSession is the Swift name so you might have better luck keeping the NS prefix) and look at a few, you'll probably start to see some patterns.
Thanks for getting back to me so quickly
to retrieve hyperlinks I think I use the same approach you outlined above just with the the selector string changed to "a" or "a href" isn't it
Exactly right. One way to get all the links might be a[href]. If you've used CSS before to style webpages, it's the part before the { where you say which elements you want styled. The examples at https://developer.mozilla.org/en-US/docs/Learn/CSS/Building_blocks/Selectors might be helpful too.
with your greatly appreciated help, I've managed to filter down to div's using a particular class then get the hyperlinks within it, I can't workout how to get url string out of the htmlelement.
The debugger shows me that the value I want is in attributes _map but nothing I've tried so far has worked to retrieve it
Thank you for taking the time to help me with this, objective-c isn't my current main code language so some things aren't translating across very well
Best regards dragonsKnight5
HTMLElement has an attributes property which returns a dictionary, and it also supports subscripting directly. Either of these should work:
NSString *url = aElement[@"href"];
NSString *url2 = aElement.attributes[@"href"];
And it's my pleasure! Objective-C is kinda crufty (it's a few decades old at this point), but learning it helped me understand objects and methods in a way that other languages didn't, so I'll always be fond of it.