`fuzzed_y': undefined method `keys' for #<Array:0...>
The code is:
=begin
The following requires cp'ed from:
https://github.com/tardate/pdf-reader-turtletext/blob/master/lib/pdf-reader-turtletext.rb
=end
require 'pdf-reader'
require 'pdf/reader/patch/object_hash'
require 'pdf/reader/positional_text_receiver'
require 'pdf/reader/turtletext'
require 'pdf/reader/turtletext/version'
require 'pdf/reader/turtletext/textangle'
=begin
The following from:
"How to instantiate Turtletext in code"
https://github.com/tardate/pdf-reader-turtletext
=end
pdf_filename = '../taxforms/f1065-2017.pdf'
reader = PDF::Reader::Turtletext.new(pdf_filename)
=begin
The following from:
"How to extract text within a region described in relation to other text
https://github.com/tardate/pdf-reader-turtletext
=end
textangle = reader.bounding_box do
page 4
end
textangle.text
However, when run with my ruby2.3, it produces error the error in the subject line:
make -k
gem list pdf-reader
*** LOCAL GEMS ***
pdf-reader (1.4.0, 1.1.1)
pdf-reader-html (0.1.0)
pdf-reader-markup (0.0.1)
pdf-reader-turtletext (0.2.2)
ruby how_to_instantiate.rb
/var/lib/gems/2.3.0/gems/pdf-reader-turtletext-0.2.2/lib/pdf/reader/turtletext.rb:53:in `fuzzed_y': undefined method `keys' for #<Array:0x000000013d5058> (NoMethodError)
from /var/lib/gems/2.3.0/gems/pdf-reader-turtletext-0.2.2/lib/pdf/reader/turtletext.rb:42:in `content'
from /var/lib/gems/2.3.0/gems/pdf-reader-turtletext-0.2.2/lib/pdf/reader/turtletext.rb:87:in `text_in_region'
from /var/lib/gems/2.3.0/gems/pdf-reader-turtletext-0.2.2/lib/pdf/reader/turtletext/textangle.rb:134:in `text'
from how_to_instantiate.rb:28:in `<main>'
Makefile:4: recipe for target 'how_to_instantiate' failed
Is this a bug in pdf-reader-turtletext or am I at fault?
TIA.
@cppljevans Please check out the fork from tkieley. I made a small patch so it will work with PDF Reader > 1.2
Thanks @MatthewSuttles; I cloned the tkieley fork and used -I<patch> in the ruby command
where <patch> is where the lib subdirectory is in the downloaded the fork and it no longer shows the
error.
However, the new fuzzed_y code uses .select instead of .find. Wouldn't find be faster because it would stop searching once it found the first item satisfying the requirements. IOW, something like:
hash_sort.each do |precise_y|
matching_y = output.map(&:first).find\
{ |new_y|\
diff_y = new_y-precise_y;\
diff_y.abs < y_precision\
} || precise_y
For anyone still having this issue, I submitted a pull request (#10) to fix compatibility with pdf-reader 2.4.0. You can find it here: https://github.com/emmeryn/pdf-reader-turtletext in the update-gem branch.