sw icon indicating copy to clipboard operation
sw copied to clipboard

Trying to Build tesseract - Errors from SW

Open FlyinTeller opened this issue 2 years ago • 13 comments

Describe the bug I am trying to build tesseract directly from source. I have cloned the github repo and am running sw --trace build in the main directory. See the output in the attached file

Expected behavior Installation without error

To Reproduce Steps to reproduce the behavior:

  1. Clone https://github.com/tesseract-ocr/tesseract.git
  2. cd tesseract
  3. sw --trace build

Information:

  • Paste sw --version output.
sw.client.sw version 1.0.0
git revision 04536355191c090cecdaf978fd06fc2d09f26cf1
assembled on
15.12.2022 13:08:17 UTC
  • Write OS, its version (host/target)
  • Microsoft Windows 10, x64
  • Describe your compiler, its version - MSVC\14.36.32532 Microsoft Visual Studio 17 2022
  • (optional) Post sw logs using -trace parameter. - see attached file (due to length) a.txt

FlyinTeller avatar Jun 07 '23 07:06 FlyinTeller

I am integrating tesseract into my project and get a build error, too:

(...)
[131/140] [org.sw.demo.boost.filesystem-1.81.0].lib
Exception in file D:/dev/cppan2/client2/src/sw/builder/command.cpp:840, function execute1: When executing: [pub.egorpugin.primitives.command-0.3.1]/src/command.cpp
C:/Users/Robert/.sw/storage/pkg/9d/a6/8d0d/src/sdir/src/command/src/command.cpp(142): error C2039: "contains" ist kein Member von "std::basic_string<char,std::char_traits<char>,std::allocator<char>>".
C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\include\xstring(4905): note: Siehe Deklaration von "std::basic_string<char,std::char_traits<char>,std::allocator<char>>"
C:/Users/Robert/.sw/storage/pkg/9d/a6/8d0d/src/sdir/src/command/src/command.cpp(147): note: Siehe Verweis auf die gerade kompilierte Instanziierung "auto primitives::Command::setProgram::<lambda_1>::()::<lambda_1>::operator ()<const char(&)[6]>(_T1) const" der Funktions-Vorlage.
        with
        [
            _T1=const char (&)[6]
        ]

Relevant part of my CMakeLists.txt:

set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED ON)

# Add Tesseract via SW
# Source: https://github.com/SoftwareNetwork/sw/tree/master/test/integrations/cmake/tess
set(SW_BUILD_SHARED_LIBS 1)
set(DEPENDENCIES
    org.sw.demo.google.tesseract.libtesseract-master
)
find_package(SW REQUIRED)
sw_add_package(${DEPENDENCIES})
sw_execute()

rsnitsch avatar Jun 13 '23 08:06 rsnitsch

I resorted to building tesseract from source with -DSW_BUILD=False and manual installing of neccessary dependencies

FlyinTeller avatar Jun 13 '23 08:06 FlyinTeller

Use the latest VS2022. I might need to update sw binary first to make it work.

egorpugin avatar Jun 13 '23 14:06 egorpugin

I get the same error when I compile the tesseract5.3.1 on VS1019&win11 with sw. Then I compile the tesseract5.3.1 X64 successfully with vcpkg, cmake and command prompt on win11 refer to workflows/vcpkg.yml

  • Install cmake and vcpkg, Let's suppose the path of vcpkg is: D:/vcpkg. Add vcpkg and cmake path to environment variable Path.

  • Build and Install Leptonica and image libraries using vcpkg. The Leptonica and image libraries will be installed in D:/vcpkg/installed/x64-windows. vcpkg install leptonica:x64-windows

  • Build and Install PkgConfig. the workflow doesn't contain this step. it will report errors when we compile tesseract if we don't have PkgConfig. vcpkg install pkgconf:x64-windows

  • Install libarchive. the workflow doesn't contain this step. it will report errors when we compile tesseract if we don't have archive. The latest version of vcpkg untill 2023/6/26 could not intall archive because of some bug. we could download the compiled x64 [libarchive]. Then unzip it and copy the .h/.dll/.lib files to the corresponding folders in D:/vcpkg/installed/x64-windows.

  • Get the source code of tesseract5.3.1. Unzip it and come to its path in command prompt, then run the following command. -DSW_BUILD=OFF --->close SW -DBUILD_SHARED_LIBS=ON --->generate dll.

cmake . -B build -DCMAKE_BUILD_TYPE=Release -DSW_BUILD=OFF -DOPENMP_BUILD=OFF -DBUILD_SHARED_LIBS=ON -DBUILD_TRAINING_TOOLS=OFF "-DCMAKE_TOOLCHAIN_FILE=D:/vcpkg/scripts/buildsystems/vcpkg.cmake"
cmake --build build --config Release --target install

livezingy avatar Jun 26 '23 07:06 livezingy

Use the latest VS2022. I might need to update sw binary first to make it work.

Unfortunately, I cannot easily switch to VS 2022, because that breaks compatibility with another library that I use (OpenCV). When I try to use VS 2022 then CMake reports: Found OpenCV Windows Pack but it has no binaries compatible with your configuration. You should manually point CMake variable OpenCV_DIR to your build of OpenCV library.

I could compile OpenCV myself with VS 2022 as well, but currently I don't have the time for that.

For now I have resorted to using the tesseract binaries by Uni Mannheim and calling tesseract.exe as a subprocess (saving image file and providing the path to that).

rsnitsch avatar Jun 30 '23 10:06 rsnitsch

I mean, that you need to install VS2022 for sw. For your purposes you are free to use any version you need.

egorpugin avatar Jun 30 '23 12:06 egorpugin

I get a new error now:

ninja: error: 'C:/Users/Robert/.sw/storage/pkg/b1/62/8e34/obj/bld/485381/lib/org.sw.demo.google.tesseract.libtesseract-5.3.1.lib', needed by 'CMakeFiles/MYPROJECT_autogen_timestamp_deps', missing and no known rule to make it

My CMakeLists.txt:

# Add Tesseract via SW
# Source: https://github.com/SoftwareNetwork/sw/tree/master/test/integrations/cmake/tess
set(SW_BUILD_SHARED_LIBS 1)
set(SW_DEPENDENCIES
    org.sw.demo.google.tesseract.libtesseract-5.3.1
)
find_package(SW REQUIRED)
sw_add_package(${SW_DEPENDENCIES})
sw_execute()

# ...

target_link_libraries(MYPROJECT PRIVATE
    # other deps...
    ${SW_DEPENDENCIES}
)

I tried to remove the line with SW_BUILD_SHARED_LIBS but it doesn't change anything.

sw version:

λ sw --version
sw.client.sw version 1.0.0
git revision 30611200c2168a3ca7f9d888c11ba375b4667d07
assembled on
13.06.2023 21:33:07 UTC
13.06.2023 23:33:07 Mitteleurop�ische Sommerzeit

Visual Studio 2019 and 2022 are installed and up-to-date. I also have the Visual Studio 2017 build tools installed and updated.

rsnitsch avatar Aug 04 '23 08:08 rsnitsch

Hi,

I'm not sure that Ninja generator works with sw. I tried your example and see

d:\dev\cppan2\client2\test\integrations\cmake\tess\ninja>cmake --build .
ninja: error: 'D:/dev/swst/pkg/b0/7f/f40a/obj/bld/108430/lib/org.sw.demo.google.tesseract.libtesseract-5.3.2.lib', needed by 'tess_example.exe', missing and no known rule to make it

Try to build using VS generator.

egorpugin avatar Aug 04 '23 10:08 egorpugin

I've made a fix for Ninja. https://github.com/SoftwareNetwork/sw/commit/7d2916a1aac00edfb0fb917ade42a23a5fcc7a85#diff-680a0d8c76eec8315c509b457b47195fd59aa97dcf5c70cad13e2e080d4f0074R331

Try to add following lines (without version change in the top) into C:\users\u\.sw\storage\static\SWConfig.cmake

egorpugin avatar Aug 04 '23 14:08 egorpugin

Sorry @egorpugin I am now using vcpkg to integrate tesseract.

My CMakeLists.txt now looks like this:

# Make vcpkg install all binaries (DLLs)
set(X_VCPKG_APPLOCAL_DEPS_INSTALL ON)

# Setting the toolchain did not work, so I include the vcpkg CMake script directly.
include(C:/vcpkg/scripts/buildsystems/vcpkg.cmake)

find_package(Tesseract REQUIRED)
include_directories( ${Tesseract_INCLUDE_DIRS} )

target_link_libraries(MYPROJECT PRIVATE
    Tesseract::libtesseract
)

rsnitsch avatar Aug 06 '23 12:08 rsnitsch

Use the latest VS2022. I might need to update sw binary first to make it work.

Workaround to not buying the newest compiler for just sw or tesseract: In your user path, then /.sw/storage/pkg/9d/a6/8d0d/src/sdir/src/command/src/command.cpp Relpace line 142 return s.contains("4nt.exe"); with return s.find("4nt.exe") != std::string::npos; Some very quick fella included C++23 there.

MK-3PP avatar Sep 12 '23 12:09 MK-3PP

Workaround to not buying the newest compiler for just sw or tesseract:

Is it possible to use VS 2022 build tools? Are they free?

egorpugin avatar Sep 13 '23 11:09 egorpugin

Idk, the community versions should be available. As for my case, the dev and build systems in our company run on VS 2019 Pro Licenses and I doubt CEOs and admins will be very happy to break the ecosystem for just this one line.

(We will upgrade eventually, but not for this.)

MK-3PP avatar Oct 11 '23 11:10 MK-3PP