gl4es icon indicating copy to clipboard operation
gl4es copied to clipboard

Implement Precompile Shader Archive

Open ptitSeb opened this issue 6 years ago • 83 comments

When using GLES2 backend, every Fixed Pipeline Function (so OpenGL 1.x) can lead to the creation of a new shader program. Some can take a bit of time to compile and link (like when a lot of lights are involved), giving some "hicup" to a game. That long loading time can be seen when launching Foobillard++ or Neverball for example.

Because Fixed Pipeline Emulator always generate the same shaders program, thoses could be saved for later use: by creating an Archive containing past build FPE program, and using the GL_OES_program_binary extension to save / load the program binary and avoid the compiling and linking part.

The PSA will be available only if the extension is present, and if it support at least 1 format for binary programs.

On linux, the Archive will be save in the HOME folder, as a hidden file (named .gl4es.psa) On AmigaOS4, it will be in PROGDIR: as a hidden file (same name as linux)

TODO: Were to put the archive on Android TODO: Were to put the archive on Emscripten

ptitSeb avatar Jul 17 '19 14:07 ptitSeb

Implemented in commit 601184b1b7e63cd1f85b5360207bb763efa8fe82 and fixed with 5a189a1f30ae73a4849a2e4f3f18259e6484ec55. Disabled by default for now.

ptitSeb avatar Jul 17 '19 14:07 ptitSeb

While Daniel working on adding necessary functions to amigaos4 driver, i still did test how it all handles currently by settings LIBGL_NOPSA to 0.

So, i just run neverball and exit. As result, it create .gl4es.psa file in the root of game (good!), which is 43 bytes of size (so just header saying GL4ES PrecompiledShaderArchive + some bytes of structure format or something). When i exit from game i also have at end "LIBGL: Saved a PSA with 0 Precompiled Programs" (that of course expected, as there is no functions implemented).

Through, should't there be error or something (or log message) saying when running something like "LIBGL: Forcing to use PSA, but functions didn't work" , or something ? Or it make no sense ?

kas1e avatar Jul 17 '19 19:07 kas1e

Mmm, yes, the archive should not be created. I'll check that !

ptitSeb avatar Jul 17 '19 19:07 ptitSeb

@kas1e : with comit 37d3b629580f2289703d5ee83f701c38cabf08ef it should now ignore PSA if the extension is not present (you will have to delete yourself the previous empty one).

ptitSeb avatar Jul 17 '19 19:07 ptitSeb

Btw, did you test PSA on Pandora already ? I.e. with fricking shark while playing and loading of neverball for example. Interesting to know if on your side with pandora's gles2 all works as expected now ?

kas1e avatar Jul 18 '19 05:07 kas1e

I tested quickly on the Pandora, and it seems promising yes. Starting time for Neverball is much faster now. I need to do more testing, but that looks good (I haven't tried Friking Shark yet).

ptitSeb avatar Jul 18 '19 05:07 ptitSeb

Is pandora's gles2 driver also have needs to assembly from spirv format those precompiled shaders, or they already in the machine code and not in spirv saved in .gl4es.psa file ?

kas1e avatar Jul 18 '19 05:07 kas1e

No spirv on the pandora, it's already machine code specific to the PowerVR it use.

ptitSeb avatar Jul 18 '19 06:07 ptitSeb

So if then take in account that Hans will not do it in Nova, then maybe asking Daniel to save psa not in spirv, but in Nova's assembly code will make sense.. (If nova provide such conversion functions to public)

kas1e avatar Jul 18 '19 06:07 kas1e

yep

ptitSeb avatar Jul 18 '19 06:07 ptitSeb

But note that I don't think it's possible to access Nova compiled program shader if Hans don't write the equivalent of glProgramBinary(...) and glGetProgramBinary(...) into Nova. That's why Daniel proposed to do it using the SpirV way.

ptitSeb avatar Jul 18 '19 06:07 ptitSeb

But imho problem with those "micro-pauses-hickups" happens because there needs to convert things from one format to card specific code in realtime , and when we have precompiled shaders (and if they saved in the machine code), there will be no needs to convert anything, just send when need it. So, imho, if only ogles2 will have those functions, but shaders will be in machine code, then everything should be fine already, no ?:)

But, there is another issue arise : machine specific code probabaly will be specific also for different gfx cards. In case with Pandora as i understand there just one single gfx card all the time, so , all precompiled shaders on pandora will always working on all pandoras, while, in case with amigaos4 (or any other desktop os where we can change cards), we have different gfx card, which mean different machine code, which mean that we can't in end save machine-specific code, right ? And on pandora it works because one single card everywhere ?

From another side, it really doesn't matter if it will be different for different cards : it only will mean that we can't release with included .gl4es.psa , but one which will be generated for user on user's machine will be then later used with no probs, only drawback will be that first time will need to play with hickups..

kas1e avatar Jul 18 '19 08:07 kas1e

Yes, the .gl4es.psa is supposed to be build on the target computer, not bring by the app. That means the 1st run will still get the hicup, but other runs will not. It the same on the Pandora, there are 3 different model with slightly diferent hardware, and there are a lot of version of the driver user can choose from, so I don't plan to put any pre-baked gl4es.psa in games, and will one be build by itself.

ptitSeb avatar Jul 18 '19 08:07 ptitSeb

Aha got it.. So let's wait and see if implementation as Daniel do with "spirv" on top of nova conversion will give enough boost or not. If it will be enough, then our version will be even portable across different gfx card, but if not , then probabaly we will need ask Hans to provide public api to generate machine ready code.

kas1e avatar Jul 18 '19 08:07 kas1e

Hi :)

So Daniel send me first version with his implementation of necessary functions, that what readme says:

  • Implemented the OES_get_program_binary extension as requested by Kasie Kasovich (kas1e). Because Nova doesn't provide any support here, this implementation works with the internal SPIR-V code of all shaders that make up the respective program. So the binaries here are no true machine-code but intermediate code, packed into a proprietary format. Providing such binaries essentially skips the vertex- and fragment-shader's GLSL compilation steps. Anyway, this means:
  • new function GetProgramBinaryOES to extract such a binary representation of the respective successfully linked shader program.
  • new function ProgramBinaryOES to supply the GL with such a cached binary.
  • support for the glGet parameter GL_NUM_PROGRAM_BINARY_FORMATS_OES - which returns 1 now.
  • support for the glGet parameter GL_PROGRAM_BINARY_FORMATS_OES, which returns one single format ID identifying my binary format (0xC00DA675)
  • support for the glGetProgramiv parameter GL_PROGRAM_BINARY_LENGTH_OES which gives you the buffer size required to store the binary for the respective program.
  • consequently add GL_OES_get_program_binary to the extension string.

So, i run Neverball over new binary, and gl4es says me in output:

LIBGL: Extension GL_OES_get_program detected LIBGL: Numberof supported Program Binary Format : 1

And nothing more related to it.

So , after i exit from Neverball, i didn't have any .gl4es.psa file created. I run some DOS tracer (to see if it tries to create that file) and nope, file even didn't tries to be created.

Have any clue what to check next ?:)

kas1e avatar Jul 23 '19 12:07 kas1e

Do you see "LIBGL: Shuting down" at the end of the program at least?

The function close_gl4es() from src/gl/init.c at this end of the file, declared as a "destructor" should print it.

ptitSeb avatar Jul 23 '19 13:07 ptitSeb

Yeah of course i see that one, i just print only info relevant to PSA, but there is full log (supertuxkart0.6.2a, latest gl4es):

4/0.Work:games/supertuxkart> supertuxkart_gl4es_1916 
LIBGL: Initialising gl4es
LIBGL: v1.1.1 built on Jul 22 2019 23:35:10
LIBGL: Using GLES 2.0 backend
LIBGL: Using Warp3DNova.library v1 revision 65
LIBGL: Using OGLES2.library v2 revision 9
LIBGL: OGLES2 Library and Interface open successfuly
LIBGL: Targeting OpenGL 2.0
LIBGL: Forcing NPOT support by disabling MIPMAP support for NPOT textures 
LIBGL: Not trying to batch small subsequent glDrawXXXX
LIBGL: Current folder is:/Work/games/supertuxkart
Data files will be fetched from: '.'
Highscores will be saved in './.supertuxkart/highscore.data'.
LIBGL: Hardware test on current Context...
LIBGL: Hardware Limited NPOT detected and used
LIBGL: Extension GL_EXT_blend_minmax detected and used
LIBGL: FBO are in core, and so used
LIBGL: PointSprite are in core, and so used
LIBGL: CubeMap are in core, and so used
LIBGL: BlendColor is in core, and so used
LIBGL: Blend Substract is in core, and so used
LIBGL: Blend Function and Equation Separation is in core, and so used
LIBGL: Texture Mirrored Repeat is in core, and so used
LIBGL: Extension GL_OES_mapbuffer detected
LIBGL: Extension GL_OES_element_index_uint detected and used
LIBGL: Extension GL_OES_packed_depth_stencil detected and used
LIBGL: Extension GL_EXT_texture_format_BGRA8888 detected and used
LIBGL: Extension GL_OES_texture_float detected and used
LIBGL: high precision float in fragment shader available and used
LIBGL: Extension GL_EXT_frag_depth detected and used
LIBGL: Max vertex attrib: 16
LIBGL: Extension GL_OES_get_program detected
LIBGL: Number of supported Program Binary Format: 1
LIBGL: Max texture size: 16384
LIBGL: Max Varying Vector: 32
LIBGL: Texture Units: 8(8), Max lights: 8, Max planes: 6
LIBGL: Extension GL_EXT_texture_filter_anisotropic detected and used
LIBGL: Max Anisotropic filtering: 16
LIBGL: Hardware vendor is A-EON Technology Ltd. Written by Daniel 'Daytonta675x' MьЯener @ GoldenCode.eu
LIBGL: OGLES2 Library and Interface closed
LIBGL: Shuting down 

kas1e avatar Jul 23 '19 13:07 kas1e

I also asking Daniel to recheck new stuff too, he says that glGetProgramiv with GL_PROGRAM_BINARY_LENGTH_OES works and glGetProgramBinaryOES apparently works too.

EDIT: and glProgramBinaryOES works too

kas1e avatar Jul 23 '19 13:07 kas1e

Btw, measuring things by some test examples on os4 for now, we found that of course all the cached binaries should be preloaded right on running, because loading binaries from disk can slow the things down the same as before.

But i assume in case with gl4es, on begining it scaning for .gl4es.psa , and if it found and functions/extensions working, it then preload binaries to memory, right ?

kas1e avatar Jul 23 '19 14:07 kas1e

Check in https://github.com/ptitSeb/gl4es/blob/master/src/gl/init.c#L569 if the name of the PSA is correct, using some printf

ptitSeb avatar Jul 23 '19 14:07 ptitSeb

Btw, measuring things by some test examples on os4 for now, we found that of course all the cached binaries should be preloaded right on running, because loading binaries from disk can slow the things down the same as before.

But i assume in case with gl4es, on begining it scaning for .gl4es.psa , and if it found and functions/extensions working, it then preload binaries to memory, right ?

Yes. Loading of the PSA file is done at init of gl4es. Then, when a fpe shader need to be created, it ceck first in the PSA archive (in memory), and create the shader directly if present. If not present, the shader is created in the traditional way, and is added to the in-memory PSA (that is then flaged "dirty"). At shut down of gl4es, the PSA archive is writen back to disk if it's flagged dirty.

ptitSeb avatar Jul 23 '19 14:07 ptitSeb

Check in https://github.com/ptitSeb/gl4es/blob/master/src/gl/init.c#L569 if the name of the PSA is correct, using some printf

Something wrong with whole "If" started from if(hardext.prgbin_n>0) { line. Maybe missing } or something somewhere, because none printfs is reached there.

As i see for now , all that file creation thing inside of if(globals4es.nopsa==0) { } loops, which probabaly not what it should be ?

kas1e avatar Jul 23 '19 15:07 kas1e

look closely, it's nopsa , so it should be 0 to use psa.

ptitSeb avatar Jul 23 '19 15:07 ptitSeb

That how it looks like for me now:

    if (getcwd(cwd, sizeof(cwd))!= NULL)
        SHUT(LOGD("LIBGL: Current folder is:%s\n", cwd));

	printf("hardext.prgbin_n = %d\n", hardext.prgbin_n);

	printf("uuuuuuuuuuuuuu\n");	
	SHUT(LOGD("OMEGA: 2232323232323ooooo\n"));

    if(hardext.prgbin_n>0) {
        env(LIBGL_NOPSA, globals4es.nopsa, "Don't use PrecompiledShaderArchive");
		printf("eeeeeeeeee\n");
		
        if(globals4es.nopsa==0) {
			printf("erewrerrrrrr\n");
            cwd[0]='\0';
            // TODO: What to do on ANDROID and EMSCRIPTEN?
#ifdef __linux__
            const char* home = getenv("HOME");
            if(home)
                strcpy(cwd, home);
            if(cwd[strlen(cwd)]!='/')
                strcat(cwd, "/");
#elif defined AMIGAOS4
            strcpy(cwd, "PROGDIR:");
#endif

			SHUT(LOGD("OMEGA: ooooooooooooo\n"));
            if(strlen(cwd)) {
				SHUT(LOGD("OMEGA: bbbbbbbbbbbbbb\n"));
                strcat(cwd, ".gl4es.psa");
				printf("cwd = %s\n", cwd);
                fpe_InitPSA(cwd);
                fpe_readPSA();
            }
        }
    }
}

And in log, i can see :

hardtext.prgbin_n = 0 uuuuuuuuuuuuuuuuu OMEGA:22232323232323oooooo

and no other prinfs

kas1e avatar Jul 23 '19 15:07 kas1e

Mmmm, wait, on AmigaOS, there is no pre-tests, so the prog_n is 0 by default. What is the version of GLES2 that have this extension available ?

ptitSeb avatar Jul 23 '19 15:07 ptitSeb

that one where those new functions added ? 2.9

kas1e avatar Jul 23 '19 15:07 kas1e

Or you mean GLES2 standard ?

kas1e avatar Jul 23 '19 15:07 kas1e

that one where those new functions added ? 2.9

Yes, that. Because I'll probably need to hardcode prog_n = 1 on AMIGA if OGLES2 driver >= 2.9 for now (until I find a way to create an offscreen context on Amiga so I can launch some test at start of gl4es).

ptitSeb avatar Jul 23 '19 15:07 ptitSeb

Yeah, let's do it that way. Or , you simple can assume that ogles2.9 is minimum for gl4es at all

kas1e avatar Jul 23 '19 15:07 kas1e

Mmm, wait, I have this in src/agl/agl.c I have

#define MIN_OGLES2_LIB_VERSION 1
#define MIN_OGLES2_LIB_REVISION 22

So, 1.9 doesn't work...

ptitSeb avatar Jul 23 '19 15:07 ptitSeb